Project 004 · robust prediction validation

Causal Regression

把回归从“学 $X\to Y$ 的表面相关”改写成“先从证据 $X$ 溯因到潜在个体原因 $U$，再用稳定机制 $Y=f(U,\varepsilon)$ 生成预测”。

这页是临时 noindex review surface：只展示 title / abstract / math spine / source map / paper-claim result，不复制完整 LaTeX 项目。

DiscoSCM downstream validation selected extraction only no full LaTeX mirror updated 2026-05-30 17:49 CST

one sentence

一句话定位

Causal Regression 的中心假设是：观测特征 $X$ 不是结果 $Y$ 的真正稳定来源；$X$ 与 $Y$ 都是更底层的潜在个体因果变量 $U$ 与环境噪声 $\varepsilon$ 的投影。于是预测不应只拟合相关性，而应执行：

conventional regression

$$\hat y=m_\theta(x)\approx\mathbb E[Y\mid X=x]$$

直接学习 $X\to Y$。强 IID 场景可用，但容易把 shortcut / spurious correlation 当成规律。

causal regression

$$U\sim g_\phi(X),\qquad Y=f_\theta(U,\varepsilon)$$

先溯因推断潜在原因，再通过稳定机制预测。鲁棒性来自机制层，而不是表面统计层。

visual theory map

从“相关预测”到“因果机制”的图解

Statistical regression

输入特征被当作平面 predictor；模型只要在训练分布上压低误差，就可能学习到脆弱捷径。

风险：当 label noise、distribution shift 或 confounder 出现时，shortcut 断裂。

Causal regression

模型先把 $X$ 当作证据来推断 $U$，再把 $U$ 与 $\varepsilon$ 送入稳定机制 $f$。

核心：把可解释性绑定在 $U$ 与 $\varepsilon$ 的不确定性分解上。

math spine

数学脊柱：从 DiscoSCM 到 CausalEngine

这部分把页面上所有公式收拢成一条链，避免“有漂亮图但数学不清楚”。

1 · DiscoSCM premise

$$\langle U,\mathbf E,\mathbf V,\mathcal F\rangle,\qquad v_i\leftarrow f_i(pa_i,e_i;u)$$

$U$ 表示 individual selection；$\mathbf E$ 表示环境噪声。二者分离，是后面 epistemic / aleatoric 分解的来源。

2 · population valuation

$$P(Y^d(x)=y\mid e)=\sum_u P(y_x;u)P(u\mid e)$$

先 abduction 得到 $P(u\mid e)$，再 valuation，最后 reduction 到群体预测。

3 · regression analogue

$$P(U\mid Z)=\mathrm{Cauchy}(\mu_U(Z),\gamma_U(Z))$$

在回归里，$e$ 被工程化为观测特征证据 $X$ / 表示 $Z$，输出一个关于 $U$ 的后验近似。

当前严谨口径： CausalEngine 不是声称直接精确求解真实 $P(U\mid X)$；它用可训练的参数化近似 $P_\phi(U\mid Z)$ 来实现 DiscoSCM 的 abduction 思路。非线性复杂性主要被 Perception 表示 $Z$ 吸收，机制层保持线性以换取解析传播。

causal engine

四阶段架构：Perception → Abduction → Action → Decision

图中每个节点对应论文中的一个计算阶段；所有公式在下方展开。

parameterized abduction

$$\mu_U(Z)=W_{loc}Z+b_{loc},\qquad \gamma_U(Z)=\mathrm{softplus}(W_{scale}Z+b_{scale})$$

$\gamma_U$ 是 epistemic uncertainty：模型关于“这个样本到底是哪类个体/原因”的不确定性。

robust regression loss

$$\mathcal L=\log(\pi\gamma_S)+\log\!\left(1+\left(\frac{y-\mu_S}{\gamma_S}\right)^2\right)$$

Cauchy NLL 对大误差是对数增长，天然降低 label outlier 的梯度支配。

why cauchy + linear

核心可计算性：Cauchy 的线性稳定性

页面上一句“$\gamma_S=|W|\gamma_U$”不够严谨。向量情形需要说清楚：如果每个维度独立服从 Cauchy，且 score 第 $j$ 维是线性组合，则输出仍是 Cauchy，scale 用逐元素绝对值加权。

scalar stability

$$aX_1+bX_2\sim\mathrm{Cauchy}(a\mu_1+b\mu_2,\, |a|\gamma_1+|b|\gamma_2)$$

这是避免 Monte Carlo sampling 的关键数学性质。

vector score, clarified

$$S_j=\sum_i W_{ji}(U_i+\varepsilon_i)+b_j$$ $$\mu_{S_j}=\sum_i W_{ji}\mu_{U_i}+b_j$$ $$\gamma_{S_j}=\sum_i |W_{ji}|\big(\gamma_{U_i}+b_{noise,i}\big)$$

这里 $|W_{ji}|$ 是逐元素绝对值；这正是旧页面需要补清楚的地方。

理论卖点：可解释不确定性分解 + 解析传播 + robust Cauchy likelihood，是 CausalEngine 比普通 robust regression 更“因果化”的地方。

experiment snapshot

实验结果快照：先作为 paper claim 展示

当前页面只记录 TeX 中的结果说法，还未独立复现。建议后续把 anonymous code 跑通后，把这里升级成 verified result dashboard。

13synthetic comparison baselines

4label-noise types

8public regression datasets

MdAEprimary robustness metric

High-noise shuffle setting: MdAE reduction vs strongest baseline

Synthetic statistical

84.0%

Synthetic causal

55.7%

Interpretation: lower MdAE under severe label-noise corruption. This is a paper-claim visualization, not yet an independently reproduced WeHub benchmark.

abstract

摘要抽取

The performance of standard regression models, which primarily learn statistical associations, is vulnerable to label noise. This paper proposes Causal Regression, a paradigm that shifts the focus toward learning invariant causal mechanisms. We introduce CausalEngine, a neural architecture that operationalizes this paradigm based on the Distribution-consistency Structural Causal Model (DiscoSCM). It first performs abduction to infer a distribution over latent cause, and subsequently applies a causal mechanism to make a prediction. The mathematical properties of the Cauchy distribution facilitate an analytical inference process. This design sidesteps the need for sampling-based approximations, thereby eliminating the high-variance gradients and computational overhead they introduce, leading to stable and efficient end-to-end training. This design also provides a structured form of interpretability by decomposing predictive uncertainty into two distinct sources: epistemic uncertainty, arising from incomplete knowledge of an individual, and aleatoric uncertainty, stemming from inherent environmental randomness. Our experiments demonstrate CausalEngine's significant robustness against label noise. Especially in high-noise regimes where strong baselines falter, our approach exhibits a significantly smaller drop in performance. This work suggests that shifting the modeling focus from statistical associations to causal structures is a promising direction for building AI systems that are more reliable and interpretable. Code is available at anonymous.4open.science/r/causal-regression-135C.

source map

来源状态

Host: gongqian
Main: main_final_rebuttal.tex
SHA-256: 35d7747a77cb257112a0b1ae09c70cfb355a5478f1bee05b15b5bd1e90c98691
mtime: 2026-03-19T11:00:06
Code: anonymous.4open.science/r/causal-regression-135C

Public note: temporary noindex research-review page; not a final publication page.

files observed

TeX 输入文件

file	lines	sha256 prefix
main_final_rebuttal.tex	107	35d7747a77cb…
math_commands.tex	508	90473c4d0542…
final_introduction.tex	127	feb6f1ef0090…
final_causalregression.tex	36	4f2e849fb0e2…
final_causalengine.tex	99	3a4cbe696abe…
final_experiments.tex	154	aadabd225782…
final_conclusion.tex	28	f73224a0bfe8…
final_appendix.tex	427	d45412189de3…

next gates

下一步 gates

理论审查：明确 $P_\phi(U\mid Z)$ 是 posterior approximation，不是 exact posterior。
数学补强：把 vector Cauchy stability、independence assumptions、$W$ 的逐元素绝对值写进正文。
实验复现：从 anonymous/internal code 重新跑 high-noise benchmark，区分 paper claim 与 verified result。
叙事升级：如果复现通过，可把这页从临时 review surface 升级为正式 research page。

返回临时区首页打开 DiscoSCM foundation 查看 progress.json