Project 004 · robust prediction validation

Causal Regression

把回归从“学 $X\to Y$ 的表面相关”改写成“先从证据 $X$ 溯因到潜在个体原因 $U$,再用稳定机制 $Y=f(U,\varepsilon)$ 生成预测”。

这页是临时 noindex review surface:只展示 title / abstract / math spine / source map / paper-claim result,不复制完整 LaTeX 项目。

DiscoSCM downstream validation selected extraction only no full LaTeX mirror updated 2026-05-30 17:49 CST
one sentence

一句话定位

Causal Regression 的中心假设是:观测特征 $X$ 不是结果 $Y$ 的真正稳定来源;$X$ 与 $Y$ 都是更底层的潜在个体因果变量 $U$ 与环境噪声 $\varepsilon$ 的投影。于是预测不应只拟合相关性,而应执行:

conventional regression
$$\hat y=m_\theta(x)\approx\mathbb E[Y\mid X=x]$$

直接学习 $X\to Y$。强 IID 场景可用,但容易把 shortcut / spurious correlation 当成规律。

causal regression
$$U\sim g_\phi(X),\qquad Y=f_\theta(U,\varepsilon)$$

先溯因推断潜在原因,再通过稳定机制预测。鲁棒性来自机制层,而不是表面统计层。

visual theory map

从“相关预测”到“因果机制”的图解

Statistical regression

输入特征被当作平面 predictor;模型只要在训练分布上压低误差,就可能学习到脆弱捷径。

X surface features Y label noise learns a direct shortcut: X → Y
风险:当 label noise、distribution shift 或 confounder 出现时,shortcut 断裂。

Causal regression

模型先把 $X$ 当作证据来推断 $U$,再把 $U$ 与 $\varepsilon$ 送入稳定机制 $f$。

X P(U|X) abduction epistemic γU Y ε infer cause, then apply invariant mechanism f(U, ε)
核心:把可解释性绑定在 $U$ 与 $\varepsilon$ 的不确定性分解上。
math spine

数学脊柱:从 DiscoSCM 到 CausalEngine

这部分把页面上所有公式收拢成一条链,避免“有漂亮图但数学不清楚”。

1 · DiscoSCM premise
$$\langle U,\mathbf E,\mathbf V,\mathcal F\rangle,\qquad v_i\leftarrow f_i(pa_i,e_i;u)$$

$U$ 表示 individual selection;$\mathbf E$ 表示环境噪声。二者分离,是后面 epistemic / aleatoric 分解的来源。

2 · population valuation
$$P(Y^d(x)=y\mid e)=\sum_u P(y_x;u)P(u\mid e)$$

先 abduction 得到 $P(u\mid e)$,再 valuation,最后 reduction 到群体预测。

3 · regression analogue
$$P(U\mid Z)=\mathrm{Cauchy}(\mu_U(Z),\gamma_U(Z))$$

在回归里,$e$ 被工程化为观测特征证据 $X$ / 表示 $Z$,输出一个关于 $U$ 的后验近似。

当前严谨口径: CausalEngine 不是声称直接精确求解真实 $P(U\mid X)$;它用可训练的参数化近似 $P_\phi(U\mid Z)$ 来实现 DiscoSCM 的 abduction 思路。非线性复杂性主要被 Perception 表示 $Z$ 吸收,机制层保持线性以换取解析传播。
causal engine

四阶段架构:Perception → Abduction → Action → Decision

Operational pipeline observed evidence → latent individual cause → invariant causal law → task prediction 1 · Perception Z = Perception(X) extract evidence representation 2 · Abduction P(U|Z)=Cauchy(μU,γU) μU = Wloc Z + bloc γU = softplus(Wscale Z + bscale) ε Cauchy noise 3 · Action U′ = U + ε S = Waction U′ + baction linear causal mechanism 4 · Decision Y = τ(S) regression: τ(s)=s epistemic uncertainty how unsure about the individual cause U aleatoric uncertainty irreducible environment / measurement noise
图中每个节点对应论文中的一个计算阶段;所有公式在下方展开。
parameterized abduction
$$\mu_U(Z)=W_{loc}Z+b_{loc},\qquad \gamma_U(Z)=\mathrm{softplus}(W_{scale}Z+b_{scale})$$

$\gamma_U$ 是 epistemic uncertainty:模型关于“这个样本到底是哪类个体/原因”的不确定性。

robust regression loss
$$\mathcal L=\log(\pi\gamma_S)+\log\!\left(1+\left(\frac{y-\mu_S}{\gamma_S}\right)^2\right)$$

Cauchy NLL 对大误差是对数增长,天然降低 label outlier 的梯度支配。

why cauchy + linear

核心可计算性:Cauchy 的线性稳定性

页面上一句“$\gamma_S=|W|\gamma_U$”不够严谨。向量情形需要说清楚:如果每个维度独立服从 Cauchy,且 score 第 $j$ 维是线性组合,则输出仍是 Cauchy,scale 用逐元素绝对值加权。

scalar stability
$$aX_1+bX_2\sim\mathrm{Cauchy}(a\mu_1+b\mu_2,\, |a|\gamma_1+|b|\gamma_2)$$

这是避免 Monte Carlo sampling 的关键数学性质。

vector score, clarified
$$S_j=\sum_i W_{ji}(U_i+\varepsilon_i)+b_j$$ $$\mu_{S_j}=\sum_i W_{ji}\mu_{U_i}+b_j$$ $$\gamma_{S_j}=\sum_i |W_{ji}|\big(\gamma_{U_i}+b_{noise,i}\big)$$

这里 $|W_{ji}|$ 是逐元素绝对值;这正是旧页面需要补清楚的地方。

U posterior Cauchy(μU, γU) epistemic: incomplete knowledge about the individual cause add ε, apply W U′ = U + ε S = WU′ + b aleatoric bnoise enters scale S distribution Cauchy(μS, γS) closed-form uncertainty propagation no sampling estimator needed
理论卖点:可解释不确定性分解 + 解析传播 + robust Cauchy likelihood,是 CausalEngine 比普通 robust regression 更“因果化”的地方。
experiment snapshot

实验结果快照:先作为 paper claim 展示

当前页面只记录 TeX 中的结果说法,还未独立复现。建议后续把 anonymous code 跑通后,把这里升级成 verified result dashboard。

13synthetic comparison baselines
4label-noise types
8public regression datasets
MdAEprimary robustness metric

High-noise shuffle setting: MdAE reduction vs strongest baseline

Synthetic statistical
84.0%
Synthetic causal
55.7%

Interpretation: lower MdAE under severe label-noise corruption. This is a paper-claim visualization, not yet an independently reproduced WeHub benchmark.

abstract

摘要抽取

The performance of standard regression models, which primarily learn statistical associations, is vulnerable to label noise. This paper proposes Causal Regression, a paradigm that shifts the focus toward learning invariant causal mechanisms. We introduce CausalEngine, a neural architecture that operationalizes this paradigm based on the Distribution-consistency Structural Causal Model (DiscoSCM). It first performs abduction to infer a distribution over latent cause, and subsequently applies a causal mechanism to make a prediction. The mathematical properties of the Cauchy distribution facilitate an analytical inference process. This design sidesteps the need for sampling-based approximations, thereby eliminating the high-variance gradients and computational overhead they introduce, leading to stable and efficient end-to-end training. This design also provides a structured form of interpretability by decomposing predictive uncertainty into two distinct sources: epistemic uncertainty, arising from incomplete knowledge of an individual, and aleatoric uncertainty, stemming from inherent environmental randomness. Our experiments demonstrate CausalEngine's significant robustness against label noise. Especially in high-noise regimes where strong baselines falter, our approach exhibits a significantly smaller drop in performance. This work suggests that shifting the modeling focus from statistical associations to causal structures is a promising direction for building AI systems that are more reliable and interpretable. Code is available at anonymous.4open.science/r/causal-regression-135C.

source map

来源状态

Public note: temporary noindex research-review page; not a final publication page.

files observed

TeX 输入文件

filelinessha256 prefix
main_final_rebuttal.tex10735d7747a77cb…
math_commands.tex50890473c4d0542…
final_introduction.tex127feb6f1ef0090…
final_causalregression.tex364f2e849fb0e2…
final_causalengine.tex993a4cbe696abe…
final_experiments.tex154aadabd225782…
final_conclusion.tex28f73224a0bfe8…
final_appendix.tex427d45412189de3…
next gates

下一步 gates

  1. 理论审查:明确 $P_\phi(U\mid Z)$ 是 posterior approximation,不是 exact posterior。
  2. 数学补强:把 vector Cauchy stability、independence assumptions、$W$ 的逐元素绝对值写进正文。
  3. 实验复现:从 anonymous/internal code 重新跑 high-noise benchmark,区分 paper claim 与 verified result。
  4. 叙事升级:如果复现通过,可把这页从临时 review surface 升级为正式 research page。