Skip to content

Latest commit



308 lines (182 loc) · 16.7 KB

File metadata and controls

308 lines (182 loc) · 16.7 KB

Uplift 基础


Uplift models用于预测一个treatment的增量反馈价值,比如给用户投放广告后其转化意愿的增量。我们不可能对同一个用户即treated又controled,但是借助统计和机器学习的知识,可以得到相似的用户大致会怎么反应。每一个用户会获得一个估计 lift score,用于指导在不同用户人群上实施差异化策略。(uplift model的目标是估计 CATE

Response model v.s. Uplift model


用户 转化率 投放广告后转化率 uplift
A 1% 10% 9%
B 10% 0.1% 0.1%
  • Response model:**看转化概率,**基于Response值——会倾向于给B投广告
  • Uplift model:**看转化概率提升,**基于delta_Response/uplift值建模——会倾向于给A投放广告

注意:Uplift Model是在估计ITE的方法,并不是直接估计ATE!


这里介绍Meta-learner,对对照组和实验组的结局进行建模(线性、树到深度学习都可),利用拟合的模型(base learner)预测ITE, CATE, ATE。


干预T作为一个0-1分类特征,建立一个模型,计算给定协变量X时不同干预T的 uplift 值






  • Step1:基于变量X和干预W训练预测模型
  • Step2:分别估计干预和不干预时的得分,差值即为增量




实验组和对照组分别建模,再计算给定协变量X时 uplift 值





  • Step1:对treatment组数据和control组数据分别训练预测模型
  • Step2:两个模型分别打分




X-learner 适合实验组和对照组样本数量差别较大场景


  1. 对实验组和对照组分别拟合模型

  1. 交叉预测:Di表示样本i实际结局和预估结局之间的差

  1. fit(D^1~ X^1),训练实验组模型 τ_1(x)

    fit(D^0~ X^0),训练对照组模型 τ_0(x)

  2. 对两个结果加权计算CATE,用权重来平衡实验组和对照组的样本量差异:

​ g(x) 为样本x进入实验组的先验概率,可以用协变量估计,可以简化为实验组占比。


  • 多模型造成误差累加
  • multi-treatment带来模型的数量增加



  • Step1:对Treatment组数据和Control组数据分别训练预测模型
  • Step2:计算一组uplift的近似表示的数据集,用treatment组模型预测control组数据,control组模型预测treatment组数据,分别做与Y的差值得到增量的近似
  • Step3:以此为目标再训练预测模型,拟合uplift




  1. 红框(1)

    • Y为观测到的结果(比如ctr label)
    • m(x)用日常的机器学习模型拟合label,数据用对照组+实验组训练,描述整体数据的预估均值,红框1就是要拟合的label
  2. 红框(2)

    • $$W_i =1|0$$,表示生效treatment或control组

    • e(x)表示倾向性评分,常用于非随机实验的数据;在流量相同的随机实验中,e(x) = 0.5即可

    • $$\tau(x)$$则是表示模型预估的uplift,红框2表示为模型的输出

  3. 红框(3),正则项


  • 模型精度依赖于m(x), e(x)的精度
  • multi-treatment带来模型的数量增加



Step1:通过交叉验证的方式,每次预测一组,得到整个数据集的预测结果$$m(x)$$和倾向得分 e(x)

$$ m(X_i)=E(Y|X_i) $$

$$ e(X_i)=E(W=1|X_i) $$

Step 2: 在cv的其他组最小化损失函数,估计增量。-q(i)表示不在第i组的样本



Cumulative Uplift Curve




  • treament有代价

  • Sleeping dog is treated

AUUC( Area Under Uplift Curve )


  • 用估计的uplift score 对测试集样本由高到低排序, 10%, 20%
  • 计算G(top φ):

横轴:样本排序,纵轴:G(top φ),得到uplift curve。用曲线与random line之间的面积作为评价模型表现的指标AUUC。

Qini curve

类似uplift-curve,对T和C样本不均做了处理,以Treatment组的样本量为准,对Control组做一个缩放,累积绘制的曲线称为Qini 曲线

和Cumulative Uplift Curve思路一致,只不过纵轴可以是:

  • 实际的转化量,上图就是。

  • 实际转化用户占全部用户的比例,相当于归一化。

更多参考:Causal Inference and Uplift Modeling A review of the literature





Causal ML Packet



**Causal ML**是一个 Python 包,它提供了一套使用基于最近研究的机器学习算法的提升建模和因果推理方法。

  • 广告活动定位优化:在广告活动中提高投资回报率的一个重要手段是将广告定位到在给定 KPI(例如参与度或销售量)中会有良好反应的客户群。CATE 通过根据 A/B 实验或历史观察数据在个人层面估计广告曝光的 KPI 影响来识别这些客户。
  • 个性化参与:公司有多种选择与客户互动,例如在追加销售或通信消息渠道中的不同产品选择。可以使用 CATE 来估计每个客户和治疗选项组合的异质治疗效果,以获得最佳的个性化推荐系统。

The package currently supports the following methods

  • Tree-based algorithms
    • Uplift tree/random forests on KL divergence, Euclidean Distance, and Chi-Square
    • Uplift tree/random forests on Contextual Treatment Selection
    • Causal Tree - Work-in-progress
  • Meta-learner algorithms
    • S-learner
    • T-learner
    • X-learner
    • R-learner
    • Doubly Robust (DR) learner
    • TMLE learner
  • Instrumental variables algorithms
    • 2-Stage Least Squares (2SLS)
    • Doubly Robust (DR) IV
  • Neural-network-based algorithms
    • CEVAE
    • DragonNet - with causalml[tf] installation


  1. Radcliffe, Nicholas J., and Patrick D. Surry. "Real-world uplift modelling with significance-based uplift trees." White Paper TR-2011-1, Stochastic Solutions (2011): 1-33.
  2. Zhao, Yan, Xiao Fang, and David Simchi-Levi. "Uplift modeling with multiple treatments and general response types." Proceedings of the 2017 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, 2017.
  3. Athey, Susan, and Guido Imbens. "Recursive partitioning for heterogeneous causal effects." Proceedings of the National Academy of Sciences 113.27 (2016): 7353-7360.
  4. Künzel, Sören R., et al. "Metalearners for estimating heterogeneous treatment effects using machine learning." Proceedings of the national academy of sciences 116.10 (2019): 4156-4165.
  5. Nie, Xinkun, and Stefan Wager. "Quasi-oracle estimation of heterogeneous treatment effects." arXiv preprint arXiv:1712.04912 (2017).
  6. Bang, Heejung, and James M. Robins. "Doubly robust estimation in missing data and causal inference models." Biometrics 61.4 (2005): 962-973.
  7. Van Der Laan, Mark J., and Daniel Rubin. "Targeted maximum likelihood learning." The international journal of biostatistics 2.1 (2006).
  8. Kennedy, Edward H. "Optimal doubly robust estimation of heterogeneous causal effects." arXiv preprint arXiv:2004.14497 (2020).
  9. Louizos, Christos, et al. "Causal effect inference with deep latent-variable models." arXiv preprint arXiv:1705.08821 (2017).
  10. Shi, Claudia, David M. Blei, and Victor Veitch. "Adapting neural networks for the estimation of treatment effects." 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), 2019.

Paper Reading

Uplift Modelling/Causal Tree

  1. Nicholas J Radcliffe and Patrick D Surry. Real-world uplift modelling with significance based uplift trees. White Paper TR-2011-1, Stochastic Solutions, 2011.[文章链接]
  2. Rzepakowski, P. and Jaroszewicz, S., 2012. Decision trees for uplift modeling with single and multiple treatments. Knowledge and Information Systems, 32(2), pp.303-327.[文章链接]
  3. Yan Zhao, Xiao Fang, and David Simchi-Levi. Uplift modeling with multiple treatments and general response types. Proceedings of the 2017 SIAM International Conference on Data Mining, SIAM, 2017. [文章链接] [Github链接]
  4. Athey, S., and Imbens, G. W. 2015. Machine learning methods for estimating heterogeneous causal effects. stat 1050(5) [文章链接]
  5. Athey, S., and Imbens, G. 2016. Recursive partitioning for heterogeneous causal effects. Proceedings of the National Academy of Sciences. [文章链接] [Github链接]
  6. C. Tran and E. Zheleva, “Learning triggers for heterogeneous treatment effects,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2019 [文章链接] [Github链接]

Forest Based Estimators

  1. Wager, S. & Athey, S. (2018). Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association .
  2. M. Oprescu, V. Syrgkanis and Z. S. Wu. Orthogonal Random Forest for Causal Inference. Proceedings of the 36th International Conference on Machine Learning (ICML), 2019 [文章链接] [GitHub链接]

Double Machine Learning

  1. V. Chernozhukov, D. Chetverikov, M. Demirer, E. Duflo, C. Hansen, and a. W. Newey. Double Machine Learning for Treatment and Causal Parameters. ArXiv e-prints [文章链接] [Github链接]
  2. V. Chernozhukov, M. Goldman, V. Semenova, and M. Taddy. Orthogonal Machine Learning for Demand Estimation: High Dimensional Causal Inference in Dynamic Panels. ArXiv e-prints, December 2017.
  3. V. Chernozhukov, D. Nekipelov, V. Semenova, and V. Syrgkanis. Two-Stage Estimation with a High-Dimensional Second Stage. 2018.
  4. X. Nie and S. Wager. Quasi-Oracle Estimation of Heterogeneous Treatment Effects. arXiv preprint arXiv:1712.04912, 2017.[文章连接]
  5. D. Foster and V. Syrgkanis. Orthogonal Statistical Learning. arXiv preprint arXiv:1901.09036, 2019 [文章链接]

Meta Learner

  1. C. Manahan, 2005. A proportional hazards approach to campaign list selection. In SAS User Group International (SUGI) 30 Proceedings.
  2. Green DP, Kern HL (2012) Modeling heteroge-neous treatment effects in survey experiments with Bayesian additive regression trees. Public OpinionQuarterly 76(3):491–511.
  3. Sören R. Künzel, Jasjeet S. Sekhon, Peter J. Bickel, and Bin Yu. Metalearners for estimating heterogeneous treatment effects using machine learning. Proceedings of the National Academy of Sciences, 2019. [文章链接] [GitHub链接]

Deep Learning

  1. Fredrik D. Johansson, U. Shalit, D. Sontag.ICML (2016). Learning Representations for Counterfactual Inference [文章链接]
  2. Shalit, U., Johansson, F. D., & Sontag, D. ICML (2017). Estimating individual treatment effect: generalization bounds and algorithms. Proceedings of the 34th International Conference on Machine Learning [文章链接]
  3. Christos Louizos, U. Shalit, J. Mooij, D. Sontag, R. Zemel, M. Welling.NIPS (2017). Causal Effect Inference with Deep Latent-Variable Models [文章链接]
  4. Alaa, A. M., Weisz, M., & van der Schaar, M. (2017). Deep Counterfactual Networks with Propensity-Dropout [文章链接]
  5. Shi, C., Blei, D. M., & Veitch, V. NeurIPS (2019). Adapting Neural Networks for the Estimation of Treatment Effects [文章链接] [Github链接]


  1. Shuyang Du, James Lee, Farzin Ghaffarizadeh, 2017, Improve User Retention with Causal Learning [文章连接]
  2. Zhenyu Zhao, Totte Harinen, 2020, Uplift Modeling for Multiple Treatments with Cost [文章连接]
  3. Will Y. Zou, Smitha Shyam, Michael Mui, Mingshi Wang, 2020, Learning Continuous Treatment Policy and Bipartite Embeddings for Matching with Heterogeneous Causal Effects Optimization [文章链接]
  4. Will Y. Zou,Shuyang Du,James Lee,Jan Pedersen, 2020, Heterogeneous Causal Learning for Effectiveness Optimization in User Marketing [文章连接]