PROPENSITY SCORE ESTIMATION WITH LOGISTIC REGRESSION
The most common method to estimate propensity scores is logistic regression, because it is a parametric model that is familiar to many researchers. Although there are many advanced data mining methods that can potentially outperform logistic regression, I recommend that researchers use logistic regression first because it frequently produces propensity scores that result in adequate covariate balance. If you are able to achieve covariate balance using the propensity scores estimated with logistic regression, it is not necessary to use advanced data mining methods.
In the video below, I review R code for propensity score estimation with logistic regression.
In the video below, I review R code for propensity score estimation with logistic regression.
Code for Chapter 2 Propensity Score Estimation
| R Code for Propensity Score Estimation | |
| File Size: | 15 kb |
| File Type: | r |
| Rmarkdown Guide to Propensity Score Estimation | |
| File Size: | 11 kb |
| File Type: | rmd |
Data for Example of Propensity Score Estimation
| R Data for Propensity Score Estimation Example | |
| File Size: | 294 kb |
| File Type: | rdata |
MACHINE LEARNING FOR PROPENSITY SCORE ESTIMATION
Many machine learning methods can be used to estimate propensity scores, such as generalized boosted modeling, random forests, and neural networks.
Leite et al. (2025) conducted a comprehensive systematic review of publications which used machine learning to estimate propensity scores published from 1983 to 2023. They found that gradient boosting machine (generalized boosted modeling or GBM) is the most commonly used method. They provide detailed guidelines for reporting propensity score estimation with machine learning.
In the video below , I show how to estimate propensity scores using gradient boosting machine (generalized boosted modeling or GBM) with the twang package of R.
Leite et al. (2025) conducted a comprehensive systematic review of publications which used machine learning to estimate propensity scores published from 1983 to 2023. They found that gradient boosting machine (generalized boosted modeling or GBM) is the most commonly used method. They provide detailed guidelines for reporting propensity score estimation with machine learning.
In the video below , I show how to estimate propensity scores using gradient boosting machine (generalized boosted modeling or GBM) with the twang package of R.
In the video below, I show how to estimate propensity scores with random forests using the party package of R.
example code for maching learning for propensity score estimation
| example_machine_learning_methods_for_propensity_score_estimation.rmd | |
| File Size: | 8 kb |
| File Type: | rmd |
| chapter4_ssocs08.rdata | |
| File Size: | 432 kb |
| File Type: | rdata |
publications about propensity score estimation
Leite, W. L., Zhang, H., Collier, Z. K., Chawala, K., Kong, L., Lee, Y., Quan, J., & Soyoye, O. (2024). Machine Learning for Propensity Score Estimation: A Systematic Review and Reporting Guidelines. OSF Pre-print.
Leite, W. L., Aydin, B., & D. D. Cetin-Berber (2021). Imputation of Missing Covariate Data Prior to Propensity Score Analysis: A Tutorial and Evaluation of Robustness of Practical Approaches. Evaluation Review. https://doi.org/10.1177/0193841X211020245
Code for the paper
Collier, Z. K., & Leite, W. L. (2021). A Tutorial on Artificial Neural Networks in Propensity Score Analysis. Journal of Experimental Education. DOI: 10.1080/00220973.2020.1854158
Collier, Z. K., Leite, W. L, & Zhang, H. (2021): Estimating propensity scores using neural networks and traditional methods: a comparative
simulation study, Communications in Statistics - Simulation and Computation, DOI: 10.1080/03610918.2021.1963455
Collier Z. K., Leite W. L., Karpyn A. (2021). Neural Networks to Estimate Generalized Propensity Scores for Continuous Treatment Doses. Evaluation Review. doi:10.1177/0193841X21992199
Leite, W. L., Aydin, B., & D. D. Cetin-Berber (2021). Imputation of Missing Covariate Data Prior to Propensity Score Analysis: A Tutorial and Evaluation of Robustness of Practical Approaches. Evaluation Review. https://doi.org/10.1177/0193841X211020245
Code for the paper
Collier, Z. K., & Leite, W. L. (2021). A Tutorial on Artificial Neural Networks in Propensity Score Analysis. Journal of Experimental Education. DOI: 10.1080/00220973.2020.1854158
Collier, Z. K., Leite, W. L, & Zhang, H. (2021): Estimating propensity scores using neural networks and traditional methods: a comparative
simulation study, Communications in Statistics - Simulation and Computation, DOI: 10.1080/03610918.2021.1963455
Collier Z. K., Leite W. L., Karpyn A. (2021). Neural Networks to Estimate Generalized Propensity Scores for Continuous Treatment Doses. Evaluation Review. doi:10.1177/0193841X21992199
Proudly powered by Weebly