Thesis
Methods to adjust for confounding:
propensity scores and instrumental variables
E.P. Martens
See also:
http://igiturarchive.library.uu.nl/dissertations/20071211202901/full.pdf.
Content
1 

Introduction  7 
2 

An overview of methods  11 
2.1  
Methods to assess intended effects of drug treatment in observational studies are reviewed  13 
3 

Strengths and limitations of adjustment methods  31 
3.1  
"Conditioning on the propensity score can result in biased estimation of common measures of treatment effect"  33 
3.2  
An important advantage of propensity score methods compared to logistic regression analysis  39 
3.3  
Instrumental variables: application and limitations  51 
4 

Application of adjustment methods  69 
4.1  
Comparing treatment effects after adjustment with multivariable Cox proportional hazards regression and PS methods  71 
4.2  
A nonparametric application of instrumental variables in survival analysis  85 
5 

Improvement of propensity score methods  101 
5.1  
The use of the overlapping coefficient in propensity score methods  103 
5.2  
Measuring balance in propensity score methods  119 
6 

Discussion  137 


Summary  149 


Samenvatting  157 
 
Dankwoord  167 
 
About the author/Over de schrijver  168 
Short summary
In the evaluation of the effect of different treatments wellconducted randomized controlled trials have
been widely accepted as the scientific standard. When on the other hand observational studies are used to
assess treatment effects, the absence of a randomized assignment of treatments will in general result in
treatment groups that are systematically different on factors that can be alternative explanations for the
observed treatment effect. Therefore, in these types of studies is adjustment for confounding necessary.
An overview of such methods is given and two methods are further described, evaluated and applied in real
data sets. Furthermore, improvements are suggested. One of these adjustment methods, propensity scores, is
increasingly used in the medical literature as an alternative for traditional regressionbased methods like
logistic regression and Cox proportional hazards regression. Nonetheless, an important advantage of propensity
scores is frequently overlooked by researchers, that is, its treatment effect estimate is in general closer to
the true average treatment effect than regression methods using the odds ratio or the hazard ratio. The
difference can be substantial, especially when the number of confounding factors is more than 5, the treatment
effect is larger than an odds ratio of 1.25 (or smaller than 0.8) or the incidence proportion is between 0.05
and 0.95. An important step in the application of propensity score methods is the creation of the propensity
score model, including the check for balance. In many applications this model is routinely chosen and information
on the balance of covariates between treatment groups is missing. We proposed to use a measure for balance, the
overlapping coefficient, to select the best propensity score model and to report the amount of balance uniformly.
Its inverse association with bias and the low mean squared error support the use of this measure. For smaller
sample sizes the method does not seem to work well for model selection purposes. We also exlored alternative
measures, the KolmogorovSmirnov distance and the Lévy metric, but these were slightly less promising. The other
adjustment method that has been evaluated, is the method of instrumental variables. Its potential ability to
adjust for all confounders, whether observed or not is an attractive property. We applied this method on
censored survival data and used the difference in survival probabilities as the treatment effect. Formulas
for standard errors are provided, which can be large in absolute value in case of a low number of events or
at the end of the survival curve. Nonetheless, this method is worthwile when a suitable instrumental variable
can be found or can be created. In the literature a warning can be found against a weak correlation between
the instrumental variable and treatment. We demonstrated the existence of an upper bound on this correlation,
which can be a practical limitation when considerable confounding exists. This can result in a fairly weak
instrument in order to fulfill the main assumption of the method, or worse, can indicate a violation of the
main assumption when the instrumental variable turns out to be strong.