March 24, 2021

Randomization Tests to Assess Covariate Balance When Designing and Analyzing Matched Datasets

By Zach Branson


Causal analyses for observational studies are often complicated by covariate imbalances among treatment groups, and matching methodologies alleviate this complication by find- ing subsets of treatment groups that exhibit covariate balance. It is widely agreed upon that covariate balance can serve as evidence that a matched dataset approximates a ran- domized experiment, but what kind of experiment does a matched dataset approximate? In this work, we develop a randomization test for the hypothesis that a matched dataset approximates a particular experimental design, such as complete randomization, block ran- domization, or rerandomization. Our test can incorporate any experimental design, and it allows for a graphical display that puts several designs on the same univariate scale, thereby allowing researchers to pinpoint which design—if any—is most appropriate for a matched dataset. After researchers determine a plausible design, we recommend a randomization- based approach for analyzing the matched data, which can incorporate any design and treatment effect estimator. Through simulation, we find that our test can frequently detect violations of randomized assignment that harm inferential results. Furthermore, through simulation and a real application in political science, we find that matched datasets with high levels of covariate balance tend to approximate balance-constrained designs like reran- domization, and analyzing them as such can lead to precise causal analyses. However, as- suming a precise design should be proceeded with caution, because it can harm inferential results if there are still substantial biases due to remaining imbalances after matching. Our approach is implemented in the randChecks R package, available on CRAN.

Methodology Study Protocols