The Sorted Effects Method: Discovering Heterogeneous Effects Beyond their Averages

https://doi.org/10.3982/ECTA14415
p. 1911-1938

Victor Chernozhukov, Iván Fernández‐Val, Ye Luo

The partial (ceteris paribus) effects of interest in nonlinear and interactive linear models are heterogeneous as they can vary dramatically with the underlying observed or unobserved covariates. Despite the apparent importance of heterogeneity, a common practice in modern empirical work is to largely ignore it by reporting average partial effects (or, at best, average effects for some groups). While average effects provide very convenient scalar summaries of typical effects, by definition they fail to reflect the entire variety of the heterogeneous effects. In order to discover these effects much more fully, we propose to estimate and report sorted effects—a collection of estimated partial effects sorted in increasing order and indexed by percentiles. By construction, the sorted effect curves completely represent and help visualize the range of the heterogeneous effects in one plot. They are as convenient and easy to report in practice as the conventional average partial effects. They also serve as a basis for classification analysis, where we divide the observational units into most or least affected groups and summarize their characteristics. We provide a quantification of uncertainty (standard errors and confidence bands) for the estimated sorted effects and related classification analysis, and provide confidence sets for the most and least affected groups. The derived statistical results rely on establishing key, new mathematical results on Hadamard differentiability of a multivariate sorting operator and a related classification operator, which are of independent interest.

We apply the sorted effects method and classification analysis to demonstrate several striking patterns in the gender wage gap. We find that this gap is particularly strong for married women, ranging from −60% to 0% between the 2% and 98% percentiles, as a function of observed and unobserved characteristics; while the gap for never married women ranges from − 40% to + 20%. The most adversely affected women tend to be married, do not have college degrees, work in sales, and have high levels of potential experience.

Supplemental Material

Supplement to "The Sorted Effects Method: Discovering Heterogeneous Effects Beyond their Averages"

This zip file contains the replication files for the manuscript. It also contains an online appendix. The supplementary material contains 7 appendices with additional results and some omitted proofs. Appendix C introduces some notation. Appendix D includes a brief review of differential geometry. Appendix E gathers the proofs of the key mathematical results in Appendix A. Appendix F provides sufficient conditions for the u-Donsker properties in Section 4. Appendix G extends the theoretical analysis to include discrete covariates. Appendices H and I report the results of 3 numerical simulations and an empirical application to the effect of race on mortgage denials, respectively.