Skip to main content

CEP/STICERD Applications Seminars

Robustly estimating heterogeneity in factorial data using Rashomon Partitions

Arun Chandrasekhar (Stanford University)

Monday 10 June 2024 12:00 - 13:30

Many of our seminars and public events this year will continue as in person or as hybrid (online and in person) events. Please check our website listings and Twitter feed @STICERD_LSE for updates.

Unless otherwise specified, in-person seminars are open to the public. Please ensure you have informed the event contact as early as possible.

Those unable to join the seminars in-person are welcome to participate via zoom if the event is hybrid.


About this event

Many statistical analyses, in both observational data and randomized control trials, ask: how does the outcome of interest vary with combinations of observable covariates? For example, how do various drug combinations affect health outcomes, or how does technology adoption depend on incentives and demographics? Ubiquitous tools (e.g. Bayesian and frequentist regression, regression trees) partition observations into homogeneous ``pools' where the outcome is similar within pool and different across pools, then compute a summary (or fit a model) within each pool. We propose learning these partitions using Rashomon Partition Sets (RPSs). We construct the RPS by enumerating all partitions that have posterior density close to the maximum a posteriori (MAP) partition, even if they offer very different substantive explanations. The RPS incorporates uncertainty amongst partitions with high evidence in the data, in contrast to averaging approaches (e.g. Bayesian Model Averaging, random forests) that include many partitions that a scientist would easily discard after seeing the data. Looking across partitions in the RPS provides robust explanations for heterogeneity that appear in all partitions in the neighborhood of the MAP. Partitions in the RPS follow two guiding principles. First, they must be scientifically coherent, which also improves computational efficiency. Second, they must be robust. We use a $\ell_0$ prior, which we show is minimax optimal, but impose no additional assumptions on the dependence structure. Conditional on being in the RPS, can calculate the posterior of any measurable function of the vector of feature combination effects on outcomes and characterize approximation error relative to the entire posterior. We give three empirical examples: price effects on charitable giving, heterogeneity in chromosomal structure (telomere length), and the introduction of microfinance. We highlight robust conclusions, including affirmations and reversals of extant findings.

Applications (Applied Micro) Seminars are held on Mondays in term time at 12:00-13:30 in SAL 3.05 in person.

Seminar organiser: Katie Smith

For further information please contact Sadia Ali: s.ali43@lse.ac.uk.

Please use this link to subscribe or unsubscribe to our mailing list (applications).