Addressing Hidden Stratification: Fine-Grained Robustness in Coarse-Grained Classification Problems

The classes in classification tasks are often composed of finer-grained subclasses. Models trained using only the coarse-grained class labels tend to exhibit highly variable performance across different subclasses. Moreover, the subclasses are often unknown ahead of time, making it difficult to identify and reduce such performance gaps. This hidden stratification... [Read More]

Ivy: Instrumental Variable Synthesis for Causal Inference

In science and medicine, randomized controlled experiments (RCEs) are a reliable way to measure cause-and-effect relationships. It’s not always practical to conduct RCEs due to cost, time, ethics and other concerns, and a popular alternative is to use instrumental variables (IVs), variables in observational data that resemble the behavior of... [Read More]

Towards Interactive Weak Supervision with FlyingSquid

Modern machine learning models require a lot of training data to be successful. Over the past few years, we've been studying programatically creating labels with weak supervision to address this training data bottleneck; instead of relying on manual labels, data programming uses weak supervision---multiple noisy label sources---to automatically generate labeled... [Read More]