Bootleg: Chasing the Tail with Self-Supervised Named Entity Disambiguation

Identifying people, places, and things in text, called named entity disambiguation (NED), is a fundamental AI problem. Machine based NED systems often struggle to correctly distinguish entities that appear infrequently in data even though the majority of entities people care about when searching for information or using personal assistants are... [Read More]

Addressing Hidden Stratification: Fine-Grained Robustness in Coarse-Grained Classification Problems

The classes in classification tasks are often composed of finer-grained subclasses. Models trained using only the coarse-grained class labels tend to exhibit highly variable performance across different subclasses. Moreover, the subclasses are often unknown ahead of time, making it difficult to identify and reduce such performance gaps. This hidden stratification... [Read More]

Ivy: Instrumental Variable Synthesis for Causal Inference

In science and medicine, randomized controlled experiments (RCEs) are a reliable way to measure cause-and-effect relationships. It’s not always practical to conduct RCEs due to cost, time, ethics and other concerns, and a popular alternative is to use instrumental variables (IVs), variables in observational data that resemble the behavior of... [Read More]