Powerful Abstractions for Programming Your Training Data

Using standard models (i.e. pretrained BERT) and minimal tuning, we leverage key abstractions for programmatically building and managing training data to achieve a state-of-the-art result on SuperGLUE—a a newly curated benchmark with six tasks for evaluating “general-purpose language understanding technologies.” We also give updates on Snorkel’s use in the real... [Read More]

Butterflies Are All You Need: A Universal Building Block for Structured Linear Maps

We use a type of structured matrix known as a butterfly matrix to learn fast algorithms for discrete linear transforms such as the Discrete Fourier Transform. We further introduce a hierarchy of matrix families based on composing butterfly matrices, which is capable of efficiently representing any structured matrix (any matrix... [Read More]

Learning Dependency Structures in Weak Supervision

Recently, weak supervision has been used to efficiently label large-scale training sets without traditional hand-labeled data across applications in academia and industry. However, users cannot always specify which dependencies (i.e., correlations) exist among the weak supervision sources, which could potentially number in the hundreds. We discuss a method to learn... [Read More]

Massive Multi-Task Learning with Snorkel MeTaL: Bringing More Supervision to Bear

TL;DR: We use Snorkel MeTaL to construct a simple model (pretrained BERT + linear task heads) and incorporate a variety of supervision signals (traditional supervision, transfer learning, multi-task learning, weak supervision, and ensembling) in a Massive Multi-Task Learning (MMTL) setting, achieving a new state-of-the-art score on the GLUE Benchmark and... [Read More]

Debugging Machine Learning - Reflections from DAWN Retreat

“What do you spend time on while debugging machine learning pipelines?” Responses to this question at the Fall 2018 DAWN Retreat ranged from “finding the best way to use transfer learning” to “systematically sampling from raw data”. We identify three broad themes from our discussions and explore them in this... [Read More]