Towards Interactive Weak Supervision with FlyingSquid

Modern machine learning models require a lot of training data to be successful. Over the past few years, we've been studying programatically creating labels with weak supervision to address this training data bottleneck; instead of relying on manual labels, data programming uses weak supervision---multiple noisy label sources---to automatically generate labeled... [Read More]

Automating the Art of Data Augmentation

Part III Theory

As we have seen in the previous blog post, data augmentation techniques have achieved remarkable gains when applied to neural network models. In this blog post, we reflect on the success story of various augmentation techniques and review our recent work that study theoretical properties of data augmentation. [Read More]

Automating the Art of Data Augmentation

Part II Practical Methods

Instead of performing manual search, automated data augmentation approaches hold promise to search for more powerful parameterizations and compositions of transformations. Perhaps the biggest difficulty with automating data augmentation is how to search over the space of transformations. This can be prohibitively expensive due to the large number of transformation... [Read More]