FlyingSquid: A Python Framework For Interactive Weak Supervision

In this research article, we will be discussing keypoints about FlyingSquid through the paper ‘Fast and Three-rious: Speeding Up Weak Supervision with Triplet Methods’ published in 2020 by Stanford Researchers.

Weak supervision is a common method for building machine learning models without relying on ground truth annotations. It generates probabilistic training labels by estimating the accuracy of multiple noisy labeling sources (e.g., heuristics). While it might seem like the easiest way to get started with ML, weak supervised training can be costly and time-consuming in practice. 

A group of computer science researchers from Stanford University shows that, for a class of latent variable models highly applicable to weak supervision, they could find an explicit closed-form solution obviating the need for iterative solutions like stochastic gradient descent (SGD). The research team used these insights to build the FlyingSquid framework, which is faster than previous weak supervision approaches and requires fewer assumptions. It learns to label source accuracies with a closed-form solution.

FlyingSquid is a framework for automatically building models from multiple noisy label sources. You can write functions that generate labels on your data, and FlyingSquid uses the agreements/disagreements between them to learn how accurate each labeling function is. The resulting model can then be used directly in downstream applications or alternatively trained into powerful end machine learning systems.


The researchers validated FlyingSquid on benchmark weak supervision datasets. They were able to find that FlyingSquid achieves the same or higher quality compared to previous approaches without needing custom tuning, recovers model parameters 170 times faster on average.



Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.

🚀 LLMWare Launches SLIMs: Small Specialized Function-Calling Models for Multi-Step Automation [Check out all the models]