Facebook AI Releases ‘Dynabench’, A Dynamic Benchmark Testing Platform For Machine Learning Systems

Facebook AI releases Dynabench, a new and ambitious research platform for dynamic data collection, and benchmarking. This platform is one of the first for benchmarking in artificial intelligence with dynamic benchmarking happening over multiple rounds. It works by testing machine learning systems and asking adversarial human annotators to break it.

While there has been significant progress in AI research benchmarks — from MNIST to ImageNet to GLUE, we are still far from having machines that can truly understand natural language. Dynabench creates new challenging datasets using both humans and models together to measure NLP models more accurately. This process shows where gaps in current models exist, which allows it to train the next generation of AI models in the loop. It also measures how easily humans fool AI models in a dynamic environment instead of a static benchmark.

https://ai.facebook.com/blog/dynabench-rethinking-ai-benchmarking

Dynabench uses a novel procedure called dynamic adversarial data collection to improve current AI benchmarking practices. This new approach to evaluate the robustness (or brittleness) of ML systems goes beyond the traditional training set paradigm.

With all these benchmark innovations in Dynabench, we can hope the best for future AI systems to make fewer mistakes, have less harmful biases, and be more useful in real-world applications.

Source: https://ai.facebook.com/blog/dynabench-rethinking-ai-benchmarking

Website: https://dynabench.org/

Related Paper: https://arxiv.org/pdf/1910.14599.pdf

Related Github: https://github.com/facebookresearch/anli

Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.

🚀 The end of project management by humans (Sponsored)