Singapore-Based Researchers Launch ‘AI Verify’: The First AI Governance Testing Framework (MVP)

This Article is written as a summay by Marktechpost Staff based on the 'AI Verify' Paper and post. All Credit For This Research Goes To The Researchers of This Project. 

Please Don't Forget To Join Our ML Subreddit

Singapore-based researchers launch the first AI Governance Testing Framework and Toolkit for organizations looking to demonstrate responsible AI measurably. An early-stage product called AI Verify attempts to increase trust between businesses and their stakeholders by performing technological testing and process audits in conjunction with each other.

There is a constant need for the public to be assured that AI systems are fair, explainable, safe, and accountable; as more products and services use AI to personalize or make autonomous predictions. The objective is to increase public confidence in AI while encouraging its more comprehensive application. Voluntary AI governance frameworks and guidelines have been published to help system owners and developers implement trustworthy AI products and services.

Developers and owners of AI systems who want to be more transparent about their systems’ performance through technical tests and process checks can get this as a Minimum Viable Product (MVP). Understanding how AI models make judgments and if the AI predictions models make have any unintentional bias is a vital part of transparency. AI systems should be held accountable and subject to scrutiny. To test the MVP, companies are asked to join in the trial.

It’s hoped that the MVP will accomplish the following goals:

  • Provide organizations with the tools to establish trust with their customers, suppliers, and other stakeholders. As a result, stakeholders’ confidence in the AI systems is increased due to using the MVP to demonstrate the claimed performance of AI systems. 
  • Enhance AI governance frameworks’ ability to communicate with one another. Businesses may benefit from the MVP’s focus on shared concepts of reliable AI governance and laws. As a result of this network, the advancement of AI governance testing will be aided by exchanging knowledge and establishing best practices.

It’s time to get into the AI governance testing framework and toolkit. AI ethics principles, guidelines, and frameworks were considered for the AI Governance Testing Framework and Toolkit. AI ethical principles are divided into five categories. Data governance, accountability, human agency and supervision, inclusive growth, and societal and environmental well-being are among the 11 principles that guide the development of data-driven systems and services.


The following elements are essential to the Testing Framework’s structure:

  • The Testing Framework defines ethical concepts for artificial intelligence (AI) AI ethics standards.
  • Criteria that can be put to the test. A set of measurable criteria will be assigned to each principle. This governance principle’s anticipated outcomes can be measured using testable criteria that incorporate elements of technical and non-technical (e.g., process and organizational structure).
  • The method of testing. Each testable criteria must be met for the testing procedure to be effective. Quantitative methods, such as statistical and technical tests, can be used in the testing process. They can be qualitative as well, for example, by creating written proof during process inspections.
  • Measuring. There are well-defined quantitative and/or qualitative factors for each testable criterion that can be measured.
  • Toxicity and Tolerance. With the advancement of AI technology, it is increasingly difficult to set thresholds that define acceptable values or benchmarks for the metrics chosen (whether those are developed by industry or regulators).

Testing Framework does not currently have a threshold feature. The goal is to collect valuable data and thresholds from industry testing their AI systems against the Testing Framework and produce context-specific metrics and points.

In the beginning, this Toolkit covers three principles: fairness, explainability, and robustness in technical testing. By identifying and combining popular open-source libraries into a single Toolkit, the Toolkit provides a “one-stop” tool for conducting specialized testing. Adversarial Robustness Toolkit, SHAP (SHapley Additive Explanations), AIF360, and Fairlearn are some technologies that can be used for fairness testing.

The Toolkit provides a user interface to guide users through the testing process, including a guided fairness tree to guide users to the fairness metrics relevant to their use case. Supports particular binary classification and regression models that use tabular data, such as decision trees, and produces a basic summary report to help system developers and owners interpret the results of the tests. Checklists will be used for the process checks to indicate whether or not the documentation stated in the Testing Framework has been found.

The MVP’s scope and limits are listed below. The Toolkit presently contains the following features and restrictions, as it’s in the early stages of development and iteration:

  • Requires binary classification and regression methods from well-known frameworks like scikit-learn, TensorFlow, and XGBoost to function. Although it is possible to handle tabular datasets for most principles, the toolkit does not currently allow unsupervised models;
  • A web interface can load small-to-medium scale models (less than 2 GB) into the toolkit. The toolkit does not handle picture datasets. While more complex models and AI pipelines may not work right now, as the pilot program continues, more features will be added based on input from the industry.
[Announcing Gretel Navigator] Create, edit, and augment tabular data with the first compound AI system trusted by EY, Databricks, Google, and Microsoft