The UK’s MHRA, The U.S. Food and Drug Administration (FDA), and Health Canada have come together to work on the ten basic guiding principles to develop Good Machine Learning Practice (GMLP). These principles help us promote high-quality, safe, and effective devices using Artificial Intelligence and Machine Learning. AI/ML has the potential to grasp useful insights from the huge amount of data that is available from the everyday healthcare that is being delivered. Software Algorithms are used that are helpful to learn from the real world, which in turn helps us to improve product performance. These Software Algorithms are also data-driven and iterative in nature, hence taking into account the unique considerations that arise due to their complexities.
The AI/ML medical device field has been evolving continuously, as are the GMLP practices. These ten guiding principles are the foundations for developing Good Machine Learning Practices to address the nature of these products and, at the same time, encourage future development in this field that is growing quickly. These guiding principles can be used to implement positive habits that have been proven effective in other areas, to implement practices from other sectors that might turn out useful in the medical technology and healthcare sector, and to develop innovative procedures tailored to the medical and healthcare industry. Now let us have a look at some of the guiding principles for these practices.
- Multiple uses and needs are to be kept in mind while developing the product and throughout the product life cycle. To make sure that ML-enabled medical devices are safe and effective and address clinically significant needs over the course of the device’s lifecycle, it can be helpful to have a thorough understanding of a model’s intended integration into clinical workflow, as well as the desired benefits and associated patient risks.
- The model design has to be implemented with good attention to the fundamentals like good software engineering practices, data management, and data quality. These practices also incorporate systematic risk management and design processes that may effectively explain and document design, implementation, and risk management decisions and justifications. They also guarantee the integrity and authenticity of data.
- The data collected should consist of relevant features of the target patient population, and the measurement inputs are to be adequate enough for the training and test dataset so that the output can be reasonably generalized. Also, it is important to manage bias to promote generalized performances for the patient population and to identify the circumstances where the model may underperform.
- The training and test datasets are to be selected in such a way that they should be independent of one another. To ensure independence, all potential sources of reliance, including patient, data acquisition, and site characteristics, are taken into account and addressed.
- The most effective techniques for creating a reference dataset are used to make sure that clinically pertinent and well-characterized data are gathered and that the reference’s limitations are recognized. If available, reputable reference datasets that support and illustrate model robustness and generalizability across the intended patient population are employed in model creation and testing.
- The product’s clinical advantages and hazards are well recognized, used to develop clinically significant performance goals for testing, and support the idea that the product can be used safely and effectively for the purpose for which it was designed. The global and local performance is considered to estimate the uncertainty and variability in the device inputs and outputs.
- Human factors and human interpretability considerations are to be taken into account. At the same time, the model outputs are addressed, focusing more on the performance of the Human-AI combo than just the model’s performance alone.
- Sound test plans are strategized, developed, and executed. The intended patient population, significant subgroups, the clinical setting and team’s utilization of it, measurement inputs, and any confounding variables are all factors to keep in mind.
- Users get quick access to information that is understandable, contextually relevant, and suitable for the target audience, including the model’s performance for a particular group, acceptable inputs, acknowledged drawbacks, user interface interpretation, and integration of the model into clinical workflows. In addition, users receive information on device upgrades and modifications from real-world performance monitoring, the basis for decisions where applicable, and a channel for raising issues about the product with the developer.
- Deployed models can be observed in real-world applications to maintain or enhance performance. Additionally, there are appropriate controls in place to mitigate risks of overfitting, unintentional bias, or model degradation that may affect the safety and performance of the model when it is utilized by the Human-AI team when models are regularly or continuously trained after deployment.
Check out the e-Paper and Reference link. All Credit For This Research Goes To Researchers on This Project. Also, don’t forget to join our Reddit page and discord channel, where we share the latest AI research news, cool AI projects, and more.
Avanthy Yeluri is a Dual Degree student at IIT Kharagpur. She has a strong interest in Data Science because of its numerous applications across a variety of industries, as well as its cutting-edge technological advancements and how they are employed in daily life.