Kedro, A Python Machine Learning Pipeline Framework Developed By McKinsey, Has Been Donated To The Linux Foundation

According to the Linux Foundation, McKinsey’s QuantumBlack will offer Kedro, a machine learning pipeline tool, to the open-source community. This non-profit organization provides a vendor-independent center for open source initiatives. The Linux Foundation will maintain Kedro within its umbrella organization, the Linux Foundation AI & Data (LF AI & Data), created in 2018 to encourage AI innovation by fostering technical initiatives, developer communities, and enterprises.

“In LF AI & Data, we are delighted to welcome the Kedro project. It tackles the myriad issues that exist today in producing machine learning solutions. It perfectly complements our portfolio of hosted technical projects”, stated Ibrahim Haddad, LF AI & Data’s Executive Director. They are excited to engage with the community to expand the project’s reach and offer new avenues for cooperation with our members, hosted projects, and the broader open source community.

The ML Pipelines

A machine learning pipeline orchestrates the flow of data into and out of a machine learning model. Raw data, data processing, forecasts, and variables that enhance the model’s behavior are all included in pipelines to code the workflow to be shared throughout an organization.

There are several tools for building machine learning pipelines. Kedro is a relatively recent tool for developing machine learning pipelines. It’s a Python framework that draws principles from software engineering and applies them to the field of data science, establishing the groundwork for turning a project from an idea to a finished product, according to McKinsey.

Kedro was created to overcome the critical flaws of one-time scripting and “glue code” by focusing on building maintainable and efficient data science programming, according to Yetunde Dada, Kedro’s product lead. One of the purposes of implementing modularity was to encourage the production of reusable analytics code and promote team cooperation.

Kedro’s community and user base have grown to over 200,000 monthly downloads and over 100 contributors in the two and a half years available on GitHub. Telkomsel, Indonesia’s largest wireless network operator, employs Kedro as the standard in its data science organization.

Use In The Future

Open-source software has grown commonplace in businesses, and it is increasingly employed in mission-critical situations. According to a 2021 Red Hat poll, 79% of firms expect the adoption of open-source software for new technologies to rise over the next two years, despite concerns about program integrity, particularly in light of recent events.

Kedro will focus on achieving a stable API, or version 1.0, standard interfaces with developer tools and cloud platforms and will continue improving their experiment tracking capabilities. Users may also be confident that upgrading to newer versions of Kedro and taking advantage of new features is simple. Kedro now offers primary connections with a variety of cloud service providers. They wish to collaborate with them to enable seamless integrations. Users to identify and promote production models have opened the way for users to find and encourage experiments, a means for data scientists to monitor science experiments. Based on user feedback, they will expand this functionality to include many more features.

Kedro joins Microsoft’s SynapseML, an open-source pipeline product published in November. Developers may use SynapseML, like Kedro, to build systems that address cross-domain problems, including text analysis, translation, and speech processing.




Prathamesh Ingle is a Mechanical Engineer and works as a Data Analyst. He is also an AI practitioner and certified Data Scientist with an interest in applications of AI. He is enthusiastic about exploring new technologies and advancements with their real-life applications