AWS Announces Redshift ML To Allow Users To Train Machine Learning Models With SQL

Amazon has announced the general availability of Redshift ML, enabling customers to utilize SQL to query and combine structured and semi-structured data across various data warehouses, operational databases, and data lakes. Redshift ML can deploy, train, and create machine learning models directly from an Amazon Redshift instance.

Previously, AWS customers who wanted to process data from Amazon Redshift to train an AI model were required to export the data to an Amazon Simple Storage Service (Amazon S3) bucket. Then they can configure and start training. The process needed many different skills and more than one person to complete, thereby raising the barrier to entry for enterprises aiming to forecast revenue, predict customer churn, detect anomalies, etc.

However, using Redshift ML, customers can now create a model using an SQL query to specify training data and the output value they wish to predict. After executing an SQL command, Redshift ML exports the data from Amazon Redshift to an S3 bucket and then calls Amazon SageMaker Autopilot to prepare the data, select an algorithm, and apply it for model training. Customers can choose the algorithm to use if they do not wish to defer to SageMaker Autopilot.

🔥 Recommended Read: Leveraging TensorLeap for Effective Transfer Learning: Overcoming Domain Gaps

All the interactions between Amazon Redshift, S3, and SageMaker, including the steps involved in training, are handled by Redshift ML. After training the model, Redshift ML uses Amazon SageMaker Neo to optimize the model for deployment and finally makes it available as an SQL function. Customers can then utilize the SQL function to apply the model to their data in queries, reports, and dashboards.

Currently, Redshift ML is available in the following AWS regions: U.S. East (Ohio), U.S. East (North Virginia), U.S. West (Oregon), U.S. West (San Francisco), Canada (Central), Europe (Frankfurt), Europe (Ireland), Europe (Paris), Europe (Stockholm), Asia Pacific (Hong Kong), Asia Pacific (Tokyo), Asia Pacific (Singapore), Asia Pacific (Sydney) and South America (São Paulo).

Amazon claims that with Redshift ML, customers will only pay for what they use. For example, While training a new model, they pay for the Amazon SageMaker Autopilot and S3 resources used by Redshift ML. Similarly, while making predictions, there’s no additional cost for models imported into their Amazon Redshift cluster. Redshift ML also enables customers to use existing Amazon SageMaker endpoints for inference.

Amazon has recently been creating a lot of high-level offerings to make it easier for developers and customers to use some of the complex services it provides, and Redshift ML is a step forward in this process.


Consultant Intern: Kriti Maloo is currently pursuing her B.Tech from Indian Institute of Technology (IIT) Bhubaneswar. She is interested in Data Analytics and its applications in various domains. She is a Bibliophile and loves to explore new advancements in the field of technology.