Back to Blog

Built-in Machine Learning Algorithms in Amazon Sagemaker: How Do We Get There?

C1, September 21, 2021

When you hear the term “machine learning,” do you think to yourself, “How does machine learning really work?” Well, machine learning uses historical data, and what I mean by this is past data. This data could be from databases, Hadoop systems, CSV format, or streaming data from a social media website.

Why Do We Need Machine Learning?

To use machine learning, we need a physical use case—and there are millions of them. It can be soccer players running down a field, people walking into a store, the main word searched on your website today, or any other physical use case. What we need to do is collect data on that given particular use case, and as I mentioned, we can have the data natively or maybe stream it into a data lake. There are many options.

We then need to combine that data set with some kind of machine learning model, and we do not necessarily know which model we are going to use upfront. There will be some significant experimentation that will go into the process. Screen Shot 2021-09-08 at 4.10.25 PM We want our data set and model to reflect the real-world use case. There is an element of intuition here. We want both our data and model to actually balance, so if we are using too many layers of depth on our tree with XGBOOST or if we are not keeping parameters in check, then our model can get too large and cause overfitting. This will cause the model to learn too closely and not be able to generalize very well, as in the image below.

Screen Shot 2021-09-08 at 4.10.30 PM

And flipping this over, if our data set is too large and the model is not large enough to handle this, then we will see underfitting.

Screen Shot 2021-09-08 at 4.10.34 PM

Breaking Down the Points of Consideration

The first consideration is: What data do I have on this use case?
Which machine learning model should I use?
How do I frame the problem and map them?

Each Algorithm Solves a Type of Prediction Problem

In the image below are the supported algorithms built into Amazon Sagemaker.

Algorithms are standardized methods used to train models. A model is a function that maps inputs to a set of predicted outcomes using algorithms. Existing data is then used to build a function using rules, and this is called training. With training, we can ensure that machine learning is applicable to real-world use cases and will provide valuable insights.

ml-sagemaker

Business Applications

Focus on customer success using Data & Analytics

Achieve growth targets, gain competitive advantage and provide better products and services using ConvergeOne Data & Analytics to deliver better customer experiences and guide organizational strategy. Schedule a consultation

About the author:

About the author: C1

C1 is transforming the industry by creating connected experiences that make a lasting impact on customers, our teams and our communities. More than 10,000 customers use C1 every day to help them build meaningful connections through innovative and secure experiences.

Follow the author:

This browser is no longer supported.

Why Do We Need Machine Learning?

Breaking Down the Points of Consideration

Each Algorithm Solves a Type of Prediction Problem

Focus on customer success using Data & Analytics

Recommended for you

Two People You Need for Building a Strong Analytics Team

Use the AWS Framework for Migration into AWS Cloud

Lessons From a Survivor of Ransomware