AWS SageMaker

I have played around with AWS SageMaker a bit more recently. This is Amazon’s managed machine learning service that allows you to build and run machine learning models in the AWS public cloud. The nice thing about this is that you can productionize a machine learning solution very quickly because the operational aspects – namely hosting the model and scaling an endpoint to allow inferences against the model – are removed. So called ‘MLOps’ has almost become a field of its own, so abstracting all this complexity away and just focusing on the core of the problem you are trying to solve is very beneficial. Of course, like everything else in public cloud, this comes at a monetary cost, but it is well worth the cost if you don’t have specialists in this area, or just want to do a fast proof-of-concept.

I will discuss here the basic flow of creating a model in SageMaker – of course some of these are general things that would be done as part of any machine learning project. The first setup you will need to do is head on over to AWS and create a new Jupyter Notebook instance in AWS SageMaker, this is where the logic for the training of the model, and deployment of the ML endpoint will reside.

Assuming you have identified the problem you are trying to solve, you will need to identify the dataset which you will use for training and evaluation of the model. You will want to read the AWS documentation for the algorithm you choose, as this will likely require the data to be in a specific format for the training process. I have found that many of the built-in algorithms in SageMaker require data in different formats, which has been a bit frustrating. I recommend looking at the AWS SageMaker examples repository, as it has detailed examples of all the available algorithms, and examples you can walk through that solve real world problems.

Once you have the dataset gathered and in the correct format, and you have identified the algorithm you want to use, the next step is to kick off a training job. It is likely your data will be stored on AWS S3, and as usual you would split into training data and data you will use later for model evaluation. Make sure that the S3 bucket where you store your data is located in the same AWS region as your Jupyter Notebook instance or you may see issues. SageMaker makes it very easy to kick off a training job. Let’s take a look at an example.

Here, I’m setting up a new training job for some experiments I was doing around anomaly detection using the Random Cut Forest (RCF) algorithm provided by AWS SageMaker. This is an unsupervised algorithm for detecting anomalous data points within a dataset.


Above we are specifying things like the EC2 instance type we want the training to execute on, the number of EC2 instances, and the input and output locations of our data. The final parameters above where we are specifying the number of samples per tree and the number of trees are specific to the RCF algorithm. These are known as hyperparameters. Each algorithm will have its own hyperparameters that can be tuned, for example see here for the list available when using RCF. When the above is executed, the training process starts and you will see some output in the console, note that you will be charged for the model training time, once the job completes you will see the amount of seconds you have been billed for.

At this point, you have a model, but now you want to productionize it and allow endpoints to run inferences against it. Of course, it is not as easy as train and deploy – I am completely ignoring the testing/validation of the model and tuning based on that, as here I just want to show how SageMaker is effective at abstracting away the operational aspects of deploying a model. With SageMaker, you can deploy an endpoint, which is essentially your model hosted on a server with an API that allows queries to be run against it, with a prediction returned to the requester. The endpoint can be spun up in a few lines of code:


Once you get confirmation that the endpoint is deployed – this will generally take a few minutes – you can use the predict function to run some inference, for example:


Once you are done playing around with your model and endpoint, don’t forget to turn off your Jupyter instance (you don’t need to delete it), and to destroy any endpoints that you have created or you will continue to be charged.


AWS SageMaker is powerful in terms of putting the ability to create machine learning models and setup endpoints to serve requests to them in anybody’s hands. It is still a complex beast that requires knowledge of the machine learning process in order for you to be successful. However, in terms of being able to train a model quickly and put it into production, it is a very cool offering from AWS. You also get benefits like autoscaling of your endpoints should you need to scale up to meet demand. There is a lot to learn about SageMaker, and I’m barely scratching the surface here, but if you are interested in ML I highly recommend you take a look.

Thoughts on AWS re:Invent 2018

I’ve just returned from AWS re:Invent 2018, Amazon Web Services’ yearly conference showcasing new services, features, and improvements to the AWS cloud. This was the 7th year of re:Invent, and my first time attending.

The scale of the conference is staggering – held across six different Las Vegas hotels over five days, with almost 60,000 attendees this year. I expected queues, and got them. Overall though logistically the conference was well organized. Pending I queued at least 30 minutes beforehand, I was able to to make it to 95% of the sessions I planned on attending across the week.

In terms of the sessions themselves, most were very good. Over the week, I attended sixteen different sessions, made up of talks, demos, chalk talks, and hands-on sessions.

Two of my favorite sessions were ‘Optimizing Costs as you Scale on AWS’ and ‘AIOps: Steps Towards Autonomous Operations’. The former described the 5 pillars of cost optimization – Right sizing, Increasing Elasticity, Picking the Right Pricing Model, Matching Usage to Storage Class, and Measuring and Monitoring. These may seem obvious, but can often be forgotten in instances where the project is a POC that becomes production for example, or a team is not too familiar with AWS and how costs can increase as you scale up an applications usage in production. This session also included insights from an AWS customer who talked through how they had applied and governed this model in their organization, which was interesting to compare and contrast to how I’ve seen it done in the past.

I also attended numerous sessions on SageMaker, AWS’s managed machine learning service (think AML on steroids). I’m looking forward to starting to play around with SageMaker, now that I have attended a hands-on lab I am more confident beginning to look at some of the ideas I have where this could be applied. I looked at this earlier this year while completing my Masters Thesis, but ended up using Amazon Machine Learning instead in the interest of time (AML is a lot simpler to get up and running). AWS also announced Amazon SageMaker Ground Truth, which can be used to streamline the labeling process for machine learning models, via human labelling and automated labelling. One other cool announcement around ML was the launch of AWS Marketplace for Machine Learning, where you can browse 150+ pre-created algorithms and models that can be deployed directly to SageMaker. Someone may have already solved your problem!

If I was to retrospectively give myself some advice for attending re:Invent, it would be:

  1. Try to organize session by hotel. Moving hotels between sessions can take a long time (especially at some points of the day due to Las Vegas traffic). Organizing your sessions so that you are in the same hotel for most of the day can be beneficial. A good thing though is that there is a regular shuttle between conference venues.
  2. Don’t assume you will make every session. Colleagues who had previously been to re:Invent gave me this advice, but I still assumed I would make everything. Traffic, queues or something else will inevitably disrupt your schedule at some point during the week.
  3. Leave time for lunch! Easy to forget when you’ve got a menu of exciting talks to attend. AWS provided a grab-n-go lunch option which was very handy to just grab something between sessions.

If I had one criticism of re:Invent, it would be that some of the talks labelled as advanced did not go as deep as I expected into the technical detail. I thought the hands-on labs did a good job of this though, especially the two I attended on AWS SageMaker.

Overall, re:Invent is a significant investment in the attendees you send (tickets are not cheap, not to mind accommodation, food etc. – remember it’s held in Vegas), but a good idea if you are taking first steps with AWS, looking at getting in deeper or optimizing your usage, or thinking about migrating existing on-premise services to the public cloud.

See here for a good summary of all the re:Invent announcements, as well as the keynote videos.