What Do Machine Learning Engineers Do?

Machine learning engineers are often considered data scientists, but the two roles have different responsibilities.

We’ve touched on the difference between the roles in a previous article. This time, we want to give you a better idea of what machine learning engineers do and how they add value to your business.

To illustrate it, we will use the example of a non-emergency transport business (NEMT). An NEMT company fills the gap between a taxi service and an ambulance. NEMT providers serve patients who need to get to medical appointments but have special needs that conventional transportation can’t meet. They’re not emergency appointments, so they don’t need an expensive ambulance.

The Value of Machine Learning Engineers

The role of machine learning engineers is to use machine learning (ML) to add value to your business or product. But what does that mean in real-world terms?

In most cases, machine learning engineers build features for products or automate workflows. Their work might be building decision support systems or an entire decision automation system.

Let’s look at our example.

NEMT businesses either maintain a fleet of owned vehicles or contract with freelance NEMT drivers. As you can imagine, one of an NEMT business’s core functions is ensuring passengers are picked up and dropped off at their medical appointments on time.

That means dispatch must have up-to-the-minute data about the route each driver will take to get to the passenger and the route from the passenger’s location to their destination. The drivers also need this data to avoid traffic backups that occur en route. Passengers need this knowledge so they can be ready when the driver arrives.

Providing this information involves real-time data analysis of traffic flow, road conditions, weather impact, and more. And it needs to be done for several routes simultaneously.

With a simple rule-based system, a software engineer would have to consider all possible factors and write code for them. With so many variables, there’s no way to write rules for everything.

However, a machine learning engineer can build a model that learns all the possible relations between data by itself and then delivers accurate predictions. The ML solution just needs to be fed the necessary data.

In a nutshell, machine learning engineers build systems that quickly perform predictions about your business and other tasks that would be next to impossible to do manually.  

Machine Learning Engineers’ Responsibilities

What goes into building ML systems? In other words, what does a machine learning engineer’s job encompass?

Continuing with our example, let’s look at what an ML engineer does to build and maintain our NEMT company’s ride request app.

Building the Model

The ML engineer starts by choosing and preparing the necessary data. The system needs to know the pickup time requested, the distance from the driver to the customer, the average speed vehicles go on each road, the current weather conditions, and traffic congestion along the route. Maybe a few other variables.

Each variable becomes a feature. Features are data attributes that a model uses to predict results. To get this data, an ML engineer analyzes records on past pickups that contain those variables.

After selecting the right data and consolidating it, the ML engineer removes any errors, fills in missing entries, and transforms records into a single format. When the data is ready, the ML engineer chooses an algorithm that fits the task. Selecting the most effective algorithm depends on the type of data, the expected predictive accuracy, and how resource intensive the model is.

For example, you need deep neural networks if you want to process images and video with near-perfect accuracy. However, training them would require renting clusters of GPUs, and running those models in production may require specific AI-optimized processing units.

A standard decision tree might be all you need if you don’t require that kind of processing power. The ML engineer would experiment with several models and a subset of data to determine which model best fits your business requirements.

Training and Deploying the Model

Once they select the model to use, they have to train it. During training, the model will learn to make predictions by finding patterns in the training data set. The ML engineer also needs a testing set of historical data to determine if the model makes accurate predictions.

Once the model passes the tests, the ML engineer prepares it for deployment. For our NEMT business, this involves three applications: one for dispatch, a second one for drivers, and a third for customers. There’s also the server, which holds all the backend logic.

Machine learning models are usually deployed as a microservice, an isolated container where the code can function as a standalone unit. To deploy our NEMT model, the ML engineer wraps it in a container, installs it on the server, and then connects the model to data sources.

The applications feed it some of the data, like driver and customer geolocation, current speed of the vehicle, and so on. We’ll also need data on traffic flow, delays, weather, and so on that comes from other sources.

From this point, the model consumes the required data, calculates a prediction, and sends it back to the users. But the ML engineer’s job doesn’t end there.

Monitoring the Model

Even though they tested the model on historical data, the ML engineer doesn’t know how well it works with real-time data. Therefore, they need to track the model’s performance. Monitoring and evaluating the model is a big part of a machine learning engineer’s role.

Perhaps our NEMT model predicted a ride would arrive at 8:04, but it actually didn’t arrive until 8:15. To track performance, the ML engineer sets up a monitoring infrastructure to compare real-world data to the model’s predictions. This process is ongoing for the life of the model because its accuracy may change over time.

Monitoring systems tell ML engineers if the model performs well or needs retraining. And retraining doesn’t mean the model is flawed or built wrong. It may just mean that it’s using outdated information to make predictions.

For example, if a major road closes for repair, the detour may delay drivers. Without this updated data, our NEMT model’s pickup time predictions become less accurate. If the ML engineer has the monitoring system set right, it will fire off an alert, at which point the engineer will retrain the model with updated data.

Since real-world conditions change constantly, retraining often becomes a daily task for a machine learning engineer.

Machine Learning Engineer’s Skillset

From initial data analysis to training, deploying, and retraining the model, an ML engineer covers the whole machine learning part of your product. That means they need a specific background and skill set.

To begin with, they must be well-versed in statistics, data analysis, and applied mathematics. They also need to know existing ML algorithms and common architectures, such as decision trees, deep learning networks, support vector machines, and Naive Bayes.

To train the models, machine learning engineers must be familiar with standard tools, including:

  • Python, the primary programming language used in data science
  • R, an integrated suite of software facilities for data manipulation, calculation, and graphical display
  • Java and C++, two languages used to run models on big data servers
  • Scikit-learn, a Python-based library with several machine learning algorithms
  • Tensor Flow, a free and open-source software library for machine learning and artificial intelligence with a focus on training and inference of deep neural networks
  • Hadoop, a distributed computing framework
  • Apache Spark, a data processing tool
  • Nvidia Cuda, a parallel GPU computing platform for deep learning

Machine learning engineers usually start in software development, although they may also begin in data science and analytics. Either way, they have a background in computer science.

ML engineers share some of the responsibilities of data scientists, but where data scientists focus more on analytics, ML engineers are more involved in production. Our NEMT company might hire a data scientist to explore new markets or areas of expansion, for example, while their ML engineer focuses on improving product performance.

Likewise, ML engineers share responsibilities with data engineers. The difference here is that the ML engineer defines the database specifications. In contrast, the data engineer uses those specs to upload data into the database and connect it with the model.

Need a Machine Learning Engineer?

How do you know if you need to hire a machine learning engineer? Products that require machine learning are those that predict future conditions (such as our NEMT app), offer product suggestions based on previous purchases (like Amazon), or try to predict customer behavior and preferences in some other way (suggest media based on previous videos watched or songs listened to, etc.).

If that sounds like the kind of product you want to build, chances are you’ll need to hire a machine learning engineer. Or you can outsource the work to an experienced team, like the experts at Taazaa.

If you’re still not sure about your product development needs, we can help. Contact Taazaa today!

Bidhan Baruah

Bidhan is the Co-founder and Chief Operating Officer of Taazaa. He is well versed in outsourcing and off-shoring, and loves building and growing startup teams. A true Apple lover, he loves trying different phones and tablets whenever he gets time.