
In the last section, we explored how to identify a set of models that might help us solve a business problem. But just knowing which algorithms exist is not enough. The real challenge—and opportunity—comes in the next stage of the journey: training these models, testing their performance, improving them, and ultimately selecting the one that delivers the most value for your business.
Think of this stage as the playoffs of a sports league. Multiple teams (models) have qualified, but now they must compete head-to-head to see who makes it to the finals and who takes home the trophy.
Now that we have identified a set of models to try as we discussed in our last section, we can move to the next steps, from Step 6 to Step 8 in the Machine learning pipeline , which include
.png)
This is where machine learning becomes real. The difference between a model that works in theory and one that creates business value comes down to how well we execute this stage.
Step 6: Train the ML Models: First, we will have to train the model, similar to how we train our two-year-olds for a task. Let us understand what training a model means.
Think of training a model like teaching a two-year-old how to identify animals. You show it many examples of cats, dogs, and birds and tell it the correct name. Over time, the child learns to associate the correct labels with the features. A machine learning model works the same way: it learns from historical examples where the “correct answer” is already known.
Splitting the Dataset: Training vs Testing
Training starts with splitting the dataset into two (in some cases, three) parts:
.png)
.png)
This process ensures that the model doesn’t just memorize the training data (which would lead to overfitting), but learns general patterns that apply broadly—even to data it hasn’t seen before.
Executive Insight: If a model performs extremely well on training set but poorly on test set, it’s like a sales associate who can ace role-play sessions but fails miserably in front of a real customer.
To illustrate the concept of training, let us discuss a business case. We will revisit the house price prediction example introduced earlier as our working case study.
1. Frame the business question. Predict the sale price of a home (in dollars) before it sells, so that as a prospective buyer in the market, we can bid the right price for the house.
.png)
2. Define target and features.
3. Assemble & clean data.Join MLS + tax records; deduplicate homes; fix obvious errors (e.g., 99 bedrooms); handle missing values; standardize units; encode categorical fields (e.g., one-hot for neighborhood)
4. Split the data.Training (~70%), Validation (~15%), Test (~15%), or use K-Fold cross-validation. Keep the test set untouched until the very end.
5. Establish a baseline.Predict median price per square foot by neighborhood (or a simple linear model). Record MAE/RMSE to know if ML beats a simple rule.
6. Choose a loss function aligned to the cost of errors.
Pick the one that best reflects your business penalty for big mispricing.
Now, with our data prepared, we train the model. This means feeding the training data into the algorithm so it can learn the relationships between inputs (like House Features) and outputs (like House Prices).
The goal is to minimize the difference between the model’s predictions and the actual results—a process guided by a mathematical function called a "loss function" or “Objective function”. We discussed this in more detail here WDIS AI-ML Series: Module 2 Lesson 1: Objective function - AI is nothing but an optimization problem
In the next step, we’ll explore how to measure how well each model has learned and begin the process of selecting a winner.
We configure each model with specific settings or hyperparameters (like learning rate, number of layers, and regularization strength), which greatly influence performance. Throughout this process, we watch for two common pitfalls:
Example: If we’re building a churn model, we would feed historical customer records into the training phase. The model then learns patterns that differentiate customers who stayed from those who left. This trained model can later predict which current customers might be at risk of leaving based on their latest behavior.
We will dedicate a separate chapter on Overfitting and underfitting later. Until then it is enough to know that overfitting means the model has learnt the training data too well but does poorly on testing data while underfitting is when the model has not explained the training data as well as shown in the image below

.png)
Step 6: Report the Results of ML Models on Evaluation Metrics:
Step 7: Improve ML Models to Enhance Performance:
Step 8: Identify the Winning Model:
Example: Finalizing a customer churn prediction model after verifying its accuracy and reliability across diverse customer segments.
Model Drift Definition: Model drift occurs when the predictive performance of a machine learning model deteriorates over time due to changes in the underlying data patterns or distribution. Regular monitoring and addressing model drift ensure sustained model accuracy, reliability, and continued alignment with business objectives.
Understanding and proactively managing model drift help businesses ensure their machine learning models remain effective, trustworthy, and valuable over the long term, safeguarding business decisions and strategic investments.
Training and testing is where machine learning transitions from experimentation to impact. The best model is not the most complex one—it is the one that performs reliably, aligns with business costs, and continues delivering value over time.
Understanding this stage is what separates organizations that build models from organizations that build durable AI capabilities.
