Forem: Anastasiia Molodoria

How To Apply Machine Learning To Demand Forecasting

Anastasiia Molodoria — Fri, 04 Nov 2022 17:58:27 +0000

Demand and sales forecasting are of paramount importance in retail. Without this tool, companies encounter disruption of the inventory balance, through ordering too much or not enough products for a certain period of time. In the case of surplus, a company is forced to offer discounts to sell products. Otherwise, it may face inventory issues. A shortage, in turn, results in lost profits. However, these problems can be solved by applying demand and sales forecasting to increase the return on inventory and determine the intention of future consumers to buy a specific product at a specific price.

Let’s go through the process of implementing ML forecasting in retail together, discovering the key steps of building forecasting software.

Benefits of ML Demand Forecasting for Business

Looming uncertainty and changes in the market lead to highly volatile data. Unlike traditional methods, demand forecasting using machine learning is more flexible and allows the quick infusion of new information into models. That’s why ML models are adaptive and accurate enough to bring obvious benefits to the business:

Increase in sales. All needed products will be available in the store, so customers can purchase them without waiting for long delivery times.
Customer satisfaction maintenance. Warehouses will plan purchases in advance, so customers won’t face the problem of their favourite product’s absence.
Higher inventory turnover. Thanks to proper planning of goods in warehouses, poorly selling goods will not be stale.
Reduction in the number of spoilage products. Demand forecasting will help to competently plan the delivery of products, considering expiration dates.
Reduced personnel costs. By analyzing and predicting future demand, we can plan an optimal number of employees for proper shift support.

ML demand forecasting methods, like other use cases of machine learning forecasting, can rely on a tremendous amount of data to make accurate predictions. However, the question of how to develop such models remains open, and we will consider it in the following section.

How to Develop an ML-Based Demand Forecasting Software

Before embarking on demand forecasting model development, you should understand the workflow of ML modelling. This offers a data-driven roadmap of how to optimize cooperation with software developers. Let’s review the process of how AI engineers at MobiDev approach ML demand forecasting tasks.

STEP 1. BRIEF DATA REVIEW

The first step when initiating the demand forecasting project is to provide the client with meaningful insights. The process includes the following steps:

Gather available data
Briefly review the data structure, accuracy, and consistency
Run a few data tests and pilots
Look through a statistical summary

In our experience, a few days is enough to understand the current situation and outline possible solutions.

STEP 2. SETTING BUSINESS GOALS AND SUCCESS METRICS

Each project is unique and has its own business goals. Therefore, this stage is key in creating an effective forecasting solution since it provides the starting point of the development process and outlines the following stages.

Before coming to the stage of developing a demand forecasting solution, a software development team needs to agree with the client/business owner on the success metrics for the model’s results evaluation. Success metrics offer a clear definition of what is “valuable” within demand forecasting. A typical message might state:

“I need a machine learning solution that predicts demand for […] products, for the next [week/month/a half-a-year/year], with […]% accuracy.”

This statement example will help you to identify what your success metrics will look like. You are expected to consider the following information:

Product Types / Categories What types of products/product categories will you forecast?

Different products/services should be considered and predicted independently for most cases. For example, the demand forecast for perishable products and subscription services coming at the same time each month will likely be different.

Time Frame What is the length of time for the demand forecast?

Short-term forecasts are commonly done for less than 12 months – 1 week/1 month/6 months.

These forecasts may have the following purposes:

Uninterrupted supply of products/services
Sales target setting and evaluating sales performance
Optimization of prices according to market fluctuations and inflation

Long-term forecasts are completed for periods longer than a year. The main purposes of long-term forecasts may include the following:

Long-term financial planning and funds acquisition
Decision-making regarding the expansion of business
Annual strategic planning
Accuracy

What is the minimum expected percentage of demand forecast accuracy for making informed decisions?

Implementing retail software development projects, we were able to reach an average accuracy level of 95.96% for positions with enough data. The minimum required forecast accuracy level is set depending on your business goals.

Examples of metrics to measure the forecast accuracy are MAPE (Mean Absolute Percentage Error), MAE (Mean Absolute Error), or custom metrics.

STEP 3. DATA UNDERSTANDING & PREPARATION

Regardless of what we’d like to predict, data quality is a critical component of an accurate demand forecast. The following data could be used for building forecasting models:

When building a forecasting model, the data is evaluated according to the following parameters:

Consistency
Accuracy
Validity
Relevance
Accessibility
Completeness
Detalization

In reality, the data collected by companies often isn’t ideal. It usually needs to be cleaned, analyzed for gaps and anomalies, checked for relevance, and restored. That’s why data science consultants can be involved at this stage.

Data understanding is the next task once preparation and structuring are completed. It’s not modelling yet but an excellent way to understand data by visualization. Below you can see how we visualized the data understanding process:

This visualization demonstrates data decomposition, extracting trends, and seasonal or other factors from input data. It’s divided into several graphs:

The 1st graph is an original timeline (time series visualization)_
The 2nd, 3rd, and 4th graphs separately represent seasonality, trends, and noise for further analysis and forecasting

STEP 4. MACHINE LEARNING MODELS DEVELOPMENT

There are no “one-size-fits-all” forecasting algorithms. Often, demand forecasting features consist of several machine learning approaches. The choice of machine learning models depends on several factors, such as business goal, data type, data amount and quality, forecasting period, etc.

Here you’ll find those machine learning approaches when applied to our retail clients. These approaches can also be used for most demand forecasting cases:

ARIMA/SARIMA
Regression models
XGBoost
K-Nearest Neighbours Regression
Random Forest
Long Short-Term Memory (LSTM)

Below we would like to describe in more detail 3 ML approaches for working with time series data for real demand forecasting projects.

TIME SERIES APPROACH

Time series is a sequence of data points taken at successive, equally-spaced points in time. The major components to analyze include trends, seasonality, irregularity, and cyclicity.

In the retail field, the most applicable time series models are the following:

ARIMA (auto-regressive integrated moving average) models aim to describe the auto-correlations in the time series data. When planning short-term forecasts, ARIMA can make accurate predictions.
SARIMA (Seasonal Autoregressive Integrated Moving Average) models are the extension of the ARIMA model that supports uni-variate time series data involving backshifts of the seasonal period.
Exponential Smoothing models generate forecasts by using weighted averages of past observations to predict new values. The essence of these models is in combining Error, Trend, and Seasonal components into a smooth calculation.

You can also have an understanding from the visualization below of what prediction results usually look like when talking about working with time series prediction approaches.

Let’s say you want to forecast demand for vegetables in the next month. For a time series approach, you require historical sale transaction data for at least the previous six months. If you have historical data about seasonal products – vegetables in our case – the best choice will be the SARIMA model. The forecast error, in that case, can be around 10-15%.

REGRESSION MODELS

A regression model is a statistical model that estimates the relationship between one dependent variable and one or more independent variables using a line (or a plane in the case of two or more independent variables).

Regression models are also helpful in predicting future values from past ones. They can help determine underlying trends and deal with cases involving overstated prices.

While quite rare in real-life business cases, we can see a linear correlation between the target feature that needs to be predicted and the rest of the available variables. Because of this, it is important to select the proper regression model based on the custom client’s data.

A visual example of the different regression models is provided below:

Let’s say you want to calculate the demand for tomatoes based on their cost. Assuming that tomatoes grow in the summer and the price is lower because of the high tomato quantity, the demand indicator will increase by July and decrease by December.

The information required for such type of forecasting is historical transaction data, additional information about specific products (tomatoes in our case), discounts, average market cost, the amount in stock, etc. The forecast error can be 5-15%.

RANDOM FOREST

Random Forest is a well-known ensemble machine learning algorithm, done by constructing a multitude of decision trees at training time and outputting the mean/average prediction (regression) of the individual trees.

It can be used for both Classification and Regression problems in ML. However, it can also be used in time series forecasting, both univariate and multivariate datasets by creating lag variables and seasonal component variables manually.

Random Forest is the more advanced approach that takes multiple decision trees and merges them together. By taking an average of all individual decision tree estimates, the random forest model results in more reliable forecasts.

However, despite its versatility, Random Forest has some limitations. The model may be too slow for real-time predictions when analyzing a large number of trees.

If you have no information other than the quantity data about product sales, this method may not be as valuable. In such cases, the time series approach is superior.

STEP 5. TRAINING & DEPLOYMENT

Training

When training forecasting models, data scientists usually use historical data. By processing this data, algorithms provide ready-to-use trained model(s).

Validation

This step requires the optimization of the forecasting model parameters to achieve high performance. By using a cross-validation tuning method where the training dataset is split into several equal parts, data scientists train forecasting models with different sets of hyper-parameters. The goal of this step is to figure out which model’s parameters have the most accurate forecast.

Improvement

When researching the best business solutions, data scientists usually develop several machine learning models and then choose the ones that cover the project’s requirements the best. The improvement step involves the optimization of analytic results. For example, using model ensemble techniques, it’s possible to reach a more accurate forecast. In that case, the accuracy is calculated by combining the results of multiple forecasting models.

Deployment

This stage assumes the forecasting model(s) integration into production use. We also recommend setting a pipeline to aggregate new data to use for your next AI features. This can save you a lot of data preparation work in future projects. Doing this also increases the accuracy and variety of what you could be able to forecast.

Key Factors Affecting Demand Forecasting

Demand forecast tasks depend on a lot of obvious and non-obvious factors. Here are the ones with the most impact.

PRODUCT TYPES AND MODELlING ERRORS

The product type is an important factor to consider for the demand model. For example, for a perishable item that has an actual demand of 100 cases, the prediction of selling 90 cases is preferred over the prediction of 110 cases. Missing the sales of 10 cases is a better result than wasting 10 cases, even though the actual error is the same percentage.

REGIONAL IMPACTS ON MODEL PERFORMANCE

Predictive models are strongly influenced by regional factors that include customer behaviour and cultural determinants. They also include the following:

Marketing campaigns may be regionally specific and have a different impact that depends on where a customer is located.
Holidays may vary between regions, which might be a consideration for adjusting the model.
Legal issues/laws may limit the use of certain data in different regions.

NEW COMPETITORS ON THE MARKET

Demand forecasting is a dynamic concept. The more competitors and product alternatives are present in the market, the harder demand forecasting becomes. The competition level contains sub-factors, such as the number of alternative products and competitors.

So, it is a very good idea to add this information dynamically to your demand forecasting model.

ECONOMIC SITUATION

The state of the economy influences businesses and demand forecasting models. To put it more bluntly: periods of economic decline are likely to cause lower demand for expensive products, though sales of low-priced goods may go up. Therefore, an economic situation as well as trends aren’t external factors and should be considered when building AI models.

Sales Forecasting For Retail During Uncertainty

When integrating demand forecasting systems, it’s essential to understand that they are vulnerable to anomalies or unpredictable situations. It means that machine learning models should be upgraded according to current reality.

As the demand forecasting model processes historical data, it can’t know that the demand has radically changed. For example, if last year, we had one demand indicator for medical face masks and antiviral drugs, this year, it would be completely different.

In that case, there might be several ways to get an accurate forecast. Here are the six most common ways:

Collect data about new market behaviour. Once the situation becomes more or less stable, develop a demand forecasting model from scratch.
Apply a feature engineering approach. By processing external data, news, a current market state, price index, exchange rates, and other economic factors, machine learning models are capable of making more up-to-date forecasts.
Upload the most recent data and provide it with the highest weights during model prediction. The period of a loadable dataset might vary from one to two months, depending on the products’ category. In this way, we can detect shifts in demand patterns and enhance forecast accuracy in a timely manner.
Apply the transfer learning approach. If there is any gathered historical data, we can use it to predict demand in the context of the current crisis.
Apply the information cascade modelling approach. We can forecast how people will make buying decisions according to the behaviour patterns of most people.
Apply the natural language processing (NLP) approach. NLP technology enables the processing of real comments from social networks, media platforms, and other available social sources. By utillizing text mining and sentiment analysis approaches, NLP models gather samples of customers’ conversations to detect people’s preferences, choices, sentiments, and behaviour shifts.

During AI app development, AI engineers analyze historical data for forecasting. This forecasting cannot predict the disruption caused by a global pandemic. Such an event requires the recalibration of the machine learning models. We met this challenge using machine learning models developed for a restaurant business prior to the pandemic.

But keep in mind that after the demand situation normalizes after the pandemic/war/ etc – you need to adjust your model back, since in other cases – the model can remember the pandemic’s pattern and predict it for the next short-time period (e.g. next year).

Machine Learning Forecasting for Enhancing Business Intelligence

Anastasiia Molodoria — Thu, 22 Sep 2022 17:31:52 +0000

Business forecasting is imperative for making balanced financial and operational decisions. Its impact across industries has grown in recent years due to the way companies build data-driven strategies and rely on data. But let’s find out what is needed for efficient forecasting and why machine learning models have all the prerequisites for enhancing business intelligence.

In this article, we’ll go over the principles of ML forecasting functioning and the benefits it can bring if used for business purposes. Also, we will highlight the differences between machine learning forecasting models, from regression to exponential smoothing.

How AI Improves Business Forecast Accuracy

Thanks to forecasting, companies are able to better serve clients and ship orders, instead of running out of stock. This leads to a huge impact on sales and customer satisfaction. For example, knowing the demand brings an ability to manage logistics and track inventory costs, or even predict ROI for a new product. Therefore, ML forecasting models allow organizations to enhance their AI maturity, and more importantly, to solve business tasks by looking at existing data.

Nowadays, the volume of data from markets, industries, and users is skyrocketing. FinancesOnline reveals that the world will produce and consume 94 zettabytes in 2022. Such growth fuels the training of ML models, making them more robust and accurate. According to Market Research Future, the ML market share is projected to reach $106.52B by 2030, with a CAGR of 38.76% during the forecast period of 2020-2030. With increasing market share (caused by evolving cloud-based services and growth in unstructured data) comes new opportunities for building forecasting models. So, let’s figure out how these models improve business forecast accuracy and why they are more efficient than traditional approaches.

ML forecasting rests on an enormous amount of information, which can be analyzed to achieve accurate predictions and high performance rates. Unlike traditional forecasting approaches, machine learning allows companies to consider numerous business drivers and factors, and for building nonlinear algorithms to minimize loss functions (a crucial ingredient in all optimization problems).

Training of any ML forecasting model requires the assessment stage. This stage foresees comparison of predicted and actual results. It brings an understanding of how well the model performs. After that, it would be possible to compare different forecasting algorithms and choose the one which produces a minimal amount of errors. With this approach, businesses can replace traditional techniques with ML, getting the following benefits for their business forecast:

Acquiring insights and detecting hidden patterns that are difficult to trace with traditional approaches. Training ML forecasting models on BigData, and moving computation to Cloud is becoming de-facto an industry standard.
Reduced number of errors in forecasting. For instance, McKinsey claims that AI-driven forecasting models applied to delivery chain management can reduce the number of errors by 20–50%.
Ability to infuse more data in a model. External data may be valuable here and change the outcomes in terms of predictions.
Flexibility and rapid adaptability to changes. Compared to traditional non-AI approaches, ML forecasting algorithms can be quickly adapted in case of any significant changes.

Please note that we’re considering forecasting, not predictive modeling. We’ll explain the difference between these two models in simple terms.

Difference Between Forecasting & PredictiveModeling

Both forecasting and predictive algorithms are applied to address cumbersome challenges related to business planning, customer behavior, and decision-making. But, nevertheless, these techniques differ.

Forecasting modeling implies analysis of past and present data to find patterns, or trends, which allow us to estimate the probability of future events. In contrast to predicting, forecasting modeling should have traceable logics. Typical use cases include a forecast for energy consumption in the following 6–12 months, an evaluation of how many customers will reach support in the next 7 days, or how many agreements for the supply are expected to be signed. All this could be forecasted based on previous (historical) data.

Predictive modeling is the process of applying AI and data mining to assess more detailed, specific outcomes and use much more diverse data types. The difference between predictive and forecasting modeling is blurred, still, we can consider an example to understand it better. Just imagine that a credit institution plans to launch a new premium card. At this point, two questions may arise.

The first will probably be, how many cards will be issued in the next 6 months? Forecasting modeling will help us find an answer to this question thanks to analysis of similar products launched in the past.But we still don’t know whom we can recommend this card to. Here predictive modeling comes into play. It enables us to analyze a customer information database with such fields as age, salary, preferences, consumer habits, etc. With this approach, we will eventually understand which clients are more likely to use this card.

Use Сases For Machine Learning Forecasting For Business

FINANCIAL FORECASTING

Without a financial forecast, companies face disruption in processes and performance, while C-level managers tend to make incorrect decisions. That’s why companies leverage ML forecasting which instead of dealing with mundane tasks, concentrates attention on understanding business drivers. Moreover, ML financial forecasting reduces the amount of ineffective strategies in play and human errors and helps predict supply, demand, inventory, future revenues, expenses, and cash flow.

For example, stakeholders of the business are aiming to know the company’s turnover and key factors for growth during the next financial period to understand and analyze areas of improvement. Based on historical key company business indicators and existing turnover information during the past periods, we can develop an ML forecasting model using deep learning or regression models. It will predict future required metrics, based also on seasonal information and other influencing factors. In this case, business owners will be able to plan the next period of time accordingly.

SUPPLY CHAIN FORECASTING

ML can fully transform management in the area of supply chains, which are becoming more globalized and sophisticated. ML-based forecasting solutions enable companies to efficiently respond to issues and threats as well as avoid under and overstocking. Machine learning algorithms for forecasting can learn relationships from a training dataset and then apply these relationships to new data. Thus, ML improves selecting and segmenting suppliers, predicting supply chain risks, inventory management, and transportation and distribution processes.

Let’s look at an example of using machine learning for supply chain forecasting. The chain of hypermarkets operates around 100 stores in different locations and has an average of 50000 SKUs per store. For such a big chain, it’s definitely required that the process of replenishment of warehouses be automated. There are two main benefits in this case:

No need to store a lot of hard-to-sell products
Frequently sold products should be delivered on time

Based on the previous information on replenishment of warehouses, as well as data that shows how fast certain products are selling, we can develop an ML model for predicting the number of products per SKU. The prediction could be shown with different time horizons (e.g. daily, weekly, monthly, etc.). This can help managers properly organize the system of storing products and minimize the case of product absence.

PRICE PREDICTION

Price prediction algorithms determine how much the product must cost to be appealing to consumers, meet the company’s expectations, and assure the highest level of sales. The construction of price forecasts should take into account such factors as product features, demand, and existing trends. This approach may be perceived skeptically, yet it’s beneficial when companies enter a new market or release a new product and want to easily cope with a myriad of fluctuating factors.

Often business owners want to have an understanding of price changes for a specific product for a future period of time. Having taken into consideration client data with related price changes for a past period of time for all of the existing products, we can catch general patterns from the previous data and extrapolate them for the next periods. The positive impact could also be applied by adding external third-party data that could influence prices as well, for instance: inflation rate, holidays, seasonal patterns, etc. Wrapping up all of this data, we can develop an ML forecasting model that will be able to predict price trends for specific products.

DEMAND & SALES FORECASTING

A fluctuation in demand is a cumbersome challenge that concerns the whole e-commerce industry. That’s why companies, including manufacturers, apply ML demand forecasting to predict buyers’ behavior and find out how many products to produce or order. With ML models, it’s possible to avoid excess inventory or stockout. Moreover, such an approach to demand forecasting enables understanding the target audience and competition.

Let’s say a restaurant chain business wants to plan demand in advance. It will help the business in several ways:

to know the number of dishes that will be sold in the restaurant in order to plan food stock in advance,
to understand and define an appropriate number of employees that are required to provide quality customer service
to come up with the proper and timely marketing campaign

In order to develop a demand forecasting model and help businesses to fulfill their goals, it will be great to start by analyzing historical data of the previous periods. One of the ways to improve the model performance could be an integration of NLP algorithms as well. For example, we can consider reviews on Google for our restaurant chain, as well as the main competitors to identify the main dishes/quality of service that customers like or do not like.

FRAUD DETECTION

According to a TransUnion report, there is a 52.2% increase in the rate of suspected digital fraud globally between 2019 and 2021. It indicates that companies should make greater efforts in the development of anti-fraud tactics. ML algorithms can detect suspicious financial transactions by learning from past data. They are already successfully applied in e-commerce, banking, healthcare, fintech, and other areas.

For instance, a cafe chain owner wants to analyze the productivity of employees. One of the main goals is to detect hidden patterns that allow employees to cheat. Different frauds like this could lead to losing money. Based on historical data, we can develop a fraud detection model that will detect anomaly patterns and notify about them. In this case, managers can precisely analyze detected anomalies and identify the root cause of such deviations in the data. In the future, such cases could be prevented by the manager to keep the business safe.

Key Machine Learning Forecasting Algorithms

Let’s look at some key machine learning forecasting algorithms to better understand how ML forecasting can be applied.

REGRESSION ALGORITHMS

ML regression models are applied to predict trends and outcomes, being capable of comprehending how variables impact each other along with the results. The dependency between variables can be both linear and nonlinear, while labeled data is required for training. After understanding the relationship of variables, regression models can predict what results will be in unseen data.

Simple and multiple linear regression and logistic regression, where a target variable has only two values, are one of the most common baseline models to predict sales, stock prices, and customer behavior.

DEEP LEARNING ALGORITHMS

Time series forecasting implementation is gradually replenishing with new deep learning algorithms. The more versatile and explainable a model is, the higher the chances for its production use. Let’s take a look at a few deep learning models for time series forecasting.

The first one is DeepAR. It’s a supervised ML algorithm created by Amazon and based on recurrent neural networks. It has proven its efficiency with datasets consisting of hundreds of interrelated time series. The advantages of the method are the possibility to use a rich set of inputs, scaling capabilities, and suitability for probabilistic forecasting.

The second one is the Temporal Fusion Transformer (TFT). It overcomes other deep learning models in terms of versatility and can be built on multiple time series. TFT performs well even if trained on a small dataset, thus being suitable for demand forecasting as just one example.

The third algorithm is long short-term memory (LSTM) based upon an artificial RNN, in which the output from one step is transformed into the input of the next step. As for the architecture of LSTM, it consists of neural networks and memory cells for maintaining data, while any manipulation within the memory is performed by gates. There are three gates here: Forget, Input, and Output. However, LSTM requires plenty of resources and a long time for training.

TREE-BASED ALGORITHMS

Tree-based algorithms refer to supervised learning approaches. Their advantages include accuracy, sustainability, and suitability for mapping non-linear patterns. The idea here is to define homogeneous sets in the sample taking into account the key differentiator in input. The classification of tree-based algorithms depends on the target variable. As for advantages, tree-based algorithms can be easily grasped, require minimal data cleaning, and handle different types of variables. The tendency toward overfitting and irreconcilability with continuous variables may be seen as disadvantages in this case.

GAUSSIAN PROCESSES

Gaussian processes (GP) are inferior in popularity to other models, yet they are powerful enough for industrial application, including automatic forecasting. Gaussian processes enable us to incorporate expert opinion via kernel, though their application in forecasting depends on the number of parameters and may be expensive.

AUTO-REGRESSIVE ALGORITHMS

The group of auto-regression algorithms foresees predicting future values using the output from the previous step as an input. Forecasting algorithms of this group include ARIMA, SARIMA, and others. In ARIMA, forecasting is carried out with the application of moving and autoregressive averages. For instance, the ARIMA model can predict fuel costs or forecast a company’s revenue based on past periods. SARIMA uses the same basic idea, but it includes a seasonal component that may affect the outcomes.

EXPONENTIAL SMOOTHING

Exponential smoothing is an alternative to ARIMA models. It can be applied as a forecasting model for univariate data that can be extended to support data with a systematic trend or seasonal component. In this model, forecasting is a weighted sum of past observations, yet the importance (weight) of past observations is exponentially decreased. The accuracy of prediction depends on the type of the exponential smoothing model which can be single, double, or triple. The most sophisticated exponential smoothing models take into account trends and seasonality.

How to Apply Machine Learning Forecasting

Regardless of the chosen model, the whole adoption of ML practices looks as the following:

Define business goals and available internal data
Search for external data, namely market reports, trends, GDPs, product reviews, etc.
Structure, clean, and label data (if needed)
Identify the batch of problems to be solved with the help of forecasting
Select a baseline model (usually simple regression or tree-based models) to be used as a first reference point to start with
Improve models’ performance by implementing more sophisticated ML models or adjusting the data
After achieving comfortable results, the model is implemented into production (added to existing software and used on more data)

Challenges of ML Forecasting

Nothing good comes without challenges, ML forecasting is no exception. Key business forecasting with machine learning challenges include the following:

Insufficient amount of data to train a model
An incorrectly chosen metric to evaluate results in alignment with business needs
Imputation of missing data
Dealing with outliers/anomalies

While infusing the data at the scale of AI, businesses encounter difficulties and limitations, that’s why it’s crucial to involve experienced data science professionals and AI engineers when implementing machine learning.

5 Essential Machine Learning Algorithms For Business Applications

Anastasiia Molodoria — Wed, 31 Aug 2022 14:35:16 +0000

Businesses, from market giants like Amazon and Netflix to a small retail store somewhere in the heart of Ohio, strive to grow and improve their efficiency. Incorporating AI and Machine Learning into operational activity is one of the ways to achieve this. But due to the diversity of ML, it’s hard to choose the right method and clearly understand what benefits it can bring. So, in this article we’re going to overview basic Machine Learning algorithms, explain their business application, and highlight a step-by-step guide to choosing an appropriate algorithm that will meet your business needs.

1. Regression

Regression is a rudimental ML algorithm for finding the relationship between at least two variables. These variables can be dependent (target) and independent (predictor). An understanding of how variables affect each other allows for building forecasts, while also identifying times series, cause and effect relationships, and serving as a predictor of strength.

The goal of regression techniques is typically to explain or predict a specific numerical value while using historical data. And the variety of the regression model depends on the type and number of input data (variables). In total, there are more than 10 such models. Simple linear and multiple linear regression are the most popular of them.

Simple linear regression consists of only one independent and one dependent variable. Multiple linear regression is much more common in practice. It foresees numerous explanatory (independent) variables that influence one dependent variable. Here, a specific example can better illustrate the differences between simple and multiple linear regression.

Assume that we’re dealing with an ice cream business. With a simple linear regression, we can find dependency between the number of sales (dependent variable) and the storage temperature of an ice cream (independent variable). Multiple linear regression covers clarifying deeper patterns. For instance, we can check how independent variables – the storage temperature, pricing, and number of flavors and staff – affect the sales (dependent variable).

Linear regression is easy to comprehend, yet it is rarely used in practice because not all of the features (variables) in the world are perfectly generalized with a linear trend. Usually, non-linear interconnections are more frequent since they depict a curvy trend in the data change occurring in real-life projects.

Time series information in such projects allows us to work with regression tasks by not only finding key factors affecting the target variable but predicting future values based on historically gathered data, including timestamps. This is one of the reasons why regression has found a wide application in areas such as retail, business processes optimization, recommendation systems, and etc.

BUSINESS USE CASE FOR REGRESSION ALGORITHM

Let’s walk through an example of applying a regression model in a restaurant business. Were you a restaurateur, you’d probably think about cost optimization. You can satisfy this need by minimizing the number of spoiled products and by leveraging precise planning of goods purchases. We can develop a regression model that will be able to predict when and how many products to buy, considering the expiration date of different products. To make a workable model, we’d need to feed it with the following historical data:

The number of restaurant dishes that were sold during the past periods (grouped by days, weeks, etc.)
Holiday info (these days have other specifics)
Marketing campaign info

The benefits are obtained through the regression model adoption that explains or predicts a numerical value while using historical data from a previous data set. After you implement the described solution, you can plan purchases more accurately.

2. Classification

Classification is an ML algorithm of categorizing unstructured or structured data. Its application remains effective in such areas as spam filtering, document classification, auto-tagging, and defect detection. Classes here may be perceived as labels or targets. By analyzing the input, the model learns how to classify new information, mapping labels or targets to the data. At the same time, binary, multiclass, and multilabel are the main types of classification algorithms.

Binary classification

We train the model to classify new data into 2 categories (spam or non-spam emails, has or doesn’t have a lung disease, buy or don’t buy a product, alpaca or llama image, which depicts a bit more complicated case of a few-shot learning classification).

The binary model training requires a dataset, which is labeled with 0 and 1. After the model analyzes the sorted dataset, it is capable of predicting labels for new data. At the core of learning lies the ability to recognize patterns.

Multiclass classification

We train a model to classify into more than two categories. For instance, a classifier can learn how to identify cats, dogs, lizards, and other animals. To achieve the reliable accuracy of recognition, the model should determine and grasp the features that enable classification into categories. However, in multiclass as well as binary classification, only one category can be assigned to the data sample.

Multilabel classification

In multilabel classification, zero or more labels can be assigned to different objects. A classifier here may recognize cats, dogs, and other animals that are depicted in one picture. A prominent example of multilabel classification is auto-tagging: blog articles can be marked with relevant tags like “AI”, “ML techniques”, “Healthcare”, and so on.

While completing classification tasks, the model makes the prediction with the probability from 0 to 1, describing the confidence with which it has delivered this or that verdict regarding the category. The number 0 means total uncertainty, and 1 represents 100% confidence in the performed classification. Therefore, depending on the specifics of the business and the client’s tasks, this threshold can be customized (not necessarily more than 0.5 – means “yes”, less than 0.5 – means “no”; you can adjust these numbers to the business needs).

BUSINESS USE CASE FOR CLASSIFICATION

Helpdesk assigns tags to each of the conversations with the customers. This is done for quick and easy navigation between previous customers’ requests and for grouping conversations by topics. This process should be automated to reduce manual work.

The business solution in this case is based on previously tagged client’s data. We can develop a multi-label classification model that will be able to automate the process of assigning several tags to the new conversations with the customers. So, call center specialists will not spend time on this activity, focusing on the other priority tasks instead.

3. Clustering

Clustering is an ML method that allows us to identify and group data points in organized structures. These structures represent large datasets, which can be seamlessly grasped and manipulated, and new insights can be achieved from the grouped data after clustering modeling. Unlike classification, clustering doesn’t require labeled data. After all, it tries to find patterns by identifying shared or similar properties, and then applies these patterns to create separate groups (clusters).

Grouping or clustering techniques are particularly useful in business applications, where there is a need to segment or categorize large volumes of data. Typical cases include segmenting customers by different characteristics to better target marketing campaigns and recommending news articles that certain readers will enjoy. Clustering is also effective in discovering patterns in complex datasets that may not be obvious to the human eye, which makes it one of the most used AI techniques in marketing.

Clustering Machine Learning models differ depending on the approach. Sometimes we start with randomly initialized center points like in K-Means and other centroid-based algorithms, other times we apply hierarchical, density, or distribution-based methods. All these algorithms open up opportunities for business usage in anomaly detection, image segmentation, social network analysis, improving marketing campaigns, and fraud detection.

BUSINESS USE CASE FOR CLUSTERING

Retail business can serve as an example here. Imagine that the business owner intends to analyze employees’ performance and identify who is not working very hard. Every day a lot of employees work in different supermarket chains with money, so the owner wants to get a full picture of employees’ performance, being able to evaluate the efficiency of operational costs.

In order to help with the solution, we can develop a clustering model for anomaly detection. In our case, the anomaly activity is fixed if employees’ behavior is uncommon (differs from all of the rest). By applying the clustering algorithm, we identify groups of employees whose behavior differs significantly from the majority of the staff. Clustering is the first step towards tackling performance issues and productivity optimization, though a business has enough room for the adoption of other ML algorithms.

4. Deep Learning

Deep learning (DL) is a field of AI that partially emulates the approaches taken by human beings while learning. DL algorithms substitute a neural network with at least three layers that breaks problems into levels of data and then solves them. These algorithms resemble the functioning of our brains when we start to comprehend the world, learn words, and recognize new objects.

In this way, being a branch of ML, deep learning substitutes algorithms that lie on multi-layer neural networks, but differ from traditional AI/ML techniques (see the picture below). The key difference is that deep learning models do not require data with a set of relevant features – it’s enough just to provide them with raw data, giving the algorithm a chance to define relevant features on its own. DL models are amplifying along with the increasing amount of data applied for the training. So, the development of deep learning looks as follows: layers of a neural network consist of neurons that transmit information to the neurons of the subsequent layer, and the model arrives at a decision when data gets to the output layer.

Deep learning models are used for a wide variety of business applications. In healthcare, they help analyze medical images, speed up diagnostic procedures, and search for drugs. In the telecommunications and media industry, neural networks can be used for machine translation, fraud detection, and virtual assistant services. The financial industry uses them for anomaly fraud detection, portfolio management, and risk analysis.

Summing up, we can state that DL is capable of text summarization, new image generation, speech-to-text conversion, emotion detection, movement recognition.

BUSINESS USE CASE FOR DEEP LEARNING

Imagine a shoe business with a team providing customer support services via chats and phone. A business owner wants to be able to briefly analyze the quality of services granted by employees and check the level of customer satisfaction.

The business solution may be built on top of the text summarization that allows us to extract the most relevant information from the chat support text. In order to be able to process audio conversations as well, we can apply speech-to-text models for extracting text information from the audio. Also, aiming generally to analyze the level of customers’ satisfaction, we can work on sentiment analysis models that identify the tone of the conversations (positive or negative dialogues).

5. Dimensionality Reduction

Dimensionality reduction techniques involve reducing the number of input features, variables, or attributes, while maintaining as informative a dataset as possible. Why do we need it, if usually we are aiming to have the maximum amount of data for training the perfect model?

It quite frequently happens that the performance of machine learning algorithms can degrade with too many input variables. A greater number of features increases the chance to overfit the model, which is fraught with poor quality results.

Thanks to the dimensionality reduction, we are able to shorten the duration of the training, shun overfitting, and apply the algorithm for data preparation performed prior to modeling.

Business usage of dimensionality reduction isn’t limited to data preparation before modeling and includes the following areas: visualization of high-dimensional data, image compression, models runtime optimization, and reducing models complexity.

BUSINESS USE CASE FOR DIMENSIONALITY REDUCTION

To better understand this algorithm, we will look at a case study. The company produces and monitors a lot of different sensors and has plenty of data that needs to be analyzed. A prediction model based on existing data may serve them well. It should analyze historical data from a bunch of sensors and predict some information, taking into account original data.

Sensor data is sparse, so applying ordinary machine learning algorithms without preprocessing steps will result in low-quality model performance. So, one of the best options is to use dimensionality reduction methods before modeling, thus being able to reduce the number of features and leave only the most relevant ones for obtaining reliable model quality. Then, after retrieving important data, we can apply regression or classification models for performing predictive modeling based on the target feature we need to predict.

From dimensionality reduction to regression, choosing among Machine Learning techniques may be tough. Get acquainted with our recommendations to solve this issue.

How to Choose ML Algorithm for Your Business App

According to ML classification, there are supervised, unsupervised, and reinforcement learning options to be utilized for this business need. In supervised learning, we encounter an idea of training based on labeled input and output data. (Regression and classification are algorithms of this group.) As for unsupervised learning models, they require data with input features, but without labeled output and are capable of finding structures within the given data. (Segmentation and clustering belong to this category.) In the case of Reinforcement learning, ML models solve a task by improvising and through further analysis of the feedback regarding taken actions and solutions.

Once you’ve learned about the types of ML algorithms, you can take a look at a step-by-step guide to choosing an appropriate algorithm for business application:

Define the business problem and algorithms that are the most suitable for tackling it
Check available data (amount, characteristics, type, and behavior)
Think about optimal evaluation metric and speed
Decide on a suitable number of features and parameters
Stick to a baseline model or more sophisticated solution (if simple linear algorithms work well, there is no need to complicate the work)

With all the diversity of top Machine Learning algorithms, you might get confused about what method to choose. Try to adhere to a data-related or problem-related approach. Remember that better data is of greater significance than an algorithm, which can be easily enhanced by extending the training time.