Alcohol Sales Regression Using AutoML

Introduction

The project aims to tackle the challenge of predicting alcohol sales in Iowa, particularly focusing on the crucial December period when sales peak due to the Christmas season. Historically, liquor stores in Iowa have relied on a simple moving sales average of the past five years to forecast December sales. This method, however, has proven insufficient, as it fails to consider various influential factors such as day, region, product, and vendor, leading to inaccurate predictions. This has resulted in either stock shortages or excesses, causing financial losses either from missed sales opportunities or from the costs associated with unsold stock.

Methodology

The project employs AutoML techniques to develop a more accurate prediction model for December alcohol sales in Iowa. The methodology involves several key steps:

  1. Data Collection and Preprocessing: The team collected sales data, including historical sales figures, product types, vendor information, and regional sales data. This comprehensive dataset underwent preprocessing to clean and structure the data for analysis.

  2. Feature Selection: To address the limitations of previous forecasting methods, the project expanded the feature set to include not just historical sales data but also day of the week, region, product type, and vendor information.

  3. AutoML Implementation: The team utilized AutoML tools to automatically select the best machine learning model for the prediction task. AutoML evaluated various models based on the expanded feature set, optimizing for prediction accuracy.

  4. Model Training and Evaluation: The selected model was trained on a portion of the data, with the remaining data used for testing and validation. Evaluation metrics such as Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and R-squared were employed to assess model performance.

Results

The AutoML-based approach significantly outperformed the traditional moving sales average method. Key findings include:

  • Improved Accuracy: The AutoML model demonstrated a substantial improvement in prediction accuracy, with lower MAE and RMSE values compared to the traditional method.
  • Comprehensive Analysis: The inclusion of additional factors like product type and vendor information in the model allowed for a more nuanced understanding of sales dynamics.
  • Model Performance: The R-squared value indicated a good fit between the model’s predictions and the actual sales data, suggesting the model’s effectiveness in capturing the variability in December alcohol sales.

Conclusion and Recommendations

The application of AutoML techniques in predicting December alcohol sales in Iowa represents a significant advancement over traditional methods. The project’s success highlights the importance of incorporating a broader set of factors into sales forecasting models. Recommendations for liquor store owners and suppliers include:

  • Adoption of AutoML-based forecasting models for more accurate inventory planning.
  • Consideration of regional sales trends, product preferences, and vendor performance in stocking decisions.
  • Continuous data collection and model retraining to adapt to changing market conditions.
Check out my Github profile for the code !