AutoML vs. Deep Learning: A Case Study in Forecasting Iowa’s Liquor Sales

Project Overview: A Forecasting Case Study

 

During my coursework at Georgian College, I undertook a comparative analysis to forecast sales using the comprehensive Iowa Liquor Sales dataset. This project pitted two distinct machine learning methodologies against each other: a rapid AutoML approach using hyperopt-sklearn and a complex Deep Learning model using an LSTM network in TensorFlow.

The results were telling: the AutoML model achieved 85% accuracy with minimal development time, while the LSTM network reached 88% accuracy by capturing subtle temporal patterns. This case study explores the practical trade-offs between these two powerful approaches in a real-world business context.

 

The Business Challenge: Predicting Iowa’s Holiday “Liquid Gold Rush”

 

The Iowa liquor sales dataset is a goldmine for data science, containing over 19 million transactions since 2012. Because Iowa is an Alcohol Beverage Control state, all sales flow through regulated channels, creating remarkably clean and comprehensive data.

The key business challenge is forecasting holiday demand. December sales often represent 15-20% of annual revenue, making accurate predictions essential for inventory management, cash flow, and staffing. This dataset, with its rich seasonal patterns, provides the perfect testing ground for predictive models.

 

A Tale of Two Models: AutoML vs. LSTM

 

This project directly compared the speed and accessibility of automated machine learning against the deep pattern recognition capabilities of neural networks.

MetricAutoML (hyperopt-sklearn)LSTM (TensorFlow)
Predictive Accuracy85%88%
Development TimeLow (Hours)High (Days/Weeks)
Computational CostModerateHigh
InterpretabilityHigh (e.g., feature importance)Low (Complex “black box”)
Best ForRapid prototyping, baseline models, quick business insights.Complex sequential data, high-stakes forecasting where accuracy is paramount.

Approach 1: Rapid Forecasting with AutoML (85% Accuracy)

 

My first model used hyperopt-sklearn, an AutoML framework that automates algorithm selection and hyperparameter tuning. Using Bayesian optimization, it tested hundreds of configurations of models like Random Forests, Gradient Boosting Machines, and SVMs.

After processing over 50,000 sales records, the system converged on an ensemble Gradient Boosting model. This approach delivered impressive 85% accuracy with just a few hours of development.

  • Business Insight: The model quickly identified key sales drivers, such as geographic clustering around college towns and predictable holiday spikes. It provided a fast, reliable tool for inventory planning that could be deployed with minimal technical overhead.

 

Approach 2: Deep Dive with LSTM Neural Networks (88% Accuracy)

 

The second approach used a Long Short-Term Memory (LSTM) neural network, an architecture designed specifically for sequential data. LSTMs use sophisticated gating mechanisms (input, forget, and output gates) to remember important information over long time periods.

This deep learning model achieved 88% accuracy, outperforming AutoML by capturing subtle temporal dependencies that simpler models missed.

  • Business Insight: The LSTM identified complex leading indicators, such as how sales patterns in October influence December demand or how weather trends in neighboring regions affect local distribution. This level of nuance allows for more sophisticated, proactive inventory planning but comes at the cost of higher complexity and development time.

 

The Verdict: Choosing the Right Tool for the Forecasting Problem

 

My experience with both models revealed a crucial lesson in practical data science: the best tool depends entirely on the problem’s context.

  • AutoML excels when you need rapid results, interpretable models, and straightforward deployment. It empowers teams to generate actionable insights quickly.

  • LSTM networks justify their complexity when temporal patterns are critical and small improvements in accuracy translate to significant financial impact.

In a real-world production environment, the ideal solution is often a hybrid approach: using AutoML for rapid prototyping and baseline models, while developing specialized deep learning models for mission-critical forecasting tasks.

 

Key Takeaways for Aspiring Data Scientists

 

This project was more than an exercise in model building; it was a lesson in strategic thinking.

  1. Balance Sophistication and Practicality: The most complex algorithm isn’t always the best. A model is only valuable if stakeholders can understand, trust, and implement its results.

  2. Communication is Key: Clearly documenting your methodology, assumptions, and limitations is just as important as the model’s performance. It’s how you build trust and enable informed decision-making.

  3. Solve Real Business Problems: Portfolio projects are most effective when they demonstrate an end-to-end process that addresses a tangible business need with measurable outcomes.

 

Conclusion

 

The Iowa liquor sales project clearly illustrated that different machine learning tools unlock different layers of insight. AutoML provides speed and accessibility, while LSTMs offer unparalleled depth for sequential data.

Ultimately, success in data science isn’t just about mastering algorithms; it’s about understanding the business context, asking the right questions, and choosing the right tool to tell the story your data is waiting to reveal.

Get In Touch !

What is the reason for contact?
How can I reach you ?
What would you like to discuss?