Machine learning applications in household-level demand prediction


Machine learning (ML) is becoming one of the most anticipated methods in predicting consumer demand. However, it is still uncertain how ML methods perform relative to traditional econometric methods under different dataset scales. This study estimates and compares the out-of-sample predictive accuracy of household budget share for organic fresh produce using two parametric models and six ML methods under regular and large sample sizes. Results show that ML method, particularly Logistic Elastic Net, performs better than econometric models under regular sample size. Contrarily, when dealing with big data, econometric models reach to same accuracy level as ML methods whereas random forest presents a possible overfitting problem. This study illustrates the competence of ML methods in demand prediction, but choosing the optimal method needs to consider product specifics, sample sizes, and observable features.

Applied Economics Letters