Forecasting of sales data using support vector regression.

Loading...
Thumbnail Image

Authors

Khan, Sirajum Munira

Advisor

Rahamataullāha Imana, E. Eica. Ema

Issue Date

2020-07-18

Keyword

Degree

Thesis (M.S.)

Department

Department of Mathematical Sciences

Other Identifiers

CardCat URL

Abstract

Sales analytics is the practice of generating insights from sales data, trends and metrics to set targets and forecast future sales performance. Because of the huge volume, velocity, variety and diversity, sales data easily comes under the umbrella of big data. In recent years, sales data requires more sophisticated technology and analytical methods than ever before. Support Vector Machine (SVM) are supervised learning methods with associated learning algorithms that analyze data used for classification and regression. The version of SVM used for regression is called Support Vector Regression (SVR). Kernel methods are a class of algorithms for pattern recognition which have become very popular as they are used in SVM widely. Kernel functions can be used in many applications as they provide a simple bridge from linearity to non-linearity for algorithms. Choosing an appropriate kernel is very crucial and it mostly depends on the problem at hand because it depends on what we are trying to model. We have considered secondary real estate data from the city Cincinnati of the United States. Since the data set is large and some initial results indicate strong evidence of nonnormality and presence of outliers, we employ the SVR technique with a variety of kernels such as linear, Gaussian, polynomial, and sigmoid to model and forecast sales in the best possible way. We compare these models with the linear regression model as well. Several goodness of fit measures reveals that the SVR with Gaussian kernel most adequately fits the data. Finally, we employ a cross-validation study to evaluate forecasts generated by different methods and found the SVR with the Gaussian kernel generates the most accurate forecasts in comparison with the other methods considered in our study.

Collections