Genetic Algorithm Based Outlier Detection Using Bayesian Information Criterion in Multiple Regression Models Having Multicollinearity Problems
Multiple linear regression models are widely used applied statistical techniques and they are most useful devices for extracting and understanding the essential features of datasets. However, in multiple linear regression models problems arise when a serious outlier observation or multicollinearity present in the data. In regression however, the situation is somewhat more complex in the sense that some outlying points will have more influence on the regression than others. An important problem with outliers is that they can strongly influence the estimated model, especially when using least squares method. Nevertheless, outlier data are often the special points of interests in many practical situations. Another problem is multicollinearity in multiple linear regression (MLR) models, defined as linear dependencies among the independent variables. The purpose of this study is to define multicollinearity and outlier detection method using a Genetic Algorithm (GA) and Bayesian Information Criterion (BIC) in multiple regression models. Also, GA with BIC is to illustrate the algorithm with real and simulation data for outlier detection in MLR models having multicollinearity problems.
Key Words: Bayesian Information Criterion, Genetic Algorithms, Multicollinearity, Multiple Linear Regression, Outlier Detection.