Estimation of the regression equation for zero inflated data – an applied study of car accidents in Karbala
A Thesis Submitted to
Council of The Administration and Economics/ Karbala University as Partial fulfillment of the Requirements for the Degree of Master of Science in Statistics
Presented by
Zahraa AbdulAmeer Ali AL-Mosawy
Supervised By
Prof. Dr. Abdul hussian. H . HABEAB . AL-Tai
Car accidents are an important phenomenon due to their direct relationship with the life conditions of the various population centers in cities. Poisson and binomial the focus was on the Poisson distribution, especially the zero-inflated Poisson distribution. Car accidents are an important phenomenon due to their direct relationship with the life conditions of the various basic components of society (human, animal, property). The relationships of car accidents can have complex models that are difficult to predict because of the complex nature of the variables that affect it. In order to research the topic of “estimating the best regression equation for car accident data when following some discontinuous distributions (a comparative study).
The thesis aims to estimate the best regression equation. The zero-inflated Poisson distribution was chosen. Estimation methods were used (maximum likelihood estimation, moments, Percentage and shrinkage). To achieve the research objectives, a number of simulation experiments were conducted according to the assumed distribution (Poisson zero-inflated) and methods for estimating the inflation parameter. The assumed zero and a number of sample sizes (small, medium and large) according to different values of the two parameters of zero inflation (λ) and the second parameter ((β) of the Poisson distribution, and then the results of the different simulation experiments were compared through the mean square error( MSE)) (return to the estimations of each The two parameters of zero inflation and the second parameter of the Poisson’s distribution of zero inflation according to each of (estimation method, distribution parameter, sample size).
From the course of the simulation experiment, it was concluded that the momentary method is the best among the methods used in the estimation process. Also, the study was implemented in the practical field on real data (which included a number of car accidents on a daily basis and arranged to represent seven daily readings for each week so that the number of weeks reached 51 weeks) All the aforementioned methods were used in the estimation process in the same experimental side as in the applied side in order to match the estimation methods, where the results of the applied side showed the superiority of the moment method also among its peers, and this is consistent with the experimental side, which indicates the suitability of the estimation method with the inflated regression model.
Among the most important conclusions that were reached, the results showed the superiority of the method using the (momentum method) over other methods in the experimental and applied aspect.