The Comparison of the Computer Numbers in the Houses in Turkey with the Pooled Count Data Models

The effect of informatics technologies upon our lives gradually increases in parallel with the significant inclusion of computer and internet in every area of life. As technological developments have a positive effect upon the informatics sector, the use of computers has become widespread in houses. The objective of this study is to explain the factors affecting the number of computers owned by the household with the help of Counting Data Models. In this context, the most convenient method was tried to be determined through comparing the Standard Poisson, Poisson Quasi Maximum Likelihood and Negative Binom regression models. Being prepared by the Turkish Statistical Institute (TUIK) between 2002-2010, the data of the 'Household Budget Survey' were used in the study.


INTRODUCTION
Because of the transition to information society today, using the information and communication technologies in economical and social life has become widespread.The rapid development experienced in the information and communication technologies since the beginning of 1990s has increased the demand of the consumers for computers, internet, mobile phones, hand-held computers and others which are modern information and communication technology products.
Both individuals' and institutions' or countries' being wealthy or not shows a directly proportional tendency with their amount of access to the information and communication technologies (Seferoğlu et al., 2008).The information and communication technologies whose usage is increasing day by day direct and affect the economic behaviors of all economic decision-making units considerably.For this reason, to present the reasons of the consumers to prefer and to demand the information and communication technologies they use and to specify the relationship between these devices and socio-economic and demographic factors are important (Kaynak and Karaca, 2012).
The computer which has an important share among information and communication technology products was first used by General Directorate for Highways 2 in 1960.The number of the computers has gradually increased in society in time.
In the literature, there have not been many scientific studies that present the determiners of existing number of computers at homes with econometric methods and care about these numbers.
The researches conducted were limited with statistical rates and comparisons.Within this scope, the study that surveyed the possession and the use of information technologies throughout Turkey was the "ability and use determination" themed field research which was conducted by TUBITAK BULTEN within the scope of TUENA project in 1997 (Atilgan, 2003).In the following years Information and Communication Technology Usage Survey in Household" was started to be published in reports by TÜİK (Turkish Statistical Institute).

METHODOLOGY Standard Poisson Regression Model: Mostly used
Poisson Regression Method underlies count data model which is a special type of regression.Poisson Regression Model (PRM) is generally accepted as a suitable approach to analyze rare cases which are based on counts in a certain amount of time (Zou, 2003).
The Poisson Regression Model in which the dependent variable has not discrete and uncategorical values the probability of the count is determined by Poisson distribution.For the number of occurrences of a case, Poisson distribution, density or probability density function is stated as: µ shows a mean number of occurrence in a unit of time.The first two moments: (2) show best known mean equality of the mean and variance for Poisson distribution (Cameron and Trivedi, 2005).This situation is known as equidispersion.The cases which count variable has a bigger variance than mean are expressed as overdispersion (Selim and Uçdoğruk, 2003).The cases which mean is bigger than variance is in titled as under dispersion.When the mean arrest count is low, Poisson distribution is skewed and the more mean court increases, the more Poisson distribution converges normal distribution (Osgood, 2000).An important assumption of Poisson process is that the situations are independent of each other.In this context, any past situation does not have an effect on any situation likely to happen in the future.
Poisson Regression Model is procured from Poisson distribution which expresses the relationship between mean parameter µ and co-variances x with parameters (Cameron and Trivedi, 2001): In the statistical literature, this model is known as log-linear model; because, logarithm of conditional mean is linear in the parameters: ln y |x x β (Cameron and Trivedi, 1998).
An important feature of Poisson Regression Model is that as far as mean function is determined properly, it gives consistent estimators for β even there is an over or an under dispersion.If first two moments are determined properly, Poisson estimator is efficient asymptotically (Selim and Uçdoğruk, 2003).On the other hand, even y is determined wrongly, it is possible to use Quasi Maximum Likelihood estimator.Even assumed distribution is wrong; Quasi Maximum Likelihood estimators provide using acquired estimations in statistical interpretation (Şahin, 2002).

The Poisson Quasi-Maximum Likelihood (PQML):
Quasi Maximum Likelihood estimation is an estimation of a model whose real distribution is not known or that is determined wrongly with Maximum Likelihood Method.Depending on the assumption that density function that is determined in econometric studies is true, Quasi Maximum Likelihood Estimation Method is frequently used comparatively with Maximum Likelihood Method (Cameron and Trivedi, 2001).
Especially in the models in which count data is analysed, its calculating easiness, being included in lot of software packages, giving satisfactory results, having consistent and asymptotically efficient standard errors, having more efficient variance assumption and having strong features compelling the model are factors for Poisson Maximum Likelihood function being chosen in the analysis (Wooldridge, 1997).
The Negative Binomial Model: Poisson and Negative Binomial distributions are defined with variance functions.They are characterized with the relationship between distribution variance and distribution means (Dauxois et al., 2006).Poisson Regression Model based on Poisson distribution is used for modeling the relationship between independent variable or variables and dependent variable which is expressed with counting.Negative Binomial Regression Model is used for providing parameter efficiency as an alternative to Poisson Regression Model in case of a over dispersion.
Poisson Regression Model is a limited model in many aspects.First limit is that events appear independently in time.Second is the assumption that the conditional mean is equal with the conditional variance.This assumption is an assumption that is very hard to obtain and it is unsuccessful in case of an over dispersion.Poisson assumptions are expected to realize in the practice.But the practice that these limits cannot be achieved produces down-ward deviant spuriously small estimated standard errors (Cameron and Trivedi, 1986).
Distribution of lots of count data accords with Negative Binomial Distribution.Moreover, the parameter added to variance makes Negative Binomial Regression Model more flexible (Baier and Wernecke, 2003).In short, Poisson and Negative binomial Regression Models are separated from each other because of the differences in the assumptions about conditional mean and conditional variance.
Our first attempt to Poisson regression model is to add the parameter that lets conditional variance of y exceed conditional mean.This model is Negative Binomial Regression Model (NBRM).By reproducing NBRM in many ways, the most common structure of model is formed via heterogeneity that cannot be observed.In PRM, for given x value, the conditional mean is known as e β .In NBRM μ is replaced by random variable (Long, 1997): ε, is a random error that is assumed to be unrelated with x. ε has a combined effect of unobservable variables that are taken from the model.In PRM the changes in µ is presented by means of heterogeneity.Different values of x are resulted with different values of µ; but, all individuals that have x will have the same µ.In NBRM the change in is originated from x change between individuals and the unobservable The data show a normal distribution; H 1 : The data do not show a normal distribution; Decision:  = 0.05 in the importance level, when the p value of Combined K-S statistics is looked at H 0 is refused heterogeneity is showed with ε.For a given combination of independent variable values, instead of one µ, the distribution of is the point (Long, 1997).NBRM can be estimated with likelihood the most.In spite of calculating difficulties, after the logarithm of likelihood equation is taken, log-likelihood equation is maximized with numerical methods (Cameron and Trivedi, 1998).

Material and Method:
In this study, Household Budget Survey which was carried out by Turkish Statistical Institute to determine the determiners of existing computer ownership numbers at homes in Turkey in the period of 2002-2010 combined the data sets using raw data and 96.750 people who were at the age of 15 and above were worked with.
The base category was determined as the males and females who were between 15-29 years old, live in the countryside, do agriculture and who were illiterate and were in the group of low spend and low income.
The explanatory variables that will be used for models to determine the number of computer at homes are specified as sex, age dummies, educational status of the males and females, families' living in urban areas or in the countryside, the number of students at homes, year dummies, ownership of computer, income and spend groups, occupational groups of the males and females.In the occupational group of the males and females there are legislators, senior officials, managers, professionals, technicians and associate professionals, clerks and workers in customer services, service workers and shop and market sales workers while in the occupational group of craftsmen-plant there are craftsmen and workers in related jobs, plant and machine operators and assemblers.
Because our dependent variable, the number of computer ownership at homes have been obtained depending on count, first the frequency and percents are taken place in our study, after that, the most appropriate count data model has been determined by testing dispersion and econometric interpretations.When education status is taken into consideration, 4% of the males are illegitimate.7% are university graduate and above.For the females, illegitimates are 20%.For both males and females, the biggest rate is in primary school graduates.

Subsidiary statistics: When
When the rate of unemployed males is 24%, unemployed females are 75%.The males are mostly work in the service sector.19% of males work in the group of craftsmen and plant, 12% are in agricultural sector and about 8% are dilutees.When we look at the occupational rates of males, 6% of them are in service sector, 12% are in agricultural sector and 3% of them work as dilutees.The males work in the occupational group of craftsmen and plant the least.
The values about total annual income and total spend of homes have been deflated by using Consumer Price Index in 2003 and logarithmic transformation have been done.Kolmogorov-Smirnov Test was made to detect the convenience Logarithmic total annual income and spend values with normal distribution.Kolmogorov-Smirnov Test results are given in Table 1.After logarithmic total income and spend values did not show a normal distribution, digressing observations were produced.In this way, data suitable with normal distribution was acquired.Logarithmatically total income and spend continuous variable was turned into categorical variable by taking the advantage of the features of the normal distribution.In order of low, middle and high income and spend classes of two categorical variables in question are obtained by benefiting from ̅ formula in the light of data in Table 2.With the acceptance of the fact that middle income and spend classes form 68% of the data (id est Z = 1), the class gaps in Table 3 were formed.It is seen that one of the most important variables about the increase in the number of computers at homes is total income and spend at home.When the income groups are taken, the rate of the ones in low income group is about 11%, the rate of middle income group is 76%.The highest rate in spend groups belongs to the middle spend group.In the low and high spend groups; the rates are 13 and 14%, respectively.The rate of the ones living in countryside is 32% while the rate of the ones living in the urban areas is 67%.When we look at the internet ownership, the rate of the ones that have internet connection is 11%, the ones that do not have the connection is 88%.When the computer ownership is examined, the highest rate is in the group that does not have a computer.The rate of the ones that have just one computer is 19%.

APPLICATION
To determine the number of computer ownership at home optimally with pooled count data models, first it is needed to interpret the illustrator statistics belonging to dependent variable.77452/96570 = 0.802 (80.2%) of total number of computers at homes which is a dependent variable is formed by zero; the mean is 0.204 and the variance is 0.177.In the process of determining which count data model is the most suitable, over dispersion is needed to be presented.If the dispersion parameter is less than '1' value, it is known as under dispersion while it is bigger than '1' it is known as over dispersion (Yeşilova et al., 2006).When the dispersion parameter belonging to the dependent variable is less than '1' value, it shows that there is an under dispersion in the data cluster.On the other hand, when the mean value is larger than the variance value, it supports the presence of under dispersion.Accordingly, because Negative Binomial Model is only used in case of over dispersion, it is not a suitable model for our data set.Moreover, even the intensity of zero value in the data cluster is high, because there is under dispersion, Zero Inflated Regression Models cannot be used 3 .
Standard Poisson estimations that can be used for under dispersion and Poisson Quasi Maximum Likelihood (PQML) estimation results have been compared to determine the purpose of the study (Selim and Uçdoğruk, 2003).These results are given in Table 4.
Although the parameters belonging to Poisson Regression Model are consistent, the standard errors are higher when compared with PQML.In this study, because our purpose is to set the determiners of the number of computers at homes, marginal effects is not included.
When PQML results given in Table 2 are examined, it is seen that the more the number of students, the more the number of computers at homes.However, it is remarkable that there is less increase in the age group of 60 and above for both males and females than other age groups.In another saying, it shows that people in this age group use the computer less and cannot be attuned to technology.
When we look at the educational status, it is easily realized that the increase in educational status for both males and females shows more increase in the number of computers at homes than the illiterate ones.
The ones in the service, craftsmen-plant sectors, the unemployed and dilutees have more computers than the ones work in the agricultural sector.The least increase for females is in the group of the unemployed while the least increase is for the dilutees for males.
The ones living in urban areas have more computers than the ones living in countryside and the internet connection at homes is naturally increasing the number of computers.
When we evaluate the differences between years, the increase in the parameters from the year 2002 to 2010 is remarkable.

CONCLUSION
With the development of information and communication day by day and its going a step further, the interest for information and communication technology products is remarkable.Among these products the importance of computer is increasing gradually.In this process, it has been observed that the individuals began to have more than one computer while moving from the year 2002 to 2010 and the determiners of the number of computers at homes have become the purpose of our study.
Consequently, the development of technology continuously and its becoming cheaper in the last ten years has made the computer nearly an ordinary good and it has made the access to the computer and technology easier.At the same time, since social networks began to be used commonly, the computer has become a private device and it has become an indispensable part of people's lives.So, the number of computers at homes has increased.

Table 1 :
Kolmogorov-Smirnov test results of total income and total expenditure variables Table 1 is examined, the males in the age group of 15-29 form 8% while for the group of 30-39 and 40-49 this proportion is about 27%.Twenty percent are in the group of 50-59 and in the age group of 60 and above form about 16%.For the females, 16% are in the age group of 15-29 while the

Table 2 :
Descriptive statistics for total income and total expenditure variables

Table 3 :
Frequency and percentages related to the all variables

Table 4 :
Estimation results for standard poisson regression and poisson quasi maximum likelihood models Standard poisson regression model - They are the males and females at the age of 15-29 who live in the countryside, work in the agricultural sector and who are illegitimate and in the low income and spend group in 2002.