Investigation and Study on Students’ Online Shopping Consumption under the Background of Big Data

With the rapid development of social and economic growth of the Internet, online shopping has become an indispensable part of people’s lives, college students has become a main force in online shopping. Although online shopping has gradually matured, there are still many problems, and the problems are worth discussing. Based on the social background of big data, based on the questionnaire survey on the online shopping consumption of students of the International College of Zhengzhou University, this article uses the basic statistical analysis methods, correspondence analysis, and SPSS software and EXCEL software. Use Logistic regression method to analyze students’ online shopping consumption. Combined with the status of students’ online shopping, find out the differences of students’ online shopping behaviors, analyze the psychological characteristics of online shopping consumption, in order to correctly guide students’ consumption concepts. Then logistic regression analysis was used to find out the key factors that influence students’ online shopping frequency. Finally, based on the previous analysis conclusions, the school, e-commerce, and the government propose corresponding countermeasures.


INTRODUCTION
With the rapid development of scientific information technology and the continuous growth of the amount of data information, the continuously accumulated data has different potential values. In 2011, McKinsey & Company proposed that the era of " big data " has arrived, and data has become an indispensable thing in every industry. The mining of big data will bring different benefits to various industries and enterprises [1] . Victor . Meyer -Schoenberg in the current situation, "the era of big data." The book lists a number of application examples big data and big data set forth and predict future trends, big data is the subversion of traditional technology, in which the value of existence Inestimable, it has become the focus of attention of the media from all walks of life.
From the advent of the Internet to the present, the Internet has slowly integrated into our lives, gradually affecting my country's economic development. According to the China Internet Network Information Center 36 survey, Chinese Internet is still in constant and rapid development, as of 6 months 2015.
The number of netizens in China has reached 668 million, and the Internet penetration rate is 48.8%. Netizens spend an average of 25.6 hours a week on the Internet. From the perspective of occupational structure, the proportion of student netizens reaches 24.6%, which shows that students are the main force of the Internet. From the perspective of network applications, the emergence of the network has facilitated communication for everyone. The scale of instant messaging used by netizens has reached 60.626 million, with a utilization rate of 90.8%. In addition, it is used to watch news, search, play online games, etc., of which the number of online shopping users reached 374 million. As China's 3rd International Symposium on Big Data and Applied Statistics Journal of Physics: Conference Series 1616 (2020) 012009 IOP Publishing doi: 10.1088/1742-6596/1616/1/012009 2 economy changes from being driven by external demand to being driven by domestic demand, online shopping plays an important role in achieving economic growth driven by consumption [2] .
College students are one of the main groups of Chinese netizens. They are in a special environment and are free to control their own funds. The number of online shopping is also more frequent than others. Their online consumption behavior shows the consumption characteristics of online groups. With the rapid development of the Internet, the consumption concept and consumption structure of college students have also changed. At the same time, it also found various problems in online shopping. In-depth analysis and proposing countermeasures will have a key impact on online shopping consumption driving the economy.
This paper investigates the online shopping consumption of students at the International College of Zhengzhou University to understand the current status of online shopping consumption of students, analyze the differences in online shopping behavior, grasp the factors that affect the frequency of online shopping, and propose corresponding solutions for e-commerce to promote electronic Better business development.

Survey target, survey unit and survey content
Regarding the selection of the survey object, combining various factors and the purpose of the survey, it is determined that the students from the first year to the fourth year of the International College of Zhengzhou University should be the target of the survey. The survey unit is composed of every freshman to freshman from the Zhengzhou University International College. Reasonably allocate the ratio of men to women in the college, and strive to achieve the objectivity and scientificity of the investigation.

Investigation methods
In this paper, the survey method is mainly based on online questionnaires, which are investigated through questionnaire websites and emails. It is more convenient to collect a large range of samples through online surveys, and it is faster and reduces the cost of the survey. Questionnaires were distributed to the students of the International College of Zhengzhou University by random sampling. The questionnaire survey lasted one month from questionnaire design to questionnaire collection and data entry. A total of 220 questionnaires were distributed and 212 were recovered, with an effective rate of 96.4 %. According to the ratio of men to women in the college in the case of 3:7, 66 boys and 154 girls were randomly selected to fill out the questionnaire.

Design of the questionnaire
The main body of this survey is " Student Internet Consumption " . The questionnaire is divided into five parts: (1) The basic personal information of the survey subject, a total of seven questions. (2) Five aspects are used to measure the status of students' online shopping in terms of online shopping frequency, online shopping products, reasons for online shopping, online shopping platforms and payment methods. (3) Factors affecting the frequency of students' online shopping, a total of two questions (4) Student psychology of online shopping, a total of five questions. (5) Open survey of students' views on online shopping.

Data analysis methods
Descriptive analysis is to collect, collate and calculate comprehensive indicators of all relevant units in the survey to process data for description.
The basic statistical characteristics of the population are described, and the following three basic statistical analysis methods are used in this article [4] .
Understand the status of the variables of the respondent's status (such as: gender ratio, grade ratio, etc.), and grasp the distribution characteristics of the data. Based on the collected sample data, a two- dimensional or multi-dimensional cross-contingency table is generated. On this basis, whether there is a certain correlation between the two variables is analyzed. For example: the difference analysis of online shopping behavior of college students of different genders.
The statistical analysis method set for the multi-choice questions of the questionnaire, the total frequency is analyzed for each answer of the multi-choice question, and then the frequency analysis (univariable) and cross-summary analysis (multiple variable). Such as: the reasons why students choose online shopping, choose online shopping platforms, etc.
Chi-square test belongs to the category of non-parametric test, which mainly compares the correlation analysis of two or more sample rates ( composition ratio) and two categorical variables. The main idea is to compare the degree of agreement or goodness of fit between the theoretical frequency and the actual frequency [5] .
Correspondence analysis is to analyze the categorical variables in the study and study the correlation between two or more categorical variables. The basic idea is to take the cross contingency table of two variables as the research object, and use the " dimension reduction " method to graphically reveal the connection between different categories of variables [4] . For example: the relationship between different living expenses and the average monthly online shopping consumption range.
Logistic regression and multiple linear regression actually have many similarities, the biggest difference is that their dependent variables are different. When the explanatory variable is a multicategorical variable, the multiple regression analysis method should be used. The basic idea of the multiple logistic regression model is similar to the binomial logistic regression. Its research purpose is to analyze the comparison between each category of the explained variable and the reference category [4] .

Analysis of the status quo of online shopping for students at the International College of Zhengzhou University
It can be seen from Table 1 that the majority of college students at the International College of Zhengzhou University make online purchases 2-3 times a month , only 12.38% of college students make online purchases more than five times a month, which shows that students of the International College of Zhengzhou University are in a healthy state in online shopping consumption, and students can well control their online shopping desires. largest proportion, 85.24% ; followed by daily necessities, 72.38% . Because college students now live in college as a boarding house, all kinds of daily necessities need to be purchased by themselves. Among them, the purchase of teaching aids online accounted for 40.95% , which shows that the purchase of teaching aids online by college students is an indispensable thing in college life. Only a handful of college students buy digital products. Digital products are luxury goods for college students, and college students are free to allocate their own expenses, so few students buy digital products online.

Difference Analysis of Students' Online Shopping Behaviors at the International College of Zhengzhou University
In this survey, the students of Zhengzhou University International College were used as the survey object. Of the 212 questionnaires effectively recovered , only two of them filled in the " no online shopping experience " option, accounting for 0.94% of the total . There are 210 students with online shopping experience , a percentage of 99.06% .  As shown in Table3, in this survey, there are 147 girls online shopping, accounting for 69.34% of the total; 65 boys , accounting for 30.66% of the total survey . Two of the boys have no online shopping experience. Through this survey, it can be found that female college students account for a greater proportion of online shopping than male college students, but in the chi-square test results shown in Table 4 , the P value is 0.033 less than the significance level of 0.05 . That is to reject the original hypothesis, that students of different genders have significant differences in online shopping experience. Among the 210 students surveyed , juniors had the most online shopping, accounting for 41.51% of the total . Followed by sophomore students, the proportion is 24.06% . In comparison, seniors have the lowest online shopping rate at 16.98% . Through investigation and analysis, it is found that the firstyear university students are still adapting because they have just entered the threshold of the university. For the first time, they can freely distribute the cost of living. Online shopping; relative to the third-year students who have lived in the university for two years, they have been well adapted to the life of the university, and have begun to need various textbooks and textbooks, which has increased the demand for online shopping. Since the second-year students of the university began to study professional subjects, the study pressure was great, and the time for online shopping was greatly reduced. The fourth-year students of the university started to run in various job fairs, spending much less time online than before. As can be seen from Table 6 , the frequency distribution results obtained by cross-analyzing the grade and online shopping years. Table 5 test results show that the P value is 0.371 greater than the significance level of 0.05 , which means that there is no significant difference in online shopping frequency among the four grades.  Table 7 shows that among the 210 students, 80 students have a living cost of 500-1000 yuan, accounting for 38.09% The ratio of living expenses of 87 students is 1,000-1500 yuan, the ratio is 41.43% . The cost of living in 500 or less And the minority with more than 2,000 yuan accounted for 3.8% and 3.33% respectively . It can be seen that most students' living expenses are now in the range of 1000-1500.
As can be seen from Table 4.8 , there are 95 college students with online shopping experience , and the average monthly online shopping consumption is between 100 and 300 yuan.
It accounts for 44.81 % of the total number . The number of people who spend more than 500 yuan on online shopping each month is the smallest, with only 13 students, a ratio of 6.13 percent . There are 64 people who spend an average of less than 100 yuan per month on online shopping , a proportion of 30.66 percent . After a corresponding analysis of the cost of living and the monthly average online shopping consumption, the results shown in Figure 4.6 show that the cost of living is 500 The following Yuan and 500 to 1000 students per month in most of the online shopping are in the range of $ 100 dollars or less. Cost of living Students between 1,000 and 1,500 yuan basically spend between 100 and 300 yuan on online shopping each month . The higher the cost of living, the higher the online shopping expenses.

Analysis of influencing factors of online shopping consumption frequency of students of International College of Zhengzhou University
According to the analysis of the collected data and data, as well as the consideration of the actual situation, the following assumptions are proposed for the factors that affect college students' online shopping consumption: The influence of individual characteristics of college students on the frequency of online shopping First, the gender of consumers is weakly correlated with the frequency of online purchases, and the type of goods they choose to buy is similar to traditional shopping; second, the professional grade of consumers is strongly correlated with the frequency of online purchases, and is negatively correlated The relationship indicates that the higher the grade, the lower the frequency of online purchases by students. Third, the consumer's cost of living has a positive correlation with the frequency of online shopping, indicating that the higher the cost of online purchases by students with higher living expenses.
The impact of commodity factors on the frequency of online shopping First, the quality of online purchases has a positive impact on the frequency of online shopping; second, the service attitude of online shopping sellers has a positive impact on the frequency of online shopping.
Influence of related behavior variables on online shopping frequency The average online time of consumers has a positive correlation with online shopping.
In studying the influencing factors of students' online shopping in the International College of Zhengzhou University, that is, studying the factors that affect the frequency of online shopping of students. In this paper , the method of establishing a generalized Logit model is studied . The purpose of the study is to analyze the comparison between each category of the explained variable and the reference category. The formula (1) is as follows: P j is the probability that the interpreted variable is the j th category, P J is the probability that the interpreted variable is the Jth category ( j≠J ), and the J th category is the reference category.
The dependent variable in the study . 4 levels, the level 1 (online shopping 1 times) as the reference group, the level of classification for each comparison, to obtain . 3 th Logistic models, respectively horizontal 2 (online shopping 2-3 times) and the horizontal 1 (online shopping 1 time) is compared; level 3 (online shopping 4-5 times) is compared with level 1 ; level 4 ( online shopping more than 5 times ) is compared with level 1 , ie: through a single factor analysis of online shopping frequency through various research hypotheses, the living cost status ( X1 ), average online time ( X2 ), online store seller service attitude ( X3 ), online shopping brand ( X4 ), There are 5 variables in online shopping product quality ( X5 ) for regression analysis.
After multiple logistic regression analysis, the regression fitting information table (Table.10 ) and the likelihood ratio test table (Table 11 ) are obtained. From Table 4 , the logarithmic likelihood of -2 times of the zero model is 243.971 , the current model is 170.620 , the likelihood ratio chi-square value is 73.351 , the probability P value is 0.00 , within the significance level of 0.05 , the null hypothesis should be rejected for the significance test of the regression equation, then the model is considered correct, and all explanatory variables The linear relationship with the generalized Logit P has a significant impact.  By performing a likelihood ratio test on the role of each `independent variable, when the five variables among them gradually enter the model, the chi-square statistics .008 The results of the regression model composed of the above five variables are shown in Table 12 . According to the results of Logistic regression analysis of the online shopping frequency of students , we can know: The P value of the Wald statistic of the cost of living range ( X1 ) variable in these three categories of online shopping frequency is less than the significance level of 0.05 , indicating that the amount of college students' living cost has a significant impact on online shopping frequency. And " net purchase 1 time " compared to more students if the cost of living, so the more times a month online shopping.
The average access time per day ( an X2 ) variable, time online . 1 hours ( an X2 =.1 ) of the Wald statistic P value online shopping 2-3 times less than a significance level of 0.05 , indicating Internet shopping on the net frequency one hour per day 2-3 Times have a significant impact. The P value of the Wald statistic with online time of 3-4 hours ( X2=3 ) is less than the significance level of 0.05 in the online shopping for more than 5 times . Thus seen that the " net purchase 1 times " more frequency comparison, the more time online each day, students each month shopping online.
The P values of the Wald statistics of the online store seller service attitude ( X3 ) variable are all greater than the significance level of 0.05 . This shows that the merchant's service attitude has little effect on the frequency of online shopping.
Online shopping Brand ( X4 ) variables Wald statistic P values were online shopping 2-3 times and Cyber 5 or more times less than the significance level of 0.05 , indicating that the " net purchase 1 times " the frequency comparison, branded goods for college students online shopping It has a certain impact, but it is not completely significant. It can be seen that the higher the brand awareness of the product, the more the college students' trust in the brand will increase the purchase frequency.
Net purchases of goods quality ( X5 ) variable Wald statistic P value than the significance level of 0.05 , which meant that the " net purchase 1 time " compared to the quality of goods has no significant impact on students online shopping frequency. All P values are less than the significance level of 0.05 , indicating that the explanatory power of the variables in the model is increasing. If these five variables are eliminated, it will inevitably have a significant impact on the change of the -2 logarithmic likelihood value.  10 First, the average monthly frequency of online shopping for students is in the range of 2-3 times, and most types of shopping are clothing, daily necessities, girls' cosmetics and teaching aids. One reason among the students surveyed is that they like convenient online shopping, which can meet the demand, followed by affordable prices. At the same time, with the development of online shopping, problems in online shopping also follow. According to data surveys, the problem of out of stock in online shopping is more serious.
Second, investigating the differences in online shopping behavior from three aspects of different genders, grades, and cost of living, we can see that the frequency of online shopping for students of different genders is significantly different. Among them, girls make up a larger proportion of online shopping. However, different grades have little effect on the frequency of online shopping. In addition, it was also found that the more the cost of living for students, the more they spend on online shopping.
Third, according to the survey data, it is found that most students have the psychological characteristics of pursuing brands, advocating individuality, and are prone to impulsive consumption.
Fourth, after logistic regression analysis, it was found that the individual factors of students did not have much impact on online shopping consumption, and the range of students' living expenses, the time spent online each day, the service attitude of the merchant, the brand of the goods, and the quality of the goods all affected the students. Therefore, controlling the time students spend online in the dormitory not only affects the physical and mental health of students, but also controls the frequency of online shopping for students, reducing the phenomenon of arbitrarily spending money. At the same time, increasing the popularity of commodity brands is also an important factor for attracting students to patronize. For this, merchants make corresponding countermeasures to promote the development of ecommerce