Travel Behavior Analysis Using 2016 Qingdao’s Household Traffic Surveys and Baidu Electric Map API Data

,


Introduction
It is axiomatic that a model can never be better than the data from which it is estimated [1].Household traffic surveys are mainly used in transport planning and urban planning.In early days, household traffic surveys are customarily conducted by telephone, face-to-face interviewing, having become expensive and dangerous to accomplish in most urban areas of the continent [2].Outside North America, face-to-face interviews are still done, although costs are high and there are threats to the safety of interviewers.Computer-assisted telephone (CATI) survey is the main method presently in some countries, especially in North America.However, the overall response rate is low.In order to overcome the shortfall in trips of CATI [3,4], some countries, such as the US and Switzerland, try to have a household traffic surveys using GPS location devices [5][6][7][8][9] .However, it also faces some problems.For example, (1) high expense: GPS devices are fairly expensive, with passive devices, capable of storing many days' worth of data, costing on the order of US$750 each; (2) signal loss: serious degradation of the signal often happens in various circumstances, including tunnels, urban canyons, and heavy tree canopies, and in certain types of vehicles.
The past 20 years has seen a tremendous increase in internet use and computer-mediated communication.Researches of online populations have led to an increase in the use of online surveys [10][11][12][13].On one hand, it has some advantages including access to individuals in distant locations, the ability to reach difficult to contact participants, and the convenience of having automated data collection, which reduces researcher time and effort [14].On the other hand, the disadvantages, such as uncertainty over the validity of the data and sampling issues, and concerns surrounding the design, implementation, and evaluation of an online survey are also obvious [14].
In order to cast off the deficiency of one kind traffic survey method, several traffic survey methods are used to reap the accurate traffic data in Qingdao's third traffic survey (2016).They include online-survey, face-to-face interviewing, public transport survey, traffic flow survey, exit-entry survey, and other surveys (see Section 2).
Considering the disadvantages of household traffic survey data, we analyzed the travel behavior with different purposes and travel modes (car and bus) by combining household traffic survey data in 2016 and Baidu electric map API data in this paper.The accuracy of Baidu electric map API data is also validated by the actual taxi OD pair survey.This paper is organized as follows.The detail of Qingdao's 2016 traffic survey is described in Section 2, and the Baidu electric map API data in our research are presented in Section 3. In Section 4, the results of trip time and distance with different purposes and modes are shown and analyzed in detail.Finally, we discuss our work in Section 5.

Household Traffic Survey Data in Qingdao
2.1.Necessity of the Third Traffic Survey.Qingdao had two household traffic surveys in 2002 and 2010, respectively.The first survey in 2002 found out the basic rule of the residents' travel at that time and had an important effect on traffic network analysis.In order to adapt the city's rapid development and the needs of subway construction, the government had the second traffic survey in 2010, which plays an important foundation supporting role in the design of subway lines 1, 2, 3, 4, and 8.After that, great changes have taken places in recent years in Qingdao city's population quantity, structure, function, area, population structure, occupational structure of residents, trip structure, and the car ownership and the original survey data cannot support the resident trip analysis now.In view of this, the government had the third traffic survey in 2016.Figures 1 and 2 give the average trips, average trip time, and average distance from 1992 to 2015 and Figure 3 shows the urban area expansion from 1992 to 2015.

Electric Map API Data
Although traditional traffic survey could reap some data we need, most traffic survey, especially the online-survey and face-to-face interviewing only acquire the OD information, trip mode, and trip time.It is difficult to get the trip routes information.At the same time, the trip time is usually inaccurate.What is exciting is that electric map API (e.g., Google Maps, Apple Maps, Baidu Maps, Auto Navi Maps and so on) could provide the trip route and time information, which remedies the traditional traffic survey's defect.In this paper, we try to use Baidu Maps to find the routes.
Baidu Map supplies Web API v2.0 service for developers.People could obtain the route planning service by the style of HTTP/HTTPS.Table 3 shows the data sample from Baidu Maps Web API v2.0.
As shown in Table 3, for one OD, it recommends 5 routes.For bus travelers, it includes the route length, travel time, initial walk time, initial walk time, travel distance by bus, travel time by bus, arrival walk distance, and arrival walk time and so on.
In fact, we do not know the data validity of the Baidu Maps Web API v2.0.For this, we take taxi as the test object and have a taxi follow investigation.The accuracy of the Baidu Maps Web API v2.0 is checked by comparing the recommended data and actual taxi investigate data.
Table 4 gives the style of taxi following questionnaire.As shown in Table 4, taxi following questionnaire includes OD pairs, departure time, arrival time, the passing intersections, and the road and traffic conditions.
Here, we randomly chose 300 OD pairs in Qingdao city from the total taxi survey ODs. Figure 5 shows the recommendation routes of the 300 OD pairs generated by Baidu Maps Web API v2.0.

Accuracy Analysis on Recommended Routes Generated by Baidu Maps
Web API v2.0.The coincidence factor of intersections is used to verify the accuracy of Web API.Specifically, the th OD pair's matching rate can be written as follows: In ( 1),   is the th OD pair's matching rate.When   is bigger, the accuracy of the Web API data will be higher.  is the coincidence number of intersections between the Web API's recommendation route and actual taxi survey route of the th OD pair.  is the number of intersections of actual taxi survey route of the th OD pair. is the number of OD pairs.Accordingly, the average matching rate of total OD pairs can be written as follows: where  is the total number of survey OD pairs.According to statistical analysis, the average matching rate of total OD pairs is 90.74%.It is a high ratio which reflects high accuracy of Web API v2.0.However, in statistics, not all OD pairs' matching rates are high.Here we chose three kinds of OD to analyze the matching rates.Table 5 gives the three kinds of OD pairs.As shown in Table 5, between OD pair MIXC and Shiyan community, taxi survey route and Baidu's recommended route have the same passing intersections.The ratio of intersections overlap is 100%, which reflects the Baidu API's high accuracy.However, the OD pair Taidong and  Jinguihuayuan community has the different results.There are three different intersections.The matching rate is 66.7%.The reason is explicable.We found that it was in evening peak, and the traffic congestion made the driver changing the route.Surprisingly, the survey route and recommendation route are totally non-overlapping between OD pair Wanda Plaza and Fuan community.The matching rate is only 14.3%.According to the investigation on Yanji Road, it is in construction repair during the survey day.Therefore, the taxi driver changed the trip route.
In a word, although the matching rates of intersections between some OD pairs are low, they are imaginable.Some unforeseen circumstances, such as the bad weather, accident, and congestion, would make drivers change their route which deviates from recommended routes.
Undoubtedly, according to the statistics on the 300 OD pairs, OD pairs' matching rates are high (90.74%).Therefore, it is highly accurate and credible.

Accuracy Analysis on Recommended Trip Time Generated by Baidu Maps
Web API v2.0.In order to further verify the validity of Baidu map Web API data, we also compare the travelling time.Here, two kinds of data are used to verify the accuracy of recommended trip time generated by Baidu Maps Web API v2.0.One is the 300 taxi OD survey data.As shown in Table 4, the trip time is also recorded.Here the ratio between the recommendation trip time and actual taxi survey time is used to verify the accuracy of the Baidu electric map API data.Specifically, the th OD pair's time matching rate can be written as follows: where   is the Baidu electric map API recommendation trip time of the th OD pair.  denotes the actual taxi survey trip time of the th OD pair.The average time ratio of total OD pairs can be calculated as follows: According to statistical analysis, the average ratio of total OD pairs is 78.16%, which is lower than the average route matching rate of total OD pairs (90.74%).Unlike trip route, trip time has higher randomness, even the same route, the same time of different days, or the same route, different time with the same day.Therefore, we think the ratio (78.16%) is high enough that could reflect the high accuracy of Baidu electric map API data.

Data Fusion and Traveler Behavior
Analysis in Time and Space 4.1.Data Fusion, Processing, and Curve Fitting.Trip survey could obtain the accurate OD pairs, trip mode, travel purpose and so on.However, it cannot reap the travel route information and the travel time is inaccurate.Consider that Web API could provide route planning service and its high accuracy.This paper tries to take advantage of the two kinds of data and integrate them together and then have some traffic behavior analysis based on the integrated data.Table 6 gives a sample of the integrated data.Table 6 includes the OD pairs, the length of the route between one OD, the origin coordinate, the destination coordinate, the passing intersections, trip purpose, trip mode, walk distance, and walk time.
Matplotlib is used to have a python drawing.Specifically, Pylab and pyplot are the main processing tools.Hist function in pyplot is used to have interval setting and sample statistics of corresponding intervals.Here, data is set 150 intervals by considering the numerical span and frequency statistics.According to every interval's median value (independent variable) and the statistics (dependent variable), the leastsquare method is used to curve fitting.The mean and standard deviation of the travel time and distance are analyzed by Pandas in Python.

Travel Behavior Analysis with Different
Purposes.This section discussed the travel behavior with 6 purposes (i.e., work, home, school, taking children to school, shopping, and company business) from the angle of trip distance and time.
Figure 6 shows trip distance distribution with 6 purposes.As shown in Figure 6, all curves with different trip purposes show normal distribution approximately.Most purposes' trip distances distributions are concentrated, which are no more than 10 kilometers.It is relevant to the urban size, residence distribution, company distribution, and school distribution.Figure 7 shows the distribution of schools, residential communities, companies, supermarkets, and shopping arcades of Qingdao city.From Figure 7, we can see that, unlike western developed countries, Qingdao's schools, residential communities, companies, supermarkets, and shopping arcades are concentrated in the urban area, which reflects Chinese most cities' characteristics of non-separation of work and residence.It is worth noting that students have the   For company business, its travel distance distribution is dispersed, and it has the longest travel distance (13.37 kilometers) which may be related to the randomness of company business.
As shown in Figure 8, the same with travel distance distribution, the average travel time of going to school and taking children to school is minimum.They are 22.54 minutes and 19.44 minutes.Because some parents drive children to school, its average travelling time is less than children going to school themselves.However, not all travelling time is directly proportional to travel distance.Taking company business and work as an example, although company business has  the longest travel distance, work has the longest travel time instead of company business.Compared to travel distance, the standard deviations of all purposes' travel time are bigger than the travel distance.This may be decided by the randomness of travel time.Unlike travel route, travel time is easier influenced by road and traffic conditions, such as signal control, traffic congestion, weather, accident, and driving habits.Another interesting thing is that most trip purposes have two peaks when the travel time is approximately 10 minutes and 30 minutes.and 11 show the travel distance distribution by bus and car with different purposes.For bus travelers, all purposes' travel distance peaks are less than 5km and the average travel distance is 7.66km.Compared to work, going home and company business, going to school, taking children to school, and shopping's travel distance distribution are more centralized.Their average travel distance is about 6.4km.The company business still has the longest average travel distance, 9.51km.Compared to bus travelers, car users have longer travel distance; their average travel distance is 8.58km.Work, going home, shopping, and company business travel by car have longer travel distance than bus.The most respective example is company business travel, 13.65km, 5 kilometers more than bus travel, which is determined by the high convenience and comfort of car service.
Interestingly, for purposes of going to school and taking children to school, car travelers will take shorter distance than bus travelers.The average travel distance of going to school by car is only 4.82km and taking children to school is 5.93km.Correspondingly, the average travel distance of going to school and taking children to school is 6.08km and 6.18km, respectively.Actually it is comprehensible.Car could achieve door-to-door service but not bus.Bus traveler should go to bus station first and then travel by bus.Some bus lines are even not the optimal route from home to school.However, for car traveler, they would choose the optimal route generally.In addition, bus travelers also must go to school from bus station.Thus, compared to car users, bus travelers would take more travel distance.Therefore, in order to alleviate the congestion, the traffic manager should improve the bus service to suppress the high increasing of the car demand.As shown in Figure 13, most bus travelers' travel durations are between the intervals of 20 and 60 minutes.Most car travelers' travel durations are between the intervals 2 and 40 minutes.

Discussion
This paper tries to integrate resident trip survey data and electric map API data together.In order to verify the accuracy of electric map API data, we take taxi as the test object and have a taxi follow investigation.Accuracy of recommended routes and travel time is analyzed.According to statistical analysis, the average matching rate of total OD pairs is 90.74%, which reflects high accuracy of electric map API data.
In statistics, not all OD pairs' matching rates are high.Some unforeseen circumstances, such as the bad weather, accident, and congestion, would make drivers change their route which deviates from recommended routes.According to statistical analysis, the average ratio of total OD pairs is 78.16%, which is lower than the average route matching rate of total OD pairs (90.74%).Unlike trip route, trip time has higher randomness, even the same route, or the same time of different days, or the same route, different time with the same day.Therefore, we think the ratio (78.16%) is high enough that could reflect the high accuracy of electric map API data.
Based on the fusion data, travel behavior with different purposes and modes is analyzed.We found that most purposes' trip distances distributions are concentrated, which are no more than 10 kilometers.It is relevant to the urban size, residence distribution, company distribution, and school distribution.Unlike western developed countries, Qingdao's schools, residential communities, companies, supermarkets, and shopping arcades are concentrated in the urban area, which reflects Chinese most cities' characteristics of nonseparation of work and residence.It is worth noting that students have the shortest travel distance.The average travel distance is 5.34 kilometers.Company business's travel distance distribution is dispersed, and it has the longest travel distance (13.37 kilometers) which may be related to the randomness of company business.Compared to travel distance, the standard deviations of all purposes' travel time are greater than the travel distance.This may be decided by the randomness of travel time.Unlike travel route, travel time is easier influenced by road and traffic conditions, such as signal control, traffic congestion, weather, accident, driving habits, and so on.
For bus travelers, all purposes' travel distance peaks are less than 5km and the average travel distance is 7.66km.Compared to work, going home and company business, going to school, taking children to school, and shopping's travel distance distribution are more centralized.Car users have longer travel distance than bus travelers, and their average travel distance is 8.58km.Work, going home, shopping, and company business travel by car have longer travel distance than bus.The most respective example is company business travel, 13.65km, 5 kilometers more than bus travel, which is determined by the high convenience and comfort of car service.What is surprising is that, for purposes of going to school and taking children to school, car travelers will take shorter distance than bus travelers.
Unlike travel distance distribution, travel time distribution has a great difference between car and bus.Bus travelers would spend more time than car travelers when finish their travel.The big difference between car and bus in travel time would induce more car travel demand which increases traffic   congestion.Therefore, in order to alleviate the congestion, the traffic manager should improve the bus service to suppress the high increasing of the car demand.

Figure 4 :
Figure 4: The main traffic survey point.

Figure 5 :
Figure 5: Recommendation routes of the 300 OD pairs.

Figure 6 :
Figure 6: Trip distance distribution with 6 travel purposes.
The distribution of shopping arcades

Figure 7 :
Figure 7: Distributions of residential communities, schools, offices, and shopping centers in Qingdao.

Figure 9 :
Figure 9: Car and bus travel distance distribution with different purposes.

Figure 10 :Figure 11 :
Figure 10: Comparison of travel distance with different purpose by bus.

Figure 12 :
Figure 12: Car and bus travel time distribution with different purposes.

Figure 13 :Figure 14 :
Figure 13: Comparison of bus travel time with different purposes.

Table 1 :
Traffic survey content.

Table 5 :
Matching rate of three kinds of OD pairs.

Table 6 :
Sample of the integrated data.