Ridesharing accessibility from the human eye: Spatial modeling of built environment with street-level images Computers, Environment and Urban Systems

Scholarly interest in the accessibility of ridesharing services stems from debates within the transportation and planning communities on the inequality of access to transit and the growing digital divide embedded within novel forms of transit services. Contributing to such discussions, this paper considers the city of Atlanta as a case study and explores the links between the spatial disparity of accessibility to different Uber ridesharing products and features of the built environment extracted from Google Street View (GSV) imagery. The variability of wait time for an Uber service is used as a proxy of accessibility, while semantic image segmentation is performed on GSV imagery using a deep learning model DeepLabv3 + to identify notable spatial features captured at the eye- level perspective around service pick-up points. Results from spatial models show that proportions of built environment features such as buildings, vegetation, and terrains are associated with longer waiting times. In contrast, larger salient regions with foreground features are associated with shorter waiting times for several Uber service products.


Introduction
In the last decade, the sharing economy boom has ushered in technologies that have revolutionarily changed multiple industries. Among these, ridesharing service is an innovation in the Vehicle-for-Hire (VFH) market traditionally dominated by taxis. Ridesharing has introduced significant changes to the urban transportation system with the expectation of reducing the number of automobiles needed to satisfy existing travel demand by optimizing the match between riders and drivers. The practice of sharing rides can be traced back to as early as carpooling clubs from WWII (Chan & Shaheen, 2012). The most recent technologyenabled ridesharing has proliferated since the late-1990s when a combination of information and communication technologies, such as the Internet, mobile phones, and social networking, were integrated into automated ridesharing software (Chan & Shaheen, 2012).
Ridesharing is hailed by its proponents for its societal benefits, such as enabling better access to goods and opportunities by providing access to underutilized services (Cohen & Kietzmann, 2014;Hamari, Sjöklint, & Ukkonen, 2016), reduction of CO 2 emissions and fossil fuel dependency (Global e-Sustainability Initiative, 2008), and mitigation of urban congestion. Despite these purported benefits, ridesharing, as led by transportation network companies, has been criticized for intensifying urban transport challenges (Diao, Kong, & Zhao, 2021). Critical questions surrounding accessibility and spatial equity arise in academic literature and urban policy debates to probe whether ridesharing services assuage or exacerbate existing spatial divides (e.g., Hughes & MacKenzie, 2016;Wang & Mu, 2018). Notably, the interest of urban scholars is to investigate how the physical and socioeconomic characteristics of the built environment contribute to such issues.
Against this backdrop, this study investigates the relationship between ridesharing accessibility and the built environment. Uber is the most popular ridesharing mobile application today, which debuted in 2009 and now operates in >785 metropolitan areas across 85 countries (Uber Technologies Inc, 2021). In 2022, Uber still dominates the U.S. market, accounting for 72% of U.S. ridesharing spending (Bloomberg Second Measure, 2022). Considering Uber's popularity in adoption and high market penetration, this study takes Uber as an example of various ridesharing services and focuses on exploring Uber's accessibility. The built environment has long been studied as "an innate driver of traveler needs" (Cervero & Kockelman, 1997;Chen, Feng, Ding, Yu, & Yao, 2021), whose features were found to have profound impacts on human perception of cities (Lynch, 1964). We will use street-level images among numerous data sources that describe the built environment. Compared with a bird's eye view, street-level images have the advantage of providing a wide range of street landscapes closer to a human's perspective from the ground in reality. Such a ground view mimics a human-eye view of travelers from the street Wang & Vermeulen, 2021). This perspective is well aligned with the objective of this studyto understand the ridesharing accessibility from a user's perspective.
This research contributes to the literature in the following manner. First, the relationship between the built environment and ridesharing accessibility remains under-researched due to the relative newness of ridesharing data and the lack of tools to measure the built environment at fine granularity. Novel data sources such as street view imagery and innovation in computational methods such as computer vision and deep learning introduce the possibility of delineating the built environment with more delicate details to assess its effects on accessibility at a nonaggregated level. Second, while many studies on ridesharing have been conducted at aggregated geographical units (Gerte, Konduri, & Eluru, 2018, Sabouri, Park, Smith, Tian, & Ewing, 2020Yu & Peng, 2019;Yu & Peng, 2020;Bao, Liu, Yu, & Wu, 2017;Zhu et al., 2022), this study is performed at the individual point level, allowing for an understanding of the association between the built environment and ridesharing accessibility at a finer and more customized scale. Third, we expand on existing studies with Uber data by incorporating product differentiation of Uber services in our analysis. In assessing accessibility, we distinguish between UberX, UberXL, UberBlack, UberSELECT, and Uber SUV, accounting for the different products that may target users with different needs and preferences from a built environment perspective.
The remainder of this paper is organized as follows. Section 2 reviews the literature on ridesharing accessibility, the built environment, and novel computation techniques for measuring the built environment. Section 3 illustrates the data and methods. Section 4 reports the results and tests for the robustness check. Section 5 discusses the impact of this research and concludes.

Built environment and accessibility
The built environment has long sustained interest among scholars in urban studies and other related social science disciplines. In his seminal work Image of the City, Kevin Lynch broke down the componentry of the built environment that influences the imageability and human experience of cities into paths, edges, districts, nodes, and landmarks (Lynch, 1964). Later, Cervero and Kockelman (1997) identified the three "Dvariables"-density, diversity, and design -to describe the characteristics of the built environment. More recently, destination accessibility and distance to transit were further highlighted as essential features of the built environment by Ewing and Cervero (2001). As the built environment's definition and scope continued to evolve over the past decades, a vast body of work began investigating the connection between the built environment, travel behavior, and accessibility. Research has shown that individuals' travel choice depends on their socioeconomic status and built environment characteristics (Ewing & Cervero, 2010).
The question of accessibility has attracted vast attention from scholars in transportation planning. Historically, accessibility is defined as "the opportunity which an individual or type of person at a given location possesses to take part in a particular activity or set of activities" (Hansen, 1959). Previous research has shown that higher accessibility is associated with areas of cities with higher density and large employment concentrations (Hanson & Schwab, 1987;Muraco, 1972). The concept of accessibility can be further distinguished as active accessibility and passive accessibility; the former referring to how easily subjects located in a given zone can conduct activities (individual-based), and the latter relates to how easily activities in a certain zone can be reached by users and services (location-based) (Cascetta, Cartenì, & Montanino, 2013). Beyond reflecting the spatial distribution of transit networks and opportunities, accessibility is also a manifestation of temporal development based on the time-geographic perspective (Miller, 2003;Weber & Kwan, 2003). There is a tradition of location analysis in accessibility studies to minimize the average access time or distance (Kwan, Murray, O'Kelly, & Tiefelsdorf, 2003). Compared with the location-based measures, this time perspective measure of accessibility has the benefit of more sensitively mirroring the socio-demographic, economic, and cultural constraints (Miller, 2003). Travel time between work and home has been used in numerous studies as a proxy for accessibility to understand underlying inequality along racial, gender, and socioeconomic lines in urban areas (Tribby & Zandbergen, 2012;Preston & McLafferty, 1999).
Innovative ridesharing services employ algorithmic optimization to match supply and demand. A central question is whether such optimization alleviates or aggravates the existing spatial divide. As critics have specified, the current mainstream trend of smart city development and technology in urban space embodies a technocratic fantasy (Datta, 2015), one which is founded upon neoliberal ethos (Kitchin, 2015) and reinforces pre-existing biases and accelerating privatization (Benjamin, 2019). A systematic study of accessibility that includes physical features of the built environment must understand the impact of ridesharing services in the urban realm to determine appropriate policies to encourage or regulate current practices.
A handful of research has examined the accessibility of ridesharing services. The accessibility of VFH services such as Uber and Lyft is significantly impacted by transportation infrastructure and socioeconomic characteristics (Jiang, Chen, Mislove, & Wilson, 2018), population density, urban land use intensity, and the availability of public transit (Wang & Mu, 2018). As far as the built environment is concerned, only density and land use have been considered in previous studies. This research will use street-level images to enhance the resolution of spatial and temporal measures when examining the interaction between the built environment and ridesharing accessibility. Compared with previous studies measuring the density and land use of the built environment from a bird'seye view, the street-level imagery can integrate individuals' perspectives into the measurement of the built environment, providing an objective way of measuring the built environment from a human-eye perspective. Moreover, with the street-level images, this study can complement previous studies focusing on meso-scale built environment features and explore the impacts of micro-scale built environment features on ridesharing accessibility.

Measuring built environment with street view imagery
Before the arrival of AI-assisted methodologies, researchers relied on surveys, remote sensing images, and aggregated census data to quantify the built environment. The recent emergence of street view imagery (SVI) within the larger paradigm shifts of machine intelligence has provided a novel data source for researchers to read and understand the urban landscape from the street-level human eye perspective (Anguelov et al., 2010;Badland, Opit, Witten, Kearns, & Mavoa, 2010;Ibrahim, Haworth, & Cheng, 2020). Seeing the city from the street level provides a more human-oriented horizontal reading of visual features of the built world that are not captured by other birds' eye-view data sources such as aerial or satellite imagery . Since its launch in 2007, Google Street View has become arguably the most widely used service for street-view image provision, offering omnidirectional and panoramic coverage and including >90 countries .
Coupled with the availability of street-view images, maturity in deep learning and computer vision technologies in image recognition in the past decade (LeCun, Bengio, & Hinton, 2015) paved new avenues to understand the built environment through large-scale image data for the first time (Reichstein et al., 2019). Semantic segmentation using convolutional neural network models allows for accurate classification of individual attributes of the built environmentsuch as urban greenery, buildings, sky, and groundat a pixel level (Chen, Papandreou, Kokkinos, Murphy, & Yuille, 2017;Chen, Papandreou, Schroff, & Adam, 2017;Long, Shelhamer, & Darrell, 2015;Badrinarayanan, Kendall, & Cipolla, 2017). Street-view images have been widely experimented by researchers on a variety of applications and research problems, from evaluating property prices in real estate research (Wang, Hu, Tang, & Zhuo, 2020;Yang, Rong, Kang, Zhang, & Chegut, 2021), to assessing the level of walkability in a neighborhood in public health and socioeconomic studies (Koo, Guhathakurta, & Botchwey, 2022;Yin & Wang, 2016), to revisiting and measuring earlier concepts of urban perceptual qualities such as imageability and aesthetics (Ma et al., 2021).
This study will be the first to employ street-view images to quantify the built environment in the context of assessing its relationship with ridesharing accessibility. The use of street-view images will not only provide finer-resolution data but also offer a more human-oriented perspective. Research results will help shed light on potential spatial inequality relating to ridesharing services driven by market demands and potential solutions to urban transport problems.

Uber accessibility and street-level images
We measure Uber accessibility from a time perspective. Previous studies on ridesharing services in various contexts have used the waiting time in their exploration of ridesharing accessibility. For instance, Hughes and MacKenzie (2016) have measured the relationship between Uber wait times and socioeconomic indicators in Greater Seattle. Previous studies in Atlanta have adopted the expectation and variability of Uber wait time as proxies to measure the spatial disparities of ridesharing accessibility (Wang & Mu, 2018) and the impacts of road network structure on Uber accessibility (Wang, Chen, Mu and Zhang, 2020). Furthermore, Shokoohyar, Sobhani, and Ramezanpour Nargesi (2020) used the average wait time and standard error of wait time as a proxy to explore the determinants of Uber accessibility in Philadelphia. Building upon previous studies, this study measures the wait time for an Uber product service at a given location as a proxy for Uber accessibility. Taking the city of Atlanta as the study area, we employ a systematic sampling approach to capture at least one random sample point in each neighborhood and at least one random sample point every two square miles. For each of the 152 random sampling locations, we used the Uber Developers API to collect the wait times for all available Uber products 1 roughly every 30 min for a month, resulting in a total of over 360,000 data points. The Uber Developers API provides estimations of pick-up waiting times based on the GPS data from their rides. Although these estimates are claimed by Uber to be very close to the actual data, they can still vary according to real-world situations. These estimations from the Uber Developers API have been largely used and validated in existing studies as a good proxy for investigating the ride-sourcing networks (e.g., Hughes & MacKenzie, 2016;Wang & Mu, 2018).
For each sample site, we acquired its corresponding street-level images from Google's Street View Static API. Specifically, we collected eight images for each point to ensure retrieving complete coverage of the built environment information, with the headings 2 of 0, 45, 90, 135, 180, 225, 270, and 315. Additionally, we set the pitch parameter as 0 (i. e., the camera was held flatly horizontal to the vehicle) and a field of view of 90. Finally, we obtained each image as 640 pixels by 640 pixels (the highest resolution Google allows for non-premium users). Fig. 1 shows the locations of all 152 random sampling points along with their corresponding average wait time for UberX in this study.

Image semantic segmentation with Deeplab V3+
Semantic segmentation is a fundamental task in computer vision, aiming to assign semantic labels to every pixel in an image (Everingham et al., 2015;Mottaghi et al., 2014). Most of the successful semantic segmentation systems rely on hand-crafted features combined with flat classifiers, such as support vector machines (Fulkerson, Vedaldi, & Soatto, 2009), random forests (Shotton, Johnson and Cipolla, 2008), and boosting (Shotton, Winn, Rother and Criminisi, 2009). However, substantial improvements in the performance of these systems have always been constrained by the fact that hand-crafted features have quite limited expressive ability. Recently, deep convolutional neural networks has emerged as the transformative power for advancing the performance of semantic segmentation systems, aided by their excellent feature learning ability. Deep convolutional neural network-based semantic segmentation approaches can be categorized into three groups. The first approach typically employs a cascade of bottom-up image segmentation, followed by a deep convolutional neural network used for the region classification (Girshick, Donahue, Darrell, & Malik, 2014). The second approach relies on using the features extracted from the deep convolutional neural network for dense image labeling and couples those features with independently obtained segmentations (Farabet, Couprie, Najman, & LeCun, 2012). The last one directly predicts dense categorylevel pixel labels through a deep convolutional neural network, which only involves pixel-wise classification (Long et al., 2015). The segmentation-free approaches directly apply a deep convolutional neural network to the entire image in a fully convolutional fashion, becoming the mainstream application of deep convolutional neural networks to semantic segmentation systems.
Deeplab v3+ is a recently proposed semantic segmentation model, which has extended from DeepLabv3  by employing an encoder-decoder structure to gradually capture the semantic information and high-level features. Deeplab v3+ is capable of extracting multiscalar semantic information in a computationally efficient way because of its atrous spatial pyramid pooling approach (see Chen, Papandreou, Kokkinos, Murphy and Yuille, 2017 for details of the algorithm). Because of the features above, Deeplab v3+ has already achieved excellent performance on multiple datasets (Chen, Zhu, Papandreou, Schroff, & Adam, 2018). More recently, it has been applied to analyze street-level images in empirical urban research (e.g., Kim, Lee, Hipp, & Ki, 2021;Wang & Vermeulen, 2021).
In this work, we employed the Deeplab v3+ model to segment Google Street View (GSV) images. Specifically, we leverage the Xception model as the network backbone of Deeplab v3+ and train the Deeplab v3+ model on the Cityscapes dataset (Cordts et al., 2016), which is a large-scale dataset containing high-quality pixel-level annotations of 5,000 images and about 20,000 coarsely annotated images. Then, we applied the well-trained Deeplab v3+ model to segment all Google Street View images into 19 categories, namely bicycle, building, bus, car, fence, motorcycle, person, pole, rider, road, sidewalk, sky, terrain, traffic light, traffic sign, train, truck, vegetation, and wall. Fig. 2 provides an example of this study's image segmentation results via Deeplab v3+. We calculated the categorical percentage within each image, where categories with at least 1% area of each panorama were considered in the subsequent analysis. For example, in Fig. 2, roads occupy the most extensive area (26%), followed by vegetation (21%) and buildings (18%). As eight GSV panoramas were collected for each sampling point, the average proportion of each category among the eight panoramas was taken. Finally, we chose the top six common categories (i.e., road, building, wall, vegetation, terrain, and car) for the following empirical analysis (Section 3.4).

Additional variables from computer vision
Although there is not a consensus regarding defining a salient region, it usually refers to areas of an image with semantic contents. When it comes to a GSV panorama, a salient region can be conceptualized as foreground features (e.g., buildings, cars, trees) instead of background features (e.g., open space, sky, etc.). We extracted additional information from the salient region analysis of the corresponding GSV panoramas for each sample point. For each GSV panorama, we applied Gaussian filtering and identified the salient region via Otsu's thresholding method (Otsu, 1979). Otsu's thresholding method has been widely applied for its mathematical simplicity and computational efficiency. Fig. 3 shows an example of the salient region of a GSV panorama. We calculated the salient region ratio from 0 to 1, denoting the ratio between the area of salient regions and the entire GSV panorama. The higher the value, the more proportion of foreground features in the corresponding GSV panorama.
Similarly, major colors (i.e., a color occupied at least 1% of the corresponding image) of all GSV images were identified via the Colorgram package (https://github.com/obskyr/colorgram.py). Colorgram. py is a Python version of the JavaScript library colorgram.js, which often produces better results compared to alternative color extraction libraries. We then counted the number of major colors for each image. As eight GSV panoramas were taken at every single sample point, we calculated the average value from salient regional analysis and color analysis, resulting in the variables of salient region ratio (Srr) and the number of colors (NumColor) -consider that as an indicator of the heterogeneity of the street view, for the following spatial modeling.

Spatial modeling
To understand the association between the built environment from street-level images and ridesharing accessibility from Uber wait times, we build upon previous studies (e.g., Hughes and MacKenzie (2016); Wang and Mu (2018); Sabouri et al. (2020); ), and added neighborhood-scale population density (PopDen) and road network density (RoadDen) as neighborhood-level control variables. Therefore, our model specification can be conceptually illustrated as follows, with the six major categories derived from GSV images (Road to Car) plus four additional variables introduced in sections 3.3 (Srr, NumColor) and 3.4 (PopDen, RoadDen): To select the appropriate regression model for the current study, we performed a spatial autocorrelation analysis to estimate whether Uber accessibility is related to geographical positions. Global Moran's I was estimated to identify whether Uber accessibility at a location is influenced by neighboring locations. We constructed the spatial weight matrix by assessing the spatial autocorrelation within the spatial context of a fixed number of close neighbors.
The neighbor relationships identified in the spatial weight matrices ensure that every target case is connected to a certain number of neighboring cases, even when the density of cases' spatial locations varies across the study area. Four nearest neighbors to the target case were determined in computations, as they displayed a better performance than alternative spatial weighting methods that we have evaluated (see robustness tests in Section 4.3 for details). Table 1 shows the values of Moran's I for Uber accessibility of different Uber car types under the spatial weight of four nearest neighbors. The results suggest the existence of spatial autocorrelation for the accessibility of all Uber car types, which lays the foundation for applying spatial regression models.  While Uber accessibility is spatially autocorrelated, the independent variables, such as the street landscape of a specific observation, tend to be more independent and primarily determined by the local physical built environment. The Spatial Lag Model (SLM) is a method to control only spatial autocorrelation in the dependent variable, where spatial dependence effects between observations lay on the dependent variable of Uber accessibility. We performed two Lagrange Multiplier (LM) tests to evaluate the spatial dependence effects according to the spatial model selection procedure (Anselin, Florax, & Rey, 2013) and to select an adequate model to analyze the association between street landscape and Uber accessibility of all Uber car types. The Lagrange Multiplier diagnostics for spatial error model (LMerr) and the Lagrange Multiplier diagnostics for spatial lag model (LMlag) tests were significant, suggesting that a spatial model is needed. Hence, the Robust LMerr and LMlag were applied, and only the Robust Lagrange Multiplier diagnostics for the spatial lag model (RLMlag) statistic was significant (Table 1). This empirical result confirmed that a spatial lag model would be more appropriate for our analysis than a spatial error model. Therefore, this study uses spatial lag models to analyze the impact of the street landscape on Uber accessibility. The equation of SLM is displayed as follows: where Y is the dependent variable of uber accessibility; ρ is the spatial regression coefficient of the endogenous interaction effects (represented by WY) between the sample observations; W is the weight matrix with spatial location relationship of the sample observations. X represents the explanatory variables such as the street landscape, population density, road density, and their interaction terms, and β is a vector of corresponding coefficients; ε refers to the error term.

Results
We built our spatial lag regression based on our model specification using the dependent variable UberX accessibility (Avg_UberX) to investigate the association between street landscape and UberX accessibility as measured by the average wait time (Eq. 1 and Eq. 2). To further explore the potential moderation effects of road network density on the relationship between the significant street landscape variables and Uber accessibility, we have added the corresponding interaction terms in the preceding model. Table 2 presents the final regression results, including the interaction terms for UberX accessibility. A significant likelihood ratio (LR) test suggested that this model with a spatial lag term performs significantly better than traditional ordinary least squares (OLS) regression. Additionally, the LM test for residual autocorrelation is not significant, suggesting that the spatial autocorrelation has been accounted for by adding the spatial lag term. Generally, there was a positive spatial autocorrelation (Rho >0) in our regression analysis of the association between street landscape and UberX accessibility. All explanatory "." Significant at 0.1; "*" Significant at 0.05; "**" Significant at 0.01.
variables are significant in the model and supported by the Wald statistics. A high value of Nagelkerke pseudo-R 2 (0.763) suggests that this model fits the data well. Because of the spatial lag term, the coefficient in a spatial lag model represents only the short-term direct impacts of an independent variable on the dependent variable. Therefore, the total effect (see LeSage and Pace (2009) for details) was computed to incorporate the direct and indirect effects of spatial dependence on UberX accessibility from neighbors. Both control variablespopulation density (PopDen) and road network density (RoadDen) -were found to exert an enormously significant total impact on the accessibility of UberX. Concerning street landscape features, the street-level greenness (Vegetation) measured at the observation points plays a critical role in UberX accessibility. When controlling for other covariates, the regression reveals that a higher occupation of vegetation (predominately big trees) in the street view is associated with a longer waiting time for UberX. The positive impact of the proportion of terrain (primarily urban parks and grassland) in the street landscape on the average UberX waiting time has confirmed the importance of urban green spaces in streets for UberX accessibility. One possible explanation is that observation points with large portions of vegetation or terrains in the street-level images may be located in suburban areas with fewer Uber services available. Another interpretation could be that places with more vegetation or terrain in the street landscape can be less connected to road networks, costing extra time for UberX drivers to reach. Additionally, Uber tends to serve more businessconcentrated areas that are often located in concrete jungles. The proportion of vegetation and terrain presented an exceedingly high total influential magnitude (0.843 and 0.628, respectively) on UberX accessibility compared with other explanatory variables.
There are other important indicators of UberX accessibility. Our results suggest that the salient region ratio (Srr) in the street views is associated with better accessibility of UberX. One percent increase in the salient region ratio is associated with an approximate 30% decrease in the average waiting time for UberX. Salient regions refer to foreground scenes, such as a landmark, which are more eye-catching. In contrast, more buildings in the street views are found to correspond to an increase in the average waiting time of UberX, although the association is only significant at the level of 90% (p = 0.073 < 0.1). It could be understood that more buildings in street views of the observation point can be accompanied by crowdedness and narrower streets, which could be an obstacle to the accessibility of UberX.

Moderation effect of road network density
Our exploration of the moderating effects of road network density on street landscape variables provides new insight into the relationship between street landscape and uber accessibility. The interaction term RoadDen * Srr is found to be significantly and positively correlated with the average waiting time of UberX, opposite to the coefficient direction of Srr on UberX accessibility, which suggests that road network density has played a countereffect role in the relationship between Srr and UberX accessibility. In other words, when the proportion of foreground features in street views increases, an observation point with dense road networks will experience a decrease in the average waiting time of UberX compared to an observation point with sparse road networks.
Additionally, road network density plays a significant role in mitigating the impact of the proportion of terrain on UberX accessibility. For example, two observation points are located in dense road networks (A) and sparse road networks (B), respectively. When the proportion of terrain increases in the surroundings, A will experience a smaller increase in the average waiting time of UberX than B. Nevertheless, we did not find a similar effect of road network density on the proportion of vegetation or buildings on UberX accessibility. As road network density has both direct and moderation effects on UberX accessibility, the road networks, in a broad sense, may have more profound and nuanced implications for ridesharing accessibility issues.

Robustness check
A series of robustness tests were conducted with alternative spatial weighting methods and compared their performances in the modeling diagnosis. Regarding the spatial weighting method of k nearest neighbors, we chose k = 3 and k = 5 to compare with our modeling results when k = 4. Additionally, distance-band weight methods were applied with three distance bands 3 (i.e., 2.60 km, 2.75 km, and 3.00 km). Table 3 displays the regression results with different spatial weight matrices. All the models fit the data well. Regarding individual explanatory variables, the evaluation results on the relationship between built environment features and UberX accessibility are consistent under different spatial weight matrices.

Built environment features in the close proximity of pick-up points
Our results show that several built environment features from GSV panoramas are significantly correlated with UberX accessibility. To the best of our knowledge, this is the first empirical work investigating the relationship between built environment features and ridesharing accessibility at a very fine scale and with pedestrian-perspective, i.e., the 360-degree view at the pick-up points. We tackle the research question with a more human-oriented data source rather than the traditional bird's-eye view images. Literature using traditional land-use datasets has confirmed the influence of the built environment on Uber accessibility. For example, Uber has been suggested to be more accessible in denser areas and areas with higher road network densities (Jiang et al., 2018;. Additionally, Shokoohyar, Sobhani, and Ramezanpour Nargesi (2020) reported that Uber accessibility is not associated with public facilities such as police departments and health centers. This study with street-level imagery, therefore, contributes to complementing the existing findings on spatial determinants of Uber accessibility in that the location for travelers to request ridesharing service matters at a fine scale. First, we found that the proportion of foreground features (i.e., salient region) is positively related to UberX accessibility, meaning that the more areas of salient regions within close proximity to pick-up points, the less waiting time that one would expect for an UberX to pick up a customer. Second, it has been revealed that the proportion of buildings, vegetation, and terrain in the surroundings of the pick-up points are positively correlated with higher UberX waiting time (i.e., lower UberX accessibility). Although SVI has the limitation on its perspective that it is collected from vehicles, which might not always be consistent with the viewpoint of pedestrians , still, compared with the much land-use data using a bird's eye view, the street-level images have the advantage of mimicking the situation when a customer requests an Uber and possibly present most of the actual built environment from the first-person view. When everything else is held constant, pick-up points surrounded by more buildings, tall trees (i.e., vegetation), or in the middle of urban parks/grasslands (i.e., terrains) will tend to have a longer waiting time for Uber. Pinning a pick-up location with certain built environment features may reduce the wait time for ridesharing services.
Such findings of the street-level built environment from street view images are generally aligned with previous studies with taxi data or travel surveys in other studies. For example, Guo and Karimi (2017) focused on spatial-temporal inflow and outflow mobility patterns across different spatial areas within New York City and found that the spatial environment caused such spatiotemporal patterns of taxi trips. More recently. Zhang, Wu, Zhu, and Liu (2019) investigated taxi trips' pick-up and drop-off locations and revealed that street-level images could 3 Empirical analysis shows that some points would become islands when a distance band was set <2.60 km. explain up to 66.5% of hourly variation taxi trips in Beijing. It is also worth mentioning that we did not find a significant correlation between the presence of cars and the accessibility of UberX, which resonates with the study by Goel et al. (2018) across 34 cities in Great Britainthe number of cars in Google Street View images was not correlated with car commuters.
Nevertheless, empirical results showed that the number of major colors in the surrounding areas, i.e., the heterogeneity of the street view, does not significantly correlate with UberX accessibility. On the one hand, the colorscape of streets reflects the built environment aesthetics (Stamps, 2013) and is linked to the level of mixed land use (Wang & Vermeulen, 2021). While Sabouri et al. (2020) reported that Uber demand was positively correlated with land use entropy, this empirical work did not find a similar conclusion from a colorscape perspective.

Built environment features in the neighborhood scale of pick-up points
Previous studies of the relationship between the built environment and ridesharing accessibility have primarily focused on the built environment characteristics at the neighborhood scale (e.g., Hughes & MacKenzie, 2016;Sabouri et al., 2020;Wang & Mu, 2018) and found the important roles of population density and road network density in ridesharing accessibility. This study has confirmed such findings by including neighborhood-scale population density and road network density where the pick-up points are located. Pick-up points in the neighborhoods with higher population densities are correlated with higher UberX accessibility (i.e., less waiting time), confirming Hughes and MacKenzie's (2016) finding in Seattle. While the proportion of roads in the surrounding area of a pick-up point is not significantly associated with UberX accessibility, the density of the road network in the neighborhood of a pick-up point is found in our study to significantly correlate with higher UberX accessibility, consistent with a previous study in Philadelphia (Shokoohyar, Sobhani, & Ramezanpour Nargesi, 2020).
In addition to supporting the direct positive impact of road network density on ridesharing accessibility reported in previous studies, this study can contribute to revealing that road network density at the neighborhood scale provides a more nuanced knowledge by its moderation effect on built environment features at the scale of the pick-up point. Firstly, while higher proportions of both vegetation and terrain are associated with a longer waiting time of UberX, road network density mitigates the effect of the proportion of terrain but not that of vegetation on the waiting time for UberX. Put it differently, when comparing the waiting time for UberX at two pick-up points with high road network density, the pick-up point with high occupancy of terrain (urban parks, grassland) will have less waiting time than the point with high occupancy of vegetation (big trees). Similarly, road network density negatively moderates the role of the salient region ratio in UberX accessibility. When there are two pick-up points with equal attraction of human attention in their surrounding areas, the one located in a higher road network density area will result in a longer waiting time for UberX. Indeed, further studies must disentangle the nexus between road network structures and salient regions in an urban setting.

Heterogeneous levels of accessibility across different Uber products
As mentioned by Uber Inc., there are different Uber products across the market to meet the needs of different travelers and customers. At the time of the data collection, there were five different Uber services. In addition to UberX, which represents the low-cost Uber product that has been widely used in urban research, other available products include UberXL, UberBLACK, UberSELECT, and UberSUV. Different Uber products vary in the number of seats, base rate, cost per minute, and cost per mile, along with the minimum payment and cancelation fee. We re-ran our main model by replacing the dependent variable with the wait time of the remaining four Uber products. Similar to their roles in UberX accessibility, the proportions of buildings, vegetation, and terrain in close proximity to pick-up points are significantly positively correlated with the waiting time for all of the rest four Uber products. However, the silent region ratio was only significant for the accessibility of UberXL and UberSELECT. Previous studies have found that although people with a higher willingness to pay to reduce their travel time use ridesharing more often in general (Alemi, Circella, Mokhtarian, & Handy, 2019), UberX resembles public transit systems (e.g., Wang and Mu (2018); Deka and Fei (2019); Jin, Kong, and Sui (2019)), suggesting there are more socioeconomic variables associated with its service availability and accessibility.
The primary difference between UberX and UberXL is their capacity of four and six passengers, respectively. Therefore, neighborhood-level built environment features, i.e., population density and road network density, contribute similarly to the accessibility of both services. When comparing UberBLACK and UberSUV, they have the highest costs among these five Uber services, where built environment features at the neighborhood scale play no significant role in their accessibility. Finally, UberSELECT is a mid-range service with the same capacity level as Table 3 Robustness check with different spatial matrices (Y = Avg_UberX).
UberX; however, it is about twice the price of UberX. Consequently, we found that some neighborhood-scale built environment features matter (i.e., population density), but some do not hold significance (i.e., road network density) in predicting its accessibility.

Limitations and future work
There are several caveats and limitations in the current study, which cast light on future work. First, we employed a pre-trained Deeplab v3+ model for segmentation. Although the model was well-trained on the Cityscapes dataset with street-level images of the urban scene, we had little control over the categories of built environment features included in the initial analysis. For example, there are different types of buildings, such as single-family houses, rowhouses, duplex flats, commercial complexes, etc. Our approach has consolidated these nuances into a single categorybuilding. Future work can investigate the subcategories of built environment features from street-level images. Second, this study used the estimation of Uber waiting time from the Uber Developer API to measure Uber accessibility. Although these datasets of estimations are claimed by the companies to represent the actual data in a large probability, we should be aware that these estimations of waiting times might be used for the companies to attract and retain customers. Future work could consider using the actual waiting time to more accurately model the relationship between the built environment from the user perspective and ridesharing accessibility. A comparison between the actual and estimated waiting times can provide empirical support for the validity of applying the dataset from the Uber Developer API or from other ride-sourcing service companies in the research field of ride-sourcing platforms. Third, this study has only provided a snapshot of the relationship between built environment features and ridesharing accessibility due to the current data availability of both Uber wait time and GSV. Future work can apply alternative sources of ridesharing services (e.g., Lyft) and GSV (e.g., Mapillary and StreetSide) to provide a complete view of this relationship. Likewise, we could only infer the correlations between built environment features and ridesharing accessibility. In the future, additional studies can be conducted with quasi-experimental designs to reveal the causal relationship between the two. Moreover, this study uses Uber as an example to explore the impacts of the built environment on ridership accessibility. However, there are other ride-sourcing companies, such as Lyft (a major competitor of Uber in the US). There have been studies comparing Uber and Lyft in terms of their pick-up waiting time, trip duration, and the associated influential factors including trip characteristics and weather conditions (e.g., . Future studies should consider exploring the differences in the relationships between built environments using SVI and ridership accessibility for Uber and Lyft. The outbreak of the COVID-19 pandemic has had profound impacts on the sharing economy industries and provided both challenges (such as car-sharing) and opportunities (such as dockless shared bikes/ebikes) to sustainable shared mobilities (Shokouhyar, Shokoohyar, Sobhani, & Gorizi, 2021). On the one hand, built environment features are associated with COVID-19 cases and deaths (e.g., Li, Peng, He, Wang, & Feng, 2021). On the other hand, policies such as lockdown and the consideration of avoiding physical contact between individuals through several types of shared mobilities have severely restricted individuals' usage of ridesharing services. The concern of keeping social distancing also contributes to the changes in urban mobility patterns because of changes in individuals' demands for social participation and activity. Although previous studies have acknowledged the importance of the built environment in sustainable transportation (e.g., Shokoohyar, Jafari Gorizi, Ghomi, Liang and Kim, 2022), it is worth exploring whether the built environment features still played an important role in the usage and accessibility of ridesharing services in and after the pandemic era.

Data availability
Data will be made available on request.