Modeling Strategic Interventions to Increase Attendance at Youth Community Centers

: Community centers play a crucial role in urban environments, providing physical and educational services to their surrounding communities, particularly for students. Among the many benefits for students are enhanced academic outcomes, improvement of behavioral problems, and increased school attendance. Such centers are also particularly vital for low-income and racial minority students as they are pivotal in giving them outside-of-school learning opportunities. However, determinants influencing attendance at community centers remain largely unexplored. The novelty of our research comes from using census data, Boston Centers for Youth and Families (BCYF) attendance data, and specific center attributes, to develop human mobility gravitational models that have been used, for the first time, to predict attendance across the BCYF network. Using those models, we simulated the potential effects on general and student attendance by changing center attributes, such as facilities and operating hours. We also researched the impact of changing the walking accessibility to those centers on their respective attendance patterns. After the analysis, we found that the most cost-effective policy to increase BCYF attendance is changing each center’s educational and recreational offerings far beyond any accessibility interventions. Our results provide insights into potential policy changes that could optimize the attendance and reach of BCYF Community Centers to under-served populations.


Introduction
In numerous urban environments, from metropolitan areas to large towns, community centers are a key part of society [1,2].Often publicly funded and run by the government, these centers all share one common goal: to serve the community [3].They do this in various ways, and each community center, or network thereof, is different.Nevertheless, these community centers run with tight budgets; thus, they have to optimize their budgets when deciding their offerings, like classes, facilities, or hours open, so that they can reach as many people as they can within their communities.This creates a heavy need for tools that enable policymakers to better understand why people go to certain community centers.With this knowledge, community centers will be able to funnel their resources in a way that maximizes their attendance.
In general, policymakers and community center directors do not possess comprehensive tools or models that measure the dependence of demand on the accessibility and attributes of community centers.However, the problem of accessibility to amenities has been heavily studied for the past 50 years for other types of infrastructure, namely commercial or retail infrastructure, using gravitational models, accessibility to transportation, and Agent-Based Modeling [4][5][6][7][8].
For a long time, gravitational models, such as the Huff model [4], have been the baseline for measuring attendance to a facility.They combine attractiveness and distance as the main factors influencing the likelihood of the public visiting a facility.Although they have been mainly used in retail settings, gravitational models have been refined over time to account for many factors influencing consumer attendance and are thus a perfect fit for our study.
Another important factor when considering the likelihood that an individual visits any kind of facility, whether public or commercial, is the accessibility of transportation to said facility.Sevtsuk et al. [5] as well as Wang et al. [9] have shown how a facility's popularity is directly impacted by how difficult it is to reach it, which relates to our work with community centers, as public transportation or pedestrian accessibility is rather important to enabling access.
More recent advancements in the research of public service facility accessibility include taking into consideration spatial accessibility and equity [10][11][12], oftentimes working with Geographic Information System (GIS) data in order to simulate urban mobility to public or commercial facilities.
More recently, Agent-Based Modeling has been used to understand these interactions in urban settings, allowing researchers to directly simulate mobility to a community center, measuring how both accessibility and feature changes can impact the communities' behavior.Doorley et al. [7] and Antonelli [8] have shown how effective ABM is in predicting behavior due to infrastructure changes.
Nevertheless, these methodologies have never been employed in the context of community centers.This research attempted to fill this gap by developing and applying advanced human mobility gravitational models to predict attendance at Boston Centers for Youth and Families (BCYF).
This study is the first in its application of enhanced gravitational Huff models to community centers, using unique datasets that include census data and specific center attributes.Our approach provides a novel analytical framework for understanding and predicting attendance patterns, which has not been explored in the literature.This extension is crucial as it adapts well-established models to a new and under-served domain, which is key to fostering more equitable urban communities.Our research allows owners and policymakers in those sectors to make better, data-driven decisions to benefit their respective infrastructures.

Methodology
In focusing on the Boston Centers for Youth and Families (BCYF) [3] (see Figure 1), our collaboration with BCYF directors has provided access to key data, enabling an in-depth analysis of the BCYF system, its utilization patterns, and the diverse factors influencing engagement within the urban demographic [13,14].Following established methodologies identified in our literature review [4,5,15] and a unique dataset for community center attendance across the BCYF network, we developed enhanced gravitational Huff models for modal transportation.In our study, we built two models.One of these models represents the total population, and one represents the population aged 5 through 17.The models were built with the same equations; the only difference is that they were trained with different eligible populations and thus provide guidance for specific actions regarding the different types of populations.This separation between our model's populations is due to key distinctions in the way the two populations behave.For example, the weekly schedule of someone aged 5 through 17 is very different from an adult, as one follows the academic day, whereas the other follows the work day.Furthermore, adults have much more independence when it comes to mobility and are more likely to drive rather than take public transportation [16], ultimately creating an important difference in their mobility behavior.The accuracies of our models allow us to study potential interventions and offer insights into better, data-driven decisions for policymakers or center directors to increase attendance at BCYF community centers.

Data
In our methodology, we use a variety of data to analyze attendance patterns at the BCYF community centers.Our research utilized anonymous data from visits to each community center, including variables such as age, class, encrypted name, and contact ID.To calculate the number of unique visitors at each center, we identified distinct encrypted names and contact IDs.The data were anonymized using the SHA-512 Algorithm [17].For consistency across all data, we limited our time range to 1 January 2022-31 May 2023.We also accessed a specific dataset of visit data only for the BCYF Quincy Community Center that includes the same variables with the addition of a zip code, which let us fit our probability equations for distance.
Furthermore, we gathered the different attributes of each center [3], as shown in Table 1.We used these data to optimize the attractiveness function as well as inform our interventions.We also used geographic and demographic data from the U.S. Census.This included, for each CBG [18], the number of current residents, the age distribution (specifically focusing on the total and 5-17-year-old population), and socioeconomic data regarding the number of people below the poverty line.

Models
We developed two models: one encompassing the entire available population and another focusing on individuals aged 5-17.Our methodology, akin to the Huff model, employs a gravitational approach.In this framework, the likelihood of an individual from CBG c visiting community center z is inversely related to the distance between c and z and directly related to the attractiveness of z: where f (d) is a function that grows with distance d, and s z measures the attractiveness of community center z.Traditional forms of f (d) are power law or exponential functions, and they vary depending on the transportation modal choice [15].
To determine the functional forms of these modal choice functions, we used two datasets.The first used attendance data from a specific community center known as the BCYF Quincy Community Center.These data contained N cz , which is the number of visitors to the center from each zip code in the Boston area.Using a regression for the probability P cz = N cz /E c (where E c is the population in each CBG c), we found that a power law fit like N cz ∼ d/(2 + d γ ) describes the data accurately for large distances, with exponent γ = 2.47 ± 0.35 and a Root Mean Square Error (RMSE) of 0.288.
Since driving is the most common mode of transportation at large distances, we used the previous fit to model the probability that an individual from CBG c visits a community center z by driving as In this equation, we use the traditional Huff model [4], where the probability is proportional to the attractiveness of the center, s c , and inversely proportional to the distance.Furthermore, from our analysis of the data and previous research conducted by Ibaragoyen et al. (2023) [19] and Hidalgo et al. (2020) [20], we discovered that individuals typically opt not to drive for distances shorter than d  [19], we approximated the probability of visiting a community center c by walking as the probability of walking to an elementary school, and we found that with A = 0.93 ± 0.00344352, B = 0.903 ± 0.00359974, and C = 2.21 ± 0.03706364.Finally, from Ibaragoyen et al.'s (2023) [19] research, we estimated the probability of visiting a community center c by public transportation with the following equation: where β = 3, and d min ≃ 0.5 km.We tested that our results do not depend critically on these assumptions for their functional form.
In our equations, α is a normalization so that the probabilities of going to all community centers z from a CBG c equal one: To determine the number of individuals visiting community center z from a CBG c, we multiply each probability by the relevant eligible population, E c , which varies depending on the model used.In our primary model, the eligible population is the total population of the CBG.Conversely, in our student model, the eligible population is restricted to those aged 5-17 within each CBG.This results in the following equations: where ϕ is a normalization factor so that ∑ c,z N cz equals the known amount of individuals that attend the entire network of BCYF community centers.From the BCYF network attendance data, we know that the total amount of people attending the entire network is 42,686 individuals for the total population model and 20,720 for the 5-17 age group model.
However, this enhanced Huff model remains a competitive one, where the total number of visitors remains constant.In other words, demand never increases.Changes in the center attributes, or the accessibility to them, will only change how the number of visitors is distributed between centers.However, demand is elastic, and the number of visitors to all community centers could increase if center attributes or accessibility to centers were altered.In fact, there is evidence that attendance to public infrastructure increases with accessibility.For example, in [5], it was found that the number of visitors to retail centers decreases when residents have less access to these centers.To account for this accessibility difference between individuals, as was performed in [5], we added a third component to our Equation (6), which decreases the number of visitors from each CBG c to a community center z using the same inverse decay function used in our driving model, P acc (d) = d 2+d 2.33 .Thus, our final equations are Note that this change allows for an increasing demand if accessibility or attractiveness to centers is increased.
Finally, we modeled s z , the attractiveness of each community center z, as a linear combination of the community center attributes x z,i : Table 1 shows the attributes of each of our community centers, ranging from the number of hours open to the number of classes or facilities.We also have binary variables, such as having a pool, computer lab, dance studio, or fitness center.
These attributes are not independent of one another.For example, the number of facilities is heavily correlated with the existence of a computer lab, dance studio, and fitness center (see Figure 2).Thus, we employ a feature selection process, discarding the attributes that have large correlations with one another in order to prevent multicollinearity.Therefore, we choose the number of classes based inside a school as well as the possession of a pool, dance studio, fitness center, and computer lab to model each center's attractiveness.

Optimization of the Model
Since the transportation modal probabilities are fixed to those in the existing literature, our model only depends on the relative weights assigned to the attributes of a center as in Equation (12).We fit them using an optimization algorithm [21] to minimize the RMSE between our predictions and the actual attendance to each center.The result of this optimization is shown in Figure 3, where we can see that our models for the total and 5-17-year-old population are very accurate, with RMSE = 0.090 and RMSE = 0.063, respectively.During our optimization, we found that the most important attributes for the attractiveness of a center are the number of classes offered and the possession of a pool.These two factors held the most weight, as can be seen in Figure 3B.Interestingly, centers that are based inside a school have less attractiveness.This is due to centers located within schools offering fewer hours and classes, since they are closed during the school day, and lacking fitness center facilities, as can be seen by the correlations in Figure 2.
These findings provide insights into our interventions later on, as we now know the most influential factors that can be altered in order to increase attendance. .Relationship between our model predictions and real data for the attendance at each community center for the total population (A) and 5-17-year-old population (C).The coefficient weights found for our model for the total population (B) and the 5-17-year-old population (D).For illustration purposes, the coefficient for the number of classes is normalized to one.

Interventions
Given the high accuracy of our models, attributed by an RMSE of 0.090 for the total population model and 0.063 for the 5-17-year-old population model, we can study the impact on attendance of two interventions: enhancing the center offerings and changing the walking accessibility to centers.
The first is an intervention regarding the attractiveness of centers, where we change their attributes so that more individuals are enticed to visit them.Note that, as Figure 2 shows, individual attributes of a center are not independent.For example, we cannot independently change whether the center is school-based from the number of classes.To determine the direction of an intervention, we used the principal component analysis results for the different attributes to suggest possible linear combinations of changes to a center's attributes.We found that the first principal component is mostly described by the number of classes, having a fitness center, being based inside a school, and having a dance studio; see Figure 2B.Thus, we chose our interventions along the directions of the first PCA.Namely, we studied how attendance at a particular community center changes when the center becomes independent from a school, when we add a fitness center, and when we increase the number of classes the center offers by one hundred.Our results for different centers subject to this intervention can be found in Figure 4.The effect of the intervention is substantial, with percentage increases in attendance of 100-130% in the total population and around 40% in the 5-17-year-old population.We also studied the percentage change in attendance of the low-income population, calculated by multiplying our predicted attendance for each CBG by the fraction of people living under the poverty line in that CBG.In this intervention, we did not see any significant difference in attendance change between the total and low-income populations.
Our second intervention modifies the walking accessibility of a particular center.
To this end, we modified the function P (w) cz (s z , d cz ) so that it becomes more likely that an individual will come to the center walking.This can be achieved by doubling the B coefficient in the Equation (3) so that the function decays slower with distance.Our results for this intervention, shown in Figure 4 Panels B and D, still show a substantial change.In regards to the low-income population, the percentage change is higher than for the total population, suggesting that by changing the walking accessibility of a center, we impact the low-income communities around them more.

Center Attributes Intervention
Center Accessibility Intervention A B C D The geographical effects of these interventions can be seen in Figure 5.As we can see, both interventions significantly increase attendance from nearby CBGs.Nevertheless, we can see in Figure 5B that the increase in attendance extends to CBGs further away from the community center.We attribute this to the fact that driving is the most probable modal choice for larger distances.

Conclusions and Future Work
This study offers a detailed analysis of attendance patterns at BCYF community centers, utilizing enhanced Huff models based on census and real attendance data.Our findings reveal significant insights into factors influencing attendance and effective policy interventions for increasing it.Although our paper contains data based only on BCYF centers, we believe that our methodology and model can be applied to other networks of community centers if we have access to attendance data.We found that the most impactful factor is enhancing the educational and recreational offerings at each center, notably more than improving walking accessibility.Centers within schools are less attractive, suggesting that making centers independent, thereby allowing for more classes, longer hours, and fitness facilities, would significantly boost attendance.
Predicted attendance increases substantially with certain interventions: independence, added fitness centers, and additional classes could lead to attendance increases of 100-130% for the entire population and around 40% for the 5-17 age group, with similar rises in low-income populations.Modifications to walking accessibility show notable impacts as well, showing a 20-40% increase in the entire population and a 15-40% increase in the 5-17-year-old population.However, with this type of intervention, we see a large percentage increase for the low-income population, suggesting that improving pedestrian access greatly benefits under-served communities.Walking accessibility changes could include road closures, limiting the number or type of vehicles permitted on the road past a certain hour, or increased crossing guards.The City of Boston already has similar changes in place, such as school crossing guards and the closing of Memorial Drive, proving that such changes not only have effects but are possible.
Despite these insights, this study has limitations: incomplete data for smaller centers, unaddressed detailed transit accessibility, and a simplified linear attractiveness function in the model.Future research could enhance the model accuracy with more a comprehensive data collection and transportation accessibility analysis.Nevertheless, our model's high accuracy renders it a valuable and robust tool, capable of providing policymakers and center directors with data-driven insights to optimize the BCYF program's reach.
under license for this study.Data is available from the authors upon reasonable request and with permission of BCYF.

Figure 1 .
Figure 1.Map of the BCYF community centers in the Boston area.Symbol size is proportional to attendance, and colors indicate if they are school-based.

Figure 2 .
Figure 2. (A) Correlation matrix between center attributes.Each square represents how correlated two attributes are.A score of 1 would mean a perfect positive correlation, and a score of −1 would mean a perfect negative correlation.A score of zero would mean that the attributes are not correlated at all.(B) Biplot of the first two PCA components of center attributes.Each dot represents a community center, projected onto the space defined by the first two principal components.The first principal component (PC1) explains 33.3% of the variance, and PC2 explains 22.8% of the variance in the data.

Figure 3 .
Figure3.Model fitting results.Relationship between our model predictions and real data for the attendance at each community center for the total population (A) and 5-17-year-old population (C).The coefficient weights found for our model for the total population (B) and the 5-17-year-old population (D).For illustration purposes, the coefficient for the number of classes is normalized to one.

Figure 4 .
Figure 4. Effect of interventions.(A) Percent changes in attendance for total and low-income total populations when changing center attributes.(B) Percent changes in attendance for total and lowincome total populations when changing the walking accessibility to centers.(C) Percent changes in attendance for 5-17-year-old and low-income 5-17-year-old populations when changing center attributes.(D) Percent changes in attendance for 5-17-year-old and low-income 5-17-year-old populations when changing the walking accessibility to centers.

Figure 5 .
Figure 5. Geographical representation of the effects of each intervention labeled with the square root of the total attendance per CBG.

Table 1 .
Attributes of the different BCYF community centers considered in the study.