Afterschool Landscape Tool: White Paper

Introduction

Dallas Afterschool in collaboration with the Child Poverty Action Lab developed the Afterschool Landscape Tool to understand the landscape of afterschool programming across Dallas County and identify where gaps currently exist. The index was developed to aid Dallas Afterschool and other stakeholders making programmatic investments across Dallas County.

To achieve this aim, data was compiled within our two indices: existing afterschool environment (supply) and current neighborhood conditions (demand). This data was then standardized and combined into each index. Indicators of high demand were weighted to calculate each index for a given census tract, based on the [year] Census. The intersection of each index was then used to determine neighborhoods with high demand and low supply across Dallas County.

The Afterschool LandscapeTool focuses on the current supply and prospective demand of afterschool programs available for children between 5 and 14 years of age. For the purposes of this tool, afterschool programs are defined as programs that run at least 4 afternoons per week for three hours per day. This Index does not track enrichment programs that meet once or twice a week such as sports programs, karate classes, art classes, dance class, or other clubs or similar activities.

The Scores

The Afterschool Landscape Tool contains two indices scored on a scale of 0 to 100.

Existing After School Environment (Supply Index)

Census tracts with a high score on this index have a higher density of afterschool programs and more program seats available relative to the number of eligible children (between the ages of 5 to 14) in a given geographic area.

High Need Neighborhood Conditions (Demand Index)

Census tracts with a high score on this index have a higher? proportion of families and households with afterschool program needs. A high score indicates a larger proportion of single-mother households, children below poverty, lower median household income, food insecurity, and lower transportation accessibility.

Data

The Afterschool Landscape Tool was calculated using data from the US Census Bureau, the American Community Survey, and local Dallas Afterschool data on program capacity and distribution.

American Community Survey

Data from the U.S. Census Bureau’s American Community Survey was accessed via Kyle Walker’s tidycensus R package, drawing on estimates and percentages from tables concerning demographic, economic, housing, and family characteristics of households. An exhaustive list of the specific fields used can be found in the project’s source code - https://github.com/childpovertyactionlab/dallas-afterschool/.

Dallas Afterschool

Data on afterschool program locations and attributes was provided by Dallas Afterschool. Dallas Afterschool continually updates this database, pulling from Texas Department of Family and Protective Services and Child and Adult Food Care Program databases as well as their own knowledge of the afterschool sector.

An extract of their database, as of February 15, 2022, is hosted on the project Github (link). Parents, guardians, and families can go to www.findafterschooldallas.org for a searchable, updated tool to help them find programming for their children. For further inquiries around afterschool program data, please contact Dallas Afterschool’s Director of Advocacy, Yanela Montoya.

Packages

In order to complete this analysis a number of open source R packages developed by the R community were used.

Methodology

Data Collection + Processing

Data used for the Afterschool Program Access Index was sourced from a variety of public and private sources as detailed in the ‘Data’ section above. Data for this project largely comes at two different scales: Census tract level data produced by the Census Bureau, as well as address-level afterschool program data produced by Dallas Afterschool.

Data from the U.S. Census Bureau was accessed through the TidyCensus package in R at the census tract level. Remaining data was accessed as a flat file from original data sources.

All data was then processed using a suite of R packages from the such as sf, dplyr, and stringr in order to filter out any programs that were not related to after school programming across Dallas County. Points features were then summarized to census tract levels and metrics related to percent, total value, density, and rate per capita were calculated.

For indicators in the total value category the goal was to sum all relevant inputs to and obtain the sum value as an indicator (e.g. total Dallas Afterschool Programs). Indicators in the rate per capita category contained inputs which were divided by the relevant population which was divided by 10,000 (e.g. Program seats per 10,000 eligible children). The percentage category was calculated by dividing the relevant input by the relevant population (e.g. percent single-mother households). With these three categories we were able to create a holistic understanding of the supply and demand of afterschool programs.

All data frames were exported into a geopackage database for further analysis.

Test for Normality + Standardization

Once the indicators were calculated it was necessary to transform our variables into a proper standard before calculating the indexes. To do so a test of normality (Shapiro-Wilks) was conducted in order to determine if the data fell within a normal distribution. Indicators which were determined to not fall within a normal distribution by the Shapiro-Wilks test were subsequently log transformed. It was also determined that all indicators must be directionally aligned, to do so any indicators which were related to negative outcomes were multiplied by negative one. This transformation meant that negative indicators would also have negative values, whereas positive indicators would have positive values. The next step involved taking the log transformed indicators and converting them into z-scores in order to have all values within the same scale.

Principal Components Analysis + Score Creation

With a number of metrics included in this analysis, there is considerable overlap or correlation between many of the indicators included for each score. For example, the poverty rate and median household income are both measures of relative wealth in a community. Both are used here, however, to help further identify communities not all where a larger share of residents are living in poverty but where incomes are low relative to other communities even if the poverty rate is low. To include the details and indicators that were identified for the Afterschool Landscape Tool we determined that principal components analysis would help us identify variables that were highly related and in need of adjustment prior to calculating any of the scores. Once scores were calculated, each was converted into a 0 to 100 scale indicating low (0) and high (100) supply or demand for a given community.

Typology Group Creation

Once each score was generated the jenks distribution was calculated in order to identify groups between very low to very high (supply or demand) grouping each score into one of five groups.

Each score was then compared against one another to determine the combination of relative supply and demand in a census tract. These groups were then used to determine the three major typologies.

These typologies were determined by comparing the five previously identified groups and determining where supply and demand compared against one another.

Limitations * Further Changes

One important note with this analysis is that because it is only using afterschool programming affiliated with Dallas Afterschool, it may be missing additional program types such as on campus extra curricular club activities or private programming such as art or music classes that students may attend.

Sources