Lessons learned in working with this dataset:

## 
##  Downloading file 1 of 1: `parks.csv`
## # A tibble: 41 x 25
##     rank city  med_park_size_d… med_park_size_p… park_pct_city_d…
##    <dbl> <chr>            <dbl>            <dbl> <chr>           
##  1     1 Wash…              1.4              5   21%             
##  2     2 St. …              3.2             15   15%             
##  3     3 Minn…              5.7             27.5 15%             
##  4     4 Arli…              2.4             10   11%             
##  5     5 Port…              4.9             22.5 18%             
##  6     6 Irvi…              6.1             30   27%             
##  7     7 San …              1.3              5   20%             
##  8     8 Cinc…              4.4             20   14%             
##  9     9 New …              1.1              5   22%             
## 10    10 Chic…              2.2             10   10%             
## # … with 31 more rows, and 20 more variables: park_pct_city_points <dbl>,
## #   pct_near_park_data <chr>, pct_near_park_points <dbl>,
## #   spend_per_resident_data <chr>, spend_per_resident_points <dbl>,
## #   basketball_data <dbl>, basketball_points <dbl>, dogpark_data <dbl>,
## #   dogpark_points <dbl>, playground_data <dbl>, playground_points <dbl>,
## #   rec_sr_data <dbl>, rec_sr_points <dbl>, restroom_data <dbl>,
## #   restroom_points <dbl>, splashground_data <dbl>, splashground_points <dbl>,
## #   amenities_points <dbl>, total_points <dbl>, city_dup <chr>

How does the amount of spending on parks per capita relate to the cities’ scores in different years?

Diminishing returns for additional spending, especially in recent years

Looking at the 2020 data in particular, it looks like there is diminishing returns for increasing scores with spending above $200 per resident. Exploring that more here.

If we look at linear models fit on the whole dataset and only fit for the cities with spending of at most $200 per resident, a linear model fits better when only cities with spending at most $200. This is the case when looking at just the adjusted R^2 values …

adj_rSquare_allData adj_rSquare_spendingAtMost200
0.7171783 0.7248579

This better model fit for the reduced model is also apparent when looking at the model residuals as a function of per capita spending. The residuals from the full model show a clear trend for cities with more spending, but are more randomly occurring in the model only considering cities with at most $200 per capita spending.

Conclusion: It’s expected to have an asymptote to values where fractional response values are measured, so its presence isn’t surprising, but it’s useful to see that the asymptote occurs around $200 per capita spending. This could potentially help guide overall city budgets to help maximize the accessibility and cultural importance of parks while balancing other civic priorities.

Are there consistent differences in park features and amenities by state?

Group the cities by state and then see if there are consistent differences in the characteristics of parks for given states/regions of the country (i.e. are splash pads more common in areas with longer summers and warmer weather overall?)

In most states, different cities have strongly different per-capita expenditures on parks

The number of playgrounds per 10,000 residents varies widely across cities in different states, but is typically around 3.

There are far fewer dog parks compared to playgrounds (~1 dog park per 100,000 residents), but dogs have many more to choose from in Idaho and Oregon (~5 dog parks per 100,000 residents)

Splash pads appear more common in the Mid-Atlantic and New England (PA, NY, MA)

There are around 2 restrooms per 10,000 residents in most states, but there’s a fair amount of variability. Parks in Minnesota generally have the most restrooms.

There are generally about 3 basketball hoops per 10,000 residents, but with large amounts of variation among cities in Virginia and California.

The number of recreation and senior centers per 20,000 residents varies a lot between states, with 1 rec center being the most common.

The percentage of residents living within a 10 minute walk to a park varies widely, between 35% and 100%.