8 NYC 311 Data Exploration
8.1 Introduction
The NYC 311 system includes a large collection of non-emergency service requests submitted by New Yorkers. Using the nycOpenData package, specifically the nyc_311 function, we pulled a random sample of 10,000 requests to explore how different types of complaints vary across boroughs and agencies. We begin by examining overall patterns through descriptive tables and visualizations, and then use a chi-square test to determine whether complaint types differ by borough. To enhance the analysis, an interactive Shiny application is included that allows users to filter the data by borough and agency, adjust how many top complaints to view, and examine the most common issues within each category.
8.2 Loading Libraries
First, we load the necessary libraries required to run the Shiny app and perform the analysis.
8.3 Loading The Dataset
Then, we load the dataset into R so it can be cleaned, explored, and used in the Shiny app.
8.4 Data Cleaning
Since this analysis focuses on boroughs, agencies, and complaint types, we standardized the formatting of these columns to ensure consistency across the dataset.
8.5 Descriptive Statistics
Next, we summarized the number of 311 complaints by borough and identified the three most frequent complaint types within each borough by grouping the data, counting occurrences, and ordering the results. We then visualized these patterns using a faceted bar chart to compare complaint trends across boroughs.
boro_comp <- data_311 %>% count(borough, sort=T)
boro_comp %>%
kable(caption = "Number of 311 Complaints by Borough")| borough | n |
|---|---|
| Bronx | 3455 |
| Brooklyn | 2494 |
| Queens | 1945 |
| Manhattan | 1617 |
| Staten Island | 480 |
| Unspecified | 9 |
top3 <- data_311 %>%
group_by(borough, complaint_type) %>%
summarise(n = n(), .groups = "drop") %>%
arrange(borough, desc(n)) %>%
group_by(borough) %>%
slice_head(n = 3)top3 %>%
ggplot(aes(x = reorder(complaint_type, n),
y = n,
fill = complaint_type)) +
geom_col(fill = "cornflowerblue") +
facet_wrap( ~ borough, scales = "free") +
coord_flip() +
labs( title = "Top 3 Complaint Types in Each Borough",
x = "Complaint Type",
y = "Count") +
theme_minimal()
Figure 8.1: Faceted bar chart showing the top three most frequent 311 complaint types in each NYC borough. Each panel represents a borough, with bars indicating the number of complaints for the three most common complaint categories within that borough.
8.5.0.1 Interperation:
Across all NYC boroughs, Noise – Residential is the most common 311 complaint, showing that noise issues are a widespread citywide concern. Heat/Hot Water and Illegal Parking also appear frequently, especially in the Bronx, Manhattan, and Queens, pointing to ongoing building maintenance and traffic-related problems. Staten Island shows a slightly different pattern, with Blocked Driveway ranking among its top complaints. Overall, the plot highlights both shared concerns across boroughs and a few borough-specific issues.
8.6 Inferential Testing: Chi-Square
Additionally, we run a chi-square test to examine whether complaint types are distributed differently across boroughs.
##
## Pearson's Chi-squared test
##
## data: tab
## X-squared = 7842.4, df = 530, p-value < 2.2e-16
8.6.0.1 Interperation:
The chi-square test showed a significant association between complaint type and borough, χ²(570) = 6420.2, p < .001. This means that the distribution of complaint types is not the same across boroughs, and certain types of complaints occur more frequently in some boroughs than others.
8.7 Interactive Exploration of NYC 311 Complaint Patterns
Here, we set up the inputs for my interactive 311 explorer. Users can choose a borough, select an agency, and adjust how many top complaint types they want to see. These inputs automatically update the plot below.
This interactive bar chart displays the most frequent complaint types for a selected agency. Users can filter by agency and adjust the number of top complaint types shown, with bar lengths representing the total number of complaints.