Fatal Police Encounters — Point Pattern Analysis NYC

Overview

This project analyzes the spatial pattern of fatal police encounters in New York City using point pattern methodologies, producing a counter-map that reframes risk in urban space. Rather than mapping crime risk, the project visualizes and evaluates the spatial concentration of fatal police violence using the Fatal Encounters dataset — determining whether incidents are randomly distributed or statistically clustered across NYC's five boroughs.

Three major analytical components structure the study: (1) first-order intensity analysis using Kernel Density Estimation at multiple bandwidths, (2) second-order spatial dependence testing using Ripley's K function applied globally and by borough grouping, and (3) inhomogeneity adjustment to separate population concentration from true spatial clustering of incidents. A MAUP sensitivity test using a regular fishnet grid validates the robustness of findings across alternative spatial aggregation schemes.

5 NYC boroughs
analyzed

3 KDE bandwidths
tested

10k Population
normalization rate

✓ Significant clustering
confirmed citywide

Study Area & Population Context

The study area encompasses all five NYC boroughs — Manhattan, Bronx, Brooklyn, Queens, and Staten Island. All layers were projected into NAD 1983 (2011) State Plane New York Long Island (US Feet) to minimize distortion in distance-based analyses. A population choropleth using 2020 census tracts reveals strong spatial inhomogeneity: Manhattan and portions of Brooklyn and Queens show particularly high population density, making raw incident counts a misleading measure of risk without population normalization.

New York City study area — five boroughs

Study area — NYC five borough boundary. Analytical extent for all point pattern methods. Projection: NAD 1983 (2011) State Plane NY Long Island.

NYC population density by 2020 census tracts

NYC population — 2020 census tracts choropleth. Strong spatial inhomogeneity is evident, particularly in Manhattan. Source: NYC Open Data.

Data Sources

Dataset	Description	Format	Source
FatalEncounters_NYC	Fatal police encounters from 2000–present, clipped to NYC boundary	Vector point	Fatal Encounters (Finch et al., 2019)
NYC_Borough_Boundaries	Borough-level administrative polygons	Vector polygon	NYC Open Data
NYC_Boundary	Dissolved NYC boundary polygon	Vector polygon	Derived from boroughs
NYC_Census_Tracts_2020	Census tracts with 2020 population counts	Vector polygon	NYC Open Data

Methods — Four-Component Analytical Framework

Data preparation began with duplicate point removal using the Near tool to ensure a 1:1 relationship between location and event — a prerequisite for Ripley's K, which assumes independent events at distinct coordinates. The dissolved NYC boundary was used as the study area extent, with analyses run globally and by contiguous borough groupings to assess water barrier effects.

01
Kernel Density Estimation — First-Order Intensity
KDE visualizes the continuous spatial intensity of fatal encounters. Three bandwidths — 3,000 ft, 6,000 ft, and 10,000 ft — were tested. The 6,000 ft surface was selected as the final visualization: it balanced local cluster detail with regional interpretability, avoiding fragmentation at 3,000 ft and over-smoothing at 10,000 ft.
02
Ripley's K Function — Second-Order Spatial Dependence
Multi-Distance Spatial Cluster Analysis tested whether clustering exceeded complete spatial randomness across 12 distance bands spanning neighborhood to borough scales. Edge correction (simulate outer boundary values) was applied. Analysis was run globally for NYC and separately for three contiguous borough groupings: Manhattan & Bronx, Brooklyn & Queens, and Staten Island.
03
Inhomogeneity Adjustment — Population-Normalized Rates
Incident counts were spatially joined to census tracts and normalized to FE_per10k (incidents per 10,000 residents). An expected value per tract was calculated proportional to population share, and an excess measure derived (Excess_FE = Observed − Expected) to identify tracts with disproportionately high incident counts beyond demographic expectation.
04
MAUP Sensitivity Test — Fishnet Grid
A uniform fishnet grid at 2,000 ft and 4,000 ft cell sizes was generated and clipped to the NYC boundary. Encounter points were spatially joined per cell; population was area-weighted from census tract intersections. Incident rates per 10,000 residents were computed per cell and compared against the tract-based inhomogeneity results to test boundary dependency.

Kernel Density Estimation — Bandwidth Comparison

All three bandwidth surfaces consistently identify Upper Manhattan, the Bronx, and central Brooklyn as primary hotspot zones. Bandwidth choice controls the trade-off between local precision and regional legibility.

Bandwidth	3,000 ft	6,000 ft	10,000 ft
Result	Fragmented local clusters — fine-grained but hard to interpret regionally	Coherent regional clusters — balances detail and readability ✓ Selected	Over-smoothed — merges distinct clusters into broad, undifferentiated zones

KDE 3,000 ft bandwidth — fatal police encounters NYC

KDE — 3,000 ft bandwidth. Localized clusters visible but fragmented. Reveals micro-hotspots at the cost of regional coherence.

KDE 6,000 ft bandwidth — selected final surface

KDE — 6,000 ft bandwidth (selected). Coherent regional clusters in Upper Manhattan, South Bronx, and central Brooklyn. Best balance of local and regional interpretability.

KDE — 10,000 ft bandwidth. Over-smoothed — merges distinct clusters into undifferentiated mass. Water barriers become a significant artifact at this scale.

The stability of primary hotspot locations across all three bandwidths — particularly Upper Manhattan, the South Bronx, and central Brooklyn — indicates that the spatial intensity pattern is not an artifact of any single smoothing parameter choice.

Ripley's K Function — Clustering Results

The global K function shows the observed K curve consistently above the expected K and above the upper confidence envelope across nearly all distance bands — indicating statistically significant clustering citywide. Borough grouping analyses reveal that this clustering is not uniform: it varies in magnitude by geographic context, partly shaped by water separation between boroughs.

NYC — Global

Strongly Clustered

Observed K exceeds Expected K and upper confidence envelope across nearly all distance bands. Statistically significant clustering at both neighborhood and citywide scales.

Manhattan & Bronx

Strongly Clustered

Strongest and most persistent deviations above the confidence envelope. Water separation from other boroughs does not suppress clustering — it may amplify it by containing incident density within a smaller analysis window.

Brooklyn & Queens

Clustered — Moderate

Statistically significant clustering confirmed, though intensity is comparatively more diffuse than the Manhattan & Bronx grouping. Pattern consistent across distance bands.

Staten Island

Variable — Small Sample

Greater variability in K results due to the smaller number of recorded events. Pattern does not approximate complete spatial randomness but results should be interpreted cautiously. Ripley's K is sensitive to sample size at small n.

Inhomogeneity Adjustment — Population-Normalized Risk

Standard point pattern analysis assumes spatial homogeneity — that the underlying population at risk is evenly distributed. In NYC, this assumption is clearly unrealistic. Population density varies dramatically across census tracts, meaning raw incident counts mechanically reflect where people live, not necessarily where risk is elevated. The inhomogeneity adjustment separates population concentration from true spatial concentration of police violence.

Rate Map — Incidents per 10,000 Residents

FE_per10k = (Join_Count / POP2020) × 10,000

Normalizes raw incident counts by local tract population. Reveals where encounter rates are elevated relative to local population size — not simply where more people live.

Fatal police encounters per 10,000 residents — inhomogeneity adjustment rate map, NYC

Incidents per 10,000 residents — population-normalized rate map. High-rate tracts span Manhattan, the Bronx, and parts of Brooklyn and Staten Island, including areas outside the densest population cores. Source: Author's analysis.

Expected vs. Excess

Expected_FE = (Tract Population / Total NYC Population) × Total Incidents

Excess_FE = Observed Incidents − Expected_FE

The Expected map shows where incidents would occur if distributed strictly proportional to population share. Excess identifies tracts where observed counts exceed this baseline — neighborhoods experiencing disproportionate concentration beyond what demographics alone predict.

Inhomogeneity Adjustment — Expected incidents map NYC

Expected incidents — where fatal encounters would occur if distributed strictly proportional to population share. High expected values concentrate in densely populated Manhattan and portions of Brooklyn and Queens. This map primarily reflects demographic structure.

Inhomogeneity Adjustment — Excess incidents map NYC

Excess incidents — observed minus expected. Red tracts exceed their population-proportional baseline. Persistent excess concentrations appear in Upper Manhattan, segments of the Bronx, central Brooklyn, and parts of Staten Island — clustering not explained by population density alone.

The excess map provides the clearest evidence of spatial disproportionality. Clustering observed in the Ripley's K analysis is not solely a reflection of population density — it persists after controlling for demographic concentration in specific neighborhoods across the Bronx, Upper Manhattan, and central Brooklyn.

MAUP Sensitivity Analysis — Fishnet Grid

Because census tracts vary substantially in size and shape across NYC, patterns in tract-level rates may partly reflect administrative boundaries rather than underlying spatial processes. A sensitivity test using a regular fishnet grid at two resolutions evaluates whether the identified hotspots are boundary-dependent artifacts or robust spatial signals.

MAUP fishnet 4,000 ft — encounter counts per cell

Fishnet 4,000 ft — encounter count per grid cell. Pattern is smoother and more generalized than the 2,000 ft grid, but primary high-rate areas persist in Upper Manhattan, South Bronx, and central Brooklyn. Source: Author's analysis.

MAUP fishnet 2,000 ft — cases per 10,000 people

Fishnet 2,000 ft — encounters per 10,000 residents (area-weighted population allocation). Localized high-rate clusters visible in Upper Manhattan, South Bronx, central Brooklyn, and parts of Staten Island. Source: Author's analysis.

Core hotspot regions persist across both fishnet resolutions and align with the tract-based inhomogeneity results — confirming that the identified concentration patterns are not artifacts of census tract boundary configuration. Results are interpreted as robust at the neighborhood-to-community scale.

Discussion

All four analytical components converge: fatal police encounters in NYC are not randomly distributed. The global Ripley's K function confirms statistically significant clustering across both neighborhood and citywide distance scales. Borough subgroup analyses show that this clustering is not driven by a single borough — it is distributed unevenly, with Manhattan and the Bronx showing the strongest concentration and Brooklyn and Queens showing moderate but significant patterns.

The inhomogeneity adjustment is critical in this context. Without population normalization, high incident counts in dense Manhattan tracts would mechanically reflect exposure rather than elevated risk. The excess analysis demonstrates that clustering persists after controlling for demographic structure — certain areas exhibit disproportionately high encounter counts even relative to their population share. This strengthens the inference that spatial concentration cannot be attributed solely to background population density.

Policy implications should be approached cautiously. Spatial concentration does not imply causal explanation. However, identifying statistically significant concentration areas may support targeted inquiry into structural conditions, policing practices, and community-level characteristics. Spatial evidence functions as a diagnostic tool rather than a definitive explanatory model — the 2SFCA produces a measure of potential spatial concentration, not a complete account of the social processes producing it.

Limitations

Crowd-sourced data quality. The Fatal Encounters dataset may contain underreporting, geocoding inaccuracies, or classification inconsistencies. As a crowd-sourced compilation, it represents the most robust publicly available dataset for this topic — but is not officially comprehensive.
Euclidean distance in Ripley's K. The K function relies on Euclidean distance, which does not account for physical barriers such as waterways. NYC's inter-borough water separation influences clustering results for borough groupings — particularly for Staten Island, which is isolated by the Upper Bay.
Population as the only exposure variable. Census population is used as the proxy for the at-risk population but does not incorporate policing intensity, patrol patterns, socioeconomic disadvantage, or built environment variables that may influence spatial concentration.
KDE water barrier artifact. Kernel Density Estimation spreads intensity continuously across space regardless of physical barriers. Waterways between boroughs are not respected by the smoothing kernel, which may slightly inflate density estimates near coastlines.
Standard 2SFCA binary threshold. The standard Ripley's K framework assumes homogeneity under CSR. The inhomogeneous K function (Baddeley et al., 2000) would provide a more formally correct second-order test in the presence of spatial heterogeneity — a recommended extension for future analysis.