Overview

Water distribution infrastructure in urban environments faces increasing challenges from aging pipes, material degradation, soil conditions, and topographic stress. Seattle's water system includes thousands of pipe segments varying in age, diameter, pressure class, and material — all factors that influence the likelihood of a main break, which can disrupt service, damage property, and require costly emergency response.

This project developed a reproducible, automated Python-based script tool for ArcGIS Pro that generates a composite failure risk index (RISK_IDX) for every water main segment in Seattle. The tool reads pipe attributes directly from Seattle Public Utilities' open dataset, normalizes each attribute into a risk metric, and computes a weighted composite score — producing a spatially interpretable risk layer across the entire city network.

The tool is designed to be highly portable: because it relies exclusively on attributes present in most utility datasets (age, material, diameter, pipe class, relining history), it can be adapted to any municipality that maintains comparable infrastructure records.

5Risk components
in composite index
0–1RISK_IDX
continuous range
2Script tool
parameters
7Fields written
to output FC

Study Area & Output Maps

Seattle study area — water main network

Figure 1 — Study area: Seattle's water main network loaded from Seattle Public Utilities open data. Pipe segments vary in age, material, diameter, and pressure class.

Water mains before risk scoring

Figure 2 — Water mains prior to risk scoring. Attribute fields reviewed: INSTALL_YR, DIAMETER, PIPE_CLASS, MATERIAL, RELINED_YR.

Final RISK_IDX output map — Seattle water mains

Figure 3 — Final RISK_IDX output. Older neighborhoods (Capitol Hill, Ballard, Fremont) show elevated risk from cast iron pipes. Relined central corridors appear lower risk.

The Risk Index Formula

The composite risk index is a weighted sum of five normalized component scores, each derived from pipe attribute fields. Weights reflect the relative importance of each factor to infrastructure failure probability.

RISK_IDX — Composite Formula
RISK_IDX = (
    0.30 × AGE_RISK      — pipe age normalized over 0–100 years
  + 0.25 × MAT_RISK     — material type lookup (CI=0.8, DI=0.6, PVC=0.3)
  + 0.20 × DIAM_RISK    — smaller diameter = higher risk (4–36")
  + 0.15 × CLASS_RISK   — lower pressure class = higher risk (50–300)
  + 0.10 × RELINE_RISK — relined pipe = 0.2, unrelined = 1.0
)
ComponentSource Field(s)WeightVisualLogic
AGE_RISK INSTALL_YR / INSTALL_DT 30%
Age in years ÷ 100, clamped 0–1. Falls back from year to datetime field.
MAT_RISK MATERIAL 25%
Lookup table maps material code to risk value. Cast iron highest.
DIAM_RISK DIAMETER 20%
Inverted normalization — 4" = highest risk, 36" = lowest. Clamped in inches.
CLASS_RISK PIPE_CLASS 15%
Reverse-normalized 50–300 psi range. Lower pressure class = higher risk.
RELINE_RISK RELINED_YR / RELINED_DT 10%
Binary: relined pipe = 0.2 (lower risk), unrelined = 1.0 (default).

Material Risk Lookup

Pipe material is the second most heavily weighted risk factor. The lookup table encodes known corrosion and failure susceptibility across material types commonly found in Seattle's aging network.

CodeMaterialRisk ValueBarCategory
CI / CAST IRONCast Iron0.80
High
ACAsbestos Cement0.75
Med-High
DIDuctile Iron0.60
Med-High
STEELSteel0.50
Moderate
PVCPVC0.30
Low
HDPEHDPE0.30
Low

Output Risk Scale — RISK_IDX Results

The final RISK_IDX ranges from 0.372 to 0.794 across Seattle's water main network. Five natural-break classes from the output legend define the risk tiers visible in the map.

0.3720.5500.6500.7000.7580.794
0.372–0.550Lower risk
0.550–0.650Moderate
0.650–0.700Elevated
0.700–0.758High
0.758–0.794Highest

Methodology — Development Workflow

Python — Core Script Tool

The complete tool runs as an ArcGIS Pro script tool with exactly two parameters. The core logic uses arcpy.da.UpdateCursor to iterate over every pipe segment and write all five component risk values plus the composite index.

"""
Water main risk index (attribute-only).
Param 0: in_mains (Feature Class / Layer)
Param 1: out_mains (Feature Class)
"""
import arcpy
import datetime

material_risk_lookup = {
    "CI": 0.8,  "CAST IRON": 0.8,
    "AC": 0.75, "DI": 0.6,
    "STEEL": 0.5, "PVC": 0.3, "HDPE": 0.3
}

# Composite risk index (weights sum to 1.0)
risk_idx = (
    0.30 * age_risk +
    0.25 * mat_risk +
    0.20 * diam_risk +
    0.15 * class_risk +
    0.10 * reline_risk
)

# Helper: add DOUBLE field only if not already present
def add_double_field(fc, name):
    if name not in [f.name for f in arcpy.ListFields(fc)]:
        arcpy.management.AddField(fc, name, "DOUBLE")

# Output fields written per segment:
# AGE_YRS · AGE_RISK · DIAM_RISK · MAT_RISK
# CLASS_RISK · RELINE_RISK · RISK_IDX

Dataset Inventory

DatasetSourceTypeCRSStatus
Water_Mains.shp
Seattle Public Utilities
data.seattle.gov
Shapefile NAD83 HARN WA North Active
Break / Outage Events
Seattle Public Utilities
Outage Viewer
Feature Service WGS84 Web Mercator Planned
SSURGO Soil Polygons
USDA NRCS
websoilsurvey.sc.egov.usda.gov
Shapefile NAD83 Planned
1m DTM (King County West 2021)
WA LiDAR Portal
lidarportal.dnr.wa.gov
Raster (GeoTIFF) NAD83 HARN WA North Planned
WSDOT Traffic Sections (AADT)
WA Spatial Data Hub
geo.wa.gov
Shapefile NAD83 HARN WA North Planned
City_Limits.shp
Seattle Open Data
data.seattle.gov
Shapefile NAD83 HARN WA North Active

Limitations

Future Work

Slope Integration

Preprocess DTM using Add Surface Information to create a SLOPE_MEAN field per pipe. Normalize to represent topographic pressure stress as an additional risk component.

Soil Corrosivity

Derive corrosivity index from SSURGO component tables, spatially joined to water mains. Accounts for environmental degradation processes in buried iron pipe environments.

Machine Learning

Combine historical break records with attribute and environmental variables to train a gradient boosting or random forest model using scikit-learn and ArcGIS Spatially Enabled DataFrames.

The tool's modular structure and reliance on universally available pipe attributes (age, material, diameter, relining, class) make it directly replicable in any municipality that maintains comparable infrastructure records — positioning it for broader use in both academic and professional utility contexts.