Skip to main content
Municipal Government

Civic Infrastructure AI

Multi-pipeline AI platform: 197 asset types detected, 86K trees assessed, 99%+ crash extraction accuracy, 6,400+ automated tests

197 asset types

86K trees assessed

99%+ extraction accuracy6,400+ tests
Civic infrastructure AI case study — production AI system processing public-sector documents for a government agency, with privacy-preserving classification and audit trails

The Problem

Manual inspection of street-level imagery, tree hazards, road conditions, and crash reports at city scale — across millions of images and thousands of documents

Manual inspection of street-level imagery, tree hazards, road conditions, and crash reports at city scale. A major municipal government preparing for a global sporting event needed to catalog infrastructure assets, assess tree hazards, evaluate roadway conditions, and extract structured data from crash reports — all across an entire metropolitan area. Manual inspection could not scale. Street-level imagery existed but lacked automated analysis. 93,000+ street trees needed hazard scoring. 763 miles of roadway required condition assessment at 20-foot intervals. Thousands of crash reports across multiple jurisdictions sat as scanned PDFs and TIFFs with no structured data. PII in public imagery and crash documents created compliance risk that blocked deployment.

We built a multi-pipeline AI platform on AWS. The infrastructure detection pipeline uses Claude Sonnet and LiDAR point cloud processing to classify 197 asset types from street-level imagery, with position refinement, cross-collection deduplication, and interactive GeoJSON reports. The system runs on ECS Fargate with parallel workers and Bedrock Batch inference for 50% cost savings at scale.

A tree hazard assessment pipeline processed 86,164 street trees across 7 major event sites, achieving a 97.2% assessment rate. KD-tree bearing-based image selection finds the 15 best views per tree from 4.4 million camera records. A semantic retry mechanism improved 8,048 assessments from non-assessable to assessable — identifying 24 critical-priority and 3,085 high-priority trees. A separate roadway assessment pipeline extracts 12+ condition attributes from 205,707 survey points across 763 miles at 20-foot intervals.

A crash report extraction system handles multi-jurisdiction document processing with a hybrid OCR + LLM pipeline achieving 99.36–99.78% field accuracy. Pre-extraction PII redaction blacks out 40–65 sensitive fields in document images before any ML processing — zero PII reaches cloud AI services. A crash narrative analyzer extracts 30 structured fields from narrative text across 13 jurisdictions with post-check validation rules. The full platform spans 6,400+ automated tests across Python and Rust codebases.

86K

Trees assessed

99%+

Extraction accuracy

6,400+

Automated tests

Editorial notes

Mandate

Collapse fragmented review cycles into a single delivery cadence without weakening municipal controls.

Signal

Design authority came from making evidence and operator confidence visible at every stage, not hiding complexity behind marketing language.

Operational insert

A multi-domain AI platform built for municipal trust

The core problem was not building one model. It was building five specialized AI pipelines — infrastructure, trees, roads, crash documents, and narratives — each with domain-specific accuracy requirements, and making their evidence legible enough for operators to act on without interpretive guesswork.

Domain coverage

Infrastructure assets, tree hazards, roadway conditions, crash reports, and narrative analysis — each pipeline operates independently but shares a common compliance and delivery framework.

Risk control

Pre-extraction PII redaction blacks out sensitive fields in document images before any ML processing. Zero PII reaches cloud AI services. SOC 2 certified.

Operator confidence

Interactive Leaflet maps, GeoJSON downloads, and drill-down reports were structured to support municipal review and audit visibility — not just raw model throughput.

Diagram showing civic infrastructure AI workflow from imagery intake to privacy-safe municipal review

Operational read

Infrastructure assets, tree hazards, roadway conditions, crash reports, and narrative analysis — each pipeline operates independently but shares a common compliance and delivery framework.

Pre-extraction PII redaction blacks out sensitive fields in document images before any ML processing. Zero PII reaches cloud AI services. SOC 2 certified.

Interactive Leaflet maps, GeoJSON downloads, and drill-down reports were structured to support municipal review and audit visibility — not just raw model throughput.

Context

A major municipal government preparing for a global event needed to catalog 197 infrastructure asset types, assess 93K street trees for hazards, evaluate 763 miles of roadway, and extract structured data from thousands of multi-jurisdiction crash reports.

Constraint

PII in imagery and documents blocked deployment. Manual inspection could not scale. Each domain (infrastructure, trees, roads, crashes) required specialized AI pipelines with domain-specific accuracy validation.

Intervention

Built a multi-pipeline AI platform: infrastructure detection (Claude + LiDAR), tree hazard scoring (KD-tree image selection, semantic retry), roadway assessment (12+ attributes at 20ft intervals), and crash extraction (hybrid OCR + LLM, pre-extraction PII redaction, 99%+ accuracy).

Outcome

197 asset types detected, 86K trees assessed (97.2% rate, 24 critical found), 99.36–99.78% crash extraction accuracy, pre-extraction PII compliance, 6,400+ tests — delivered in 35 weeks.

Architecture

Multi-domain AI platform from imagery to structured intelligence

Vision Detection

Claude Sonnet and multi-model inference classify 197 infrastructure asset types and 12+ roadway condition attributes from street-level imagery. LiDAR point clouds provide sub-meter geospatial accuracy. Bedrock Batch reduces inference cost by 50%.

Tree Hazard Assessment

KD-tree bearing-based image selection across 4.4M camera records finds the 15 best views per tree. Semantic retry improves 8,048 assessments. 86,164 trees scored on a 0–100 hazard scale across 13 condition factors.

Document Extraction

Hybrid OCR (Textract) + LLM (Claude) pipeline extracts 167–215 fields per crash report with field-type-aware consensus. Pre-extraction PII redaction blacks out sensitive fields in images before any ML processing. 99.36–99.78% accuracy across 3 jurisdictions.

Data Platform

Results stored in Snowflake with geospatial indexing. Interactive HTML reports with Leaflet maps, GeoJSON downloads, and drill-down details. REST APIs serve extraction results to downstream systems. 6,400+ automated tests across Python and Rust.

Tech Stack

Compute

AWS ECS Fargate + Bedrock Batch

AI Models

Claude Sonnet 4.6, Qwen 3 VL, Grounding DINO

Languages

Python + Rust (54K LOC Rust)

Data

Snowflake, S3, Textract OCR

Spatial

LiDAR point clouds, KD-tree indexing

Compliance

Pre-extraction PII redaction, SOC 2

Results

197

Infrastructure asset types

86K

Trees assessed (97.2% rate)

99%+

Crash extraction accuracy

763 mi

Roadway assessed

6,400+

Automated tests

50%

Inference cost reduction

Advisory Mandate

Planning a Similar Mandate?

A direct working session about the problem, the constraints, and the fastest credible path forward.

We respond within 4 hours during business hours

Subscribe

AI engineering insights. No spam.