15 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
NEVER EVER RUN GIT COMMANDS!!
Project Overview
Property Map is a full-stack geospatial application for visualizing UK property data on an interactive map. It combines Land Registry price-paid data, EPC energy certificates, postcode geolocation, TFL journey times, Index of Deprivation scores, crime statistics, ethnicity data, broadband speeds, school ratings, road noise, and OpenStreetMap POIs into a single wide parquet file, then serves aggregated H3 hexagon statistics and POI data via a Rust backend.
Commands
All commands use Task runner. Python uses uv run. Frontend uses npm run from frontend/.
# Development servers
task dev:server # Rust backend on :8001 (cargo run --release)
task dev:frontend # Webpack dev server on :3030 (proxies /api to :8001)
# Data pipeline
task prepare # Build wide.parquet from all pre-downloaded sources
# Quality
task lint # Lint all: Python (ruff) + TypeScript (ESLint+Prettier) + Rust (clippy+fmt)
task format # Auto-fix formatting for all languages
task test # Python tests (fuzzy join, haversine, POI counts)
task check # Full validation: lint + build + test
# Building
task build:frontend # TypeScript typecheck + webpack production build
task build:server # cargo build --release (NOTE: dir is wrong in Taskfile, run from server-rs/)
# Granular lint/format
task lint:python # uv run ruff check .
task lint:frontend # eslint + prettier --check
task lint:rust # cargo clippy -- -D warnings && cargo fmt --check
task format:python # ruff check --fix && ruff format
task format:frontend # eslint --fix + prettier --write
task format:rust # cargo fmt --all
Running individual tests:
uv run pytest pipeline/utils/test_haversine.py # Single test file
uv run pytest pipeline/utils/test_haversine.py -k "test_name" # Single test
Architecture
Data Flow
Raw sources → [Download scripts] → data/*.parquet
→ [Fuzzy join EPC ↔ Price-Paid] → epc_pp.parquet
→ [Merge all datasets] → wide.parquet
→ [Rust server loads into memory + precomputes H3 + spatial grid]
→ [Frontend renders deck.gl H3HexagonLayer over MapLibre GL]
Data Pipeline (pipeline/)
Python + Polars. Two phases:
- Download (
pipeline/download/) — Each script fetches one raw dataset intodata/ - Transform (
pipeline/transform/) — Joins and derives features:join_epc_pp.py— Fuzzy-joins EPC ↔ price-paid by address within postcode bucketsmerge.py— Main pipeline: joins all datasets →wide.parquetwith human-readable column namestransform_poi.py— Filters POIs, maps to friendly names + emoji (exhaustive category validation)poi_proximity.py— Counts POIs within 2km per postcode using 0.05° spatial gridcrime.py— Aggregates crime CSVs into yearly averages by LSOA
Critical: column renaming in merge.py — The pipeline renames columns from snake_case to human-readable names before writing wide.parquet. The Rust server auto-discovers features from whatever column names exist in the parquet. Key renames:
pp_address→Address per Property Registerpostcode→Postcodelatest_price→Last known priceduration→Leashold/Freeholdtotal_floor_area→Total floor area (sqm)current_energy_rating→Current energy rating
The server and frontend must handle these human-readable names. See the full rename map in merge.py.
Backend (server-rs/)
Rust + Axum. Loads parquet into memory at startup.
Structure (uses Rust 2018 module style — foo.rs + foo/ directory, not foo/mod.rs):
data.rs+data/— Property and POI data loadingparsing.rs+parsing/— Filter parsing and bounds parsingroutes.rs+routes/— One file per endpointutils.rs+utils/— GridIndex, hashing, interned columnsconsts.rs— Key constants (histogram bins, H3 range, max enum cardinality, excluded columns)
API endpoints:
GET /api/features— Feature metadata with histograms and 2nd/98th percentilesGET /api/hexagons?resolution=&bounds=&filters=— H3 aggregates (min/max per feature per hex)GET /api/hexagon-properties?h3=&resolution=&filters=&limit=&offset=— Paginated properties within a hexagonGET /api/pois?bounds=&categories=— POIs by bounds (max 5000)GET /api/poi-categories— Available POI category names
Serves frontend/dist/ as static fallback in production.
Data representation (unified model):
- All features (numeric and enum): row-major flat
Vec<f32>, NaN = null - Enum features: stored as f32 indices (0.0, 1.0, 2.0...) with
enum_values: FxHashMap<usize, Vec<String>>mapping feature index → string values - String fields (address, postcode): interned/packed for memory efficiency
- The server accepts the parquet path as a CLI argument (defaults to
data_sources/processed/wide.parquet)
Frontend (frontend/)
React 18 + TypeScript. deck.gl H3HexagonLayer over MapLibre GL. TailwindCSS. No state management library — pure React hooks.
Key patterns:
App.tsxmanages all state, API fetching (150ms debounce), and URL state sync (300ms debounce)- URL encodes view/filters/POI categories/active tab as query params for shareable links
- AbortControllers cancel in-flight requests on new queries
- Zoom → H3 resolution:
<7→7, <9.5→8, <11→9, <13→10, ≥13→11 - Bounds quantized to 0.01° to match backend caching
- Properties pane uses feature names from API response (human-readable), not hardcoded field names
- Proxy: dev server on :3030 proxies
/apito :8001; also handles VS Code/proxy/PORTpatterns
Frontend Design Guide (STRICT — must be followed for all UI changes)
The frontend uses Tailwind's darkMode: 'class' strategy. The dark class is toggled on <html>. Every visible element must have both light and dark styles. Never add a light-only color class without its dark: counterpart. Run task build:frontend after any UI change to verify.
Theme System
- State:
App.tsxowns athemestate ('light' | 'dark' | 'system'), persisted inlocalStorageunder the keytheme, default'system'. - Effective theme: When
'system', resolved viawindow.matchMedia('(prefers-color-scheme: dark)'). Achangelistener re-renders on OS preference flip. - Toggle cycle: light → dark → system → light. Three-way, not binary.
- Flash prevention:
index.htmlcontains an inline<script>that applies thedarkclass before first paint. If the localStorage/matchMedia logic in that script changes, update it to matchApp.tsx. - Prop plumbing:
effectiveTheme('light' | 'dark') is passed as a prop to<Map>and<HomePage>. Components that need the resolved theme must receive it as a prop — do not read localStorage or matchMedia inside child components.
Color Token Reference
Every UI element must use the correct token from this table. Do not invent new pairings.
| Role | Light class | Dark class | Hex (dark) |
|---|---|---|---|
| Page / pane background | bg-warm-50 or bg-white |
dark:bg-warm-900 |
#1c1917 |
| Card / elevated surface | bg-white |
dark:bg-warm-800 |
#292524 |
| Inset / recessed surface | bg-warm-100 or bg-warm-50 |
dark:bg-warm-800 |
#292524 |
| Input / select background | bg-white |
dark:bg-warm-800 or dark:bg-warm-900 |
|
| Primary border | border-warm-200 |
dark:border-warm-700 |
#44403c |
| Subtle border (dividers) | border-warm-100 |
dark:border-warm-800 |
#292524 |
| Primary text (headings) | text-navy-950 or implicit dark |
dark:text-warm-100 |
#f5f5f4 |
| Body text | text-warm-700 |
dark:text-warm-300 |
#d6d3d1 |
| Secondary text (labels, hints) | text-warm-500 or text-warm-600 |
dark:text-warm-400 |
#a8a29e |
| Disabled / placeholder text | text-warm-400 / placeholder-warm-400 |
dark:text-warm-500 / dark:placeholder-warm-500 |
#78716c |
| Accent text (links, actions) | text-teal-600 |
dark:text-teal-400 |
#1de4c3 |
| Accent hover text | hover:text-teal-800 |
dark:hover:text-teal-300 |
#51f7d9 |
| Accent background (highlights) | bg-teal-50 |
dark:bg-teal-900/30 |
|
| Active ring / focus ring | ring-teal-400 |
same — works in both | |
| Price / key metric text | text-teal-700 |
dark:text-teal-400 |
|
| Remove / close button | text-warm-400 hover:text-warm-700 |
dark:hover:text-warm-300 |
|
| Checkbox accent | accent-teal-600 |
same — works in both | |
| Header (unchanged both modes) | bg-navy-900 text-white |
same |
Mapping Rules for Specific Contexts
Sidebars (Filters, POIPane, PropertiesPane, right-pane tabs):
- Container:
bg-white dark:bg-warm-900 - Inner cards / dropdown menus:
bg-white dark:bg-warm-800 - Borders:
border-warm-200 dark:border-warm-700 - Tab text (active): add
dark:text-warm-100 - Tab text (inactive):
text-warm-600 dark:text-warm-400
Map overlays (PostcodeSearch, MapLegend, POI popup, loading indicator):
- Background:
bg-white dark:bg-warm-800 - Text:
dark:text-warm-200 - Semi-transparent variants: use
/90opacity suffix (e.g.dark:bg-warm-800/90) - Deck.gl tooltip (inline styles, not Tailwind): use
#292524bg /#e7e5e4text /rgba(0,0,0,0.5)shadow in dark. - Deck.gl postcode labels (RGB arrays):
[220,220,220,220]text /[30,30,30,200]outline in dark; inverse in light.
Map basemaps:
- Light:
https://basemaps.cartocdn.com/gl/voyager-gl-style/style.json - Dark:
https://basemaps.cartocdn.com/gl/dark-matter-gl-style/style.json handleMapLoadmust only apply label/water tweaks in light mode. Dark Matter has good defaults.
HomePage (landing page):
- Page bg:
bg-warm-50 dark:bg-warm-900 - Cards:
bg-white dark:bg-warm-800withborder-warm-200 dark:border-warm-700 - Backdrop-blur panels: use
/60or/40opacity on bothbg-warm-50anddark:bg-warm-900 - HexCanvas: reads
isDarkref; uses dimmer fill (#058172) and stroke (#0a665b) at 60% opacity multiplier. - All headings:
dark:text-warm-100. All body:dark:text-warm-300ordark:text-warm-400.
DataSourcesPage:
- Same card pattern as above. Footer is already dark (
bg-navy-900) — no changes needed. - License badges:
bg-warm-100 dark:bg-warm-700 text-warm-600 dark:text-warm-300 - Links:
text-teal-600 dark:text-teal-400
DataSources floating button (on map):
bg-white/90 dark:bg-warm-800/90withtext-teal-600 dark:text-teal-400
Rules for New Components
- Every
bg-whiteneedsdark:bg-warm-800ordark:bg-warm-900. Pane-level = warm-900, card-level = warm-800. - Every
border-warm-200needsdark:border-warm-700. - Every
text-warm-*needs adark:text-warm-*counterpart. Follow the token table — don't guess. - Every
text-teal-600needsdark:text-teal-400. Everyhover:text-teal-800needsdark:hover:text-teal-300. - Every
bg-teal-50needsdark:bg-teal-900/30. - Every
hover:bg-warm-50needsdark:hover:bg-warm-700ordark:hover:bg-warm-800. - Inputs and selects: always add
dark:bg-warm-800 dark:text-warm-200 dark:border-warm-700. Placeholders getdark:placeholder-warm-500. - Checkboxes: always include
accent-teal-600 rounded. - Do not use Tailwind
dark:classes inside deck.gl layers or canvas code. Use thethemeprop / ref and conditional JS values. - Do not add
transition-*classes for theme switching. The global CSS rule inindex.csshandles transitions forbackground-color,border-color, andcoloron all standard HTML elements. Adding per-element transition classes will conflict. - Never hardcode hex colors in JSX
style=props for themed elements (except deck.gl tooltip and canvas, which can't use Tailwind). Use the Tailwind classes from the token table instead. - The header (
bg-navy-900) is identical in both themes. Do not add dark variants to it.
Verification Checklist (for any UI PR)
task build:frontendpasses with no errors- Every new
bg-*,text-*,border-*class has adark:counterpart (search your diff) - Toggle through all three modes (light → dark → system) with no flash
- Map basemap switches when theme changes
- Sidebars, dropdowns, and popups are readable in both modes
- HomePage and DataSourcesPage adapt correctly
Coding Preferences
- Unified data models over special-casing: Prefer storing different data types uniformly (e.g., enums as f32 indices alongside numeric features) rather than maintaining separate code paths
- Terse tests: Test what matters in as few tests as possible — don't overcomplicate with excessive setup or edge cases that don't add value
- Extract and organize: Group related utilities into proper modules (e.g.,
utils/,parsing/) rather than leaving helpers scattered - Inline module tests: Place
#[cfg(test)] mod tests { }at the bottom of each module file rather than in separate test files
Rust Code Style (server-rs)
Follow these conventions in all Rust code:
- Module style: Use Rust 2018 module naming —
foo.rs+foo/directory, NOTfoo/mod.rs - Imports over inline paths: Import items at the top of the file, don't use
crate::inline in code// Good use crate::utils::generate_priorities; let p = generate_priorities(n); // Bad let p = crate::utils::generate_priorities(n); - Tracing macros: Import and use short form, not fully qualified
// Good use tracing::{info, warn}; info!("message"); // Bad tracing::info!("message"); - JSON serialization: Use
serde_jsonwith#[derive(Serialize)]structs, not manual string building - Precompute at startup: For static/rarely-changing responses, compute once at startup and store in
AppState - Unique placeholders: When injecting content into HTML, use distinctive markers like
__NARROWIT_OG_TAGS__that won't accidentally match other content
Key Implementation Details
- Spatial sort: Rows sorted by 0.01° grid cell at load time for cache-friendly sequential access
- Row-major layout:
feature_data[row * num_features + feat_idx]— all features (numeric and enum) for one property are contiguous - H3 precomputation: Resolutions 4–12 computed in parallel (rayon) at startup
- Histogram percentiles without sorting: O(n) two-pass algorithm — build histogram, interpolate percentiles
- Startup precomputation: Static responses (like
/api/features) are computed once at startup and cached inAppState - POI transform validation: Fails if any OSM category is unmapped — guarantees exhaustive coverage
- Fuzzy join: Groups by postcode, uses
thefuzz.token_sort_ratiowith numeric token compatibility, greedy assignment from highest score - Filter bounds format:
south,west,north,east(not standard bbox order) - POI proximity: Uses 0.05° grid (~5km cells) to reduce candidates before haversine distance check
- OG tag injection: Uses
<meta name="x-og-placeholder" content="__NARROWIT_OG_TAGS__"/>placeholder in HTML, replaced at runtime by middleware