345 lines
22 KiB
Markdown
345 lines
22 KiB
Markdown
# CLAUDE.md
|
||
|
||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||
|
||
NEVER EVER RUN GIT COMMANDS!!
|
||
|
||
## Project Overview
|
||
|
||
Property Map is a full-stack geospatial application for visualizing UK property data on an interactive map. It combines Land Registry price-paid data, EPC energy certificates, postcode geolocation, TFL journey times, Index of Deprivation scores, crime statistics, ethnicity data, broadband speeds, school ratings, road noise, and OpenStreetMap POIs into a single wide parquet file, then serves aggregated H3 hexagon statistics and POI data via a Rust backend.
|
||
|
||
## Commands
|
||
|
||
All commands use [Task](https://taskfile.dev) runner. Python uses `uv run`. Frontend uses `npm run` from `frontend/`.
|
||
|
||
```bash
|
||
# Development servers
|
||
task dev:server # Rust backend on :8001 (cargo run --release)
|
||
task dev:frontend # Webpack dev server on :3001 (proxies /api to :8001)
|
||
|
||
# Data pipeline
|
||
task prepare # Build wide.parquet from all pre-downloaded sources
|
||
|
||
# Assets
|
||
task download:map-assets # Download font glyphs + twemoji PNGs into frontend/public/assets/
|
||
|
||
# Quality
|
||
task lint # Lint all: Python (ruff) + TypeScript (ESLint+Prettier) + Rust (clippy+fmt)
|
||
task format # Auto-fix formatting for all languages
|
||
task test # Python tests (fuzzy join, haversine, POI counts)
|
||
task check # Full validation: lint + build + test
|
||
|
||
# Building
|
||
task build:frontend # TypeScript typecheck + webpack production build
|
||
task build:server # cargo build --release (NOTE: dir is wrong in Taskfile, run from server-rs/)
|
||
|
||
# Granular lint/format
|
||
task lint:python # uv run ruff check .
|
||
task lint:frontend # eslint + prettier --check
|
||
task lint:rust # cargo clippy -- -D warnings && cargo fmt --check
|
||
task format:python # ruff check --fix && ruff format
|
||
task format:frontend # eslint --fix + prettier --write
|
||
task format:rust # cargo fmt --all
|
||
```
|
||
|
||
Running individual tests:
|
||
```bash
|
||
uv run pytest pipeline/utils/test_haversine.py # Single test file
|
||
uv run pytest pipeline/utils/test_haversine.py -k "test_name" # Single test
|
||
```
|
||
|
||
## Architecture
|
||
|
||
### Data Flow
|
||
|
||
```
|
||
Raw sources → [Download scripts] → data/*.parquet
|
||
→ [Fuzzy join EPC ↔ Price-Paid] → epc_pp.parquet
|
||
→ [Merge all datasets] → wide.parquet
|
||
→ [Rust server loads into memory + precomputes H3 + spatial grid]
|
||
→ [Frontend renders deck.gl H3HexagonLayer over MapLibre GL]
|
||
```
|
||
|
||
### Data Pipeline (`pipeline/`)
|
||
|
||
Python + Polars. Two phases:
|
||
|
||
1. **Download** (`pipeline/download/`) — Each script fetches one raw dataset into `data/`
|
||
2. **Transform** (`pipeline/transform/`) — Joins and derives features:
|
||
- `join_epc_pp.py` — Fuzzy-joins EPC ↔ price-paid by address within postcode buckets
|
||
- `merge.py` — **Main pipeline**: joins all datasets → `wide.parquet` with human-readable column names
|
||
- `transform_poi.py` — Filters POIs, maps to friendly names + emoji (exhaustive category validation)
|
||
- `poi_proximity.py` — Counts POIs within 2km per postcode using 0.05° spatial grid
|
||
- `crime.py` — Aggregates crime CSVs into yearly averages by LSOA
|
||
|
||
**Critical: column renaming in `merge.py`** — The pipeline renames columns from snake_case to human-readable names before writing `wide.parquet`. The Rust server auto-discovers features from whatever column names exist in the parquet. Key renames:
|
||
- `pp_address` → `Address per Property Register`
|
||
- `postcode` → `Postcode`
|
||
- `latest_price` → `Last known price`
|
||
- `duration` → `Leashold/Freehold`
|
||
- `total_floor_area` → `Total floor area (sqm)`
|
||
- `current_energy_rating` → `Current energy rating`
|
||
|
||
The server and frontend must handle these human-readable names. See the full rename map in `merge.py`.
|
||
|
||
### Backend (`server-rs/`)
|
||
|
||
Rust + Axum. Loads parquet into memory at startup.
|
||
|
||
**Structure** (uses Rust 2018 module style — `foo.rs` + `foo/` directory, not `foo/mod.rs`):
|
||
- `data.rs` + `data/` — Property and POI data loading
|
||
- `parsing.rs` + `parsing/` — Filter parsing and bounds parsing
|
||
- `routes.rs` + `routes/` — One file per endpoint
|
||
- `utils.rs` + `utils/` — GridIndex, hashing, interned columns
|
||
- `consts.rs` — Key constants (histogram bins, H3 range, max enum cardinality, excluded columns)
|
||
|
||
**API endpoints:**
|
||
- `GET /api/features` — Feature metadata with histograms and 2nd/98th percentiles
|
||
- `GET /api/hexagons?resolution=&bounds=&filters=&fields=` — H3 aggregates (min/max per feature per hex), AABB-filtered to bounds
|
||
- `GET /api/postcodes?bounds=&filters=&fields=` — Postcode polygon aggregates, AABB-filtered to bounds
|
||
- `GET /api/postcode/:postcode` — Single postcode lookup (centroid + polygon)
|
||
- `GET /api/hexagon-properties?h3=&resolution=&filters=&limit=&offset=` — Paginated properties within a hexagon
|
||
- `GET /api/pois?bounds=&categories=` — POIs by bounds (max 5000)
|
||
- `GET /api/poi-categories` — Available POI category names
|
||
|
||
Serves `frontend/dist/` as static fallback in production.
|
||
|
||
**Data representation (unified model):**
|
||
- All features (numeric and enum): row-major flat `Vec<f32>`, NaN = null
|
||
- Enum features: stored as f32 indices (0.0, 1.0, 2.0...) with `enum_values: FxHashMap<usize, Vec<String>>` mapping feature index → string values
|
||
- String fields (address, postcode): interned/packed for memory efficiency
|
||
- The server accepts the parquet path as a CLI argument (defaults to `data_sources/processed/wide.parquet`)
|
||
|
||
### Frontend (`frontend/`)
|
||
|
||
React 18 + TypeScript. deck.gl `H3HexagonLayer` over MapLibre GL. TailwindCSS. No state management library — pure React hooks.
|
||
|
||
**Architecture:**
|
||
- `App.tsx` — Minimal router: loads features/POI categories, handles page navigation (home/dashboard/data-sources/faq)
|
||
- `MapPage.tsx` — Dashboard layout: composes map + left/right panes, uses custom hooks for all logic
|
||
- Custom hooks in `hooks/` encapsulate stateful logic:
|
||
- `useMapData` — Hexagon/postcode fetching, bounds, loading state, color range calculation
|
||
- `useFilters` — Filter state and handlers (add/remove/change/drag/pin)
|
||
- `useHexagonSelection` — Selection state, area stats, properties fetching
|
||
- `usePOIData` — POI fetching with debounce
|
||
- `usePaneResize` — Reusable pane resize handlers
|
||
- `useTheme` — Theme state with localStorage persistence
|
||
- `useUrlSync` — URL state synchronization
|
||
|
||
**Key patterns:**
|
||
- URL encodes view/filters/POI categories/active tab as query params for shareable links
|
||
- AbortControllers cancel in-flight requests on new queries (150ms debounce)
|
||
- Zoom → H3 resolution defined in `consts.ts` `ZOOM_TO_RESOLUTION_THRESHOLDS`: `<7.5→5, <9.5→6, <10.5→8, <12→9, ≥12→10`
|
||
- `POSTCODE_ZOOM_THRESHOLD = 15`: below 15 shows H3 hexagons, at/above 15 shows postcode polygons
|
||
- Viewport bounds computed via `getBoundsFromViewState()` in `map-utils.ts` — uses Web Mercator math with **TILE_SIZE=512** (MapLibre/deck.gl convention, NOT 256)
|
||
- Properties pane uses feature names from API response (human-readable), not hardcoded field names
|
||
- Proxy: dev server on :3001 proxies `/api` to :8001; also handles VS Code `/proxy/PORT` patterns
|
||
|
||
**Shared UI Components (`frontend/src/components/ui/`):**
|
||
- `icons/` — One file per icon (CloseIcon, InfoIcon, EyeIcon, PlusIcon, ChevronIcon, FilterIcon, LightbulbIcon, DownloadIcon, MapPinIcon, CheckIcon, ClipboardIcon, SunIcon, MoonIcon, SpinnerIcon). All accept `className` prop. **Never inline SVGs** — always extract to this folder.
|
||
- `IconButton.tsx` — Reusable icon button wrapper with consistent hover states. Accepts `active` prop for teal highlight.
|
||
- `SearchInput.tsx` — Styled search input with dark mode support. Used in Filters, POIPane, PropertiesPane.
|
||
- `PaneHeader.tsx` — Reusable pane header with title, optional subtitle, info button, and close button.
|
||
- `SelectionButtons.tsx` — "All" / "None" selection buttons for checkbox lists.
|
||
- `TabButton.tsx` — Tab button with active state styling. Used in right pane tabs.
|
||
- `EmptyState.tsx` — Empty state display with icon, title, description. Also exports `PaneEmptyState` for centered pane messages.
|
||
- `CheckboxList.tsx` — Checkbox list with toggle logic. Variants for array and Set-based selection.
|
||
|
||
**Shared Components (`frontend/src/components/`):**
|
||
- `FeatureInfoPopup.tsx` — Popup showing feature name, description, detail, and "View data source" link.
|
||
- `FeatureIcons.tsx` — `FeatureActions` component combining eye/info/add/remove icons for feature rows.
|
||
|
||
**Shared Utilities (`frontend/src/lib/`):**
|
||
- `api.ts` — `apiUrl(endpoint, params?)` builds API URLs. `logNonAbortError(label, err)` and `isAbortError(err)` for error handling.
|
||
- `features.ts` — `groupFeaturesByCategory(features)` groups FeatureMeta[] by their `group` field.
|
||
- `format.ts` — `formatNumber(value, decimals)` for number formatting. `calculateHistogramMean(histogram)` for weighted mean calculation.
|
||
- `property-fields.ts` — `getNum(property, ...keys)` for getting numeric property values with fallback field names.
|
||
|
||
When adding new UI, prefer using these shared components over inline implementations to maintain consistency.
|
||
|
||
**When to extract vs inline:**
|
||
- Extract to `hooks/`: Stateful logic with useState/useEffect/useCallback that can be named as a cohesive unit (e.g., `useFilters`, `useMapData`). If a component has 5+ related state variables and handlers, extract them to a hook.
|
||
- Extract to page component: Layout + hook composition for a major view (e.g., `MapPage` composes `useMapData` + `useFilters` + child components). Keep App.tsx focused on routing.
|
||
- Extract to `ui/` component: Repeated 3+ times with same styling (buttons, inputs, icons)
|
||
- Extract to `lib/`: Pure functions used across components (formatting, calculations, lookups)
|
||
- Keep inline: One-off UI specific to a single component
|
||
|
||
**Component size guideline:** If a component exceeds ~300 lines, look for extraction opportunities. Large components are usually doing too much — split into hooks (for logic) and child components (for UI sections).
|
||
|
||
**Naming conventions:**
|
||
- UI components: PascalCase, noun-based (`TabButton`, `EmptyState`)
|
||
- Utilities: camelCase verb-based (`formatNumber`, `calculateHistogramMean`)
|
||
|
||
## Frontend Design Guide (STRICT — must be followed for all UI changes)
|
||
|
||
The frontend uses Tailwind's `darkMode: 'class'` strategy. The `dark` class is toggled on `<html>`. Every visible element must have both light and dark styles. **Never add a light-only color class without its `dark:` counterpart.** Run `task build:frontend` after any UI change to verify.
|
||
|
||
### Theme System
|
||
|
||
- **State**: `App.tsx` owns a `theme` state (`'light' | 'dark' | 'system'`), persisted in `localStorage` under the key `theme`, default `'system'`.
|
||
- **Effective theme**: When `'system'`, resolved via `window.matchMedia('(prefers-color-scheme: dark)')`. A `change` listener re-renders on OS preference flip.
|
||
- **Toggle cycle**: light → dark → system → light. Three-way, not binary.
|
||
- **Flash prevention**: `index.html` contains an inline `<script>` that applies the `dark` class before first paint. If the localStorage/matchMedia logic in that script changes, update it to match `App.tsx`.
|
||
- **Prop plumbing**: `effectiveTheme` (`'light' | 'dark'`) is passed as a prop to `<Map>` and `<HomePage>`. Components that need the resolved theme must receive it as a prop — do not read localStorage or matchMedia inside child components.
|
||
|
||
### Color Token Reference
|
||
|
||
Every UI element must use the correct token from this table. Do not invent new pairings.
|
||
|
||
| Role | Light class | Dark class | Hex (dark) |
|
||
|------|------------|------------|------------|
|
||
| **Page / pane background** | `bg-warm-50` or `bg-white` | `dark:bg-warm-900` | #1c1917 |
|
||
| **Card / elevated surface** | `bg-white` | `dark:bg-warm-800` | #292524 |
|
||
| **Inset / recessed surface** | `bg-warm-100` or `bg-warm-50` | `dark:bg-warm-800` | #292524 |
|
||
| **Input / select background** | `bg-white` | `dark:bg-warm-800` or `dark:bg-warm-900` | |
|
||
| **Primary border** | `border-warm-200` | `dark:border-warm-700` | #44403c |
|
||
| **Subtle border (dividers)** | `border-warm-100` | `dark:border-warm-800` | #292524 |
|
||
| **Primary text (headings)** | `text-navy-950` or implicit dark | `dark:text-warm-100` | #f5f5f4 |
|
||
| **Body text** | `text-warm-700` | `dark:text-warm-300` | #d6d3d1 |
|
||
| **Secondary text (labels, hints)** | `text-warm-500` or `text-warm-600` | `dark:text-warm-400` | #a8a29e |
|
||
| **Disabled / placeholder text** | `text-warm-400` / `placeholder-warm-400` | `dark:text-warm-500` / `dark:placeholder-warm-500` | #78716c |
|
||
| **Accent text (links, actions)** | `text-teal-600` | `dark:text-teal-400` | #1de4c3 |
|
||
| **Accent hover text** | `hover:text-teal-800` | `dark:hover:text-teal-300` | #51f7d9 |
|
||
| **Accent background (highlights)** | `bg-teal-50` | `dark:bg-teal-900/30` | |
|
||
| **Active ring / focus ring** | `ring-teal-400` | same — works in both | |
|
||
| **Price / key metric text** | `text-teal-700` | `dark:text-teal-400` | |
|
||
| **Remove / close button** | `text-warm-400 hover:text-warm-700` | `dark:hover:text-warm-300` | |
|
||
| **Checkbox accent** | `accent-teal-600` | same — works in both | |
|
||
| **Header (unchanged both modes)** | `bg-navy-900 text-white` | same | |
|
||
|
||
### Mapping Rules for Specific Contexts
|
||
|
||
**Sidebars (Filters, POIPane, PropertiesPane, right-pane tabs):**
|
||
- Container: `bg-white dark:bg-warm-900`
|
||
- Inner cards / dropdown menus: `bg-white dark:bg-warm-800`
|
||
- Borders: `border-warm-200 dark:border-warm-700`
|
||
- Tab text (active): add `dark:text-warm-100`
|
||
- Tab text (inactive): `text-warm-600 dark:text-warm-400`
|
||
|
||
**Map overlays (PostcodeSearch, MapLegend, POI popup, loading indicator):**
|
||
- Background: `bg-white dark:bg-warm-800`
|
||
- Text: `dark:text-warm-200`
|
||
- Semi-transparent variants: use `/90` opacity suffix (e.g. `dark:bg-warm-800/90`)
|
||
- Deck.gl tooltip (inline styles, not Tailwind): use `#292524` bg / `#e7e5e4` text / `rgba(0,0,0,0.5)` shadow in dark.
|
||
- Deck.gl postcode labels (RGB arrays): `[220,220,220,220]` text / `[30,30,30,200]` outline in dark; inverse in light.
|
||
|
||
**Map basemaps:**
|
||
- Self-hosted Protomaps tiles served from PMTiles via `/api/tiles/{z}/{x}/{y}`
|
||
- Style built by `@protomaps/basemaps` library with `namedFlavor(theme)` for light/dark
|
||
- Font glyphs and twemoji PNGs served locally from `frontend/public/assets/` (no external CDN deps at runtime)
|
||
- `CopyWebpackPlugin` copies `frontend/public/` → `dist/` on build; Rust `ServeDir` fallback serves them in prod
|
||
- Download assets with `task download:map-assets` (script: `pipeline/download/map_assets.py`)
|
||
|
||
**HomePage (landing page):**
|
||
- Page bg: `bg-warm-50 dark:bg-warm-900`
|
||
- Cards: `bg-white dark:bg-warm-800` with `border-warm-200 dark:border-warm-700`
|
||
- Backdrop-blur panels: use `/60` or `/40` opacity on both `bg-warm-50` and `dark:bg-warm-900`
|
||
- HexCanvas: reads `isDark` ref; uses dimmer fill (`#058172`) and stroke (`#0a665b`) at 60% opacity multiplier.
|
||
- All headings: `dark:text-warm-100`. All body: `dark:text-warm-300` or `dark:text-warm-400`.
|
||
|
||
**DataSourcesPage:**
|
||
- Same card pattern as above. Footer is already dark (`bg-navy-900`) — no changes needed.
|
||
- License badges: `bg-warm-100 dark:bg-warm-700 text-warm-600 dark:text-warm-300`
|
||
- Links: `text-teal-600 dark:text-teal-400`
|
||
|
||
**DataSources floating button (on map):**
|
||
- `bg-white/90 dark:bg-warm-800/90` with `text-teal-600 dark:text-teal-400`
|
||
|
||
### Rules for New Components
|
||
|
||
1. **Every `bg-white` needs `dark:bg-warm-800` or `dark:bg-warm-900`.** Pane-level = warm-900, card-level = warm-800.
|
||
2. **Every `border-warm-200` needs `dark:border-warm-700`.**
|
||
3. **Every `text-warm-*` needs a `dark:text-warm-*` counterpart.** Follow the token table — don't guess.
|
||
4. **Every `text-teal-600` needs `dark:text-teal-400`.** Every `hover:text-teal-800` needs `dark:hover:text-teal-300`.
|
||
5. **Every `bg-teal-50` needs `dark:bg-teal-900/30`.**
|
||
6. **Every `hover:bg-warm-50` needs `dark:hover:bg-warm-700` or `dark:hover:bg-warm-800`.**
|
||
7. **Inputs and selects**: always add `dark:bg-warm-800 dark:text-warm-200 dark:border-warm-700`. Placeholders get `dark:placeholder-warm-500`.
|
||
8. **Checkboxes**: always include `accent-teal-600 rounded`.
|
||
9. **Do not use Tailwind `dark:` classes inside deck.gl layers or canvas code.** Use the `theme` prop / ref and conditional JS values.
|
||
10. **Do not add `transition-*` classes for theme switching.** The global CSS rule in `index.css` handles transitions for `background-color`, `border-color`, and `color` on all standard HTML elements. Adding per-element transition classes will conflict.
|
||
11. **Never hardcode hex colors in JSX `style=` props for themed elements** (except deck.gl tooltip and canvas, which can't use Tailwind). Use the Tailwind classes from the token table instead.
|
||
12. **The header (`bg-navy-900`) is identical in both themes.** Do not add dark variants to it.
|
||
|
||
### Verification Checklist (for any UI PR)
|
||
|
||
- [ ] `task build:frontend` passes with no errors
|
||
- [ ] Every new `bg-*`, `text-*`, `border-*` class has a `dark:` counterpart (search your diff)
|
||
- [ ] Toggle through all three modes (light → dark → system) with no flash
|
||
- [ ] Map basemap switches when theme changes
|
||
- [ ] Sidebars, dropdowns, and popups are readable in both modes
|
||
- [ ] HomePage and DataSourcesPage adapt correctly
|
||
|
||
## Coding Preferences
|
||
|
||
- **Unified data models over special-casing**: Prefer storing different data types uniformly (e.g., enums as f32 indices alongside numeric features) rather than maintaining separate code paths
|
||
- **Terse tests**: Test what matters in as few tests as possible — don't overcomplicate with excessive setup or edge cases that don't add value
|
||
- **Extract and organize**: Group related utilities into proper modules (e.g., `utils/`, `parsing/`) rather than leaving helpers scattered
|
||
- **Inline module tests**: Place `#[cfg(test)] mod tests { }` at the bottom of each module file rather than in separate test files
|
||
- **Decompose large React components**: Extract stateful logic into custom hooks (`useXxx`), extract page layouts into page components. App.tsx should only handle routing and initial data loading. Each hook should encapsulate one cohesive concern (e.g., `useFilters` owns filter state + all filter handlers).
|
||
|
||
## Rust Code Style (server-rs)
|
||
|
||
Follow these conventions in all Rust code:
|
||
|
||
1. **Module style**: Use Rust 2018 module naming — `foo.rs` + `foo/` directory, NOT `foo/mod.rs`
|
||
2. **Imports over inline paths**: Import items at the top of the file, don't use `crate::` inline in code
|
||
```rust
|
||
// Good
|
||
use crate::utils::generate_priorities;
|
||
let p = generate_priorities(n);
|
||
|
||
// Bad
|
||
let p = crate::utils::generate_priorities(n);
|
||
```
|
||
3. **Tracing macros**: Import and use short form, not fully qualified
|
||
```rust
|
||
// Good
|
||
use tracing::{info, warn};
|
||
info!("message");
|
||
|
||
// Bad
|
||
tracing::info!("message");
|
||
```
|
||
4. **JSON serialization**: Use `serde_json` with `#[derive(Serialize)]` structs, not manual string building
|
||
5. **Precompute at startup**: For static/rarely-changing responses, compute once at startup and store in `AppState`
|
||
6. **Unique placeholders**: When injecting content into HTML, use distinctive markers like `__PERFECT_POSTCODES_OG_TAGS__` that won't accidentally match other content
|
||
|
||
## Key Implementation Details
|
||
|
||
- **Spatial sort**: Rows sorted by 0.01° grid cell at load time for cache-friendly sequential access
|
||
- **Row-major layout**: `feature_data[row * num_features + feat_idx]` — all features (numeric and enum) for one property are contiguous
|
||
- **H3 precomputation**: Resolutions 4–12 computed in parallel (rayon) at startup
|
||
- **Histogram percentiles without sorting**: O(n) two-pass algorithm — build histogram, interpolate percentiles
|
||
- **Startup precomputation**: Static responses (like `/api/features`) are computed once at startup and cached in `AppState`
|
||
- **POI transform validation**: Fails if any OSM category is unmapped — guarantees exhaustive coverage
|
||
- **Fuzzy join**: Groups by postcode, uses `thefuzz.token_sort_ratio` with numeric token compatibility, greedy assignment from highest score
|
||
- **Filter bounds format**: `south,west,north,east` (not standard bbox order)
|
||
- **Server-side AABB filtering**: Both `/api/hexagons` and `/api/postcodes` filter results by bounding-box intersection with query bounds. Hexagons use `h3_cell_bounds()` (h3o returns degrees, not radians). Postcodes compute polygon AABB from vertices. See `bounds_intersect()` in `parsing/bounds.rs`.
|
||
- **GridIndex returns slightly more than requested**: The 0.01° grid cells mean properties up to ~1km outside the viewport may be returned. The AABB filter in the route handlers catches these extras.
|
||
- **POI proximity**: Uses 0.05° grid (~5km cells) to reduce candidates before haversine distance check
|
||
- **OG tag injection**: Uses `<meta name="x-og-placeholder" content="__PERFECT_POSTCODES_OG_TAGS__"/>` placeholder in HTML, replaced at runtime by middleware
|
||
|
||
## Rust Performance Patterns (server-rs)
|
||
|
||
**Lookup optimization:**
|
||
- `AppState.feature_name_to_index: FxHashMap<String, usize>` for O(1) feature lookups (used in filter parsing, field selection)
|
||
- Never use `.position()` on feature_names in hot paths — always use the prebuilt HashMap
|
||
- Enum filters use `FxHashSet<u32>` (f32 bits) for O(1) contains checks instead of `Vec::contains`
|
||
|
||
**Hot loop patterns:**
|
||
- Hoist conditional branches outside loops when possible (e.g., `if has_selective` check moved outside aggregation loop in hexagons.rs)
|
||
- Use `into_par_iter()` for file I/O (postcode GeoJSON loading) and CPU-bound startup work (H3 precomputation)
|
||
|
||
**Cardinality counting:**
|
||
- Use `FxHashSet` with `f32::to_bits()` for O(n) unique value counting instead of collect→sort→dedup O(n log n)
|
||
- For enum ordering, convert order slice to `FxHashSet` before filtering to get O(1) contains
|
||
|
||
**Data structure choices:**
|
||
- CSR (Compressed Sparse Row) for GridIndex — single flat `values` array + `offsets` array eliminates per-cell Vec overhead
|
||
- `Box<[f32]>` for fixed-size aggregation arrays — avoids Vec capacity field (8 bytes saved per cell)
|
||
- Bit-packed booleans for flags like `is_approx_build_date` — 8x memory savings vs `Vec<bool>`
|
||
|
||
**What NOT to optimize:**
|
||
- String cloning in JSON responses (~10-20 small strings) — negligible vs serialization overhead
|
||
- GridIndex 3-pass build (min/max → count → fill) — necessary for CSR without O(n) extra memory
|
||
- Arc<str> for enum values — complexity not worth modest benefit
|