Add learnings

This commit is contained in:
Andras Schmelczer 2026-02-07 15:31:08 +00:00
parent adaebfbd2a
commit c9651d25c7

View file

@ -15,7 +15,7 @@ All commands use [Task](https://taskfile.dev) runner. Python uses `uv run`. Fron
```bash
# Development servers
task dev:server # Rust backend on :8001 (cargo run --release)
task dev:frontend # Webpack dev server on :3030 (proxies /api to :8001)
task dev:frontend # Webpack dev server on :3000 (proxies /api to :8001)
# Data pipeline
task prepare # Build wide.parquet from all pre-downloaded sources
@ -92,7 +92,9 @@ Rust + Axum. Loads parquet into memory at startup.
**API endpoints:**
- `GET /api/features` — Feature metadata with histograms and 2nd/98th percentiles
- `GET /api/hexagons?resolution=&bounds=&filters=` — H3 aggregates (min/max per feature per hex)
- `GET /api/hexagons?resolution=&bounds=&filters=&fields=` — H3 aggregates (min/max per feature per hex), AABB-filtered to bounds
- `GET /api/postcodes?bounds=&filters=&fields=` — Postcode polygon aggregates, AABB-filtered to bounds
- `GET /api/postcode/:postcode` — Single postcode lookup (centroid + polygon)
- `GET /api/hexagon-properties?h3=&resolution=&filters=&limit=&offset=` — Paginated properties within a hexagon
- `GET /api/pois?bounds=&categories=` — POIs by bounds (max 5000)
- `GET /api/poi-categories` — Available POI category names
@ -109,14 +111,61 @@ Serves `frontend/dist/` as static fallback in production.
React 18 + TypeScript. deck.gl `H3HexagonLayer` over MapLibre GL. TailwindCSS. No state management library — pure React hooks.
**Architecture:**
- `App.tsx` — Minimal router: loads features/POI categories, handles page navigation (home/dashboard/data-sources/faq)
- `MapPage.tsx` — Dashboard layout: composes map + left/right panes, uses custom hooks for all logic
- Custom hooks in `hooks/` encapsulate stateful logic:
- `useMapData` — Hexagon/postcode fetching, bounds, loading state, color range calculation
- `useFilters` — Filter state and handlers (add/remove/change/drag/pin)
- `useHexagonSelection` — Selection state, area stats, properties fetching
- `usePOIData` — POI fetching with debounce
- `usePaneResize` — Reusable pane resize handlers
- `useTheme` — Theme state with localStorage persistence
- `useUrlSync` — URL state synchronization
**Key patterns:**
- `App.tsx` manages all state, API fetching (150ms debounce), and URL state sync (300ms debounce)
- URL encodes view/filters/POI categories/active tab as query params for shareable links
- AbortControllers cancel in-flight requests on new queries
- Zoom → H3 resolution: `<7→7, <9.5→8, <11→9, <13→10, ≥13→11`
- Bounds quantized to 0.01° to match backend caching
- AbortControllers cancel in-flight requests on new queries (150ms debounce)
- Zoom → H3 resolution defined in `consts.ts` `ZOOM_TO_RESOLUTION_THRESHOLDS`: `<7.5→5, <9.5→6, <10.5→8, <12→9, ≥12→10`
- `POSTCODE_ZOOM_THRESHOLD = 15`: below 15 shows H3 hexagons, at/above 15 shows postcode polygons
- Viewport bounds computed via `getBoundsFromViewState()` in `map-utils.ts` — uses Web Mercator math with **TILE_SIZE=512** (MapLibre/deck.gl convention, NOT 256)
- Properties pane uses feature names from API response (human-readable), not hardcoded field names
- Proxy: dev server on :3030 proxies `/api` to :8001; also handles VS Code `/proxy/PORT` patterns
- Proxy: dev server on :3000 proxies `/api` to :8001; also handles VS Code `/proxy/PORT` patterns
**Shared UI Components (`frontend/src/components/ui/`):**
- `Icons.tsx` — Central icon library (CloseIcon, InfoIcon, EyeIcon, PlusIcon, ChevronIcon, FilterIcon, LightbulbIcon). All icons accept `className` prop for sizing.
- `IconButton.tsx` — Reusable icon button wrapper with consistent hover states. Accepts `active` prop for teal highlight.
- `SearchInput.tsx` — Styled search input with dark mode support. Used in Filters, POIPane, PropertiesPane.
- `PaneHeader.tsx` — Reusable pane header with title, optional subtitle, info button, and close button.
- `SelectionButtons.tsx` — "All" / "None" selection buttons for checkbox lists.
- `TabButton.tsx` — Tab button with active state styling. Used in right pane tabs.
- `EmptyState.tsx` — Empty state display with icon, title, description. Also exports `PaneEmptyState` for centered pane messages.
- `CheckboxList.tsx` — Checkbox list with toggle logic. Variants for array and Set-based selection.
**Shared Components (`frontend/src/components/`):**
- `FeatureInfoPopup.tsx` — Popup showing feature name, description, detail, and "View data source" link.
- `FeatureIcons.tsx``FeatureActions` component combining eye/info/add/remove icons for feature rows.
**Shared Utilities (`frontend/src/lib/`):**
- `api.ts``apiUrl(endpoint, params?)` builds API URLs. `logNonAbortError(label, err)` and `isAbortError(err)` for error handling.
- `features.ts``groupFeaturesByCategory(features)` groups FeatureMeta[] by their `group` field.
- `format.ts``formatNumber(value, decimals)` for number formatting. `calculateHistogramMean(histogram)` for weighted mean calculation.
- `property-fields.ts``getNum(property, ...keys)` for getting numeric property values with fallback field names.
When adding new UI, prefer using these shared components over inline implementations to maintain consistency.
**When to extract vs inline:**
- Extract to `hooks/`: Stateful logic with useState/useEffect/useCallback that can be named as a cohesive unit (e.g., `useFilters`, `useMapData`). If a component has 5+ related state variables and handlers, extract them to a hook.
- Extract to page component: Layout + hook composition for a major view (e.g., `MapPage` composes `useMapData` + `useFilters` + child components). Keep App.tsx focused on routing.
- Extract to `ui/` component: Repeated 3+ times with same styling (buttons, inputs, icons)
- Extract to `lib/`: Pure functions used across components (formatting, calculations, lookups)
- Keep inline: One-off UI specific to a single component
**Component size guideline:** If a component exceeds ~300 lines, look for extraction opportunities. Large components are usually doing too much — split into hooks (for logic) and child components (for UI sections).
**Naming conventions:**
- UI components: PascalCase, noun-based (`TabButton`, `EmptyState`)
- Utilities: camelCase verb-based (`formatNumber`, `calculateHistogramMean`)
## Frontend Design Guide (STRICT — must be followed for all UI changes)
@ -221,6 +270,7 @@ Every UI element must use the correct token from this table. Do not invent new p
- **Terse tests**: Test what matters in as few tests as possible — don't overcomplicate with excessive setup or edge cases that don't add value
- **Extract and organize**: Group related utilities into proper modules (e.g., `utils/`, `parsing/`) rather than leaving helpers scattered
- **Inline module tests**: Place `#[cfg(test)] mod tests { }` at the bottom of each module file rather than in separate test files
- **Decompose large React components**: Extract stateful logic into custom hooks (`useXxx`), extract page layouts into page components. App.tsx should only handle routing and initial data loading. Each hook should encapsulate one cohesive concern (e.g., `useFilters` owns filter state + all filter handlers).
## Rust Code Style (server-rs)
@ -259,5 +309,32 @@ Follow these conventions in all Rust code:
- **POI transform validation**: Fails if any OSM category is unmapped — guarantees exhaustive coverage
- **Fuzzy join**: Groups by postcode, uses `thefuzz.token_sort_ratio` with numeric token compatibility, greedy assignment from highest score
- **Filter bounds format**: `south,west,north,east` (not standard bbox order)
- **Server-side AABB filtering**: Both `/api/hexagons` and `/api/postcodes` filter results by bounding-box intersection with query bounds. Hexagons use `h3_cell_bounds()` (h3o returns degrees, not radians). Postcodes compute polygon AABB from vertices. See `bounds_intersect()` in `parsing/bounds.rs`.
- **GridIndex returns slightly more than requested**: The 0.01° grid cells mean properties up to ~1km outside the viewport may be returned. The AABB filter in the route handlers catches these extras.
- **POI proximity**: Uses 0.05° grid (~5km cells) to reduce candidates before haversine distance check
- **OG tag injection**: Uses `<meta name="x-og-placeholder" content="__NARROWIT_OG_TAGS__"/>` placeholder in HTML, replaced at runtime by middleware
## Rust Performance Patterns (server-rs)
**Lookup optimization:**
- `AppState.feature_name_to_index: FxHashMap<String, usize>` for O(1) feature lookups (used in filter parsing, field selection)
- Never use `.position()` on feature_names in hot paths — always use the prebuilt HashMap
- Enum filters use `FxHashSet<u32>` (f32 bits) for O(1) contains checks instead of `Vec::contains`
**Hot loop patterns:**
- Hoist conditional branches outside loops when possible (e.g., `if has_selective` check moved outside aggregation loop in hexagons.rs)
- Use `into_par_iter()` for file I/O (postcode GeoJSON loading) and CPU-bound startup work (H3 precomputation)
**Cardinality counting:**
- Use `FxHashSet` with `f32::to_bits()` for O(n) unique value counting instead of collect→sort→dedup O(n log n)
- For enum ordering, convert order slice to `FxHashSet` before filtering to get O(1) contains
**Data structure choices:**
- CSR (Compressed Sparse Row) for GridIndex — single flat `values` array + `offsets` array eliminates per-cell Vec overhead
- `Box<[f32]>` for fixed-size aggregation arrays — avoids Vec capacity field (8 bytes saved per cell)
- Bit-packed booleans for flags like `is_approx_build_date` — 8x memory savings vs `Vec<bool>`
**What NOT to optimize:**
- String cloning in JSON responses (~10-20 small strings) — negligible vs serialization overhead
- GridIndex 3-pass build (min/max → count → fill) — necessary for CSR without O(n) extra memory
- Arc<str> for enum values — complexity not worth modest benefit