Add postcode boundary calculation
This commit is contained in:
parent
f9bd218a3e
commit
f5e6894c0f
14 changed files with 1384 additions and 717 deletions
23
pipeline/transform/postcode_boundaries/__init__.py
Normal file
23
pipeline/transform/postcode_boundaries/__init__.py
Normal file
|
|
@ -0,0 +1,23 @@
|
|||
"""Generate postcode boundary polygons from OA boundaries, INSPIRE parcels, and UPRN data.
|
||||
|
||||
Produces per-district GeoJSON files compatible with the Rust server's postcode loader.
|
||||
Each postcode gets a polygon (or MultiPolygon) guaranteed to be contained within its
|
||||
Output Area(s), with 100% OA coverage and no overlaps between postcodes within an OA.
|
||||
|
||||
Algorithm per OA:
|
||||
1. Single-postcode OA → entire OA polygon assigned to that postcode
|
||||
2. Multi-postcode OA:
|
||||
a. Assign INSPIRE parcels to postcodes via UPRN point-in-polygon majority vote
|
||||
b. Union INSPIRE parcels per postcode, clip to OA → "claimed" area
|
||||
c. Distribute remaining (unclaimed) OA area via Voronoi of UPRN points
|
||||
d. Final polygon = claimed + Voronoi share
|
||||
|
||||
Memory-efficient design (<12GB total):
|
||||
- INSPIRE polygons stored as raw coordinate bytes in parquet; Shapely objects built
|
||||
lazily per-OA via numpy bbox pre-filter (~100-500 candidates at a time)
|
||||
- UPRNs kept as sorted polars DataFrame with offset dict (Arrow storage, ~1.2GB)
|
||||
- OA processing runs sequentially (no multiprocess INSPIRE duplication)
|
||||
|
||||
Output format: {output}/units/{DISTRICT}.geojson with properties.postcodes and
|
||||
properties.mapit_code fields matching server-rs/src/data/postcodes.rs expectations.
|
||||
"""
|
||||
Loading…
Add table
Add a link
Reference in a new issue