{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# EPC Data Analysis\n", "\n", "Exploratory analysis of the Energy Performance Certificate (EPC) dataset (~29M rows, converted from CSV to parquet for memory efficiency)." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2026-01-28T22:06:26.543050Z", "iopub.status.busy": "2026-01-28T22:06:26.542977Z", "iopub.status.idle": "2026-01-28T22:06:27.083569Z", "shell.execute_reply": "2026-01-28T22:06:27.083187Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Rows: 28,847,961\n", "Columns: 93\n" ] }, { "data": { "text/html": [ "
| LMK_KEY | ADDRESS1 | ADDRESS2 | ADDRESS3 | POSTCODE | BUILDING_REFERENCE_NUMBER | CURRENT_ENERGY_RATING | POTENTIAL_ENERGY_RATING | CURRENT_ENERGY_EFFICIENCY | POTENTIAL_ENERGY_EFFICIENCY | PROPERTY_TYPE | BUILT_FORM | INSPECTION_DATE | LOCAL_AUTHORITY | CONSTITUENCY | COUNTY | LODGEMENT_DATE | TRANSACTION_TYPE | ENVIRONMENT_IMPACT_CURRENT | ENVIRONMENT_IMPACT_POTENTIAL | ENERGY_CONSUMPTION_CURRENT | ENERGY_CONSUMPTION_POTENTIAL | CO2_EMISSIONS_CURRENT | CO2_EMISS_CURR_PER_FLOOR_AREA | CO2_EMISSIONS_POTENTIAL | LIGHTING_COST_CURRENT | LIGHTING_COST_POTENTIAL | HEATING_COST_CURRENT | HEATING_COST_POTENTIAL | HOT_WATER_COST_CURRENT | HOT_WATER_COST_POTENTIAL | TOTAL_FLOOR_AREA | ENERGY_TARIFF | MAINS_GAS_FLAG | FLOOR_LEVEL | FLAT_TOP_STOREY | FLAT_STOREY_COUNT | … | WALLS_ENERGY_EFF | WALLS_ENV_EFF | SECONDHEAT_DESCRIPTION | SHEATING_ENERGY_EFF | SHEATING_ENV_EFF | ROOF_DESCRIPTION | ROOF_ENERGY_EFF | ROOF_ENV_EFF | MAINHEAT_DESCRIPTION | MAINHEAT_ENERGY_EFF | MAINHEAT_ENV_EFF | MAINHEATCONT_DESCRIPTION | MAINHEATC_ENERGY_EFF | MAINHEATC_ENV_EFF | LIGHTING_DESCRIPTION | LIGHTING_ENERGY_EFF | LIGHTING_ENV_EFF | MAIN_FUEL | WIND_TURBINE_COUNT | HEAT_LOSS_CORRIDOR | UNHEATED_CORRIDOR_LENGTH | FLOOR_HEIGHT | PHOTO_SUPPLY | SOLAR_WATER_HEATING_FLAG | MECHANICAL_VENTILATION | ADDRESS | LOCAL_AUTHORITY_LABEL | CONSTITUENCY_LABEL | POSTTOWN | CONSTRUCTION_AGE_BAND | LODGEMENT_DATETIME | TENURE | FIXED_LIGHTING_OUTLETS_COUNT | LOW_ENERGY_FIXED_LIGHT_COUNT | UPRN | UPRN_SOURCE | REPORT_TYPE |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| str | str | str | str | str | i64 | str | str | i64 | i64 | str | str | date | str | str | str | date | str | i64 | i64 | i64 | i64 | f64 | i64 | f64 | i64 | i64 | i64 | i64 | i64 | i64 | f64 | str | str | str | str | f64 | … | str | str | str | str | str | str | str | str | str | str | str | str | str | str | str | str | str | str | i64 | str | f64 | f64 | f64 | str | str | str | str | str | str | str | datetime[μs] | str | i64 | i64 | i64 | str | i64 |
| "0013ac0dc0f80dd06448efce74287d… | "43 GAYAL CROFT" | "MILTON KEYNES" | null | "MK5 7HX" | 10003769440 | "C" | "B" | 75 | 89 | "House" | "Mid-Terrace" | 2022-08-22 | "E06000042" | "E14000822" | null | 2022-12-08 | "rental" | 75 | 90 | 175 | 61 | 1.9 | 31 | 0.7 | 55 | 44 | 340 | 342 | 98 | 63 | 61.0 | "Single" | "Y" | null | null | null | … | "Good" | "Good" | "None" | "N/A" | "N/A" | "Pitched, insulated (assumed)" | "Good" | "Good" | "Boiler and radiators, mains ga… | "Good" | "Good" | "Programmer, room thermostat an… | "Good" | "Good" | "Low energy lighting in 75% of … | "Very Good" | "Very Good" | "mains gas (not community)" | 0 | null | null | 2.4 | 0.0 | "N" | "natural" | "43 GAYAL CROFT, MILTON KEYNES" | "Milton Keynes" | "Milton Keynes South" | null | "England and Wales: 1996-2002" | 2022-12-08 22:29:20 | "Rented (social)" | 8 | null | 25018528 | "Energy Assessor" | 100 |
| "20daddd4e5ce9abcca2efab2270af5… | "30 Ditton Road" | null | null | "DA6 8JL" | 10003624703 | "D" | "B" | 59 | 84 | "House" | "Mid-Terrace" | 2022-11-03 | "E09000004" | "E14000558" | null | 2022-11-03 | "marketed sale" | 51 | 81 | 279 | 93 | 4.6 | 49 | 1.6 | 80 | 80 | 755 | 452 | 117 | 68 | 93.0 | "Unknown" | "Y" | null | null | null | … | "Poor" | "Poor" | "None" | "N/A" | "N/A" | "Pitched, 100 mm loft insulatio… | "Average" | "Average" | "Boiler and radiators, mains ga… | "Good" | "Good" | "Programmer and room thermostat" | "Average" | "Average" | "Low energy lighting in all fix… | "Very Good" | "Very Good" | "mains gas (not community)" | 0 | null | null | 2.64 | 0.0 | "N" | "natural" | "30 Ditton Road" | "Bexley" | "Bexleyheath and Crayford" | "BEXLEYHEATH" | "England and Wales: 1950-1966" | 2022-11-03 16:38:48 | "Owner-occupied" | 11 | null | 100020205591 | "Energy Assessor" | 100 |
| "0019793879cf7ab92be22ba552e992… | "The Paddocks" | "Bridge Road" | "Stoke Bruerne" | "NN12 7SE" | 10003816660 | "D" | "C" | 65 | 74 | "House" | "Detached" | 2022-12-16 | "E06000062" | "E14000942" | null | 2022-12-16 | "marketed sale" | 57 | 67 | 146 | 103 | 8.3 | 38 | 6.4 | 130 | 130 | 1008 | 959 | 184 | 93 | 217.0 | "Single" | "N" | null | null | null | … | "Good" | "Good" | "None" | "N/A" | "N/A" | "Pitched, insulated at rafters" | "Average" | "Average" | "Boiler and radiators, oil" | "Average" | "Good" | "Programmer, room thermostat an… | "Good" | "Good" | "Low energy lighting in all fix… | "Very Good" | "Very Good" | "oil (not community)" | 0 | null | null | 2.36 | 0.0 | "N" | "natural" | "The Paddocks, Bridge Road, Sto… | "West Northamptonshire" | "South Northamptonshire" | "TOWCESTER" | "England and Wales: 1983-1990" | 2022-12-16 10:47:55 | "Owner-occupied" | 22 | null | 10023964427 | "Energy Assessor" | 100 |
| column | dtype | null_count | null_pct |
|---|---|---|---|
| str | str | i64 | f64 |
| "LMK_KEY" | "String" | 0 | 0.0 |
| "ADDRESS1" | "String" | 49 | 0.0 |
| "ADDRESS2" | "String" | 14135472 | 49.0 |
| "ADDRESS3" | "String" | 26056812 | 90.3 |
| "POSTCODE" | "String" | 0 | 0.0 |
| … | … | … | … |
| "FIXED_LIGHTING_OUTLETS_COUNT" | "Int64" | 11903726 | 41.3 |
| "LOW_ENERGY_FIXED_LIGHT_COUNT" | "Int64" | 19926976 | 69.1 |
| "UPRN" | "Int64" | 646277 | 2.2 |
| "UPRN_SOURCE" | "String" | 646277 | 2.2 |
| "REPORT_TYPE" | "Int64" | 0 | 0.0 |
| PROPERTY_TYPE | mean_efficiency | median_efficiency | count |
|---|---|---|---|
| str | f64 | f64 | u32 |
| "Flat" | 69.745031 | 72.0 | 8236696 |
| "Maisonette" | 64.894323 | 68.0 | 710695 |
| "House" | 63.306193 | 65.0 | 17437884 |
| "Bungalow" | 60.703275 | 63.0 | 2448109 |
| "Park home" | 45.450847 | 48.0 | 14577 |
| column | count | null_count | mean | std | min | q25 | median | q75 | max |
|---|---|---|---|---|---|---|---|---|---|
| str | u32 | u32 | f64 | f64 | f64 | f64 | f64 | f64 | f64 |
| "CURRENT_ENERGY_EFFICIENCY" | 28847961 | 0 | 64.953829 | 14.289267 | 0.0 | 58.0 | 67.0 | 74.0 | 13060.0 |
| "POTENTIAL_ENERGY_EFFICIENCY" | 28847961 | 0 | 79.197588 | 11.097917 | 0.0 | 75.0 | 81.0 | 85.0 | 13071.0 |
| "ENERGY_CONSUMPTION_CURRENT" | 28847961 | 0 | 253.700647 | 173.148972 | -320866.0 | 174.0 | 233.0 | 308.0 | 312273.0 |
| "CO2_EMISSIONS_CURRENT" | 28847961 | 0 | 3.819772 | 3.585926 | -2854.8 | 2.1 | 3.2 | 4.7 | 4863.0 |
| "TOTAL_FLOOR_AREA" | 28847959 | 2 | 86.904086 | 118.035704 | 0.0 | 59.48 | 77.0 | 99.0 | 530331.552 |
| "HEATING_COST_CURRENT" | 28840897 | 7064 | 708.726959 | 644.316629 | -944269.0 | 364.0 | 571.0 | 860.0 | 201800.0 |
| "LIGHTING_COST_CURRENT" | 28840912 | 7049 | 82.130655 | 589.723862 | -768.0 | 54.0 | 73.0 | 100.0 | 3.15769e6 |
| "HOT_WATER_COST_CURRENT" | 28841075 | 6886 | 154.051112 | 102.004094 | -38254.0 | 94.0 | 120.0 | 183.0 | 66675.0 |
| "NUMBER_HABITABLE_ROOMS" | 25398115 | 3449846 | 4.215609 | 1.915856 | 0.0 | 3.0 | 4.0 | 5.0 | 3424.0 |