Don't cut off heads

2026-05-06 22:05:14 +01:00 · 2026-05-06 22:05:14 +01:00 · 3f77f0e94b
commit 3f77f0e94b
parent 4601f7aaea
7 changed files with 287 additions and 110 deletions
--- a/README.md
+++ b/README.md
@ -1,6 +1,6 @@
 # Frame
-A small e-ink photo frame for our home. It pulls from our self-hosted [Immich](https://immich.app/) library, checks the self-hosted [Home Assistant](https://www.home-assistant.io/) to see if anyone is home, and shows a photo on the [PhotoPainter](https://www.waveshare.com/wiki/PhotoPainter) (Waveshare 7.3" 6-colour panel hooked up to a Raspberry Pi Zero 2W) for everyone to enjoy.
+A small e-ink photo frame for our home. It pulls from a self-hosted [Immich](https://immich.app/) library, checks a self-hosted [Home Assistant](https://www.home-assistant.io/) to see if anyone is home, and shows a photo on the [PhotoPainter](https://www.waveshare.com/wiki/PhotoPainter) (Waveshare 7.3" 6-colour panel hooked up to a Raspberry Pi Zero 2W) for everyone to enjoy.
 <p align="center"><img src="photos/frame.jpg" alt="The frame showing a dithered landscape photo with a small overlay reading '2 years ago' and 'Palmeiras'" width="420"></p>
 <p align="center"><sub><em>The bottom corners show the photo's capture age and EXIF location. This one was taken in Palmeiras two years ago.</em></sub></p>
@ -19,7 +19,7 @@ It was a fun afternoon project with Claude Code, a bit of experimenting with dif
 1. Quits if it's between midnight and 7am.
 2. Asks Home Assistant whether anyone in `HA_PRESENCE` is home. If not, quits to preserve power and not strain the e-ink unnecessarily.
-3. Picks a random photo from Immich. The pool is weighted: ~30% "on this day" memories (10% if only the ±3-day fallback fires), ~18% favourites, ~36% the last 30 days, and ~36% everything else. A 7-day rolling history avoids repeats; orientation match gets 4x the weight of mismatch. See [immich.py](./src/lib/immich.py).
+3. Picks a random photo from Immich. The pool is weighted: ~30% "on this day" memories (10% if only the ±3-day fallback fires), ~18% favourites, ~36% the last 30 days, and ~36% everything else. A 7-day rolling history avoids repeats; orientation match gets 4x the weight of mismatch. Before accepting a candidate, the picker verifies that every detected head fits inside the crop with a small safety margin; rejected candidates are skipped. See [immich.py](./src/lib/immich.py).
 4. Crops around any detected faces, boosts contrast and saturation (both lacking on e-ink), dithers down to the 6-colour palette, and pushes it to the panel. The capture age and EXIF location are painted into the bottom corners as white-on-black-stroke text, so dithering can't smear the edges.
 ## Image pipeline
@ -28,14 +28,17 @@ The two choices that matter most are `face_aware_crop` and Atkinson dithering.
 ### Cropping
-Obviously, the frame can only be in orientation at a time but I didn't want to limit it to only show portrait or landscape photos. So the `face_aware_crop` function resize-crops to fill the frame while keeping all faces within the frame. A landscape shot with room around the subject usually crops cleanly to portrait this way. For finding faces, it relies on the bounding boxes returned by Immich.
+The frame can only be in one orientation at a time, but I didn't want to limit it to only show portrait or landscape photos. So `face_aware_crop` resize-crops to fill the frame while biasing the crop around the faces returned by Immich. A landscape shot with room around the subject usually crops cleanly to portrait this way.
-See the following example from [crop_compare.ipynb](./notebooks/crop_compare.ipynb) that shows how the head bounding boxes affect the final crop.
+The important guardrail is `heads_fit_in_crop`: before the picker accepts a downloaded candidate, it checks the exact crop window against each face box extended upward to cover the head and padded by `HEAD_SAFETY_MARGIN`. If the crop would cut into any visible padded head area, the photo is rejected and another candidate is tried. That keeps the aggressive landscape-to-portrait crop from shaving off heads in group photos or edge-framed shots.
 See the following examples from [crop_compare.ipynb](./notebooks/crop_compare.ipynb) that show how the head bounding boxes affect the final crop and which candidates would be accepted or rejected.
 <p align="center">
-  <img src="photos/crop_compare_portrait.png" alt="Crop comparison showing original photos with face boxes, naive centre crops, and face-aware crops for a portrait frame target" width="760">
+  <img src="photos/crop_compare_portrait.png" alt="Crop comparison showing original photos with face boxes, naive centre crops, and accepted face-aware crops for a portrait frame target" width="760">
 </p>
 ### Dithering
 The panel can only show six colours: black, white, red, yellow, blue, and green. There's no intensity control like on an LCD, every pixel is one of those six. To get anything legible we have to dither, and there are many algorithms with wildly different running times. [dither_compare.ipynb](./notebooks/dither_compare.ipynb) shows a comparison between a few.
--- a/notebooks/crop_compare.ipynb
+++ b/notebooks/crop_compare.ipynb
--- a/photos/crop_compare_landscape.png
+++ b/photos/crop_compare_landscape.png
--- a/photos/crop_compare_portrait.png
+++ b/photos/crop_compare_portrait.png
--- a/src/display.py
+++ b/src/display.py
@ -12,7 +12,12 @@ sys.path.append(str(Path(__file__).parent / "lib"))
 from crop import face_aware_crop
 from env import load_env, require
 from homeassistant import HomeAssistantClient
-from immich import ImmichClient, get_random_photo_from_album, get_random_photo_of_people
+from immich import (
    ImmichClient,
    get_random_photo_from_album,
    get_random_photo_of_people,
    target_size_for_orientation,
 )
 from overlay import format_age, format_location
 # waveshare_epd is imported lazily only when a render will actually happen.
@ -97,9 +102,11 @@ def main() -> None:
        try:
            epd.init()
            img = Image.open(image_path).convert("RGB")
-            faces = client.get_asset_faces(asset["id"])
+            faces = asset.get("_faces")
            if faces is None:
                faces = client.get_asset_faces(asset["id"])
            print(f"Faces: {len(faces)}")
-            target_w, target_h = (480, 800) if args.orientation in (90, 270) else (800, 480)
+            target_w, target_h = target_size_for_orientation(args.orientation)
            img = face_aware_crop(img, target_w, target_h, faces)
            if args.orientation:
                img = img.rotate(args.orientation, expand=True)
--- a/src/lib/crop.py
+++ b/src/lib/crop.py
@ -1,4 +1,4 @@
-"""Resize-to-cover with face-aware positioning.
+"""Resize-to-cover with face-aware positioning and head-fit checks.
 When a portrait source is cropped onto a landscape target, the face joint-span
 centre lands on the top third of the crop window instead of the middle, so the
@ -6,6 +6,7 @@ eyes sit on the upper-third line where landscape composition naturally reads.
 """
 import math
 from dataclasses import dataclass
 from PIL import Image
@ -14,6 +15,102 @@ from PIL import Image
 # bare face centre.
 HEAD_EXTENSION = 0.4
 # Extra room around the head-extended face box required before a photo is
 # accepted for display.
 HEAD_SAFETY_MARGIN = 0.08
 _FIT_EPSILON = 1e-6
@dataclass(frozen=True)
 class CropGeometry:
    """Resize-to-cover geometry used by the face-aware crop."""
    resized_size: tuple[int, int]
    crop_box: tuple[int, int, int, int]
    head_boxes: list[tuple[float, float, float, float]]
 def _cover_size(img_w: int, img_h: int, target_w: int, target_h: int) -> tuple[int, int]:
    img_aspect = img_w / img_h
    target_aspect = target_w / target_h
    if img_aspect < target_aspect:
        return target_w, math.ceil(target_w / img_aspect)
    return math.ceil(target_h * img_aspect), target_h
 def _head_boxes(
    faces: list[dict], img_w: int, img_h: int, new_w: int, new_h: int
 ) -> list[tuple[float, float, float, float]]:
    boxes = []
    for f in faces:
        sx = new_w / (f.get("imageWidth") or img_w)
        sy = new_h / (f.get("imageHeight") or img_h)
        x1 = f["boundingBoxX1"] * sx
        y1 = f["boundingBoxY1"] * sy
        x2 = f["boundingBoxX2"] * sx
        y2 = f["boundingBoxY2"] * sy
        face_w = x2 - x1
        face_h = y2 - y1
        x_margin = face_w * HEAD_SAFETY_MARGIN
        y_margin = face_h * HEAD_SAFETY_MARGIN
        boxes.append(
            (
                x1 - x_margin,
                y1 - face_h * HEAD_EXTENSION - y_margin,
                x2 + x_margin,
                y2 + y_margin,
            )
        )
    return boxes
 def face_aware_crop_geometry(
    image_size: tuple[int, int], target_w: int, target_h: int, faces: list[dict]
 ) -> CropGeometry:
    """Return the resize size, crop box, and safety-expanded head boxes."""
    img_w, img_h = image_size
    new_w, new_h = _cover_size(img_w, img_h, target_w, target_h)
    cx, cy = new_w / 2, new_h / 2
    boxes = _head_boxes(faces, img_w, img_h, new_w, new_h) if faces else []
    if boxes:
        cx = (min(b[0] for b in boxes) + max(b[2] for b in boxes)) / 2
        cy = (min(b[1] for b in boxes) + max(b[3] for b in boxes)) / 2
    y_anchor = target_h / 3 if img_h > img_w and target_w > target_h else target_h / 2
    x_off = max(0, min(int(cx - target_w / 2), new_w - target_w))
    y_off = max(0, min(int(cy - y_anchor), new_h - target_h))
    return CropGeometry(
        resized_size=(new_w, new_h),
        crop_box=(x_off, y_off, x_off + target_w, y_off + target_h),
        head_boxes=boxes,
    )
 def heads_fit_in_crop_size(
    image_size: tuple[int, int], target_w: int, target_h: int, faces: list[dict]
 ) -> bool:
    """True when the face-aware crop keeps every visible head area inside."""
    if not faces:
        return True
    geometry = face_aware_crop_geometry(image_size, target_w, target_h, faces)
    new_w, new_h = geometry.resized_size
    crop_x1, crop_y1, crop_x2, crop_y2 = geometry.crop_box
    return all(
        max(0, head_x1) >= crop_x1 - _FIT_EPSILON
        and max(0, head_y1) >= crop_y1 - _FIT_EPSILON
        and min(new_w, head_x2) <= crop_x2 + _FIT_EPSILON
        and min(new_h, head_y2) <= crop_y2 + _FIT_EPSILON
        for head_x1, head_y1, head_x2, head_y2 in geometry.head_boxes
    )
 def heads_fit_in_crop(image: Image.Image, target_w: int, target_h: int, faces: list[dict]) -> bool:
    """True when `face_aware_crop` would keep all heads inside the output frame."""
    return heads_fit_in_crop_size(image.size, target_w, target_h, faces)
 def face_aware_crop(
    image: Image.Image, target_w: int, target_h: int, faces: list[dict]
@ -25,35 +122,7 @@ def face_aware_crop(
    the top third of the crop window (rule of thirds) instead of the middle.
    Plain centre crop when no faces.
    """
-    img_w, img_h = image.size
+    geometry = face_aware_crop_geometry(image.size, target_w, target_h, faces)
-    img_aspect = img_w / img_h
+    new_w, new_h = geometry.resized_size
    target_aspect = target_w / target_h
    if img_aspect < target_aspect:
        new_w = target_w
        new_h = math.ceil(target_w / img_aspect)
    else:
        new_w = math.ceil(target_h * img_aspect)
        new_h = target_h
    resized = image.resize((new_w, new_h), Image.LANCZOS)
-
+    return resized.crop(geometry.crop_box)
    cx, cy = new_w / 2, new_h / 2
    if faces:
        boxes = []
        for f in faces:
            sx = new_w / (f.get("imageWidth") or img_w)
            sy = new_h / (f.get("imageHeight") or img_h)
            x1 = f["boundingBoxX1"] * sx
            y1 = f["boundingBoxY1"] * sy
            x2 = f["boundingBoxX2"] * sx
            y2 = f["boundingBoxY2"] * sy
            boxes.append((x1, y1, x2, y2))
        cx = (min(b[0] for b in boxes) + max(b[2] for b in boxes)) / 2
        y_lo_ext = min(b[1] - (b[3] - b[1]) * HEAD_EXTENSION for b in boxes)
        y_hi = max(b[3] for b in boxes)
        cy = (y_lo_ext + y_hi) / 2
    y_anchor = target_h / 3 if img_h > img_w and target_w > target_h else target_h / 2
    x_off = max(0, min(int(cx - target_w / 2), new_w - target_w))
    y_off = max(0, min(int(cy - y_anchor), new_h - target_h))
    return resized.crop((x_off, y_off, x_off + target_w, y_off + target_h))
--- a/src/lib/immich.py
+++ b/src/lib/immich.py
@ -8,16 +8,20 @@ from datetime import UTC, datetime, timedelta
 from pathlib import Path
 from urllib.request import Request
 from crop import heads_fit_in_crop
 from net import urlopen_with_retry
 from PIL import Image
 HISTORY_FILE = Path(__file__).parent.parent / "photo_history.json"
 CACHE_DIR = Path(tempfile.gettempdir()) / "frame_cache"
 # Soft preference for picking photos whose orientation matches the frame.
 # Mismatched-orientation photos still appear, just less often, since
-# face_aware_crop handles them via the rule-of-thirds composition.
+# face_aware_crop can often compose them without losing heads.
 ORIENTATION_MATCH_WEIGHT = 0.8
 ORIENTATION_DIFFER_WEIGHT = 0.2
 FRAME_LANDSCAPE = (800, 480)
 FRAME_PORTRAIT = (480, 800)
 _ROTATED_EXIF_ORIENTATIONS = {5, 6, 7, 8, "5", "6", "7", "8"}
@ -184,6 +188,11 @@ def _bias_by_orientation(candidates: list[dict], frame_portrait: bool) -> list[d
    return pool
 def target_size_for_orientation(orientation: int) -> tuple[int, int]:
    """Pre-rotation crop target for the Waveshare panel."""
    return FRAME_PORTRAIT if orientation in (90, 270) else FRAME_LANDSCAPE
 def _on_this_day_candidates(assets: list[dict]) -> tuple[list[dict], bool]:
    """Photos taken on today's month-day in past years, with a ±3-day fallback.
@ -241,6 +250,75 @@ def _pick_weighted_random(assets: list[dict]) -> dict:
    return random.choice(pool)
 def _asset_label(asset: dict) -> str:
    return asset.get("originalFileName") or asset.get("originalPath") or asset.get("id", "unknown")
 def _download_if_heads_fit(
    client: ImmichClient, asset: dict, target_w: int, target_h: int
 ) -> tuple[Path, dict] | None:
    faces = client.get_asset_faces(asset["id"])
    with tempfile.NamedTemporaryFile(prefix="immich_photo_", suffix=".jpg", delete=False) as tmp:
        dest = Path(tmp.name)
    path = client.download_asset(asset["id"], dest)
    try:
        if faces:
            with Image.open(path) as img:
                fits = heads_fit_in_crop(img, target_w, target_h, faces)
            if not fits:
                path.unlink(missing_ok=True)
                print(
                    f"Rejected photo: {_asset_label(asset)} "
                    f"(heads do not fit {target_w}x{target_h} crop)"
                )
                return None
    except Exception:
        path.unlink(missing_ok=True)
        raise
    selected = dict(asset)
    selected["_faces"] = faces
    return path, selected
 def _pick_eligible_and_download(
    client: ImmichClient,
    candidates: list[dict],
    target_w: int,
    target_h: int,
    rejected_ids: set[str],
 ) -> tuple[Path, dict] | None:
    remaining = [a for a in candidates if a.get("id") not in rejected_ids]
    while remaining:
        asset = _pick_weighted_random(remaining)
        asset_id = asset["id"]
        result = _download_if_heads_fit(client, asset, target_w, target_h)
        if result is not None:
            return result
        rejected_ids.add(asset_id)
        remaining = [a for a in remaining if a.get("id") not in rejected_ids]
    return None
 def _pick_eligible_with_orientation_bias(
    client: ImmichClient,
    candidates: list[dict],
    target_w: int,
    target_h: int,
    frame_portrait: bool,
    rejected_ids: set[str],
 ) -> tuple[Path, dict] | None:
    biased_candidates = _bias_by_orientation(candidates, frame_portrait)
    result = _pick_eligible_and_download(
        client, biased_candidates, target_w, target_h, rejected_ids
    )
    if result is None and len(biased_candidates) < len(candidates):
        print("No eligible photos in picked orientation pool, trying other orientations")
        result = _pick_eligible_and_download(client, candidates, target_w, target_h, rejected_ids)
    return result
 def _pick_and_download(
    client: ImmichClient, assets: list[dict], orientation: int, source_label: str
 ) -> tuple[Path, dict]:
@ -249,18 +327,33 @@ def _pick_and_download(
    displayed, created_at = _load_history()
    candidates = [a for a in assets if a.get("id") not in displayed]
    history_filtered = len(candidates) < len(assets)
    if not candidates:
        print(f"All {len(assets)} photos shown, picking from full list")
        candidates = assets
    else:
        print(f"Photos: {len(candidates)} new / {len(assets)} total")
-    candidates = _bias_by_orientation(candidates, orientation in (90, 270))
+    target_w, target_h = target_size_for_orientation(orientation)
    rejected_ids: set[str] = set()
    frame_portrait = orientation in (90, 270)
    result = _pick_eligible_with_orientation_bias(
        client, candidates, target_w, target_h, frame_portrait, rejected_ids
    )
-    asset = _pick_weighted_random(candidates)
+    if result is None and history_filtered:
-    with tempfile.NamedTemporaryFile(prefix="immich_photo_", suffix=".jpg", delete=False) as tmp:
+        print("No eligible new photos after head-fit checks, picking from full list")
-        dest = Path(tmp.name)
+        result = _pick_eligible_with_orientation_bias(
-    path = client.download_asset(asset["id"], dest)
+            client, assets, target_w, target_h, frame_portrait, rejected_ids
        )
    if result is None:
        raise ValueError(
            f"No photos in {source_label} can be cropped to {target_w}x{target_h} "
            "without cutting off heads"
        )
    path, asset = result
    displayed.add(asset["id"])
    _save_history(displayed, created_at)
    return path, asset
@ -291,4 +384,4 @@ def get_random_photo_from_album(
    if not assets:
        raise ValueError(f"No photos in album: {album_name}")
-    return _pick_and_download(client, assets, orientation, f"album: {album_name}")
+    return _pick_and_download(client, assets, orientation, f"album {album_name!r}")