Don't cut off heads
Some checks failed
lint / ruff (push) Failing after 32s

This commit is contained in:
Andras Schmelczer 2026-05-06 22:05:14 +01:00
parent 4601f7aaea
commit 3f77f0e94b
7 changed files with 287 additions and 110 deletions

View file

@ -1,6 +1,6 @@
# Frame # Frame
A small e-ink photo frame for our home. It pulls from our self-hosted [Immich](https://immich.app/) library, checks the self-hosted [Home Assistant](https://www.home-assistant.io/) to see if anyone is home, and shows a photo on the [PhotoPainter](https://www.waveshare.com/wiki/PhotoPainter) (Waveshare 7.3" 6-colour panel hooked up to a Raspberry Pi Zero 2W) for everyone to enjoy. A small e-ink photo frame for our home. It pulls from a self-hosted [Immich](https://immich.app/) library, checks a self-hosted [Home Assistant](https://www.home-assistant.io/) to see if anyone is home, and shows a photo on the [PhotoPainter](https://www.waveshare.com/wiki/PhotoPainter) (Waveshare 7.3" 6-colour panel hooked up to a Raspberry Pi Zero 2W) for everyone to enjoy.
<p align="center"><img src="photos/frame.jpg" alt="The frame showing a dithered landscape photo with a small overlay reading '2 years ago' and 'Palmeiras'" width="420"></p> <p align="center"><img src="photos/frame.jpg" alt="The frame showing a dithered landscape photo with a small overlay reading '2 years ago' and 'Palmeiras'" width="420"></p>
<p align="center"><sub><em>The bottom corners show the photo's capture age and EXIF location. This one was taken in Palmeiras two years ago.</em></sub></p> <p align="center"><sub><em>The bottom corners show the photo's capture age and EXIF location. This one was taken in Palmeiras two years ago.</em></sub></p>
@ -19,7 +19,7 @@ It was a fun afternoon project with Claude Code, a bit of experimenting with dif
1. Quits if it's between midnight and 7am. 1. Quits if it's between midnight and 7am.
2. Asks Home Assistant whether anyone in `HA_PRESENCE` is home. If not, quits to preserve power and not strain the e-ink unnecessarily. 2. Asks Home Assistant whether anyone in `HA_PRESENCE` is home. If not, quits to preserve power and not strain the e-ink unnecessarily.
3. Picks a random photo from Immich. The pool is weighted: ~30% "on this day" memories (10% if only the ±3-day fallback fires), ~18% favourites, ~36% the last 30 days, and ~36% everything else. A 7-day rolling history avoids repeats; orientation match gets 4x the weight of mismatch. See [immich.py](./src/lib/immich.py). 3. Picks a random photo from Immich. The pool is weighted: ~30% "on this day" memories (10% if only the ±3-day fallback fires), ~18% favourites, ~36% the last 30 days, and ~36% everything else. A 7-day rolling history avoids repeats; orientation match gets 4x the weight of mismatch. Before accepting a candidate, the picker verifies that every detected head fits inside the crop with a small safety margin; rejected candidates are skipped. See [immich.py](./src/lib/immich.py).
4. Crops around any detected faces, boosts contrast and saturation (both lacking on e-ink), dithers down to the 6-colour palette, and pushes it to the panel. The capture age and EXIF location are painted into the bottom corners as white-on-black-stroke text, so dithering can't smear the edges. 4. Crops around any detected faces, boosts contrast and saturation (both lacking on e-ink), dithers down to the 6-colour palette, and pushes it to the panel. The capture age and EXIF location are painted into the bottom corners as white-on-black-stroke text, so dithering can't smear the edges.
## Image pipeline ## Image pipeline
@ -28,14 +28,17 @@ The two choices that matter most are `face_aware_crop` and Atkinson dithering.
### Cropping ### Cropping
Obviously, the frame can only be in orientation at a time but I didn't want to limit it to only show portrait or landscape photos. So the `face_aware_crop` function resize-crops to fill the frame while keeping all faces within the frame. A landscape shot with room around the subject usually crops cleanly to portrait this way. For finding faces, it relies on the bounding boxes returned by Immich. The frame can only be in one orientation at a time, but I didn't want to limit it to only show portrait or landscape photos. So `face_aware_crop` resize-crops to fill the frame while biasing the crop around the faces returned by Immich. A landscape shot with room around the subject usually crops cleanly to portrait this way.
See the following example from [crop_compare.ipynb](./notebooks/crop_compare.ipynb) that shows how the head bounding boxes affect the final crop. The important guardrail is `heads_fit_in_crop`: before the picker accepts a downloaded candidate, it checks the exact crop window against each face box extended upward to cover the head and padded by `HEAD_SAFETY_MARGIN`. If the crop would cut into any visible padded head area, the photo is rejected and another candidate is tried. That keeps the aggressive landscape-to-portrait crop from shaving off heads in group photos or edge-framed shots.
See the following examples from [crop_compare.ipynb](./notebooks/crop_compare.ipynb) that show how the head bounding boxes affect the final crop and which candidates would be accepted or rejected.
<p align="center"> <p align="center">
<img src="photos/crop_compare_portrait.png" alt="Crop comparison showing original photos with face boxes, naive centre crops, and face-aware crops for a portrait frame target" width="760"> <img src="photos/crop_compare_portrait.png" alt="Crop comparison showing original photos with face boxes, naive centre crops, and accepted face-aware crops for a portrait frame target" width="760">
</p> </p>
### Dithering ### Dithering
The panel can only show six colours: black, white, red, yellow, blue, and green. There's no intensity control like on an LCD, every pixel is one of those six. To get anything legible we have to dither, and there are many algorithms with wildly different running times. [dither_compare.ipynb](./notebooks/dither_compare.ipynb) shows a comparison between a few. The panel can only show six colours: black, white, red, yellow, blue, and green. There's no intensity control like on an LCD, every pixel is one of those six. To get anything legible we have to dither, and there are many algorithms with wildly different running times. [dither_compare.ipynb](./notebooks/dither_compare.ipynb) shows a comparison between a few.

File diff suppressed because one or more lines are too long

Binary file not shown.

Before

Width:  |  Height:  |  Size: 4.5 MiB

After

Width:  |  Height:  |  Size: 4.5 MiB

Before After
Before After

Binary file not shown.

Before

Width:  |  Height:  |  Size: 3 MiB

After

Width:  |  Height:  |  Size: 2.9 MiB

Before After
Before After

View file

@ -12,7 +12,12 @@ sys.path.append(str(Path(__file__).parent / "lib"))
from crop import face_aware_crop from crop import face_aware_crop
from env import load_env, require from env import load_env, require
from homeassistant import HomeAssistantClient from homeassistant import HomeAssistantClient
from immich import ImmichClient, get_random_photo_from_album, get_random_photo_of_people from immich import (
ImmichClient,
get_random_photo_from_album,
get_random_photo_of_people,
target_size_for_orientation,
)
from overlay import format_age, format_location from overlay import format_age, format_location
# waveshare_epd is imported lazily only when a render will actually happen. # waveshare_epd is imported lazily only when a render will actually happen.
@ -97,9 +102,11 @@ def main() -> None:
try: try:
epd.init() epd.init()
img = Image.open(image_path).convert("RGB") img = Image.open(image_path).convert("RGB")
faces = client.get_asset_faces(asset["id"]) faces = asset.get("_faces")
if faces is None:
faces = client.get_asset_faces(asset["id"])
print(f"Faces: {len(faces)}") print(f"Faces: {len(faces)}")
target_w, target_h = (480, 800) if args.orientation in (90, 270) else (800, 480) target_w, target_h = target_size_for_orientation(args.orientation)
img = face_aware_crop(img, target_w, target_h, faces) img = face_aware_crop(img, target_w, target_h, faces)
if args.orientation: if args.orientation:
img = img.rotate(args.orientation, expand=True) img = img.rotate(args.orientation, expand=True)

View file

@ -1,4 +1,4 @@
"""Resize-to-cover with face-aware positioning. """Resize-to-cover with face-aware positioning and head-fit checks.
When a portrait source is cropped onto a landscape target, the face joint-span When a portrait source is cropped onto a landscape target, the face joint-span
centre lands on the top third of the crop window instead of the middle, so the centre lands on the top third of the crop window instead of the middle, so the
@ -6,6 +6,7 @@ eyes sit on the upper-third line where landscape composition naturally reads.
""" """
import math import math
from dataclasses import dataclass
from PIL import Image from PIL import Image
@ -14,6 +15,102 @@ from PIL import Image
# bare face centre. # bare face centre.
HEAD_EXTENSION = 0.4 HEAD_EXTENSION = 0.4
# Extra room around the head-extended face box required before a photo is
# accepted for display.
HEAD_SAFETY_MARGIN = 0.08
_FIT_EPSILON = 1e-6
@dataclass(frozen=True)
class CropGeometry:
"""Resize-to-cover geometry used by the face-aware crop."""
resized_size: tuple[int, int]
crop_box: tuple[int, int, int, int]
head_boxes: list[tuple[float, float, float, float]]
def _cover_size(img_w: int, img_h: int, target_w: int, target_h: int) -> tuple[int, int]:
img_aspect = img_w / img_h
target_aspect = target_w / target_h
if img_aspect < target_aspect:
return target_w, math.ceil(target_w / img_aspect)
return math.ceil(target_h * img_aspect), target_h
def _head_boxes(
faces: list[dict], img_w: int, img_h: int, new_w: int, new_h: int
) -> list[tuple[float, float, float, float]]:
boxes = []
for f in faces:
sx = new_w / (f.get("imageWidth") or img_w)
sy = new_h / (f.get("imageHeight") or img_h)
x1 = f["boundingBoxX1"] * sx
y1 = f["boundingBoxY1"] * sy
x2 = f["boundingBoxX2"] * sx
y2 = f["boundingBoxY2"] * sy
face_w = x2 - x1
face_h = y2 - y1
x_margin = face_w * HEAD_SAFETY_MARGIN
y_margin = face_h * HEAD_SAFETY_MARGIN
boxes.append(
(
x1 - x_margin,
y1 - face_h * HEAD_EXTENSION - y_margin,
x2 + x_margin,
y2 + y_margin,
)
)
return boxes
def face_aware_crop_geometry(
image_size: tuple[int, int], target_w: int, target_h: int, faces: list[dict]
) -> CropGeometry:
"""Return the resize size, crop box, and safety-expanded head boxes."""
img_w, img_h = image_size
new_w, new_h = _cover_size(img_w, img_h, target_w, target_h)
cx, cy = new_w / 2, new_h / 2
boxes = _head_boxes(faces, img_w, img_h, new_w, new_h) if faces else []
if boxes:
cx = (min(b[0] for b in boxes) + max(b[2] for b in boxes)) / 2
cy = (min(b[1] for b in boxes) + max(b[3] for b in boxes)) / 2
y_anchor = target_h / 3 if img_h > img_w and target_w > target_h else target_h / 2
x_off = max(0, min(int(cx - target_w / 2), new_w - target_w))
y_off = max(0, min(int(cy - y_anchor), new_h - target_h))
return CropGeometry(
resized_size=(new_w, new_h),
crop_box=(x_off, y_off, x_off + target_w, y_off + target_h),
head_boxes=boxes,
)
def heads_fit_in_crop_size(
image_size: tuple[int, int], target_w: int, target_h: int, faces: list[dict]
) -> bool:
"""True when the face-aware crop keeps every visible head area inside."""
if not faces:
return True
geometry = face_aware_crop_geometry(image_size, target_w, target_h, faces)
new_w, new_h = geometry.resized_size
crop_x1, crop_y1, crop_x2, crop_y2 = geometry.crop_box
return all(
max(0, head_x1) >= crop_x1 - _FIT_EPSILON
and max(0, head_y1) >= crop_y1 - _FIT_EPSILON
and min(new_w, head_x2) <= crop_x2 + _FIT_EPSILON
and min(new_h, head_y2) <= crop_y2 + _FIT_EPSILON
for head_x1, head_y1, head_x2, head_y2 in geometry.head_boxes
)
def heads_fit_in_crop(image: Image.Image, target_w: int, target_h: int, faces: list[dict]) -> bool:
"""True when `face_aware_crop` would keep all heads inside the output frame."""
return heads_fit_in_crop_size(image.size, target_w, target_h, faces)
def face_aware_crop( def face_aware_crop(
image: Image.Image, target_w: int, target_h: int, faces: list[dict] image: Image.Image, target_w: int, target_h: int, faces: list[dict]
@ -25,35 +122,7 @@ def face_aware_crop(
the top third of the crop window (rule of thirds) instead of the middle. the top third of the crop window (rule of thirds) instead of the middle.
Plain centre crop when no faces. Plain centre crop when no faces.
""" """
img_w, img_h = image.size geometry = face_aware_crop_geometry(image.size, target_w, target_h, faces)
img_aspect = img_w / img_h new_w, new_h = geometry.resized_size
target_aspect = target_w / target_h
if img_aspect < target_aspect:
new_w = target_w
new_h = math.ceil(target_w / img_aspect)
else:
new_w = math.ceil(target_h * img_aspect)
new_h = target_h
resized = image.resize((new_w, new_h), Image.LANCZOS) resized = image.resize((new_w, new_h), Image.LANCZOS)
return resized.crop(geometry.crop_box)
cx, cy = new_w / 2, new_h / 2
if faces:
boxes = []
for f in faces:
sx = new_w / (f.get("imageWidth") or img_w)
sy = new_h / (f.get("imageHeight") or img_h)
x1 = f["boundingBoxX1"] * sx
y1 = f["boundingBoxY1"] * sy
x2 = f["boundingBoxX2"] * sx
y2 = f["boundingBoxY2"] * sy
boxes.append((x1, y1, x2, y2))
cx = (min(b[0] for b in boxes) + max(b[2] for b in boxes)) / 2
y_lo_ext = min(b[1] - (b[3] - b[1]) * HEAD_EXTENSION for b in boxes)
y_hi = max(b[3] for b in boxes)
cy = (y_lo_ext + y_hi) / 2
y_anchor = target_h / 3 if img_h > img_w and target_w > target_h else target_h / 2
x_off = max(0, min(int(cx - target_w / 2), new_w - target_w))
y_off = max(0, min(int(cy - y_anchor), new_h - target_h))
return resized.crop((x_off, y_off, x_off + target_w, y_off + target_h))

View file

@ -8,16 +8,20 @@ from datetime import UTC, datetime, timedelta
from pathlib import Path from pathlib import Path
from urllib.request import Request from urllib.request import Request
from crop import heads_fit_in_crop
from net import urlopen_with_retry from net import urlopen_with_retry
from PIL import Image
HISTORY_FILE = Path(__file__).parent.parent / "photo_history.json" HISTORY_FILE = Path(__file__).parent.parent / "photo_history.json"
CACHE_DIR = Path(tempfile.gettempdir()) / "frame_cache" CACHE_DIR = Path(tempfile.gettempdir()) / "frame_cache"
# Soft preference for picking photos whose orientation matches the frame. # Soft preference for picking photos whose orientation matches the frame.
# Mismatched-orientation photos still appear, just less often, since # Mismatched-orientation photos still appear, just less often, since
# face_aware_crop handles them via the rule-of-thirds composition. # face_aware_crop can often compose them without losing heads.
ORIENTATION_MATCH_WEIGHT = 0.8 ORIENTATION_MATCH_WEIGHT = 0.8
ORIENTATION_DIFFER_WEIGHT = 0.2 ORIENTATION_DIFFER_WEIGHT = 0.2
FRAME_LANDSCAPE = (800, 480)
FRAME_PORTRAIT = (480, 800)
_ROTATED_EXIF_ORIENTATIONS = {5, 6, 7, 8, "5", "6", "7", "8"} _ROTATED_EXIF_ORIENTATIONS = {5, 6, 7, 8, "5", "6", "7", "8"}
@ -184,6 +188,11 @@ def _bias_by_orientation(candidates: list[dict], frame_portrait: bool) -> list[d
return pool return pool
def target_size_for_orientation(orientation: int) -> tuple[int, int]:
"""Pre-rotation crop target for the Waveshare panel."""
return FRAME_PORTRAIT if orientation in (90, 270) else FRAME_LANDSCAPE
def _on_this_day_candidates(assets: list[dict]) -> tuple[list[dict], bool]: def _on_this_day_candidates(assets: list[dict]) -> tuple[list[dict], bool]:
"""Photos taken on today's month-day in past years, with a ±3-day fallback. """Photos taken on today's month-day in past years, with a ±3-day fallback.
@ -241,6 +250,75 @@ def _pick_weighted_random(assets: list[dict]) -> dict:
return random.choice(pool) return random.choice(pool)
def _asset_label(asset: dict) -> str:
return asset.get("originalFileName") or asset.get("originalPath") or asset.get("id", "unknown")
def _download_if_heads_fit(
client: ImmichClient, asset: dict, target_w: int, target_h: int
) -> tuple[Path, dict] | None:
faces = client.get_asset_faces(asset["id"])
with tempfile.NamedTemporaryFile(prefix="immich_photo_", suffix=".jpg", delete=False) as tmp:
dest = Path(tmp.name)
path = client.download_asset(asset["id"], dest)
try:
if faces:
with Image.open(path) as img:
fits = heads_fit_in_crop(img, target_w, target_h, faces)
if not fits:
path.unlink(missing_ok=True)
print(
f"Rejected photo: {_asset_label(asset)} "
f"(heads do not fit {target_w}x{target_h} crop)"
)
return None
except Exception:
path.unlink(missing_ok=True)
raise
selected = dict(asset)
selected["_faces"] = faces
return path, selected
def _pick_eligible_and_download(
client: ImmichClient,
candidates: list[dict],
target_w: int,
target_h: int,
rejected_ids: set[str],
) -> tuple[Path, dict] | None:
remaining = [a for a in candidates if a.get("id") not in rejected_ids]
while remaining:
asset = _pick_weighted_random(remaining)
asset_id = asset["id"]
result = _download_if_heads_fit(client, asset, target_w, target_h)
if result is not None:
return result
rejected_ids.add(asset_id)
remaining = [a for a in remaining if a.get("id") not in rejected_ids]
return None
def _pick_eligible_with_orientation_bias(
client: ImmichClient,
candidates: list[dict],
target_w: int,
target_h: int,
frame_portrait: bool,
rejected_ids: set[str],
) -> tuple[Path, dict] | None:
biased_candidates = _bias_by_orientation(candidates, frame_portrait)
result = _pick_eligible_and_download(
client, biased_candidates, target_w, target_h, rejected_ids
)
if result is None and len(biased_candidates) < len(candidates):
print("No eligible photos in picked orientation pool, trying other orientations")
result = _pick_eligible_and_download(client, candidates, target_w, target_h, rejected_ids)
return result
def _pick_and_download( def _pick_and_download(
client: ImmichClient, assets: list[dict], orientation: int, source_label: str client: ImmichClient, assets: list[dict], orientation: int, source_label: str
) -> tuple[Path, dict]: ) -> tuple[Path, dict]:
@ -249,18 +327,33 @@ def _pick_and_download(
displayed, created_at = _load_history() displayed, created_at = _load_history()
candidates = [a for a in assets if a.get("id") not in displayed] candidates = [a for a in assets if a.get("id") not in displayed]
history_filtered = len(candidates) < len(assets)
if not candidates: if not candidates:
print(f"All {len(assets)} photos shown, picking from full list") print(f"All {len(assets)} photos shown, picking from full list")
candidates = assets candidates = assets
else: else:
print(f"Photos: {len(candidates)} new / {len(assets)} total") print(f"Photos: {len(candidates)} new / {len(assets)} total")
candidates = _bias_by_orientation(candidates, orientation in (90, 270)) target_w, target_h = target_size_for_orientation(orientation)
rejected_ids: set[str] = set()
frame_portrait = orientation in (90, 270)
result = _pick_eligible_with_orientation_bias(
client, candidates, target_w, target_h, frame_portrait, rejected_ids
)
asset = _pick_weighted_random(candidates) if result is None and history_filtered:
with tempfile.NamedTemporaryFile(prefix="immich_photo_", suffix=".jpg", delete=False) as tmp: print("No eligible new photos after head-fit checks, picking from full list")
dest = Path(tmp.name) result = _pick_eligible_with_orientation_bias(
path = client.download_asset(asset["id"], dest) client, assets, target_w, target_h, frame_portrait, rejected_ids
)
if result is None:
raise ValueError(
f"No photos in {source_label} can be cropped to {target_w}x{target_h} "
"without cutting off heads"
)
path, asset = result
displayed.add(asset["id"]) displayed.add(asset["id"])
_save_history(displayed, created_at) _save_history(displayed, created_at)
return path, asset return path, asset
@ -291,4 +384,4 @@ def get_random_photo_from_album(
if not assets: if not assets:
raise ValueError(f"No photos in album: {album_name}") raise ValueError(f"No photos in album: {album_name}")
return _pick_and_download(client, assets, orientation, f"album: {album_name}") return _pick_and_download(client, assets, orientation, f"album {album_name!r}")