schmelczer-dev/src/content/posts/fleeting-garden-webgpu-drawing.md at db8d4597df1a92914f10bcf53e338ca3cc7aba38

andras/schmelczer-dev

Fork 0

Andras Schmelczer db8d4597df Maybe clean up

2026-05-25 09:49:09 +01:00

10 KiB

Raw Blame History

title

description

date

projectPeriod

thumbnail

The Problem

Physarum-style agent simulations are a well-trodden idea. Sense the surrounding trail, turn toward what you like, deposit a bit of your own colour, repeat. Drop a million of these on a texture and you get the familiar branching networks that look biological from a distance.

The interesting question is not how to make one run. It is how to make one feel like something specific. A generic physarum visual converges to the same family of structures regardless of input, which is why so many of them stop being interesting after the first thirty seconds. User input has to do more than seed the initial condition; it has to remain a force inside the system.

The second part of the problem is variety. The same engine had to produce visibly different behaviour under different presets, so that switching vibes felt like changing seasons rather than nudging one slider. That ruled out separate behaviour code per preset, which had been the obvious shape of the first prototype and had not survived contact with the second one.

Constraints

The toy had to be a single static file. No server, no account, no save state. Open the URL, draw, close the tab. That is the deal the metaphor makes with the user, and the deployment story falls out of it: vite build produces one HTML file, which a CI job rsyncs to a static host.

It had to be WebGPU only. Compute shaders are the right tool for this kind of simulation, and writing a Canvas2D or WebGL fallback would have meant either a second implementation or a watered-down primary one. The browserslist is literally supports webgpu and last 2 years, and anything older gets a clear message instead of a degraded experience.

It had to run on consumer hardware at sixty frames per second. The number of agents is the obvious lever, so it had to be adaptive. The number of WGSL pipelines is the less obvious one, so the architecture had to keep each frame's compute work split across a small number of focused shaders rather than one fat kernel.

Design

The simulation is split into six compute stages, written across ten WGSL files. Each stage has one job:

Agent step advances every agent by one frame. It samples the trail texture at a sensor offset, picks a turn direction, moves, and deposits a small amount of colour into the next frame's trail texture.
Diffusion blurs and decays the trail texture, so old marks soften and disappear.
Brush writes user strokes into the trail texture and a separate "source" texture that the agent shader can read.
Eraser has two variants. One clears a region of the trail texture, the other kills agents inside the eraser radius.
Agent generation handles spawning new agents along a stroke, resizing the agent buffer when the cap changes, and compacting the buffer after erasure so dead slots do not waste GPU time.
Render reads the final trail texture and produces the canvas image, with the palette and grain applied at the last moment.

Each of these is around a few dozen lines of WGSL, and the longest one (agent step) is under 300. Keeping them small is what made the simulation tunable; once they grew tangled, the tuning loop slowed to a crawl.

The Reaction Matrix

The piece of the design I would defend hardest is the reaction matrix. Each vibe carries a 3×3 table of colour-to-colour affinities. When an agent of colour i senses the trail in front of it, the three channels of that sample are weighted by row i of the matrix to decide whether to turn left, turn right, or hold course. That is the entire behaviour rule.

The matrix is nine numbers in {-1, 0, 1}, and it captures most of what makes the six vibes feel different. Aurora Mycelium has a cyclic preference where each colour chases the next, so its agents wind into ribbons. Velvet Observatory has every off-diagonal entry negative, so the colours repel each other and settle into separate islands. Paper Lantern Fog has the matrix filled with ones, which collapses the three colours into one cooperative blob.

Putting the personality of a vibe in a small, legible matrix was deliberate. The earlier prototype had a behaviour function per preset, and that route did not survive the second vibe — every new mood became a new branch in a switch statement. A 3×3 matrix is small enough that I can read it and predict the rough shape of the result, which made tuning new vibes a matter of editing a table rather than writing code.

Input and Mirroring

The drawing pipeline is intentionally simple. A pointer event becomes a series of stroke segments, each segment spawns agents along its length, and the agents' initial angle points along the stroke with a small amount of jitter. The mirror slider folds each stroke into N copies rotated around the centre, which is the cheapest way I could think of to give the user a sense of composition without a layers panel.

Spawning competes with an adaptive cap. If the framerate drops below the target, the cap shrinks; if there is headroom, it grows. When the cap is hit, new agents overwrite older ones in a circular buffer. That overwrite is what gives the garden its decay: a stroke you drew thirty seconds ago is gone not because anything erased it, but because its agents have been replaced.

Vibes as URLs

Switching vibes is the only stateful action in the app, and the chosen vibe is encoded in the URL query string. That makes the link itself the share format. A snapshot is a PNG you download; a "send your friend this preset" is a URL with ?vibe=tidepool-lantern on the end. The URL parser is tolerant about accents, casing, and whitespace, because the names are the kind of thing people retype rather than copy.

What Worked

The reaction matrix earned its place. Six presets later, I have not had to extend it. Every new vibe so far has been a recolouring plus a different table, sometimes with tweaks to the diffusion or sensor parameters, and the underlying simulation has not changed. At this scale, configuration is cheaper to evolve than code. Adding a tenth number to the matrix would be a tax on every existing vibe; tuning the nine I have is a few minutes of editing a file.

Splitting the compute work across small WGSL stages held up for the same reason in a different form. When the agent-erase shader started killing the wrong agents, I could open one short file and reason about it without touching anything else. The cost of running more pipelines is the bind-group setup, and that was lost in the noise compared to the simulation work itself.

The single-file build is the part I underestimated. The whole app, including all CSS and JavaScript, is one HTML file; the piano samples sit beside it and are preloaded at startup. That makes deployment trivial — rsync and done — but the part that actually matters is that the file is self-contained enough to hand around. I can attach it to an email or drop it on a USB stick and it runs offline, which is the closest a web app gets to feeling like an object.

What I Would Change

The intro animation cost more than it should have. Agents fly in from off-screen to spell out the title, then transition to steady-state behaviour. The choreography is tied to a single progress: 0 → 1 value that bleeds into timing, easing, and target positions across three different shaders, and that coupling is what makes the intro the part of the code I would least want to refactor today. If I rebuilt this, I would model the intro as its own dispatch with its own agent buffer and hand off to the steady-state pipeline at the boundary.

Property tests would help more than I expected. The simulation has invariants that hand-written unit tests are bad at finding — agent count stays under the cap, every drawn stroke produces a positive-coloured deposit on the next frame, the eraser does not leak agents past its radius — and these are exactly the shape of claim a generator-based test would falsify quickly.

The mobile experience is good enough rather than good. Pointer events behave, but small screens make the toolbar fight the canvas for space, and the agent cap has to shrink hard to keep the framerate up. A real fix means rethinking the toolbar layout and probably making the cap-versus-resolution tradeoff a user-visible choice.

The part I would keep is the asymmetry. You shape the gesture; the garden owns the response. The trail decay and the refusal of save state both look like missing features in isolation, and both stop looking that way the moment the garden is allowed to be fleeting. Most of the rest of the design is what fell out of taking that idea seriously.

10 KiB Raw Blame History Unescape Escape