All checks were successful
Deploy to Pages / build (push) Successful in 2m58s
Reviewed-on: https://home.schmelczer.dev/git/git/andras/schmelczer-dev/pulls/75
53 lines
4 KiB
Markdown
53 lines
4 KiB
Markdown
---
|
|
title: A Python Framework Where Doing the Right Thing Is the Default
|
|
description: My MSc thesis. 33 catalogued ML deployment habits, a decorator-shaped Python API, and a survey of working engineers on which actually got adopted.
|
|
date: 2026-05-09
|
|
projectPeriod: '2022'
|
|
thumbnail:
|
|
src: ./_assets/great-ai.png
|
|
alt: Example Python code using the GreatAI API.
|
|
tags: ['ai', 'systems', 'tools']
|
|
featuredOrder: 1
|
|
role: Researcher and framework author
|
|
stack: ['Python', 'decorators', 'FastAPI', 'survey design']
|
|
scale: 33 deployment habits surveyed, 6 proposed additions, framework evaluated by working data scientists and engineers
|
|
outcome: A pip-installable framework, an MSc thesis, and one strong opinion about API surface area
|
|
audience: recruiter-relevant
|
|
links:
|
|
- label: PyPI
|
|
url: https://pypi.org/project/great-ai/
|
|
- label: Project site
|
|
url: https://great-ai.scoutinscience.com
|
|
- label: MSc thesis
|
|
url: /media/downloads/great-ai-andras-schmelczer.pdf
|
|
download: true
|
|
media:
|
|
- type: image
|
|
src: ./_assets/great-ai.png
|
|
alt: Example Python code using GreatAI decorators and prediction helpers.
|
|
caption: A working GreatAI service is about ten lines on top of a plain prediction function.
|
|
---
|
|
|
|
By the end of 2021 I had stopped believing the people skipping ML deployment best practices were the problem. They knew the list. They agreed with the list. They had a deadline, and every item on the list cost five lines of glue. My MSc thesis turned that into the actual research question: not "what should engineers do" but "what API shape makes doing the right thing cheaper than not." The framework that fell out, `great-ai`, is a decorator on a plain Python function. The thesis behind it is the part worth reading.
|
|
|
|
## The thing nobody wants to admit
|
|
|
|
The literature has a long list of habits you should adopt when shipping an ML service: track inputs, version models, expose health, log decisions, keep predictions reproducible. Everyone agrees with the list. Almost nobody implements all of it.
|
|
|
|
I spent the bulk of the thesis catalogueing 33 such habits, proposing 6 more, and surveying engineers on which actually got applied in their day jobs. The data was pretty clear about the failure mode: it wasn't ignorance, it wasn't laziness, it wasn't budget. It was that the cost of doing the right thing, five lines of glue per habit multiplied across a stack, was higher than the visible cost of skipping it. So skipping it became the default.
|
|
|
|
So the real research question wasn't "what should engineers do." It was "what API shape makes doing the right thing cheaper than not."
|
|
|
|
## The framework's bet
|
|
|
|
- **A decorator on a plain function.** `@GreatAI.create` turns a regular Python function into a deployed service with metadata, request tracing, and a versioned interface. No inheritance, no project layout, no enforced directory structure. The mental cost is one import.
|
|
- **Implicit behaviour only for cross-cutting concerns.** Logging, versioning, metadata are implicit. Anything touching business logic stays explicit. The rule: if it would surprise me when I'm debugging, it shouldn't be implicit.
|
|
- **Own the contract, leave the storage alone.** Where you persist logs, models, or metrics is your choice; GreatAI defines the shape and provides defaults. The model registry stays somebody else's library.
|
|
|
|
The survey backed up the central premise: ease of use and functionality both matter for adoption, and they're independent axes. A framework that ticks every box and is awkward will lose to a smaller one that doesn't.
|
|
|
|
## What I'd change
|
|
|
|
- I'd narrow further. Anything GreatAI did that overlapped with MLflow, BentoML, or modern observability stacks would go. The durable bit was always the decorator and the catalogue behind it.
|
|
- I'd publish the survey instrument separately. The 33-habit catalogue and the adoption-vs-impact methodology outlive the framework. People still ask about that part.
|
|
- I'd stop calling them "best practices." I used that phrase in the thesis and it aged into corporate-speak. The honest name is "things that hurt later if you skip them."
|