
Running Local Inference on the New Gemma Models — From Departmental Hardware
How small teams are deploying quantised Gemma models on commodity GPUs to run private, offline pipelines. No cloud, no data leaving the building.
7 pieces on sole traders — practical workflows, case studies and field notes.

How small teams are deploying quantised Gemma models on commodity GPUs to run private, offline pipelines. No cloud, no data leaving the building.

Stanford's Hazy Research has shipped the first credible open-source framework for personal AI agents that run on your own hardware. For UK operators, local-first has stopped being a manifesto and started being a curl command.

Anthropic just dropped Claude Fable 5 into the $20 tier and MiniMax M3 matches it on agentic work. For a small team, the value question has quietly flipped.

A 27B model that reportedly tops consumer-hardware leaderboards and fits in a single 24GB card at Q4. For a sole trader or a small professional-services team, that is the sweet spot worth understanding.

An illustrative reconciliation scenario, grounded in reported UK retail AI pilots, shows how automating vendor mapping turns a dreaded finance chore into a background task — and what it saves.

Beyond the headline funds, the government runs practical support most small firms have never heard of. Here's what BridgeAI and the AI Skills Boost actually offer, who delivers them, and how to use them.
May 2026's runtime updates look like housekeeping. For a solo operator running models on a MacBook, they quietly remove some of the friction that makes local AI feel like hard work.
We use privacy-friendly analytics to learn which articles are useful — no ads, no data selling. Cookies are only set if you accept. More