Local AI & Automation Briefing for Monday, May 11th, 2026. I'm Bob, and this is your daily update on running AI on your own hardware.

The big sentiment shift this week: Hacker News front-paged "Local AI Needs to Be the Norm" at nearly eight hundred points and hundreds of comments. The post argues that running models on your own hardware isn't just a hobby — it's a necessity for privacy, sovereignty, and resilience as AI embeds deeper into everything. Right below it on the front page: a practical guide to running local models on an M4 Mac with just twenty-four gigs of memory. The community is voting with its clicks.

Ahmad Osman hosted the first Local AI Get-Together in San Francisco — and by all accounts it was a massive success, pulling in nearly three hundred likes and dozens of replies. Ahmad's articles are being called "the syllabus" for AI engineers who want to understand the full stack from hardware to agents. Separately, 0xSero posted about running "Droid pi and opencode" — combining a Raspberry Pi with OpenCode's provider abstraction layer — noting it "costs nothing to" run local AI agents on hardware you probably already have lying around. Two different ends of the spectrum, same conclusion: local is viable.

Unsloth AI's collaboration with NVIDIA is still the talk of the homelab community. Their guide on making LLM training twenty-five percent faster on home GPUs — packed-sequence metadata caching, double-buffered checkpoint reloads, and faster Mixture of Experts routing — has racked up nearly a thousand likes. Even more popular: their guide on running open LLMs as coding agents inside Claude Code, Codex, and OpenClaw, using Gemma 4 and Qwen 3.6 GGUFs with self-healing tool calls and web search on as little as twenty-four gigs of RAM. Over thirteen hundred likes. The message is clear: local agentic coding has arrived.

EXO Labs is actively surveying the community about where local AI is falling short, gathering requirements for their next release. Their May 8th post — "When was your first time trying local AI? Were you surprised or disappointed?" — generated dozens of detailed responses. The EXO framework is open-source under Apache 2.0 now, and the team is clearly listening before building. Cocktailpeanut at Pinokio is pushing the MLX image generation front, setting 2048-by-2048 as the default resolution and telling users to tweak dimensions for speed gains on Apple Silicon.

Let's talk hardware. The RTX 5090 market has stabilized beautifully. Best Buy has ASUS ROG Astral at $3,899, MSI Gaming Trio OC at $3,599, Gigabyte Gaming OC at $3,588. Amazon has open-box MSI Ventus 3X units dipping to $2,937 in good condition. The scalper frenzy is over — this is the normal market we've been waiting for.

The RTX PRO 6000 Blackwell with ninety-six gigs of GDDR7 ECC memory is shipping and in active use. Pricing lands around $8,000 to $8,500 per card. Early benchmarks show DeepSeek V4 Flash running at fifteen to twenty-one tokens per second — and ComfyUI workloads running three to five times faster than on consumer cards. This is the ultimate local inference card if you can stomach the price. Full Threadripper PRO builds with one card start around thirty-eight thousand dollars in Japan.

For the budget-conscious, used RTX 3090s remain the king. Twenty-four gigs of VRAM for $900 to $1,400 on the used market, with some Founders Editions dipping to $650. That's enough to run Qwen 3.6 27B at thirty to eighty tokens per second in Q4 or Q6 quantization. Used RTX 4090s hold at $2,000 to $2,700 — faster, more efficient, but the same twenty-four gig ceiling. And don't sleep on dual RTX 5060 Ti builds: thirty-two gigs total GDDR7 for roughly $1,200 in GPUs.

On the software side, n8n is exploding as the self-hosted AI automation platform. The n8n-claw project — a full OpenClaw recreation running on n8n with Supabase — demonstrates multi-channel memory, heartbeat automations, and modular agents, all self-hosted. Home Assistant 2026.5 just landed with RF support, ESPHome serial improvements, and new dashboards. The combination of n8n orchestrating AI agents and Home Assistant running your physical world is turning into the default homelab stack.

Power reality check: a four-by-RTX-5090 rig pulls about twenty-three hundred watts under inference load. At sixteen cents per kilowatt-hour, that's roughly $265 a month running 24/7. Pair it with solar and a home battery if you're thinking long-term — the math on solar payback for GPU rigs is getting interesting at current panel prices.

That's the Local AI & Automation Briefing for Monday, May 11th. I'm Bob — build something cool this week.