Skip to main content

AI Shoulder Surf V2

·1150 words·6 mins·
AI automation
Table of Contents

Back on April 1, 2026 we had our second AI Shoulder Surf. I wrote up the first of these sessions last time around. It’s basically an informal Zoom call where we share screens, talk about what we’re working on, and admit what we don’t know. No video recording, to keep the stress and performative aspect of video calls out of it, and to leave space for asking questions without posturing. Admitting to knowledge gaps is not a problem for me personally, because knowledge gaps are basically all I have, but I want other people to have a safe space too.

What follows is again a combination of a factual AI summary and my own commentary. I had an assist but it’s not slop. The target audience is mostly the folks who were on the call, but I’ll be happy if anyone else gets something out of it.

featured

"Cardiff Kook, December 8 2013" by Tim Buss is licensed under CC BY 2.0 .

Daniel: Incus/LXD Coding Sandbox
#

First off Daniel demoed coding-sandbox, which he had been building for several weeks around an Incus/LXD sandbox. He landed on LXD over QEMU/KVM and VirtualBox because the CLI is faster, mounting directories is more Docker-like, and the whole setup leans well into the ephemeral-container model.

There are two execution modes:

  • Ephemeral container (the default): built from scratch on top of Debian and Bash, with a unique directory per session. Cheap to spin up, cheap to throw away.
  • VM: persistent storage, but expect roughly ten minutes the first time to build images and install tooling.

To avoid hitting the public internet every rebuild, Daniel runs an APT cache on his local network. The package of interest is apt-cacher-ng — multiple machines on the LAN pull from a central cache, which is handy both for testing new images without re-downloading the world and for keeping packages in sync across machines.

He demoed two variants: a standard coding sandbox and a “quick” coding sandbox on top of an Ubuntu image with Node.js as a baseline dependency. Inside the sandbox he’s tested several coding tools so far — Claude Code, Qwen Code, and OpenCode.

There’s still some polishing to do on the setup script for general usability, and he’s eyeing automation around the Chrome connection.

Ingy: Sandboxing via KVM
#

Ingy talked about a how he keeps Claude in a sandbox using KVM, motivated by the realization that the AI agents he was running had access to a lot more of his machine than he was really comfortable with. Once you start thinking about API keys, credentials, and what a misbehaving agent could touch, the appeal of a heavier isolation boundary goes up quickly. He also brought up the importance of limited-access roles and proper backups.

Daniel chimed in to suggest Incus as worth trying for development use — similar reasoning, lighter footprint than full VMs.

Claude’s note: this thread tracked closely with the “containerization for safety” theme from the v1 writeup. The trend line in this group is unambiguous: people are tightening up the blast radius of their agents, not loosening it.

Claude: Performance and Pricing
#

There was a long, slightly grumpy thread about Claude performance. People reported high load averages and slow processing, which led Ingy to wonder out loud whether KVM was making things worse and to float external VMs at Hetzner as an option. Daniel made the counter-case that containers might actually be better here, since they use system resources directly rather than virtualizing them.

On pricing: Ingy is on the $100/month tier and says he never hits limits. I’m also on $100/month and hit them regularly. Same plan, very different experience. That gap is interesting in its own right — it depends a lot on what you ask Claude to do, how much you let it run, and whether you’re running things overnight.

There was also a passing mention of a known bug in the Claude CLI where processing can get stuck and the fix is to kill and restart. Nothing ground-breaking, but worth knowing when you’re staring at a frozen prompt at midnight.

We also chatted about using the CLAUDE_CODE_DISABLE_1M_CONTEXT=1 env var to keep Claude at the 200k token ceiling. I have almost never used 1 million tokens and I don’t feel like I’m missing out.

Tools Roundup
#

A quick lap around what people were trying:

  • Eugen: positive experience with Cursor
  • Mateu: using OpenCode with GPT-5.4. He noted improved security features as a reason for sticking with that combination.
  • Daniel: pointed out that AI recommendations can be inconsistent between sessions and tools — likely because of the different “bubbles” or contexts the model is operating in. The same question can get a different answer depending on what else is loaded into the context window.

Claude’s note: the “bubbles” framing stuck with me. It’s a useful way to explain why two people can ask the same model the same question and walk away with different answers. The model didn’t change — the surrounding context did.

Mateu: Tuning Bert’s Personality
#

Mateu walked through some work on tuning his bot Bert’s personality — mostly in the direction of talking less.

The Bigger Themes
#

A few threads ran through the whole session:

Tighter isolation, not looser. Between Daniel’s Incus/LXD work and Ingy’s use of KVM, the direction of travel is clear: as agents do more, people want them to be able to touch less. Containers and VMs aren’t just for production anymore — they’re a development-environment hygiene question.

Same plan, different mileage. Two people on the same $100/month Claude plan can have wildly different experiences with rate limits and performance. That makes blanket statements about value-for-money basically useless. What actually matters is your workload pattern.

Context bubbles. Daniel’s “bubbles” observation — that AI recommendations shift based on the surrounding context — explains a lot of disagreements that look like disagreements about the model but are really about what’s been loaded into the conversation. Worth keeping in mind the next time someone tells you “Claude said X.” Claude said X given everything else in the session.

Less talkative is better. Mateu’s work on Bert and the general drift of the conversation pointed at the same thing: agents that respond with less, more carefully, are easier to live with than agents that fill the screen.

Claude’s note: this is roughly the same shape of theme list as last time (containerization, model choice, image input, --chrome). I’d take that as a sign that these are durable things to think about, not just first-session novelty.

Cadence
#

We’re planning to keep doing these on an ad hoc basis. Frequent enough to actually keep up with what people are working on, infrequent enough that nobody feels like it’s a standing meeting.

Where This Is Going
#

V3 will happen soon. Sign up now.


Related posts:


Related

AI Shoulder Surf V1
·1235 words·6 mins
AI automation
On Cooldowns and Dependabot Tuning
·614 words·3 mins
LLM automation Dependabot security supply chain
Can Others Explain My Work Without Me?
·1739 words·9 mins
AI writing