AI Shoulder Surf V1

Table of Contents

A while back I thought of having some regular office hours so that I could share my screen with some friends and we could just talk about what we are learning and working on. I think office hours are hard to sell unless you’re some kind of expert or thought leader. I am neither of those things, but I do enjoy showing what I’m working on. So, I proposed a “shoulder surf” session where people could share screens and gab. No video recording was made, in order to remove the stress and the performative aspect of video calls. It also leaves space for asking questions and admitting you have knowledge gaps. Admitting to not having answers is not a problem for me personally, because knowledge gaps are basically all that I have, but I want other people to have a safe space.

I posted something on LinkedIn, texted some former colleagues whom I have been keeping in touch with and we set up a meeting. I thought we’d have a handful of folks, but we had a dozen and I learned a lot. This was the first session of at least two. (I’m not making big plans).

What follows is a combination of a factual AI summary of the call and some of my comments. So, I did have an assist, but it’s also not slop. The target audience of this post is the folks who were on the call, but maybe some others will find the digest version useful.

“Peaking over @brad_frost’s shoulder at An Event Apart” by Jeremy Keith is licensed under CC BY 2.0.

Mateu: OpenClaw and AI-Generated Video
#

Mateu kicked things off by demonstrating how he used AI to produce OpenClaw: Introduction & Memory Search Overview, a YouTube video explaining OpenClaw — a project he’s been experimenting with. He runs it inside a Linux container on Proxmox and used Manus to generate the video.

On the cost side, Mateu has been happy with the Codex model at $20/month and finds it genuinely useful for day-to-day work.

Ingy: Headless Claude in a Makefile
#

Ingy talked about makes, and an automation that he put together just a few minutes before the call. It centres on a version-update skill that runs Claude in headless mode. The skill instructs Claude to:

Run make version-check and capture the output
Find all lines indicating an outdated version (<file> <old-version> -> <new-version>)
Update the version string in each corresponding .mk file
Re-run make version-check to confirm everything is current
Commit the changes with a descriptive message

This is then wired up as a GitHub Actions workflow that runs on a daily schedule. It checks out the repo, passes the ANTHROPIC_API_KEY, runs make version-update, and pushes any commits if versions changed.

The interesting part isn’t the mechanics — it’s the philosophy behind it. Ingy described this as a shift toward building agent-friendly tooling rather than just human-centric tools. The skill file is the spec; Claude is the executor. Watching a scheduled workflow kick off a Claude session that reads files, reasons about version drift, and commits changes is one of those moments where the future feels very close.

Olaf: The “Talk About Us” Skill
#

I presented the talk-about-us skill, which I wrote about at length in an earlier post. The short version: it’s derived from Anil Dash’s framework for writing copy that other people can repeat accurately without you in the room.

The skill audits text for jargon, distinctiveness, emotional resonance, and value-first messaging. It gives frank feedback and rewrite suggestions while preserving the author’s voice. Friends have found it useful for blog posts, sponsor emails, resumes, and grant proposals. It gives you the honest outside perspective that’s hard to get when you’re too close to your own work.

The skill itself was written using superpowers’ skill-writing skill — a nice example of using the tools to build the tools.

Nico: The Bot That Works While You Sleep
#

Nico showed off the most ambitious automation of the afternoon: koan, a bot system designed to maximize Claude quota usage during off-hours. While he’s asleep, the bot:

Processes a queue of GitHub issues
Implements fixes and creates pull requests
Responds to PR comments
Runs multiple refinement passes: implementation, refactor, security audit

The queue is accessible via Telegram, which means Nico can add tasks to the pipeline from his phone before bed and wake up to completed pull requests. It’s queue-based, runs unattended, and uses iterative passes to improve code quality rather than trying to get everything right in one shot. The combination of multiple refinement stages and overnight execution is clever — the quota limits that would be annoying during the day become a non-issue when you have eight hours.

The Bigger Themes
#

A few threads ran through the whole session:

Specification over code. Todd made the case that the future of development is writing detailed specifications rather than code. The measure of success isn’t whether the code is correct — it’s whether the tests validate the expected behavior. He argued this requires roughly 10x more tests than traditional development. The Agent Skills specification is worth reading if this resonates with you.

Model selection matters. Most people in the group have settled on Claude Opus for serious development work. The $100/month cost is real, but the quality difference justifies it when you’re using AI as a core part of your workflow rather than an occasional helper.

Editor’s note: One person mentioned that they are using Haiku exclusively and are having issues getting quality code back out of it. I do think it’s going to be an uphill battle to get great results out of Haiku for everything. There are, however, some things it does very well. I use it to extract event and organization details out of web pages for My Mind is Racing. I would not use it to produce code, but for less complex tasks it can be a significant money saver.

Containerization for safety. There was broad agreement that running AI agents in isolated environments — Docker, KVM, Incus — is important for anything that touches real systems. You want controlled, auditable access, not a process that can reach anywhere it wants. Karol Galanciak has written a solid walkthrough on Claude on Incus if you want a practical starting point.

Editor’s note: I’ve spent a fair amount of time getting custom Docker containers set up so that I can let Claude essentially run hog wild while I sleep. While this is a good fit for my particular use case right now, I’m looking forward to trying out more lightweight solutions.

Image input is underused. Several people mentioned that feeding screenshots and photos to Claude for debugging is surprisingly effective. When you’re staring at a visual bug or a confusing terminal output, sometimes the fastest path is just to take a screenshot and ask.

--chrome is worth knowing about. Claude’s --chrome flag enables browser control, which opens up front-end testing automation that would otherwise require a separate framework. A few people hadn’t heard of it and were immediately interested.

Editor’s note: Playwright MCP Server can be very handy for inspecting web content as well as the network requests involved.

Addendum
#

I’m not sure if this actually, came up but I’e been using claude-pulse status line for Claude Code that gives you some visibility into what the agent is doing right now, especially around token usage. I get a lot out of that.