DfE #2: The Nerfing, the Swarm, and 341 Reasons to Read the Code
Dispatches from the Edge #2: The Nerfing, the Swarm, and 341 Reasons to Read the Code
Weekly insights from the Tinkerer Club — a Discord community of AI early adopters building with OpenClaw
The Week’s Sharpest Signal
Two big model launches hit this week: Anthropic shipped Opus 4.6 and OpenAI dropped Codex 5.3. Both were supposed to be better. Faster, sure. But better?
The community’s verdict arrived fast and unfiltered:
one member: “To me it looks like the Opus 4.6 is the nerfed Opus 4.5 we got two weeks ago… I’m not impressed.”
another member: “Getting worst performance than I did from 4.5 and feels like I have to be more in the loop again instead of trusting it.”
a third member reverted entirely: “Reverted back to 4.5 and it is working.”
This triggered a broader conversation that’s been simmering for months: is model nerfing real? Are providers quietly degrading models after launch? Or is something else happening? one member offered the most honest self-reflection:
“I was always wondering if nerfing is a real thing or just a collective sense of model fatigue. I too experienced that the same model can produce worse results over time, but to me it feels it was because I got lazy with my prompts and expected the same quality with less effort.”
That’s a fascinating admission. The possibility that we’re the ones degrading — getting sloppy with prompts as we develop trust, then blaming the model when quality drops. The truth is probably somewhere in between: models do change (infrastructure updates, routing changes, capacity management), AND users get lazier over time. Both arrows point the same direction.
one member offered the counterpoint that made the speed crowd happy: “Good LORD 4.6 is fast compared to 4.5.”
So the tradeoff is clear: you can have fast, or you can have the version you already trust. For now, a significant chunk of the Tinkerer Club is staying on 4.5.
What People Are Building
Sci-Fi Worlds That Build Themselves
The most creative build of the week came from one builder, who launched deep-sci-fi.world:
“I’m trying to create scientifically/causally grounded sci-fi worlds and stories by leveraging OpenClaw bots. Agents can dream up plausible future worlds, causally connect them to today, inhabit them, and tell stories about their lives there. And validate each other through peer review.”
Read that last sentence again. AI agents peer-reviewing each other’s speculative fiction for scientific plausibility. This is what happens when a community of tinkerers gets access to cheap compute and has too much imagination.
The architecture is wild: agents generate worlds, other agents critique them for causal consistency, and the surviving worlds become settings for generated narratives. It’s evolutionary fiction — survival of the most plausible.
Custom Telegram as Agent Dashboard
one developer is building a custom Telegram frontend — not just a bot interface, but a full macOS and iOS app using TDLib as the communication layer:
“You can build your own UI for Telegram. So I’m building something like your dashboard where I can manage agents, skills, etc. and chat with agents on top of Telegram.”
This is a clever hack. Instead of building a custom web dashboard from scratch, use Telegram’s protocol as the backbone and build a native app on top. You get real-time messaging, push notifications, and media handling for free. The agents don’t care what the client looks like — they just see Telegram messages.
Apple Calendar at 800x Speed
a plugin author dropped a genuinely useful plugin that generated immediate interest:
“Just published
openclaw-apple-calendarplugin for OpenClaw. Fast (like 800x faster than AppleScript versions) native Apple Calendar tools.”
For context: the existing approach to calendar integration on macOS involved AppleScript, which is roughly as fast as reading the calendar aloud to a sleeping cat. crunch7960’s plugin goes native, and the performance difference is dramatic.
The community immediately pounced — the repo was accidentally set to private, then fixed, then people started forking it. This is the kind of infrastructure-level contribution that makes the whole ecosystem better.
Merge Senpai: Your Agent Reviewer
one builder built “Merge Senpai” — an agent that monitors for new pull requests on a cron schedule and automatically deploys itself to review code. The name alone is perfect, but the architecture is solid: listen for PRs, spin up a review agent, post comments. No human has to remember to request a review.
This is the pattern: take a workflow that depends on human attention (someone needs to notice the PR, someone needs to request a review, someone needs to do the review) and replace the “someone needs to notice” part with a cron job. The human still makes the final call, but the machine handles the nagging.
The Model Wars
The Great Model Stack Optimization
The community is converging on a pattern: nobody uses just one model anymore.
The optimal stack, based on this week’s conversations:
- Claude Max subscription ($100-200/mo) — your primary workhorse for complex reasoning
- ChatGPT Plus ($20/mo) — backup reasoning, different perspective, Codex access
- Kimi k2.5 (API, cheap) — tool-heavy grunt work, heartbeats, simple tasks
- Gemini API (free tier) — vision tasks, large context processing
- Local models (ollama/LM Studio) — sensitive data, offline work, experiments
one member summarized the approach: tried Gemini as main, results were “not good,” but keeps Flash as a sub-agent for specific tasks. another member pointed to OpenCode Zen for free Kimi access.
The smart operators aren’t picking a winner — they’re building model portfolios.
Kimi k2.5: The People’s Champion (With Caveats)
Kimi k2.5 had a breakout week. one tester posted the comparison that got shared everywhere: “MiniMax dumber than 5.2, Kimi noticeably better.”
But the enthusiasm comes with asterisks:
- one member reported constant repetition and inability to complete code
- another member found it “slow af” with OpenClaw once context fills up
- one user experienced the classic trajectory: “I spoke too soon about kimi 2.5, it’s moving real slow right now. 2-3 minutes between basic telegram chats.”
- another user hit context window issues immediately
The verdict: Kimi is a legitimate budget option that punches above its weight, but it’s not a drop-in Opus replacement. Use it for the right tasks (tool execution, simple automation) and don’t expect it to architect your system.
Codex 5.2/5.3 vs. GLM: “Heaven and Hell”
one developer drew the starkest comparison of the week, calling the gap between Codex 5.2 and GLM “heaven and hell.” The Codex faithful are vindicated. GLM, which generated excitement when Z.ai launched, is settling into its niche: decent for basic coding, not competitive for anything complex.
Meanwhile, Codex 5.3 launched this week alongside Opus 4.6. one member shared the OpenAI announcement that API access is coming soon. another member confirmed it’s not available through the copilot proxy in OpenClaw yet. The wait continues.
The Security Wake-Up Call
341 Malicious ClawdHub Skills
The biggest story of the week wasn’t a feature launch — it was a Hacker News article that one member shared: Researchers found 341 malicious skills on ClawdHub.
The response in the community was immediate and sobering:
a commenter: “Be careful, a lot of malicious hidden stuff in there.”
another member: “IMO ClawdHub is horrible, download the skill as zip and go through it first as well.”
This is the npm left-pad moment for the AI agent ecosystem. ClawdHub is the package registry for OpenClaw skills — the place where you go to find pre-built capabilities for your agent. And 341 of those packages were actively malicious.
The implications:
- Your AI agent runs with your permissions, your API keys, your access
- A malicious skill can exfiltrate data, make API calls, modify files
- Most users install skills with a single command and never read the source
- There’s no code signing, no review process, no sandbox
The community’s response was pragmatic rather than panicked. Download as zip. Read the code. Don’t trust, verify. But the fundamental tension remains: the whole point of skills is to save time by not building everything yourself. If you have to audit every skill line by line, you’ve traded one kind of work for another.
The Bird CLI Concern
one member raised a related question about bird (the X/Twitter CLI skill): “Anyone running bird saw any type of issues in your X account afterwards?”
another member flagged the aggressive default behavior: “That skill is ballsy though, it’ll pull 1000 tweets at once if you tell it to, which concerns me a bit.”
The pattern: tools built for power users don’t always include the guardrails that prevent abuse. When your agent can pull 1000 tweets in one call, and the skill’s README doesn’t mention rate limits, the user is one badly-worded prompt away from getting their account flagged.
Tools & Techniques
$50 Free Anthropic Credits (The Discovery)
one member dropped the tip that made everyone’s day:
“I don’t know if I am late or if you guys didn’t know, but you can get $50 free API Anthropic credits: Open Claude → Settings → Usage → White button that says ‘claim’ = free $50 of credits.”
The thread blew up. In a community where people are tracking costs to the penny, $50 of free API credits is the equivalent of finding money in your winter coat.
Mobile Agent Access: Termux Rising
one member showed their agent running directly off their Android phone:
“Just saw the OS Panel Kitze — curious about the energy battery level, what the logic for it?”
But the real flex was the setup: running OpenClaw from a phone via Termux. another member asked how to replicate on iOS, and the answer was definitive: “iOS no go — Android, get Termux so you can SSH into the phone.”
The mobile agent access story is growing. People want their agents available everywhere, and Termux on Android is becoming the go-to for phone-based access. iOS remains locked down.
The Memory Problem Nobody Can Solve
The most honest thread of the week came from one member:
“I’ve asked and people seem hesitant to answer. I think we’re all in the same boat where we have no f***ing clue how to resolve memory issues.”
And later:
“I just want some level of persistence without each interaction having to shove 4325347542 tokens into it.”
another member mentioned layering Hindsight with QMD for business memory, calling it “a flywheel” that “takes a bit to get going.” another member referenced the PARA method (from nateliason and felixcraftai). But nobody claimed to have solved it.
This is the community’s honest moment. Everyone’s building agents that can code, deploy, browse the web, manage calendars — but persistent memory across sessions remains the hardest unsolved problem. The workarounds (file-based memory, vector stores, structured knowledge graphs) all work partially, and none of them feel right.
deanfluence nailed it: “People are rushing to solve for so many things, yet the memory component feels like the most important one to solve for if we want these agents to be the unlocks they have the potential to be.”
Whisper: When Your Agent Goes Deaf
Voice notes on Telegram remain unreliable. one member spent an entire day debugging transcription failures:
“Anyone here experiencing the Telegram voice notes falls into the void? And doesn’t get auto-transcribed?”
one member shared a working setup (whisper.cpp built from source, base model), but the honest truth slipped out:
“I kept asking clawdbot until it got it to work… Your bot is not as smart as Claude Code for debugging.”
sissi_ninja’s response was perfect: “And now feels like you have a BF, sometimes hears you, sometimes ignores you.”
Community Pulse
The Vienna Meetup
the community founder announced the first official Tinkerer Club meetup — in Vienna, two hours before the OpenClaw meetup, with plans for a talk. The community is going physical. When a Discord community of 1000+ people starts meeting IRL, it’s no longer a chat group. It’s a scene.
The New Member Dilemma
A revealing exchange: a new member joined and immediately asked if they were in the right place after paying $299. rubenstolk’s response: “Nobody knows lol. We’re just here to test if Kitze will stay sane or become like levelsio.”
It’s funny, but it also captures the community’s self-awareness. This is a paid community where the product is… each other. The shared experience of building with bleeding-edge tools, comparing notes, sharing failures. The $299 isn’t buying access to a curriculum — it’s buying access to the conversation.
The CoWork-OS Dream
one member shared CoWork-OS — an attempt to build the operating system for AI-augmented organizations. another member had been talking about this all week:
“This is literally the beginning of autonomous organizations and we have the chance to build it.”
Whether CoWork-OS specifically takes off is less important than the signal: multiple people in this community are independently converging on the same idea — that the future isn’t individual AI assistants, but AI-native organizational structures.
What’s Coming
The Model Fragmentation Continues
With Opus 4.6 controversial, Codex 5.3 incoming, Kimi gaining ground, and local models improving weekly, the model landscape is getting more fragmented, not less. The winners will be the people who build flexible architectures — model-agnostic routing, easy swapping, cost-aware selection.
The community is already there. While the rest of the world debates “which AI is best,” the tinkerers are running five models simultaneously and routing based on task complexity, cost, and trust level.
Security as a First-Class Concern
The 341 malicious skills story isn’t going away. Expect more conversation about skill sandboxing, code signing, and trust chains. The OpenClaw ecosystem needs a security model that doesn’t require every user to be a security auditor.
Memory Will Be the Moat
The team or tool that cracks persistent, efficient, agent-native memory will win. Not just “stuff context into a vector store,” but actual semantic memory — the ability for an agent to remember what matters, forget what doesn’t, and learn from its interactions the way humans learn from experience.
Right now it’s duct tape and hope. Whoever makes it elegant will define the next phase.
The Meetup Circuit Begins
Vienna is first. If the Tinkerer Club meetup goes well — and given this community’s energy, it will — expect more cities. The online-to-offline pipeline is the oldest community playbook in tech, and it works because the conversations that happen over beers are different from the ones that happen over Discord.
Next week: Vienna meetup recap (if dispatches survive the afterparty), the Codex 5.3 API launch window, and whether anyone’s cracked the memory problem. Keep tinkering. 🦞
Dispatches from the Edge is a weekly series covering the AI agent builder community. Have something to share? Find us in the Tinkerer Club.