A hundred times this. It's fine until it isn't. And jacking these Claws into shared conversation spaces is quite literally pushing the afterburners to max on simonw's lethal trifecta. A lot of people are going to get burned hard by this. Every blackhat is eyes-on this right now - we're literally giving a drunk robot the keys to everything.
I understand that things can go wrong and there can be security issues, but I see at least two other issues:
1. what if, ChadGPT style, ads are added to the answers (like OpenAI said it'd do, hence the new "ChadGPT" name)?
2. what if the current prices really are unsustainable and the thing goes 10x?
Are we living some golden age where we can both query LLMs on the cheap and not get ad-infected answers?
I read several comments in different threads made by people saying: "I use AI because search results are too polluted and the Web is unusable"
And I now do the same:
"Gemini, compare me the HP Z640 and HP Z840 workstations, list the features in a table" / "Find me which Xeon CPU they support, list me the date and price of these CPU when they were new and typical price used now".
How long before I get twelve ads along with paid vendors recommendations?
This look nice! I was curious about being allowed to use a Claude Pro/Max subscription vs an API key, since there's been so much buzz about that lately, so I went looking for a solid answer.
"After installing Claude Code onto your machine, run claude in your terminal and follow the prompts to authenticate. The SDK will use this authentication automatically."
OP here. Yes! This was a big motivation for me to try and build this. Nervous Anthropic is gonna shut down my account for using Clawdbot.
This project uses the Agents SDK so it should be kosher in regards to terms of service. I couldn't figure out how to get the SDK running inside the containers to properly use the authenticated session from the host machine so I went with a hacky way of injecting the oauth token into the container environment. It still should be above board for TOS but it's the one security flaw that I know about (malicious person in a WhatsApp group with you can prompt inject the agent to share the oauth key).
If anyone can help out with getting the authenticated session to work properly with the agents running in containers it would be much appreciated.
I went down this rabbit hole a bit recently trying to use claude inside fence[0] and it seems that on macOS, claude stores this token inside Keychain. I'm not sure there's a way to expose that to a container... my guess would be no, especially since it seems the container is Linux, and also because keeping the Keychain out of reach of containers seems like it would be paramount. But someone might know better!
True. There’s a setting for Claude code though where you can add apiKeyHelper which is a script you add that gets the token for Claude Code. I imagine you can use that but haven’t quite figured out how to wire it up
Wow, thanks for posting that, news to me! In this case I don’t understand why there was a whole brouhaha with OpenClaw and the like - I guess they were invoking it without the official SDK? Because this makes it seem like if you have the sub you can build any agentic thing you like and still use your subscription, as long as you can install and login to Claude code on the machine running it.
Tons of chatter on Twitter making it sound like you'll get permabanned for doing this but... 1) how would they know if my requests are originating from Claude Code vs. OpenClaw? 2) how are we violating... anything? I'm working within my usage limits...
$70 or whatever to check if there's milk... just use your Claude Max subscription.
> how would they know if my requests are originating from Claude Code vs. OpenClaw
How wouldn't they know? Claude Code is proprietary they can put whatever telemetry they want in there.
> how are we violating... anything? I'm working within my usage limits...
It's well known that Claude code is heavily discounted compared to market API rates. The best interpretation of this is that it's a kind of marketing for their API. If you are not using Claude code for what it's intended for, then it's violating at least the spirit of that deal.
It was with OpenCode, but a LOT of the commentariat is insisting that running OpenClaw through subscription creds instead of API is out of TOS and will get you banhammered.
One of the things that makes Clawdbot great is the allow all permissions to do anything. Not sure how those external actions with damaging consequences get sandboxed with this.
Apple containers have been great especially that each of them maps 1:1 to a dedicated lightweight VM. Except for a bug or two that appeared in the early releases, things seem to be working out well. I believe not a lot of projects are leveraging it.
A general code execution sandbox for AI code or otherwise that used Apple containers is https://github.com/instavm/coderunner It can be hooked to Claude code and others.
It's more (exactly?) like pulling a .sh file hosted on someone else's website and running it as root, except the contents of the file are generated by a LLM, no one reads them, and the owner of the website can change them without your knowledge.
I feel like a lot of non technical people who are vibe coding or vibe using these models, focus on hallucinations and believe that as the hallucinations are reduced in benchmarks, and over estimate their ability to create safe prompts that will keep these models in line.
I think most people fail to estimate the real threat that malicious prompts can cause because it is not that common, its like when credit cards were launched, cc fraud and the various ways it could be perpetrated followed not soon after. The real threats aren’t visible yet but rest assured there are actors working to take advantage and many unfortunate examples will be seen before general awareness and precaution will prevail….
I think these days if I’m going to be actively promoting code I’ve created (with Claude, no shade for that), I’ll make sure to write the documentation, or at the very least the readme, by hand. The smell of LLM from the docs of any project puts me off even when I like the idea of the project itself, as in this case. It’s hard to describe why - maybe it feels like if you care enough to promote it, you should care to try and actually communicate, person to person, to the human being promoted at. Dunno, just my 2c and maybe just my own preference. I’d rather read a typo-ridden five line readme explaining the problem the code is there to solve for you and me,the humans, not dozens of lines of perfectly penned marketing with just the right number of emoji. We all know how easy it is to write code these days. Maybe use some of that extra time to communicate with the humans. I dunno.
OP here. Appreciate your perspective but I don't really accept the framing, which feels like it's implying that I've been caught out for writing and coding with AI.
I don't make any attempt to hide it. Nearly every commit message says "Co-Authored-By: Claude Opus 4.5". You correctly pointed out that there were some AI smells in the writing, so I removed them, just like I correct typos, and the writing is now better.
I don't care deeply about this code. It's not a masterpiece. It's functional code that is very useful to me. I'm sharing it because I think it can be useful to other people. Not as production code but as a reference or starting point they can use to build (collaboratively with claude code) functional custom software for themselves.
I spent a weekend giving instructions to coding agents to build this. I put time and effort into the architecture, especially in relation to security. I chose to post while it's still rough because I need to close out my work on it for now - can't keep going down this rabbit hole the whole week :) I hope it will be useful to others.
BTW, I know the readme irked you but if you read it I promise it will make a lot more sense where this project is coming from ;)
Hey, you do you, I’m glad you appreciate my perspective. I wasn’t trying to catch you out but I see how it came across that way - I apologise for my edit, I had hoped the ;) would show that I meant it in jest rather than in meanness but I shouldn’t have added it in the first place.
As I said in my comment, no shade for writing the code with Claude. I do it too, every day.
I wasn’t “irked” by the readme, and I did read it. But it didn’t give me a sense that you had put in “time and effort” because it felt deeply LLM-authored, and my comment was trying to explore that and how it made me feel. I had little meaningful data on whether you put in that effort because the readme - the only thing I could really judge the project by - sounded vibe coded too. And if I can’t tell if there has been care put into something like the readme how can I tell if there’s been care put into any part of the project? If there has and if that matters - say, I put care into this and that’s why I’m doing a show HN about it - then it should be evident and not hidden behind a wall of LLM-speak! Or at least; that’s what I think. As I said in a sibling comment, maybe I’m already a dinosaur and this entire topic won’t matter in a few years anyway.
Project releases with llms have grown to be less about the functionality and more about convincing others to care.
Before the proof of work of code in a repo by default was a signal of a lot of thought going into something. Now this flood of code in these vibe coded projects is by default cheap and borderline meaningless. Not throwing shade or anything at coding assistants. Just the way it goes
I don’t want to come off like I’m shitting on the poster here. I’ve definitely made that kind of careless mistake, probably a dozen times this week. And maybe we’re heading to a future where nobody even reads the readme anymore because they won’t be needed because an agent can just conjure one from the source code at will, so maybe it actually straight up doesn’t matter. I’ve just been thinking about what it means to release software nowadays, and I think the window for releasing software for clout and credit is closing, since creating software basically requires a Claude subscription and an idea now, so fewer people are impressed by the thing simply existing, and the standard of care for a project released for that aim (of clout) needs to be higher than it maybe needed to be in the past. But who knows, I’m probably already a dinosaur in today’s world, and I really don’t mean to shit on the OP - it’s a good idea for a project and it makes a lot of sense for it to exist. I just can’t tell if any actual care has gone into it, and if not, why promote?
Interesting choice to use native Apple Containers over Docker.
I assume this is to keep the footprint minimal on a Mac Mini without the overhead of the Docker VM, but does this limit the agent's ability to run standard Linux tooling? Or are you relying on the AI to just figure out the BSD/macOS equivalents of standard commands?
Is this an official Anthropic project? Because that repo doesn't exist.
Or is this just so hastily thrown together that the Quick Start is a hallucination?
That's not a facetious question, given this project's declared raison d'etre is security and the subtle implication that OpenClaw is an insecure unreviewed pile of slop.
Fixed, thanks. Claude Code likes to insert itself and anthropic everywhere.
If it somehow wasn't abundantly clear: this is a vibe coded weekend project by a single developer (me).
It's rough around the edges but it fits my needs (talking with claude code that's mounted on my obsidian vault and easily scheduling cron jobs through whatsapp). And I feel a lot better running this than a +350k LOC project that I can't even begin to wrap my head around how it works.
This is not supposed to be something other people run as is, but hopefully a solid starting point for creating your own custom setup.
To be honest, when I see many vibecoded apps, I just build my own duplicate with Claude Code. It's not that useful to use someone else's vibecode. The idea is enough, or the evidence that it works for someone else means I can just build it myself with Claude Code and I can make it specific to my needs.
Minor nitpick, it looks like about 2500 lines of typescript (I am on a mobile device, so my LOC estimate may be off). Also, Apple container looks really interesting.
I somewhat like the idea of not using MCP as much as it is being hyped.
It's certainly helpful for some things, but at the same time - I would rather improved CLI tools get created that can be used by humans and llm tools alike.
> running it scares the crap out of me
A hundred times this. It's fine until it isn't. And jacking these Claws into shared conversation spaces is quite literally pushing the afterburners to max on simonw's lethal trifecta. A lot of people are going to get burned hard by this. Every blackhat is eyes-on this right now - we're literally giving a drunk robot the keys to everything.
I understand that things can go wrong and there can be security issues, but I see at least two other issues:
1. what if, ChadGPT style, ads are added to the answers (like OpenAI said it'd do, hence the new "ChadGPT" name)?
2. what if the current prices really are unsustainable and the thing goes 10x?
Are we living some golden age where we can both query LLMs on the cheap and not get ad-infected answers?
I read several comments in different threads made by people saying: "I use AI because search results are too polluted and the Web is unusable"
And I now do the same:
"Gemini, compare me the HP Z640 and HP Z840 workstations, list the features in a table" / "Find me which Xeon CPU they support, list me the date and price of these CPU when they were new and typical price used now".
How long before I get twelve ads along with paid vendors recommendations?
This look nice! I was curious about being allowed to use a Claude Pro/Max subscription vs an API key, since there's been so much buzz about that lately, so I went looking for a solid answer.
Thankfully the official Agent SDK Quickstart guide says that you can: https://platform.claude.com/docs/en/agent-sdk/quickstart
In particular, this bit:
"After installing Claude Code onto your machine, run claude in your terminal and follow the prompts to authenticate. The SDK will use this authentication automatically."
OP here. Yes! This was a big motivation for me to try and build this. Nervous Anthropic is gonna shut down my account for using Clawdbot.
This project uses the Agents SDK so it should be kosher in regards to terms of service. I couldn't figure out how to get the SDK running inside the containers to properly use the authenticated session from the host machine so I went with a hacky way of injecting the oauth token into the container environment. It still should be above board for TOS but it's the one security flaw that I know about (malicious person in a WhatsApp group with you can prompt inject the agent to share the oauth key).
If anyone can help out with getting the authenticated session to work properly with the agents running in containers it would be much appreciated.
I went down this rabbit hole a bit recently trying to use claude inside fence[0] and it seems that on macOS, claude stores this token inside Keychain. I'm not sure there's a way to expose that to a container... my guess would be no, especially since it seems the container is Linux, and also because keeping the Keychain out of reach of containers seems like it would be paramount. But someone might know better!
0: https://github.com/Use-Tusk/fence
True. There’s a setting for Claude code though where you can add apiKeyHelper which is a script you add that gets the token for Claude Code. I imagine you can use that but haven’t quite figured out how to wire it up
Wow, thanks for posting that, news to me! In this case I don’t understand why there was a whole brouhaha with OpenClaw and the like - I guess they were invoking it without the official SDK? Because this makes it seem like if you have the sub you can build any agentic thing you like and still use your subscription, as long as you can install and login to Claude code on the machine running it.
Tons of chatter on Twitter making it sound like you'll get permabanned for doing this but... 1) how would they know if my requests are originating from Claude Code vs. OpenClaw? 2) how are we violating... anything? I'm working within my usage limits...
$70 or whatever to check if there's milk... just use your Claude Max subscription.
> how would they know if my requests are originating from Claude Code vs. OpenClaw
How wouldn't they know? Claude Code is proprietary they can put whatever telemetry they want in there.
> how are we violating... anything? I'm working within my usage limits...
It's well known that Claude code is heavily discounted compared to market API rates. The best interpretation of this is that it's a kind of marketing for their API. If you are not using Claude code for what it's intended for, then it's violating at least the spirit of that deal.
Was there a brouhaha with OpenClaw or was that with OpenCode?
It was with OpenCode, but a LOT of the commentariat is insisting that running OpenClaw through subscription creds instead of API is out of TOS and will get you banhammered.
I think you’re right and it was OpenCode. The semantic collisions are going to becpme more of a problem in the coming Cambrian explosion of software
One of the things that makes Clawdbot great is the allow all permissions to do anything. Not sure how those external actions with damaging consequences get sandboxed with this.
Apple containers have been great especially that each of them maps 1:1 to a dedicated lightweight VM. Except for a bug or two that appeared in the early releases, things seem to be working out well. I believe not a lot of projects are leveraging it.
A general code execution sandbox for AI code or otherwise that used Apple containers is https://github.com/instavm/coderunner It can be hooked to Claude code and others.
> One of the things that makes Clawdbot great is the allow all permissions to do anything.
Is this materially different than giving all files on your system 777 permissions?
It's vastly different.
It's more (exactly?) like pulling a .sh file hosted on someone else's website and running it as root, except the contents of the file are generated by a LLM, no one reads them, and the owner of the website can change them without your knowledge.
I feel like a lot of non technical people who are vibe coding or vibe using these models, focus on hallucinations and believe that as the hallucinations are reduced in benchmarks, and over estimate their ability to create safe prompts that will keep these models in line.
I think most people fail to estimate the real threat that malicious prompts can cause because it is not that common, its like when credit cards were launched, cc fraud and the various ways it could be perpetrated followed not soon after. The real threats aren’t visible yet but rest assured there are actors working to take advantage and many unfortunate examples will be seen before general awareness and precaution will prevail….
Thanks! Was hoping someone would do something more sane like this.
Openclaw is very useful, but like you I share the sentiment of it being terrifying, even before you introduce the social network aspect.
My Mac mini is currently literally switched off for this very reason.
I think these days if I’m going to be actively promoting code I’ve created (with Claude, no shade for that), I’ll make sure to write the documentation, or at the very least the readme, by hand. The smell of LLM from the docs of any project puts me off even when I like the idea of the project itself, as in this case. It’s hard to describe why - maybe it feels like if you care enough to promote it, you should care to try and actually communicate, person to person, to the human being promoted at. Dunno, just my 2c and maybe just my own preference. I’d rather read a typo-ridden five line readme explaining the problem the code is there to solve for you and me,the humans, not dozens of lines of perfectly penned marketing with just the right number of emoji. We all know how easy it is to write code these days. Maybe use some of that extra time to communicate with the humans. I dunno.
Edit: I see you, making edits to the readme to make it sound more human-written since I commented ;) https://github.com/gavrielc/nanoclaw/commit/40d41542d2f335a0...
OP here. Appreciate your perspective but I don't really accept the framing, which feels like it's implying that I've been caught out for writing and coding with AI.
I don't make any attempt to hide it. Nearly every commit message says "Co-Authored-By: Claude Opus 4.5". You correctly pointed out that there were some AI smells in the writing, so I removed them, just like I correct typos, and the writing is now better.
I don't care deeply about this code. It's not a masterpiece. It's functional code that is very useful to me. I'm sharing it because I think it can be useful to other people. Not as production code but as a reference or starting point they can use to build (collaboratively with claude code) functional custom software for themselves.
I spent a weekend giving instructions to coding agents to build this. I put time and effort into the architecture, especially in relation to security. I chose to post while it's still rough because I need to close out my work on it for now - can't keep going down this rabbit hole the whole week :) I hope it will be useful to others.
BTW, I know the readme irked you but if you read it I promise it will make a lot more sense where this project is coming from ;)
Hey, you do you, I’m glad you appreciate my perspective. I wasn’t trying to catch you out but I see how it came across that way - I apologise for my edit, I had hoped the ;) would show that I meant it in jest rather than in meanness but I shouldn’t have added it in the first place.
As I said in my comment, no shade for writing the code with Claude. I do it too, every day.
I wasn’t “irked” by the readme, and I did read it. But it didn’t give me a sense that you had put in “time and effort” because it felt deeply LLM-authored, and my comment was trying to explore that and how it made me feel. I had little meaningful data on whether you put in that effort because the readme - the only thing I could really judge the project by - sounded vibe coded too. And if I can’t tell if there has been care put into something like the readme how can I tell if there’s been care put into any part of the project? If there has and if that matters - say, I put care into this and that’s why I’m doing a show HN about it - then it should be evident and not hidden behind a wall of LLM-speak! Or at least; that’s what I think. As I said in a sibling comment, maybe I’m already a dinosaur and this entire topic won’t matter in a few years anyway.
Project releases with llms have grown to be less about the functionality and more about convincing others to care.
Before the proof of work of code in a repo by default was a signal of a lot of thought going into something. Now this flood of code in these vibe coded projects is by default cheap and borderline meaningless. Not throwing shade or anything at coding assistants. Just the way it goes
I agree 100% with you. It's even worse though. They haven't checked if the Readme has hallucinated it or not (spoiler: it has):
https://news.ycombinator.com/item?id=46850317
I don’t want to come off like I’m shitting on the poster here. I’ve definitely made that kind of careless mistake, probably a dozen times this week. And maybe we’re heading to a future where nobody even reads the readme anymore because they won’t be needed because an agent can just conjure one from the source code at will, so maybe it actually straight up doesn’t matter. I’ve just been thinking about what it means to release software nowadays, and I think the window for releasing software for clout and credit is closing, since creating software basically requires a Claude subscription and an idea now, so fewer people are impressed by the thing simply existing, and the standard of care for a project released for that aim (of clout) needs to be higher than it maybe needed to be in the past. But who knows, I’m probably already a dinosaur in today’s world, and I really don’t mean to shit on the OP - it’s a good idea for a project and it makes a lot of sense for it to exist. I just can’t tell if any actual care has gone into it, and if not, why promote?
If you run openclaw on a spare laptop or VM and give it read only access to whatever it needs, doesn’t that eliminate most of the risk?
If you're letting it communicate with the outside world, you risk the leak and abuse of anything sensitive in the data it has access to.
Interesting choice to use native Apple Containers over Docker.
I assume this is to keep the footprint minimal on a Mac Mini without the overhead of the Docker VM, but does this limit the agent's ability to run standard Linux tooling? Or are you relying on the AI to just figure out the BSD/macOS equivalents of standard commands?
If only there were some way to answer your own question. Maybe with some kind of engine that searches.
omg these ai generated comments are maddening lol
What makes you think it's an AI comment?
Or is this just so hastily thrown together that the Quick Start is a hallucination?
That's not a facetious question, given this project's declared raison d'etre is security and the subtle implication that OpenClaw is an insecure unreviewed pile of slop.
Claude hallucinated that repo here in this commit https://github.com/gavrielc/nanoclaw/commit/dbf39a9484d9c66b...
I like that Claude's hypothesis was that Anthropic created openclaw and this anti-openclaw :)
> This is the anti-[OpenClaw](https://github.com/anthropics/openclaw).
Fixed, thanks. Claude Code likes to insert itself and anthropic everywhere.
If it somehow wasn't abundantly clear: this is a vibe coded weekend project by a single developer (me).
It's rough around the edges but it fits my needs (talking with claude code that's mounted on my obsidian vault and easily scheduling cron jobs through whatsapp). And I feel a lot better running this than a +350k LOC project that I can't even begin to wrap my head around how it works.
This is not supposed to be something other people run as is, but hopefully a solid starting point for creating your own custom setup.
Seems to be fixed now
To be honest, when I see many vibecoded apps, I just build my own duplicate with Claude Code. It's not that useful to use someone else's vibecode. The idea is enough, or the evidence that it works for someone else means I can just build it myself with Claude Code and I can make it specific to my needs.
I like the idea of a smaller version of OpenClaw.
Minor nitpick, it looks like about 2500 lines of typescript (I am on a mobile device, so my LOC estimate may be off). Also, Apple container looks really interesting.
Can you use MCP tools? I saw that with open claw they moved away from that which I personally didn't like but
I somewhat like the idea of not using MCP as much as it is being hyped.
It's certainly helpful for some things, but at the same time - I would rather improved CLI tools get created that can be used by humans and llm tools alike.
It uses a wrapper in places to consume MCPs as clis.
The singularity, but instead successive exponential improvement, its excessive exponential slop which passes the Turing test for programmers.
lol, I might finally have to upgrade my Mac mini to Tahoe. Yofi.