If you're interacting with stateful systems (which you usually are with this kind of command), --dry-run can still have a race condition.
The tool tells you what it would do in the current situation, you take a look and confirm that that's alright. Then you run it again without --dry-run, in a potentially different situation.
That's why I prefer Terraform's approach of having a "plan" mode. It doesn't just tell you what it would do but does so in the form of a plan it can later execute programmatically. Then, if any of the assumptions made during planning have changed, it can abort and roll back.
As a nice bonus, this pattern gives a good answer to the problem of having "if dry_run:" sprinkled everywhere: You have to separate the planning and execution in code anyway, so you can make the "just apply immediately" mode simply execute(plan()).
I think it's configurable, but my experience with terraform is that by default when you `terraform apply` it refreshes state, which seems to be tantamount to running a new plan. i.e. its not simply executing whats in the plan, its effectively running a fresh plan and using that. The plan is more like a preview.
That is the default, but the correct (and poorly documented and supported) way to use terraform is to save the plan and re-use it when you apply. See the -out parameter to terraform plan, and then never apply again without it.
The snippet which demonstrates the plan-then-execute pattern I have is this:
def gather(paths):
files = []
for pattern in paths:
files.extend(glob.glob(pattern))
return files
def execute(files):
for f in files:
os.remove(f)
files = gather([os.path.join(tmp_dir, "*.txt")])
if dryrun:
print(f"Would remove: {files}")
else:
execute(files)
I introduced dry-run at my company and I've been happy to see it spread throughout the codebase, because it's a coding practice that more than pays for itself.
>That's why I prefer Terraform's approach of having a "plan" mode. It doesn't just tell you what it would do but does so in the form of a plan it can later execute programmatically. Then, if any of the assumptions made during planning have changed, it can abort and roll back.
Not to take anything away from your comment but just to add a related story... the previous big AWS outage had an unforeseen race condition between their DNS planner vs DNS executor:
>[...] Right before this event started, one DNS Enactor experienced unusually high delays needing to retry its update on several of the DNS endpoints. As it was slowly working through the endpoints, several other things were also happening. First, the DNS Planner continued to run and produced many newer generations of plans. Second, one of the other DNS Enactors then began applying one of the newer plans and rapidly progressed through all of the endpoints. The timing of these events triggered the latent race condition. When the second Enactor (applying the newest plan) completed its endpoint updates, it then invoked the plan clean-up process, which identifies plans that are significantly older than the one it just applied and deletes them. At the same time that this clean-up process was invoked, the first Enactor (which had been unusually delayed) applied its much older plan to the regional DDB endpoint, overwriting the newer plan. The check that was made at the start of the plan application process, which ensures that the plan is newer than the previously applied plan, was stale by this time due to the unusually high delays in Enactor processing. [...]
I was thinking that he's describing implementing an initial algebra for a functor (≈AST) and an F-Algebra for evaluation. But I guess those are different words for the same things.
This is why I think things like devops benefit from the traditional computer science education. Once you see the pattern, whatever project you were assigned looks like something you've done before. And your users will appreciate the care and attention.
Yeah any time you're translating "user args" and "system state" to actions + execution and supporting a "dry run" preview it seems like you only really have two options: the "ad-hoc quick and dirty informal implementation", or the "let's actually separate the planning and assumption checking and state checking from the execution" design.
I had a similar (but not as good) thought which was to separate out the action from the planning in code then inject the action system. So —-dry-run would pass the ConsoleOutput() action interface but without it passes a LiveExecutor() (I’m sure there’s a better name).
Assuming our system is complex enough. I guess it sits between if dry_run and execute(plan()) in its complexity.
Yes! I'm currently working on a script that modifies a bunch of sensitive files, and this the approach I'm taking to make sure I don't accidentally lose any important data.
I've split the process into three parts:
1. Walk the filesystem, capture the current state of the files, and write out a plan to disk.
2. Make sure the state of the files from step 1 has not changed, then execute the plan. Capture the new state of the files. Additionally, log all operations to disk in a journal.
3. Validate that no data was lost or unexpectedly changed using the captured file state from steps 1 and 2. Manually look at the operations log (or dump it into an LLM) to make sure nothing looks off.
These three steps can be three separate scripts, or three flags to the same script.
I like that idea! For an application like Terraform, Ansible or the like, it seems ideal.
For something like in the article, I’m pretty sure a plan mode is overkill though.
Planning mode must involve making a domain specific language or data structure of some sort, which the execution mode will interpret and execute. I’m sure it would add a lot of complexity to a reporting tool where data is only collected once per day.
It's not strictly related to the original theme, but I want to mention this.
Ansible implementation is okay, but not perfect (plus, this is difficult to implement properly). For cases like file changes, it works, but if you install a package and rely on it later, the --check command will fail. So I am finding myself adding conditions like "is this a --check run?"
Ansible is treated as an idempotent tool, which it's not. If I delete a package from the list, then it will pollute the system until I create a set of "tearing-down" jobs.
No need to overthink it. In any semi-modern language you can (de)serialize anything to and from JSON, so it's really not that hard. The only thing you need to do is have a representation for the plan in your program. Which I will argue is probably the least error-prone way to implement --dry-run anyway (as opposed to sprinkling branches everywhere).
> you can (de)serialize anything to and from JSON, so it's really not that hard
First, it is hard, especially in at least somewhat portable manner.
Second, serialization only matters if you cannot (storage, IPC) pass data around in-memory anyway. That's not the problem raised, though. Whatever the backing implementation, the plan, ultimately, consists of some instructions (verbs in parent) over objects (arguments in parent). Serializing instructions any other way than dropping non-portable named references requires one to define execution language, which is not an easy feat.
> The only thing you need to do is have a representation for the plan in your program.
That "only" is doing lifting heavier than you probably realize. Such representation, which is by the way specified to be executable bidirectionally (roll back capabilities), is a full blown program, so you end up implementing language spec, godegen and execution engines. In cases of relatively simple business models that is going to be the majority of the engineering effort.
> First, it is hard, especially in at least somewhat portable manner.
I'm curious what portability concerns you've run into with JSON serialization. Unless you need to deal with binary data for some reason, I don't immediately see an issue.
> Such representation, which is by the way specified to be executable bidirectionally (roll back capabilities), is a full blown program
Of course this depends on the complexity of your problem, but I'd imagine this could be as simple as a few configuration flags for some problems. You have a function to execute the process that takes the configuration and a function to roll back that takes the same configuration. This does tie the representation very closely to the program itself so it doesn't work if you want to be able to change the program and have previously generated "plans" continue to work.
Right, but you still have to define every ”verb” your plan will have, their ”arguments”, etc. Not need to write a parser (even Java can serialize/deserialize stuff), as you say, but you have to meta-engineer the tool. Not just script a series of commands.
> That's why I prefer Terraform's approach of having a "plan" mode. It doesn't just tell you what it would do but does so in the form of a plan it can later execute programmatically. Then, if any of the assumptions made during planning have changed, it can abort and roll back.
And how do you imagine doing that for the "rm" command?
If the changes your command makes are strictly to a relational database, then `--dry-run` becomes quite easy to implement: just start a transaction and never commit it. This avoids polluting the entire command with `if dryRun` checks everywhere. I've found this to be a great approach.
One thing that I didn't see mentioned is making your own dry run for tools that don't have it built in.
It doesn't always work but sometimes I use `diff` to help with that. For example, if you have a complicated `sed` replacement that you plan to run on a file, you can use diff like this `diff -u <(echo "hello") <(echo "hello" | sed "s/hello/hi/g")` to help show what has changed.
I like the opposite too, -commit or -execute as it is assumed running it with defaults is immutable as the dry run, simplifying validation complexity and making the go live explicit.
I've biased towards this heavily in the last 8 or so years now.
I've yet to have anyone mistakenly modify anything when they need to pass --commit, when I've repeatedly had people repeatedly accidentally modify stuff because they forgot --dry-run.
Totally agree it shouldn't be for basic tools; but if I'm ever developing a script that performs any kind of logic before reaching out to a DB or vendor API and modifies 100k user records, creating a flag to just verify the sanity of the logic is a necessity.
For most of these local data manipulation type of commands, I'd rather just have them behave dangerously, and rely on filesystems snapshots to rollback when needed. With modern filesystems like zfs or btrfs, you can take a full snapshot every minute and keep it for a while to negate the damage done by almost all of these scripts. They double as a backup solution too.
Yeah, but that's because it's implemented poorly. It literally asks you to confirm deletion of each file individually, even for thousands of files.
What it should do is generate a user-friendly overview of what's to be deleted, by grouping files together by some criteria, e.g. by directory, so you'd only need to confirm a few times regardless of how many files you want to delete.
I find it important to include system information in here as well, so just copy-pasting an invocation from system A to system B does not run.
For example, our database restore script has a parameter `--yes-delete-all-data-in` and it needs to be parametrized with the PostgreSQL cluster name. So a command with `--yes-delete-all-data-in=pg-accounting` works on exactly one system and not on other systems.
My latest script which deletes the entire content of a downloaded Sharepoint (locally only) and the relevant MS365 account from the computer runs by default in a read-only mode. You have to run it with an explicit flag to allow for changes.
Also, before it actually deletes the account, you need to explicitly type DELETE-ACCOUNT in order to confirm that this is indeed your intent.
So far, nobody managed to screw up, even in heated situations at client's place.
I like this pattern a lot, but it's important that the code in the dry path is representative. I've been bitten a few too many times by dry code that just runs `print("would have updated ID: 123")`, but not actually running most of the code in the hot path. Then when I run it for real, some of the prep for the write operation has a bug / error, so my dry run didn't actually reveal much to me.
Put another way: your dry code should do everything up until the point that database writes / API calls / etc actually happen. Don't bail too early
This is why I end up creating an interface for the calls that perform modifications. Then I have one implementation that logs and one implementation that does the actual work. I end up with the output being as representative as possible as to what would happen. I also feel a lot more comfortable that a dry run truly won't write anything when the only class that could ever actually write anything is not even instantiated in dry run mode. I don't get that same comfort when there's a ton of branches sprinkled throughout checking for a dry run flag.
Doesn’t this conflate dry-running with integration testing? ASAIK the purpose of a dry-run is to understand what will happen, not to test what will happen. For the latter we have testing.
> ASAIK the purpose of a dry-run is to understand what will happen, not to test what will happen. For the latter we have testing.
Not really. Testing is a way to increase confidence that code does what it is specified to do, because it is cheaper than full-blown formal analysis :)
The problem raised by OP here is granularity. Operation like `update(record, field, value)` is itself a tree of smaller sub-operations that may do some permissions checking, locking, network calls, even checking for presence of record if it has upsert semantics, all of which could fail. A dry run with a plan that is too coarse can succeed while the actual operation fails over things left unchecked.
For little scripts, I'm not writing unit tests- running it is the test. But I want to be able to iterate without side effects, so it's important that the dry mode be as representative as possible for what'll happen when something is run for real.
You understand how subjective that is right? Someone might expect that the database doesn't do the last commit step while other people is perfectly happy that the database engine checks that it has enough writing permissions and is running as a user that can start the process without problems.
Sure, where you draw the line will vary between projects. As long as its exact placement doesn't matter too much.
For me personally, I tend to draw the line at write operations. So in your example, I'd want a dry run to verify the permissions that it can (if I expect those to be a problem). But if that can't easily be done without a write, then maybe it's not worth it. There are also situations where you want a dry run to be really fast, so you forego some checks (allowing for more surprises later). Really just depends.
In order to make it work without polluting the code-base I find that I have to move the persistence into injectable strategy, which makes it good anyway. If you keep passing in `if dry_run:` everywhere you're screwed.
Also, if I'm being honest, it's much better to use `--wet-run` for the production run than to ask people to run `--dry-run` for the test run. Less likely to accidentally fire off the real stuff.
One nice way to do things, if you can get away with it, is to model the actions your application takes explicitly, and pass them to a central thing that actually handles them. Then there can be one place in your code that actually needs to understand whether it's doing a dry run or not. Ideally this would be just returning them from your core logic, "functional core, imperative shell" style.
I totally agree with both this and the comment you replied to. The common thread is that you can architect the application in such a way that dry vs. wet running can be handled transparently, and in general these are just good designs.
we rely on this separation for our agents. the llm outputs a declarative json plan that gets validated against security policies before the runtime executes any side effects. catching a bad command during validation is way cheaper than rolling back a container.
That undo program is called nilfs2, which unfortunately never became popular. I'll simply quote the kernel docs:
> NILFS2 is a log-structured file system (LFS) supporting continuous snapshotting. In addition to versioning capability of the entire file system, users can even restore files mistakenly overwritten or destroyed just a few seconds ago.
No rule is ironclad. I think matching risk to functionality is usually a good idea. For example, most modern Linuxen carry `rm` protected against root removal with `--no-preserve-root`. That is indeed `rm --wet-run` by a different name in the dangerous case.
Sure, in those cases - but if a command has a chance of nuking prod, you want some extra step in there. Preferably something that can't be muscle-memoried through.
I don't like the sound of `--wet-run`, but on more than one occasion I've written tools (and less frequently services) that default to `dry-run` and require `--no-dry-run` to actually make changes.
For services, I prefer having them detect where they are running. Ie if it's running in a dev environment, it's going to use a dev db by default.
Design patterns are one of those things where you have to go through the full cycle to really use it effectively. It goes through the stages:
no patterns. -> Everything must follow the gang of four's patterns!!!! -> omg I can't read code anymore I'm just looking at factories. No more patterns!!! -> Patterns are useful as a response to very specific contexts.
I remember being religious about strategy patterns on an app I developed once where I kept the db layer separated from the code so that I could do data management as a strategy. Theoretically this would mean that if I ever switched DBs it would be effortless to create a new strategy and swap it out using a config. I could even do tests using in memory structures instead of DBs which made TDD ultra fast.
DB switchover never happened and the effort I put into maintaining the pattern was more than the effort it would have taken me to swap a db out later :,) .
There's some truth to this, since some design patterns can simply be implemented "for good" in a sufficiently powerful language, but I don't find it's true in general. Unfortunately, it has become something of a thought-terminating cliché. Some common design patterns are so flexible that if you really implemented them in full generality as, say, some library function, its interface would be so complex that it likely wouldn't be a net win.
> Some common design patterns are so flexible that if you really implemented them in full generality as, say, some library function, its interface would be so complex that it likely wouldn't be a net win
Then I would say you have not arrived at the optimal solution. Keep looking.
A while ago when I was working with PowerShell a lot, I got spoiled by the easy inclusion of `-DryRun` flags in all my scripts.
Nowadays I still use the technique for a lot of the tools I make. I typically do a Terraform-like approach: create a plan, validate the plan, render the plan to the user, exit early if we're in dry-run mode, and apply the plan as the final step. Whether dry-run is enabled by default depends on the risk of the operations and who will be using the tool.
This makes it exceedingly clear what actions the tool will do. Plus it has the added benefit of being able to split the plan and apply steps into two separate steps. For example, you can create a plan, serialize it to JSON, store it (e.g. in VCS, Jira ticket, whatever) then apply it during a change window.
plan = createPlan()
print(plan.string())
if (dryRun){
return
}
plan.apply()
I usually do the opposite and add a --really flag to my CLI utilities, so that they are read-only by default and extra effort is needed to screw things up.
I've committed "--i-meant-that" (for a destroy-the-remote-machine command that normally (without the arg) gives you a message and 10s to hit ^C if you're not sure, for some particularly impatient coworkers. Never ended up being used inappropriately, which is luck (but we never quantified how much luck :-)
One of the kick-ass feature of PowerShell is you only need to add `[CmdletBinding(SupportsShouldProcess)] ` to have the `-whatIf` dry-run for your functions.
Powershell cmdlet macros (and the resultant parameter consistency with flags and verbs) are a great example that should make its way over to bash/zsh scripts. Given we are rewriting so many Unix utilities, it would be a great time to learn from a positive example.
Even better, it enables both -WhatIf and -Confirm, and provides a ShouldProcess function that interacts with both and with the user's impact threshold preferences. Really cool.
In one (internal) CLI I maintain, I actually put the `if not dry_run:` inside the code which calls the REST API, because I have a setting to log HTTP calls as CURL commands, and that way in dry-run mode I can get the HTTP calls it would have made without it actually making them.
And this works well if your CLI command is simply performing a single operation, e.g. call this REST API
But the moment it starts to do anything more complex: e.g. call API1, and then send the results of API1 to API2 – it becomes a lot more difficult
Of course, you can simulate what API1 is likely to have returned; but suddenly you have something a lot more complex and error-prone than just `if not dry_run:`
Having 1 place (or just generally limiting them) that does the things keeps the dry_run check from polluting the entire codebase. I maintain a lot of CLI tooling that's run by headless VMs in automation pipelines and we do this with basically every single tool.
I prefer “—really-do”, so the default behaviour of the tool is to do nothing. That’s more fault tolerant for the scenario you forget to add “—dry-run”.
For me the ideal case is three-state. When run interactively with no flags, print a dry run result and prompt the user to confirm the action; and choose a default for non-interactive invocations. In both cases, accept either a --dry-run or a --yes flag that indicates the choice to be made.
This should always be included in any application that has a clear plan-then-execute flow, and it's definitely nice to have in other cases as well.
> The downside is that the dryRun-flag pollutes the code a bit. In all the major phases, I need to check if the flag is set, and only print the action that will be taken, but not actually doing it.
From the article: "I added –dry-run on a whim early on in the project. I was surprised at how useful I found it to be."
Not to be overly critical (I think it's great OP found value in adding and using --dry-run), but I am willing to bet that this was a suggestion/addition from a coding agent (and most likely Claude code/Opus). Having used it myself to build various CLI tools in different languages, it almost always creates that option when iterating on CLIs. To the point where it's almost a tell. I wonder if we're entering a moment of convergence where all the tools will have similar patterns/options because they are similarly written by agents.
> Early in the development process, when testing the incomplete application, I remembered that Subversion (the version control system after CVS, before Git) had a –dry-run option.
> I remembered how helpful that was, so I decided to add it to my command as well.
He mentions the reason he added it, and it's a compelling enough story to be true.
Of course and I am not trying to point fingers. But I do think it's interesting because it's also possible that it is confabulation. Not lying, but genuinely constructing coherent explanations for decisions whose true origins are different than we recall. I think working with coding agents has already made this immensely more common.
I had the equivalent of --dry-run in my kdecvs-build script from 2003 (where it was called --pretend) so it's not that spontaneous an idea that it must have been dreamed up by an AI.
Any time you have a script that needs to run for a long time or might involve destructive actions, having a way to run the script in a "tell me what you would do without actually doing it" mode is a fairly obvious user story to throw in.
Again, it's completely possible that OP and you are the wonderful exceptions (untouched and uninspired by coding agents) that have been using these patterns for as long as you can remember. My comment revolved around the psychological phenomenon, not whether dry-run is a clever/novel idea. It's about how we might tell ourselves stories about the origin of our ideas when working with those tools.
It seems to have originated in the US with Fire Departments:
> These reports show that a dry run in the jargon of the fire service at this period [1880s–1890s] was one that didn’t involve the use of water, as opposed to a wet run that did.
Interestingly the one place I have seen "dry run" to actually mean "dry run" is using a air compressor to check to see if a water loop (in a computer) doesn't leak by seeing if there no drop in pressure.
Funny, I recalled a tool called "molly-guard" which solves the problem when you want to reboot a Unix server, but can be on the wrong one. It asks to type the server name.
Anybody who rebooted a wrong server can say that this tool is brilliant.
These kind of obstacles don't work for any action that the user does repeatedly.
Dialogs that pop up and ask "Are you sure you want to delete ...?" -> users just automatically click yes, because they already did that the last 10 times and just want to get on with their work.
Logged in to server "alpha" instead of "delta" because you thought that's the right one. Tool asks you to write the server name. You type "alpha" because you know you're on alpha. Reboots the wrong server.
Github ask you to confirm the repo name before deleting by typing it into a text field. User looks at what the repo name is and types it without thinking. Or, like lazy me, mark and drag the displayed name into the field, so you don't even have to type.
The point is, users already decided to do the action when they started. It's nearly impossible to consistently make them stop and re-evaluate their decision, because that's extremely high friction and annoying. They quickly learn to circumvent the friction as efficiently as possible (i.e. without thinking about it).
A better solution is to just do it, but let the user undo it if it was a mistake (not always possible of course).
> Or, like lazy me, mark and drag the displayed name into the field
Yes, the same, unfortunately. This rail guard works for only a couple of days or weeks when the perception is fresh. After that, it's always "select, copy, paste, confirm" without thinking.
I believe the better approach is to practice "Chaos Monkeys" and improve the infrastructure to survive even if someone makes a mistake.
> Github ask you to confirm the repo name before deleting by typing it into a text field. User looks at what the repo name is and types it without thinking. Or, like lazy me, mark and drag the displayed name into the field, so you don't even have to type.
Pretty easy to solve this sort of stuff with timers: "Sure, the repo will be deleted in <timer>. Press here to abort: <button>."
Or have the timers on the backend and just show the user an undo button.
I think dry run mode is sometimes useful for many programs (and, I sometimes do use them). In some cases, you can use standard I/O so that it is not needed because you can control what is done with the output. Sometimes you might miss something especially if the code is messy, although security systems might help a bit. However, you can sometimes make the code less messy if the I/O is handled in a different way that makes this possible (e.g. by making the functions that make changes (the I/O parts of your program) to handle them in a way that the number of times you need to check for dry run is reduced if only a few functions need to); my ideas of a system with capability-based security would allow this (as well as many other benefits; a capability-based system has a lot of benefits beyond only the security system). Even with the existing security it can be done (e.g. with file permissions), although not as well as capability-based security.
We have an internal framework for building migrations and the "dry run" it's a core part of the dev cycle. Allows you to test your replication plan and transformations without touching the target. Not to mention, a load that could take >24 hours completes in minutes
I love `—-dry-run` flags for CLI tooling I build. If you plan your applications around this kind of functionality upfront - then I find it doesn’t have to pollute your code too much. In a language like Go or Rust - I’ll use a option/builder design pattern and whatever I’m ultimately writing to (remote file system, database, pubsub, etc) will instead write to a logger. I find this incredibly helpful in local dev - but it’s also useful in production. Even with high test coverage - it can be a bit spooky to turn on a new, consequential feature. Especially one that mutates data. I like to use dry run and enable this in our production envs just to ensure that things meet the functional and performance qualities we expect before actually enabling. This has definitely saved our bacon before (so many edge cases with prod data and request traffic).
I use a similar strategy for API design. Every API call is wrapped in a large database transaction, and I either roll back or commit the transaction based on dry-run or wet-run flags. This works well as long as you don’t need to touch the file system. I even wrap emails this way—emails are first written to a database queue, and an external process picks them up every few seconds.
The code is not littered with dry-run flag checks; the internal code doesn’t even know that a dry run is possible. Everything is rolled back at the end if needed.
All database referential integrity checks run correctly.
Some drawbacks: any audit logging should run in a separate transaction if you want to log dry runs.
--nuke Delete all of the archives stored. To protect against accidental
data loss, tarsnap will ask you to type the text "No Tomorrow"
when using the --nuke command.
I like to use the term "--no-clobber", so to set a script to not delete any information but re-use the previous configuration or files, otherwise error out if not possible.
I like doing the same in CI jobs, like in Jenkins I'll add a DRY_RUN parameter, that makes the whole job readonly. A script that does the deployment would then only write what would be done.
Sort of a strange article. You don't see that many people _not_ praising --dry-run (speaking of which, the author should really learn to use long options with a double dash).
I only saw the emdash in the thread link, but I do know that an iPad "wants" to turn a double dash into an emdash automatically. I have no idea how to disable that default.
I would love to have this available in git. I know if you make mistakes you can use the reflog, but if you need 5 tries to get something right reading the reflog quickly becomes impossible. Plus there are operations, like rebase or merge, that feel the need to make 50 entries in the reflog.
I've resorted to copying the entire directory (including the .git part) and then trying on the copy. The issue is that I'm working on a C++ program that has a few gigabytes of data.
The way OverlayFS works is that there's a base directory. And then there's an overlay directory that only contains the changes. Docker is based on OverlayFS.
There's two main ways overlays are used, first at build time, each line/command generates a new overlay based on the previous base, so when you do something like
FROM debian
RUN apt-get update
it creates a base from the debian image , and then creates an overlay that only contains the changes introduced by apt-get update.
If you use docker inspect or docker show on the image you get a json showing exactly where the overlay directories are, you just need to navigate the overlay directory.
Second: on runtime. [Assuming you are not using volumes, (and if you use volumes, just make sure the volume starts out as empty, instead of sharing your host files)] OverlayFS is used for the runtime file changes as well, the last image is used as a base, and every files changed during runtime are added to the runtime overlay. That filesystem won't be deleted, if you only stop the docker container, the runtime files will still be present, and you can reach them by docker inspecting the running docker processes/instances, and then navigating the overlay fs as you would any directory.
You can also just use overlayfs, as far as I recall, you just use mount and unmount while specifying the OverlayFS driver and special parameters like base and overlay. Conjugating a chain of overlays is a bit more complex, but it's the same interface.
If you're interacting with stateful systems (which you usually are with this kind of command), --dry-run can still have a race condition.
The tool tells you what it would do in the current situation, you take a look and confirm that that's alright. Then you run it again without --dry-run, in a potentially different situation.
That's why I prefer Terraform's approach of having a "plan" mode. It doesn't just tell you what it would do but does so in the form of a plan it can later execute programmatically. Then, if any of the assumptions made during planning have changed, it can abort and roll back.
As a nice bonus, this pattern gives a good answer to the problem of having "if dry_run:" sprinkled everywhere: You have to separate the planning and execution in code anyway, so you can make the "just apply immediately" mode simply execute(plan()).
I think it's configurable, but my experience with terraform is that by default when you `terraform apply` it refreshes state, which seems to be tantamount to running a new plan. i.e. its not simply executing whats in the plan, its effectively running a fresh plan and using that. The plan is more like a preview.
That is the default, but the correct (and poorly documented and supported) way to use terraform is to save the plan and re-use it when you apply. See the -out parameter to terraform plan, and then never apply again without it.
Totally agree, and this is covered in an (identically named?) Google Research blog [1].
Just last week I was writing a demo-focused Python file called `safetykit.py`, which has its first demo as this:
The snippet which demonstrates the plan-then-execute pattern I have is this: I introduced dry-run at my company and I've been happy to see it spread throughout the codebase, because it's a coding practice that more than pays for itself.[1] https://www.gresearch.com/news/in-praise-of-dry-run/
>That's why I prefer Terraform's approach of having a "plan" mode. It doesn't just tell you what it would do but does so in the form of a plan it can later execute programmatically. Then, if any of the assumptions made during planning have changed, it can abort and roll back.
Not to take anything away from your comment but just to add a related story... the previous big AWS outage had an unforeseen race condition between their DNS planner vs DNS executor:
>[...] Right before this event started, one DNS Enactor experienced unusually high delays needing to retry its update on several of the DNS endpoints. As it was slowly working through the endpoints, several other things were also happening. First, the DNS Planner continued to run and produced many newer generations of plans. Second, one of the other DNS Enactors then began applying one of the newer plans and rapidly progressed through all of the endpoints. The timing of these events triggered the latent race condition. When the second Enactor (applying the newest plan) completed its endpoint updates, it then invoked the plan clean-up process, which identifies plans that are significantly older than the one it just applied and deletes them. At the same time that this clean-up process was invoked, the first Enactor (which had been unusually delayed) applied its much older plan to the regional DDB endpoint, overwriting the newer plan. The check that was made at the start of the plan application process, which ensures that the plan is newer than the previously applied plan, was stale by this time due to the unusually high delays in Enactor processing. [...]
previous HN thread: https://news.ycombinator.com/item?id=45677139
And just like that, you find yourself implementing a compiler (specs to plan) and a virtual machine (plan to actions)!
> And just like that, you find yourself implementing a compiler (specs to plan) and a virtual machine (plan to actions)!
Not just any compiler, but a non-typesafe, ad-hoc, informally specified grammar with a bunch of unspecified or under-specified behaviour.
Not sure if we can call this a win :-)
Greenspun's tenth rule in action!
I was thinking that he's describing implementing an initial algebra for a functor (≈AST) and an F-Algebra for evaluation. But I guess those are different words for the same things.
This is why I think things like devops benefit from the traditional computer science education. Once you see the pattern, whatever project you were assigned looks like something you've done before. And your users will appreciate the care and attention.
I think you're already doing that? The only thing that's added is serializing the plan to a file and then deserializing it to make the changes.
Yeah any time you're translating "user args" and "system state" to actions + execution and supporting a "dry run" preview it seems like you only really have two options: the "ad-hoc quick and dirty informal implementation", or the "let's actually separate the planning and assumption checking and state checking from the execution" design.
I had a similar (but not as good) thought which was to separate out the action from the planning in code then inject the action system. So —-dry-run would pass the ConsoleOutput() action interface but without it passes a LiveExecutor() (I’m sure there’s a better name).
Assuming our system is complex enough. I guess it sits between if dry_run and execute(plan()) in its complexity.
Yes! I'm currently working on a script that modifies a bunch of sensitive files, and this the approach I'm taking to make sure I don't accidentally lose any important data.
I've split the process into three parts:
1. Walk the filesystem, capture the current state of the files, and write out a plan to disk.
2. Make sure the state of the files from step 1 has not changed, then execute the plan. Capture the new state of the files. Additionally, log all operations to disk in a journal.
3. Validate that no data was lost or unexpectedly changed using the captured file state from steps 1 and 2. Manually look at the operations log (or dump it into an LLM) to make sure nothing looks off.
These three steps can be three separate scripts, or three flags to the same script.
I like that idea! For an application like Terraform, Ansible or the like, it seems ideal.
For something like in the article, I’m pretty sure a plan mode is overkill though.
Planning mode must involve making a domain specific language or data structure of some sort, which the execution mode will interpret and execute. I’m sure it would add a lot of complexity to a reporting tool where data is only collected once per day.
It's not strictly related to the original theme, but I want to mention this.
Ansible implementation is okay, but not perfect (plus, this is difficult to implement properly). For cases like file changes, it works, but if you install a package and rely on it later, the --check command will fail. So I am finding myself adding conditions like "is this a --check run?"
Ansible is treated as an idempotent tool, which it's not. If I delete a package from the list, then it will pollute the system until I create a set of "tearing-down" jobs.
Probably, Nix is a better alternative.
No need to overthink it. In any semi-modern language you can (de)serialize anything to and from JSON, so it's really not that hard. The only thing you need to do is have a representation for the plan in your program. Which I will argue is probably the least error-prone way to implement --dry-run anyway (as opposed to sprinkling branches everywhere).
> you can (de)serialize anything to and from JSON, so it's really not that hard
First, it is hard, especially in at least somewhat portable manner.
Second, serialization only matters if you cannot (storage, IPC) pass data around in-memory anyway. That's not the problem raised, though. Whatever the backing implementation, the plan, ultimately, consists of some instructions (verbs in parent) over objects (arguments in parent). Serializing instructions any other way than dropping non-portable named references requires one to define execution language, which is not an easy feat.
> The only thing you need to do is have a representation for the plan in your program.
That "only" is doing lifting heavier than you probably realize. Such representation, which is by the way specified to be executable bidirectionally (roll back capabilities), is a full blown program, so you end up implementing language spec, godegen and execution engines. In cases of relatively simple business models that is going to be the majority of the engineering effort.
> First, it is hard, especially in at least somewhat portable manner.
I'm curious what portability concerns you've run into with JSON serialization. Unless you need to deal with binary data for some reason, I don't immediately see an issue.
> Such representation, which is by the way specified to be executable bidirectionally (roll back capabilities), is a full blown program
Of course this depends on the complexity of your problem, but I'd imagine this could be as simple as a few configuration flags for some problems. You have a function to execute the process that takes the configuration and a function to roll back that takes the same configuration. This does tie the representation very closely to the program itself so it doesn't work if you want to be able to change the program and have previously generated "plans" continue to work.
Right, but you still have to define every ”verb” your plan will have, their ”arguments”, etc. Not need to write a parser (even Java can serialize/deserialize stuff), as you say, but you have to meta-engineer the tool. Not just script a series of commands.
> That's why I prefer Terraform's approach of having a "plan" mode. It doesn't just tell you what it would do but does so in the form of a plan it can later execute programmatically. Then, if any of the assumptions made during planning have changed, it can abort and roll back.
And how do you imagine doing that for the "rm" command?
If the changes your command makes are strictly to a relational database, then `--dry-run` becomes quite easy to implement: just start a transaction and never commit it. This avoids polluting the entire command with `if dryRun` checks everywhere. I've found this to be a great approach.
One thing that I didn't see mentioned is making your own dry run for tools that don't have it built in.
It doesn't always work but sometimes I use `diff` to help with that. For example, if you have a complicated `sed` replacement that you plan to run on a file, you can use diff like this `diff -u <(echo "hello") <(echo "hello" | sed "s/hello/hi/g")` to help show what has changed.
I've written about the value of dry run too at: https://nickjanetakis.com/blog/cli-tools-that-support-previe...
I like the opposite too, -commit or -execute as it is assumed running it with defaults is immutable as the dry run, simplifying validation complexity and making the go live explicit.
I've biased towards this heavily in the last 8 or so years now.
I've yet to have anyone mistakenly modify anything when they need to pass --commit, when I've repeatedly had people repeatedly accidentally modify stuff because they forgot --dry-run.
I wouldn’t want most things to work this way:
There is a time and a place for it but it should not be the majority of use cases.Totally agree it shouldn't be for basic tools; but if I'm ever developing a script that performs any kind of logic before reaching out to a DB or vendor API and modifies 100k user records, creating a flag to just verify the sanity of the logic is a necessity.
set -o noclobber
Yep. First thing I do for this kind thing is make a preview=true flag so I don’t accidentally run destructive actions.
For most of these local data manipulation type of commands, I'd rather just have them behave dangerously, and rely on filesystems snapshots to rollback when needed. With modern filesystems like zfs or btrfs, you can take a full snapshot every minute and keep it for a while to negate the damage done by almost all of these scripts. They double as a backup solution too.
I used to have alias rm='rm -i' for a few years to be careful, but I took it out once I realised that I had just begun adding -f all the time
See also rm -I (capital i), which only prompts when deleting directories or >3 files
Yeah, but that's because it's implemented poorly. It literally asks you to confirm deletion of each file individually, even for thousands of files.
What it should do is generate a user-friendly overview of what's to be deleted, by grouping files together by some criteria, e.g. by directory, so you'd only need to confirm a few times regardless of how many files you want to delete.
Even in those basic examples, it probably would be useful. `cp` to a blank file? No problem. `cp` over an existing file? Yeah, I want to be warned.
`rm` a single file? Fine. `rm /`? Maybe block that one.
That last one would error without doing anything anyway because it's not recursive.
Uhuh:
—dry-run should default to true
In PowerShell there's a setting for this:
https://learn.microsoft.com/en-us/powershell/module/microsof...
There was a tool I used some time ago that required typing in a word or phrase to acknowledge that you know it's doing the run for real.
Pros and cons to each but I did like that because it was much more difficult to fat finger or absentmindedly use the wrong parameter.
Yeah I'm more of a `--wet-run` `-w` fan myself. But it does depend on how serious/annoying the opposite is.
I've done that, but I hate the term "wet run."
I use "live run" now, which I think gets the point across without being sort of uncomfortable.
--with-danger
--make-it-so
--do-the-thing
--go-nuts
--safety-off
So many fun options.
I'm a fan of --safety-off. It gives off a 'aim away from face' or 'mishandle me and I'll blow a chunk out of your DB' vibe.
I find it important to include system information in here as well, so just copy-pasting an invocation from system A to system B does not run.
For example, our database restore script has a parameter `--yes-delete-all-data-in` and it needs to be parametrized with the PostgreSQL cluster name. So a command with `--yes-delete-all-data-in=pg-accounting` works on exactly one system and not on other systems.
It's in the UI not the command line, but I like Chromium's thisisunsafe
I've done a few --execute --i-know-what-im-doing for some more dangerous scripts
May I recommend --I-take-responsibility-for-the-outcome-of-proceeding and require a capital I?
--commit is solid too
Moist run is the way.
I have a parallel directory deduper that uses hard links and adopted this pattern exactly.
By default it'll only tell you which files are identical between the two parallel directory structures.
If you want it to actually replace the files with hard links, you have to use the --execute flag.
rmlint will only run a previous dry-run snapshot.
This is something I learnt here.
My latest script which deletes the entire content of a downloaded Sharepoint (locally only) and the relevant MS365 account from the computer runs by default in a read-only mode. You have to run it with an explicit flag to allow for changes.
Also, before it actually deletes the account, you need to explicitly type DELETE-ACCOUNT in order to confirm that this is indeed your intent.
So far, nobody managed to screw up, even in heated situations at client's place.
Just don’t randomly mix and match the approaches or you are in for a bad time.
I like this pattern a lot, but it's important that the code in the dry path is representative. I've been bitten a few too many times by dry code that just runs `print("would have updated ID: 123")`, but not actually running most of the code in the hot path. Then when I run it for real, some of the prep for the write operation has a bug / error, so my dry run didn't actually reveal much to me.
Put another way: your dry code should do everything up until the point that database writes / API calls / etc actually happen. Don't bail too early
This is why I end up creating an interface for the calls that perform modifications. Then I have one implementation that logs and one implementation that does the actual work. I end up with the output being as representative as possible as to what would happen. I also feel a lot more comfortable that a dry run truly won't write anything when the only class that could ever actually write anything is not even instantiated in dry run mode. I don't get that same comfort when there's a ton of branches sprinkled throughout checking for a dry run flag.
Doesn’t this conflate dry-running with integration testing? ASAIK the purpose of a dry-run is to understand what will happen, not to test what will happen. For the latter we have testing.
> ASAIK the purpose of a dry-run is to understand what will happen
Right - so the dry-run has to actually do as much of 'what will happen' as possible, except the actual things.
You want to put the check as far down, close to the 'action' as possible. You don't want any additional business logic gated by the dry run check.
> ASAIK the purpose of a dry-run is to understand what will happen, not to test what will happen. For the latter we have testing.
Not really. Testing is a way to increase confidence that code does what it is specified to do, because it is cheaper than full-blown formal analysis :)
The problem raised by OP here is granularity. Operation like `update(record, field, value)` is itself a tree of smaller sub-operations that may do some permissions checking, locking, network calls, even checking for presence of record if it has upsert semantics, all of which could fail. A dry run with a plan that is too coarse can succeed while the actual operation fails over things left unchecked.
Yes, but it depends on the context.
For little scripts, I'm not writing unit tests- running it is the test. But I want to be able to iterate without side effects, so it's important that the dry mode be as representative as possible for what'll happen when something is run for real.
You understand how subjective that is right? Someone might expect that the database doesn't do the last commit step while other people is perfectly happy that the database engine checks that it has enough writing permissions and is running as a user that can start the process without problems.
Sure, where you draw the line will vary between projects. As long as its exact placement doesn't matter too much.
For me personally, I tend to draw the line at write operations. So in your example, I'd want a dry run to verify the permissions that it can (if I expect those to be a problem). But if that can't easily be done without a write, then maybe it's not worth it. There are also situations where you want a dry run to be really fast, so you forego some checks (allowing for more surprises later). Really just depends.
I'd argue the dry run is a form of integration testing: Essentially the writes are mocked, but the reads are still functional.
In order to make it work without polluting the code-base I find that I have to move the persistence into injectable strategy, which makes it good anyway. If you keep passing in `if dry_run:` everywhere you're screwed.
Also, if I'm being honest, it's much better to use `--wet-run` for the production run than to ask people to run `--dry-run` for the test run. Less likely to accidentally fire off the real stuff.
One nice way to do things, if you can get away with it, is to model the actions your application takes explicitly, and pass them to a central thing that actually handles them. Then there can be one place in your code that actually needs to understand whether it's doing a dry run or not. Ideally this would be just returning them from your core logic, "functional core, imperative shell" style.
I totally agree with both this and the comment you replied to. The common thread is that you can architect the application in such a way that dry vs. wet running can be handled transparently, and in general these are just good designs.
That’s what I prefer as well. A generation step and an execution step where the executor can be just a logger or the real deal.
we rely on this separation for our agents. the llm outputs a declarative json plan that gets validated against security policies before the runtime executes any side effects. catching a bad command during validation is way cheaper than rolling back a container.
I don't want to have to type rm --wet-run tempfile.tmp every time, or mkdir -p --yes-really-do-it /usr/local/bin
The program should default to actually doing whatever thing you're asking it to do.
On the other hand it would be great if every tool had an --undo argument that would undo the last thing that program did.
That undo program is called nilfs2, which unfortunately never became popular. I'll simply quote the kernel docs:
> NILFS2 is a log-structured file system (LFS) supporting continuous snapshotting. In addition to versioning capability of the entire file system, users can even restore files mistakenly overwritten or destroyed just a few seconds ago.
https://docs.kernel.org/filesystems/nilfs2.html
https://wiki.archlinux.org/title/NILFS2
https://en.wikipedia.org/wiki/NILFS
No rule is ironclad. I think matching risk to functionality is usually a good idea. For example, most modern Linuxen carry `rm` protected against root removal with `--no-preserve-root`. That is indeed `rm --wet-run` by a different name in the dangerous case.
Sure, in those cases - but if a command has a chance of nuking prod, you want some extra step in there. Preferably something that can't be muscle-memoried through.
I don't like the sound of `--wet-run`, but on more than one occasion I've written tools (and less frequently services) that default to `dry-run` and require `--no-dry-run` to actually make changes.
For services, I prefer having them detect where they are running. Ie if it's running in a dev environment, it's going to use a dev db by default.
this is where design patterns come in handy even tho folks roll their eyes at it.
Design patterns are one of those things where you have to go through the full cycle to really use it effectively. It goes through the stages:
no patterns. -> Everything must follow the gang of four's patterns!!!! -> omg I can't read code anymore I'm just looking at factories. No more patterns!!! -> Patterns are useful as a response to very specific contexts.
I remember being religious about strategy patterns on an app I developed once where I kept the db layer separated from the code so that I could do data management as a strategy. Theoretically this would mean that if I ever switched DBs it would be effortless to create a new strategy and swap it out using a config. I could even do tests using in memory structures instead of DBs which made TDD ultra fast.
DB switchover never happened and the effort I put into maintaining the pattern was more than the effort it would have taken me to swap a db out later :,) .
What about the productivity gains from in memory db for tests though? Hard to measure I guess
Design patterns exist to paper over language deficiencies. Use a language which is not deficient.
There's some truth to this, since some design patterns can simply be implemented "for good" in a sufficiently powerful language, but I don't find it's true in general. Unfortunately, it has become something of a thought-terminating cliché. Some common design patterns are so flexible that if you really implemented them in full generality as, say, some library function, its interface would be so complex that it likely wouldn't be a net win.
> Some common design patterns are so flexible that if you really implemented them in full generality as, say, some library function, its interface would be so complex that it likely wouldn't be a net win
Then I would say you have not arrived at the optimal solution. Keep looking.
Just my two cents - but a general purpose language is going to need to be coupled with design patterns in order to be useful for different tasks.
I'm using MVC design patterns for some codebases, I'm using DDD plus Event sourcing and Event Driven for others.
I suspect that you are thinking of a small subset of design patterns (eg. Gang of Four derived patterns like Visitor, Strategy, or Iterator )
> I'm using MVC design patterns for some codebases, I'm using DDD plus Event sourcing and Event Driven for others.
All examples of OO nonsense. There is only one pattern you need (functions) (tongue-in-cheek): https://fsharpforfunandprofit.com/fppatterns/
Edit: Also consider using the ELM architecture instead of MVC: https://guide.elm-lang.org/architecture/
I typically reach for Go when I am building things, bubbletea, a TUI library uses the Elm architecture and enforces it on people using it.
Did not like, Won't use again.
MVC is well understood and works perfectly in my experience.
Like what?
First class functions and iterators are probably examples of what they mean, in terms of language features that obsolete (GoF) design patterns
Here you go: https://fsharpforfunandprofit.com/fppatterns/
A while ago when I was working with PowerShell a lot, I got spoiled by the easy inclusion of `-DryRun` flags in all my scripts.
Nowadays I still use the technique for a lot of the tools I make. I typically do a Terraform-like approach: create a plan, validate the plan, render the plan to the user, exit early if we're in dry-run mode, and apply the plan as the final step. Whether dry-run is enabled by default depends on the risk of the operations and who will be using the tool.
This makes it exceedingly clear what actions the tool will do. Plus it has the added benefit of being able to split the plan and apply steps into two separate steps. For example, you can create a plan, serialize it to JSON, store it (e.g. in VCS, Jira ticket, whatever) then apply it during a change window.
I usually do the opposite and add a --really flag to my CLI utilities, so that they are read-only by default and extra effort is needed to screw things up.
I've committed "--i-meant-that" (for a destroy-the-remote-machine command that normally (without the arg) gives you a message and 10s to hit ^C if you're not sure, for some particularly impatient coworkers. Never ended up being used inappropriately, which is luck (but we never quantified how much luck :-)
I like the timer idea. I do something kinda similar by prompting the user to enter some short random code to continue.
I guess the goal for both is to give the user a chance to get out of autopilot, and avoid up-arrowing and re-executing.
Came here to say the same
One of the kick-ass feature of PowerShell is you only need to add `[CmdletBinding(SupportsShouldProcess)] ` to have the `-whatIf` dry-run for your functions.
Quite handy.
Powershell cmdlet macros (and the resultant parameter consistency with flags and verbs) are a great example that should make its way over to bash/zsh scripts. Given we are rewriting so many Unix utilities, it would be a great time to learn from a positive example.
Even better, it enables both -WhatIf and -Confirm, and provides a ShouldProcess function that interacts with both and with the user's impact threshold preferences. Really cool.
In one (internal) CLI I maintain, I actually put the `if not dry_run:` inside the code which calls the REST API, because I have a setting to log HTTP calls as CURL commands, and that way in dry-run mode I can get the HTTP calls it would have made without it actually making them.
And this works well if your CLI command is simply performing a single operation, e.g. call this REST API
But the moment it starts to do anything more complex: e.g. call API1, and then send the results of API1 to API2 – it becomes a lot more difficult
Of course, you can simulate what API1 is likely to have returned; but suddenly you have something a lot more complex and error-prone than just `if not dry_run:`
Having 1 place (or just generally limiting them) that does the things keeps the dry_run check from polluting the entire codebase. I maintain a lot of CLI tooling that's run by headless VMs in automation pipelines and we do this with basically every single tool.
What's your obsession with REST AND HTTP for console tools?!
Rest bloat is insane. Kidss nowadays wants EVERYTHING to run over IP/TCP/https. Why?!
Learn to write local tools first.
I prefer “—really-do”, so the default behaviour of the tool is to do nothing. That’s more fault tolerant for the scenario you forget to add “—dry-run”.
For me the ideal case is three-state. When run interactively with no flags, print a dry run result and prompt the user to confirm the action; and choose a default for non-interactive invocations. In both cases, accept either a --dry-run or a --yes flag that indicates the choice to be made.
This should always be included in any application that has a clear plan-then-execute flow, and it's definitely nice to have in other cases as well.
> The downside is that the dryRun-flag pollutes the code a bit. In all the major phases, I need to check if the flag is set, and only print the action that will be taken, but not actually doing it.
Sounds like a case for the state machine pattern
From the article: "I added –dry-run on a whim early on in the project. I was surprised at how useful I found it to be."
Not to be overly critical (I think it's great OP found value in adding and using --dry-run), but I am willing to bet that this was a suggestion/addition from a coding agent (and most likely Claude code/Opus). Having used it myself to build various CLI tools in different languages, it almost always creates that option when iterating on CLIs. To the point where it's almost a tell. I wonder if we're entering a moment of convergence where all the tools will have similar patterns/options because they are similarly written by agents.
> Early in the development process, when testing the incomplete application, I remembered that Subversion (the version control system after CVS, before Git) had a –dry-run option.
> I remembered how helpful that was, so I decided to add it to my command as well.
He mentions the reason he added it, and it's a compelling enough story to be true.
Of course and I am not trying to point fingers. But I do think it's interesting because it's also possible that it is confabulation. Not lying, but genuinely constructing coherent explanations for decisions whose true origins are different than we recall. I think working with coding agents has already made this immensely more common.
I had the equivalent of --dry-run in my kdecvs-build script from 2003 (where it was called --pretend) so it's not that spontaneous an idea that it must have been dreamed up by an AI.
Any time you have a script that needs to run for a long time or might involve destructive actions, having a way to run the script in a "tell me what you would do without actually doing it" mode is a fairly obvious user story to throw in.
Again, it's completely possible that OP and you are the wonderful exceptions (untouched and uninspired by coding agents) that have been using these patterns for as long as you can remember. My comment revolved around the psychological phenomenon, not whether dry-run is a clever/novel idea. It's about how we might tell ourselves stories about the origin of our ideas when working with those tools.
https://news.ycombinator.com/item?id=27263136
Related
I’m interested to know the etymology and history of the term. Somehow I imagine an inked printing press as the “wet run.”
It seems to have originated in the US with Fire Departments:
> These reports show that a dry run in the jargon of the fire service at this period [1880s–1890s] was one that didn’t involve the use of water, as opposed to a wet run that did.
https://www.worldwidewords.org/qa/qa-dry1.htm
Interestingly the one place I have seen "dry run" to actually mean "dry run" is using a air compressor to check to see if a water loop (in a computer) doesn't leak by seeing if there no drop in pressure.
Funny, I recalled a tool called "molly-guard" which solves the problem when you want to reboot a Unix server, but can be on the wrong one. It asks to type the server name.
Anybody who rebooted a wrong server can say that this tool is brilliant.
Like "--dry-run" but for "reboot."
These kind of obstacles don't work for any action that the user does repeatedly.
Dialogs that pop up and ask "Are you sure you want to delete ...?" -> users just automatically click yes, because they already did that the last 10 times and just want to get on with their work.
Logged in to server "alpha" instead of "delta" because you thought that's the right one. Tool asks you to write the server name. You type "alpha" because you know you're on alpha. Reboots the wrong server.
Github ask you to confirm the repo name before deleting by typing it into a text field. User looks at what the repo name is and types it without thinking. Or, like lazy me, mark and drag the displayed name into the field, so you don't even have to type.
The point is, users already decided to do the action when they started. It's nearly impossible to consistently make them stop and re-evaluate their decision, because that's extremely high friction and annoying. They quickly learn to circumvent the friction as efficiently as possible (i.e. without thinking about it).
A better solution is to just do it, but let the user undo it if it was a mistake (not always possible of course).
> Or, like lazy me, mark and drag the displayed name into the field
Yes, the same, unfortunately. This rail guard works for only a couple of days or weeks when the perception is fresh. After that, it's always "select, copy, paste, confirm" without thinking.
I believe the better approach is to practice "Chaos Monkeys" and improve the infrastructure to survive even if someone makes a mistake.
> Github ask you to confirm the repo name before deleting by typing it into a text field. User looks at what the repo name is and types it without thinking. Or, like lazy me, mark and drag the displayed name into the field, so you don't even have to type.
Pretty easy to solve this sort of stuff with timers: "Sure, the repo will be deleted in <timer>. Press here to abort: <button>."
Or have the timers on the backend and just show the user an undo button.
I think dry run mode is sometimes useful for many programs (and, I sometimes do use them). In some cases, you can use standard I/O so that it is not needed because you can control what is done with the output. Sometimes you might miss something especially if the code is messy, although security systems might help a bit. However, you can sometimes make the code less messy if the I/O is handled in a different way that makes this possible (e.g. by making the functions that make changes (the I/O parts of your program) to handle them in a way that the number of times you need to check for dry run is reduced if only a few functions need to); my ideas of a system with capability-based security would allow this (as well as many other benefits; a capability-based system has a lot of benefits beyond only the security system). Even with the existing security it can be done (e.g. with file permissions), although not as well as capability-based security.
In even more praise of Ctrl+Z.
(Don't try it in a terminal window, though)
We have an internal framework for building migrations and the "dry run" it's a core part of the dev cycle. Allows you to test your replication plan and transformations without touching the target. Not to mention, a load that could take >24 hours completes in minutes
I love `—-dry-run` flags for CLI tooling I build. If you plan your applications around this kind of functionality upfront - then I find it doesn’t have to pollute your code too much. In a language like Go or Rust - I’ll use a option/builder design pattern and whatever I’m ultimately writing to (remote file system, database, pubsub, etc) will instead write to a logger. I find this incredibly helpful in local dev - but it’s also useful in production. Even with high test coverage - it can be a bit spooky to turn on a new, consequential feature. Especially one that mutates data. I like to use dry run and enable this in our production envs just to ensure that things meet the functional and performance qualities we expect before actually enabling. This has definitely saved our bacon before (so many edge cases with prod data and request traffic).
I use a similar strategy for API design. Every API call is wrapped in a large database transaction, and I either roll back or commit the transaction based on dry-run or wet-run flags. This works well as long as you don’t need to touch the file system. I even wrap emails this way—emails are first written to a database queue, and an external process picks them up every few seconds.
To continue, this design has additional benefits:
The code is not littered with dry-run flag checks; the internal code doesn’t even know that a dry run is possible. Everything is rolled back at the end if needed.
All database referential integrity checks run correctly.
Some drawbacks: any audit logging should run in a separate transaction if you want to log dry runs.
Dry run is great, but if you are using your script in a serious pipeline like that, you may want to go tho extra mile and write tests
https://github.com/shellspec/shellspec
I didn't know about --dry-run until last summer Claude Code added it to a script it had created.
What if the tool required an "un-safeword" to do destructive things?
"Do you really want to 'rm -rf /'? Type 'fiberglass' to proceed."
There is a package called molly-guard that makes you type the computer's hostname when you are trying to do a shutdown or restart. I love it.
Like tarsnap's --nuke command:
Probably inspired by Vinum's "NO FUTURE" for destructive operations. (Vinum was a raid system used on older versions of FreeBSD.)
I like to use the term "--no-clobber", so to set a script to not delete any information but re-use the previous configuration or files, otherwise error out if not possible.
Agreed. For me a good help, a dry run and a readme with good examples has been the norm for work tools for a while.
It's even more relevant now that you can get the LLMs/CLI agents to use your deterministic CLI tools.
I like doing the same in CI jobs, like in Jenkins I'll add a DRY_RUN parameter, that makes the whole job readonly. A script that does the deployment would then only write what would be done.
Funny enough, when creating CLIs with Claude Code (and Github Copilot), they've both added `--dry-run` to my CLIs without me even prompting it.
I prefer the inverse, better, though. Default off, and then add `--commit` or `--just-do-it` to make it actually run.
The fact that every single `--` has become — drives me bananas. For a technical blog this oversight is so sloppy.
And it's more important than ever in the age of coding agents.
Sort of a strange article. You don't see that many people _not_ praising --dry-run (speaking of which, the author should really learn to use long options with a double dash).
I'm not aware of any CLI arguments that accept emdash for long arguments–but I'm here for it. "A CLI framework for the LLM era"
I only saw the emdash in the thread link, but I do know that an iPad "wants" to turn a double dash into an emdash automatically. I have no idea how to disable that default.
If you’re using an iPad, the implicit assumption is that Apple knows better than you what you want to do.
I would love to have this available in git. I know if you make mistakes you can use the reflog, but if you need 5 tries to get something right reading the reflog quickly becomes impossible. Plus there are operations, like rebase or merge, that feel the need to make 50 entries in the reflog.
I've resorted to copying the entire directory (including the .git part) and then trying on the copy. The issue is that I'm working on a C++ program that has a few gigabytes of data.
Consider using work trees. They have separate reflogs, you'd still get separate directories, but less duplication.
I use --dry-run when I'm coding and I control the code.
Otherwise it's not very wise to trust the application on what should be a deputy responsibility.
Nowadays I'd probably use OverlayFS (or just Docker) to see what the changes would be, without ever risking the original FS.
How do you easily diff what changed between Docker and host?
The way OverlayFS works is that there's a base directory. And then there's an overlay directory that only contains the changes. Docker is based on OverlayFS.
There's two main ways overlays are used, first at build time, each line/command generates a new overlay based on the previous base, so when you do something like
FROM debian RUN apt-get update
it creates a base from the debian image , and then creates an overlay that only contains the changes introduced by apt-get update.
If you use docker inspect or docker show on the image you get a json showing exactly where the overlay directories are, you just need to navigate the overlay directory.
Second: on runtime. [Assuming you are not using volumes, (and if you use volumes, just make sure the volume starts out as empty, instead of sharing your host files)] OverlayFS is used for the runtime file changes as well, the last image is used as a base, and every files changed during runtime are added to the runtime overlay. That filesystem won't be deleted, if you only stop the docker container, the runtime files will still be present, and you can reach them by docker inspecting the running docker processes/instances, and then navigating the overlay fs as you would any directory.
You can also just use overlayfs, as far as I recall, you just use mount and unmount while specifying the OverlayFS driver and special parameters like base and overlay. Conjugating a chain of overlays is a bit more complex, but it's the same interface.
--dry-run
--really
--really-really
--yolo
You'll like fontconfig then, which has both --force and --really-force
https://man.archlinux.org/man/fc-cache.1.en
pffft, if you aren't dropping production databases first thing in the morning by accident, how are you going to wake yourself up :-)