Got Parameters? Just Use Docopt
Written by J. David Smith
Published on 7 September 2017
It's one of those days where I am totally unmotivated to accomplish anything (despite the fact that I technically already have – the first draft of my qual survey is done!). So, here's a brief aside that's been in the back of my mind for a few months now.
It is extremely common for the simulations in my line of workOr our, hi fellow student! to have a large set of parameters. The way that this is handled varies from person to person, and at this point I feel as though I've seen everything; I've seen simple getopt
usage, I've seen home-grown command-line parsers, I've seen compile-time #define
s used to switch models! Worse, proper documentation on what the parameters mean and what valid inputs are is as inconsistent as the implementations themselves. Enough. There is a better way.
Docopt is a library that is available in basically any language you care aboutThis includes C, C++, Python, Rust, R, and even Shell! Language is not an excuse for skipping on this. that parses a documentation string for your command line interface and automatically builds a parser from it. Take, for example, this CLI that I used for a re-implementation of my work on Socialbots:See here for context on what the parameters (aside from ζ, which has never actually been used) mean.
Simulation for <conference>.
Usage:
recon <graph> <inst> <k> (--etc | --hmnm | --zeta <zeta> | --etc-zeta <zeta>) [options]
recon (-h | --help)
Options:
-h --help Show this screen.
--etc Expected triadic closure acceptance.
--etc-zeta <zeta> Expected triadic closure acceptance with ζ.
--zeta <zeta> HM + ζ acceptance.
--hmnm Non-Monotone HM acceptance.
--degree-incentive Enable degree incentive in acceptance function.
--wi Use the WI delta function.
--fof-scale <scale> Set B_fof(u) = <scale> B_f(u). [default: 0.5]
--log <log> Log to write output to.
This isn't a simple set of parameters, but it is far from the most complex I've worked with. Just in this example, we have positional arguments (<graph> <inst> <k>
) followed by mutually-exclusive settings (–etc | –hmnm | ...
) followed by optional parameters ([options]
). Here is how you'd parse this with the Rust version of Docopt:
const USAGE: &str = ""; // the docstring above
#[derive(Serialize, Deserialize)]
struct Args {
// parameter types, e.g.
arg_graph: String,
arg_k: usize,
flag_wi: bool,
// ...
}
fn main() {
let args: Args = Docopt::new(USAGE)
.and_then(|d| d.deserialize())
.unwrap_or_else(|e| e.exit());
}
This brief incantation:
- Parses the documentation string, making sure it can be interpreted.
- Correctly handles using
recon -h
andrecon –help
to print the docstring. - Automatically deserializes every given parameter.
- Exits with a descriptive (if sometimes esoteric, in this implementation) error message if a parameter is missing or of the wrong type.
The same thing, but in C++
is:
static const char USAGE[] = R""; // the docstring above
int main(int argv, char* argv[]) {
std::map<std::string, docopt::value> args
= docopt::docopt(USAGE,
{argv + 1, argv + argc},
true,
"Version 0.1");
}
Although in this version type validation must be done manually (e.g. if you expect a number but the user provides a string, you must check that the given type can be cast to a string), this is still dramatically simpler than any parsing code I've seen in the wild. Even better: your docstring is always up to date with the parameters that you actually take.Of course, certain amounts of bitrot are always possible. For example, you could add a parameter but never implement handling for it. However, you can't accidentally add or rename a flag and then never add it to the docstring, which is far more common in my experience. So – for your sanity and mine – please just use Docopt (or another CLI-parsing library) to read your parameters. These libraries are easy to statically link into your code (to avoid .dll
/.so
not found issues), and so your code remains easy to move from machine to machine in compiled form. Please. You won't regret it.