Looking for suggestions on how to approach a programming problem for my experimental setup for my phd research. I think I could probably solve all of it if I learned to write derive macros, but want to avoid that route until exhausting any existing solutions that might take less time.
In a single cargo project, I have a set of algorithms implemented in Rust, a set of inputs in the form of problem "instances", in the form of ascii files, and a set of "domains" of which each comprises a subset of the instances and a Rust module that reads the input files in the given domain's particular format, and performs operations on the instances once they are represented in memory.
Each algorithm has a different number of parameters, and of different types, that affect its behavior.
Each domain also has a different number of parameters, also of different types, that affect its methods' behavior. (The algorithms calls these domain methods when solving problem instances.)
All parameter values must be specified on the command line at run time, for whichever algorithm is chosen and whichever domain the problem instance belongs to.
The are some dependent result variables that are common across algorithms or domains, and others that are specific only to one or a few algorithms or domains.
I would like to store the algorithm and problem domain parameter specifications (valid types/ranges, etc.) in a single place, and programmatically generate (and regenerate, as algorithms and domains are added or revised):
- command line argument parser
- execution commands covering the cross product of [all the algorithm configurations] with [the sum of [the cross product of each domain's parameter ranges and its set of instances]], where an "algorithm configuration" is an assignment of specific values to each of that algorithm's parameters
- logging format for parameter values (independent variables) and results (dependent variables)
- data analysis parsing (read in the result/log files to some standard tabular form).
I don't think there's a single crate for all this---so what would a sane approach look like?
My own thoughts/past efforts:
For (1), I have tried Clap, but it does not support multiple commands (or are they called subcommands?) in series, which I think I need because each execution specifies an algorithm (and all its parameter values), and a problem instance (and all its domain's parameter values). There is a hack to require --
somewhere in the command line to split the algorithm and domain arguments, and run Clap separately on each substring, which I have done in the past but feels yucky (maybe I should get over that).
For (2), I like Strum's enum_iter but I have not found an equivalent for structs, which is the format in which I'd like to specify parameters, their types, and ranges (like for Clap derive). Learning how to write a derive macro is on my todo list; maybe that is the approach to take? Currently manually writing a bunch of list comprehensions and cross products in Python, which is a nuisance, especially to keep up to date as the Rust code changes. Scalar parameters usually have bounded valid ranges, and different preferred sampling schemes (linear, exponential, or some weird composition of the two).
For (3), serde would probably be fine, as long as the solution to (1) has an output type that serde can serialize, which should be a very easy requirement to satisfy. Currently just using log::info! to output "key: value" strings, and writing each info!() by hand and ...
...For (4) I've written my own ugly parser in Python which I have to update manually (painfully) anytime I add a new algorithm or domain or modify an existing algorithm or domain.
Super grateful in advance for any advice or guidance, including of the variety "you're trying to solve the wrong problem"...