I am trying to set up a system to take an existing binary, and partially modify its contents by adding a configuration as an internal resource.
My usecase for this is to create an editor interface to build a UI of sorts.
This editor would then modify a prebuilt "engine" binary which would then run the interface using the created configuration.
My reasoning for not just doing this at compiletime is I don't want to force the users to run the whole rust build pipeline if at all possible.
So far my research has come up empty, would anyone have suggestions on how I would approach this?
The one you probably don't want in this case is finding an existing block of literal data in the binary and replacing it. This only works if the data is constant length, which probably isn't the case for your editor data.
Many executable formats don't care if there is extra data at the end, so your editor can simply append the data to the end, then append the length, and the engine can open itself, read the length from the last bytes, and then seek backwards to find the start of the data.
If you are running on Windows, executables can have structured "resources" embedded in them that can be edited. I don't know much about this mechanism.
If you are running on macOS, applications are packages (directories that are treated as single units by the file manager) and you can put ordinary files in the resources directory of that package, separate from the executable.
None of these are specific to Rust, because Rust does not compile or run programs in a special way.
I believe there exists a library from the rust for linux people where you could use macros as booleans and then change all locations from true to false (or other direction) at once in the binary. was obviously pretty unsafe and i can't remember the name but you could see how they did it.
Interestingly, the ZIP archive format has its index at the end for this very reasonÂą: You can put whatever you want in front of it and it remains a valid zip file. So if you put together all of your UI assets into a zip file and append it to the executable, any zip library should be able to open the executable file and access the contents without any special effort on your part.
With a little bit of abstraction, it probably wouldn't be too hard to come up with a system that can either read from a directory on the filesystem (for testing or MacOS packages) or from a zip archive (standalone or appended to the executable).
Âą Edit: See @kpreid's reply below-- I probably overstated the reason why ZIP is designed this way.
Interestingly, the ZIP archive format has its index at the end for this very reason: You can put whatever you want in front of it and it remains a valid zip file.
I don't know the design history, but there are other reasons to do this. For example, files can be compressed and written to the file as they are found, and then the index (however big it is) written after all files have been written — it allows a ZIP archive to be created in a “streaming” fashion despite file systems not supporting inserting variable-length data at the beginning of a file. Wikipedia’s description focuses on the ability to append files to a ZIP archive by writing new files and a new directory, but that's the same thing in a sense. They do also mention “self-extracting archives”, which are essentially the same thing as OP’s problem, just without intent to write the decompressed files.
With the increasing use of code signing to vet programs for authenticity, I would encourage you not to modify existing binaries, if you can avoid doing so. Changing a binary will invalidate any signatures over that binary, requiring the user to either re-sign the binary themselves (technically challenging; may cost money), submit the binary back to you so that you can re-sign it, or disable code signing.
I have some opinions about this trend overall, but regardless, vendors are doing this and we'll have to accommodate it.
Separately, this approach also will not work on systems where binaries are installed by a user with different privileged than the user, including both most *nix package management systems and some app store systems.
Personally, my recommendation would be to package the modifiable parts separately - as a ZIP, for example, or as a directory of files. Your program can look for this package on startup and load it using normal filesystem APIs, and this will not affect any code signing or verification the way that modifying the binary would. Your editor can make this process reasonably transparent by knowing where to put the files, so that the user doesn't have to guess.
Unfortunately my particular usecase prevents the usage of sidecar files.
If there was a way to send out rust code in an unreadable state in such a way that it could be quickly compiled by the end-user, I would be able to use that, but I don't think there is a thing like that?
If you can tolerate having debug builds where the files are not embedded but loaded from the file system, and then a longer release build where they are embedded, there is the rust-embed crate for that.
But I guess you'd always have to run the part of the pipeline that creates the final files for embedding.
Since the binary here serves as an interpreter and the appended data represents a program to be run within it, the combination of the two creates a unique program. This unique program would need to be verified and validated independently, so it’s reasonable to require a separate signature for each individual concatenation.