I set up a project that explains this concept better, sorry this was so confusing: Internatal String Localization Database Management System
I am new to rust, not programming, and Localization is always been a pain, so the first thing I want to learn about is how to localize my app and documentation, but currently, I do not see a clear path, since the documentation using cargo doc also needs to be localized.
I found Redox-OS and found out they need a localization system, and this is where the problem comes in, looking at solutions, currently, they all require API breakage, so I think I have a better solution, but I need to find out if it is possible, and can it get support to make it happen.
I have the full text of what I will talk about below, here at my Github Repository
I want to focus on a way that does not require any API changes to use this system of Localization System for Strings, that works for cargo doc, as well as any rust app, with no API changes, does not seem possible, with the current methods, so a new method is required, so let me give an example.
When I declare a string that needs to be localized, I run into this issue, for example, I want to add a button that needs to be localized, but all my code looks like this:
let selector = Selector::from("button");
The word "button", is the only reference I have to this text that needs to be localized, and I do not want to break this API, by having to make it look like this:
let selector = Selector::from(localize_this("button"));
The drawback of this approach of calling a function to localize a string is that it has to make a call to it, every time it creates this object,
and what happens if it breaks, or takes to long,
not to mention this does not even help me with cargo doc,
so I have a totally different approach, that I would like to propose to the Rust Community.
In the example of let selector = Selector::from("button");
the compiler can follow the quoted text back to the variable it is assigned to, the compiler will overwrite a po file on a full build, for later processing, so all it needs to do is create the po file, and append to the po file, this does not seem like a hard function to write, and it does not require a lot of processing, so minimal impact on the compiler, and can be off by default, so there is no impact by default.
Normally these files are called a po file, and I see no reason to name it differently, just because the format is enhanced. This po file will have extra schema, besides the normal msgid and msgstr, we want to track this variable name, and the file and path it is in, so now I have all the data I need to localize my strings, and they are stored in maybe an XML file, with attributes for the data-types for the msgid, msgstr, path, and other properties you might want to track, to make it easier for the Translator, and programmers, to translate this later.
The compiler will need a switch to compile these po files (--po), into mo files, only I will not be using that strategy, instead, it will optimize it, removing redundant msgid, creating a dif file, that is compared to any existing dif file, so it can upgrade it, so you do not lose any translations already made, and it creates a new po file with this data, and using a code generator, it will create an rs file, with one public function, that is a mapped array, using the msgid as a key, as such, whenever the variable is read, it needs to read from this function, using the msgid as the key to the mapped array, this is a simpy getter function, and requires no code change in my example to work, it is all done by the compiler itself.
I was thinking this function library, can be a static or dynamic library, making it easy to localize, just name the files according to their country code, and you can unload, and reload it again to make changes, or to load another language at run-time.
In the link above, I talk about writing an app that will work from the command line, as well as a GUI, called Redox-Lingo, or in this case could be called Rust-Lingo, same concept. This app can read the po files, and allow translators to modify the strings, not the msgid, so I would recommend encoding them, for example, instead of using readable text like "button 1", "button 2", instead use "button_1", and "button_2", this makes it easier to map an array, or enumerate it, whatever is faster, and makes it easier to find, you could also make it ugly to make it even easier, for example, "LOCALIZE_button_1", now you can just grep "LOCALIZE_" to find everything needing translation (better yet have Lingo do it), for each language, by just changing the locale, so this can be automated by Lingo, creating a todo list. The Lingo app can compile the po, and reload the library.
This makes it easy, since the from the existing codes point of view, no changes are required, the compiler does all the linkage when you start up the app, so there needs to be an entry point where the compiler links in a call to a localization initialization function, that creates an init file, to track changes, this is the mechanism used to know what state the system is in, it first checks to make sure the library exists for the current locale if it does, it marks it online in the init file if it does not exist, it creates it, if there are no po files, it will create them, and give a warning about a new init, thus on startup, the app will always work. It can check for po file changes, and recompile them automatically, and it can even check a repository for changes, and download them if required, so your system is always up to date at run-time.
This concept requires changes to the rust compiler, to create the po files, and create the init call, plus it needs to change the way it reads from the string variables marked as localized, as well as a way to mark them, or you can just localize all strings, you would have to test the performance on localizing all strings, any untranslated string, will be the original msgstr, which is also the msgid, so unless a translator changes it, it will remain the same, so maybe comments should also be saved with the po file, for example:
// Lingo:Off This is a Lingo comment, do not translate this button
let selector = Selector::from("button");
The compiler can then test for "// Lingo:", if it exists, it outputs it as a property to the po file for this variable, making it easy for everyone to localize.
You will want to standard the comments for example:
Do not localize
// Lingo:Off And a comment that gets ignored after off
Localize
// Lingo:On And a comment that gets ignored after on
You can add more rules as needed.
The advantage of this concept is that it makes no changes to any API, and only the compiler needs to be enhanced, giving the advantage, that all the strings are typesafe, and the getter function is in a static, or dynamic library, which is ever best, so it can be optimized, and now it works for all rust apps, and cargo doc.
I need help building a proof of concept for this method, the compiler can be standalone, it actually does not need to be in the rust compiler, so it is just a matter of creating a compiler app and trying to get support for it.
I call it a compiler, it is actually just a parser, I was thinking the main rust compiler could create the po file, but there are other ways, all you need to do is find the path to the string, and look for a comment on it, then it is just a matter of writing a code generator, to create the getter, and it can call the rust compiler to compile it.
There might also be a way to deconstruct the compiled code using the debugger to traverse the stack, looking for what you need, then outputting it.
If we can get one of the compiler team members to point me in the right direction, so I do not have to plow through all the code, searching for how to do this, at the same time ask them if they can explain how to output this information to a file, that would help get this project started.
Update: The Localization System that I am describing, does not actually do any Localization, so you can not compare it to projects like Fluent, gettext, or any other system like them, this is not that type of system, the goal of this project is just to create the po files, and by po files, I am not referring to the ancient po file format, but a better format, with more information, I only use the term po, so that you get the idea that it is an external file, with a msgid, msgstr, variable name, file name, and file path, so that any back-end, like Fluent, gettext and so on, can use this file, to do that translation, in this case, I actually want to create a project called Lingo, that does this, I hope I am making myself clear.
When you are writing an application, it would be nice not to have to worry about Localizing, and with this system that is possible, this is not a replacement for other localization systems, it is a system designed to create files that need to be translated, with 100% coverage, without the need for the programmer to have to code each string that needs to be localized.
By making the localization a static or dynamic library call, you do not have the need for any other system to translate the string, that is done at compile-time, and the strings are loaded by the library, that can be reloaded at run-time, making it easier to update than most systems, and this is low level, the only change to the rust system, would be a call to the init function at startup, and use the libraries getter for all strings.
The other benefit to this system, over other systems is that the rust compiler can use it, making it localized with no effort, in fact, it requires no effort since it is this low level, it is just part of the language, something you do not have to think about once you get it to work.
The reason there are as many Localization projects as there are right now is because they are required, and why, because Rust does not have it built-in, and that is what I am asking to do, instead of making Localization an afterthought, or a second-hand citizen, that has to be handled by the programmer, that is not a good way to localize, letting the compiler do it is.
Thanks
Flesh