I've been using Rust for 2 months, and I had a hard time convincing my team to redevelop an enterprise application from Java/Spring to Rust/Actix-web.
Everything went well, until I discovered that an official Actix-web middleware embedded some code (Html+Js) and some weird text in the build (--release).
Analyzing the strings contained in the executable, we find references to Amazon and Facebook and other weirdness.
Even creating an empty project containing only Actix-web.
here is a sample of what I found in the final executable, (on an empty project containing only actix-web with its compression middleware enabled) :
I tried to find the cause, but I'm still a beginner in this language, so I tried the best I could to find a logical explanation.
I opened a ticket describing what I found, and I also gave the elements to do the experiment again. But unfortunately I didn't get any answer from the Actix project team, and I fully understand that these good people have a life and that every second spent for the open source community is out of kindness and not out of obligation.
I rely on your experience to help me find the cause, because the migration project is totally stopped for security reasons, we can't deploy a program in enterprise containing unknown code.
is it a test code of the library that ended up in the final build by mistake?
Can someone give me an explanation of what is really going on ?
This is likely due to the inclusion of the compression algorithm Brotli, written by Google (and FOR Google, really), which uses a static dictionary formed using real-world data -- the most popular strings traveling through their own servers. So naturally, it'd contain popular website names. I'm somewhat surprised they didn't remove trademarked names from the dictionary, though. This is exactly why I thought Brotli was a little silly when it came out..
Thanks @jessa0 for your responsiveness, it's really interesting what you say, I did a search on the .cargo folder to find one of these occurrences in the librarys source code, but I didn't find all the embedded words..
is it normal to have an HTML structure and JS file names ?
But keep in mind that the answer to the question "why is there the string 'facebook' in my executable" does not in any way help answering the question "what is the definition of untrusted code I'm using and how do I prevent unstrusted code from being included in my binary"
if we look at the strings included in the executable, we find blocks of a Html page structure with JavaScript file names, a website is a code, especially if it is included in a web server that will be exposed to the public.
here are some file names included in the binary : checked.min.js , jquery.js, time.js, header.min.js, etc..
but as @jessa0 said, this structure is also in the dictionary, so it's justified, but it still leaves some doubts when we don't know the source (thanks to you I know now).
I think that users should be warned of the existence of this kind of dictionary in the compression module, because it increases the size of the binary, and it has an impact for those who do embedded.
(adding company names in the dictionary is very questionable)