Advice on how to embed data in a library/executable

nicethor · July 17, 2020, 2:45pm

I need ~100k rows by ~10 columns of (unchanging) data to correct/classify a stream of incoming records (changing). What is the best Rust way to do this, do I make a sqlite db (maybe as part of my build.rs) then embed that file into the library? Or, do I include! a csv then process that into a sqlite db in memory at the start of execution? Or, ?

drewkett · July 17, 2020, 2:56pm

What’s the size of a row of data? Is it mostly numbers and/or short strings?

What queries do you need to make? Are you looking up a row by a single column or are you doing some sort of search for best match?

It sounds like it’s probably not that much data. And if the queries are just lookups by an index or something, my inclination would just be to store it in a vec in memory and create hashmaps for any lookups.

nicethor · July 17, 2020, 2:59pm

mostly short strings, like | KX | Klaxxon | 93322 | K | ... |
Frequently i'll need to match three of the columns to get all the possible rows that meet the criteria and then do a little judging of which is the best option.

Ok, i'll go back to looking at building ~5-10 hasmaps to index it. do i just do the !include then to pull the csv into the final lib/exe?

2e71828 · July 17, 2020, 3:03pm

If you want to avoid the cost of parsing the data on every invocation, you could use build.rs to parse it at build time into repr(C) structs, and then use include_bytes!:

drewkett · July 17, 2020, 4:15pm

Just to follow up based on what @2e71828 says. If you did use build.rs to include_bytes you could use it in the following way. It still builds the maps at run time, but you wouldn't be able to avoid that with the built in HashMap I don't think. If you want everything at build time, it looks like phf supports compile time hash maps as well.

Playground Link

system · October 15, 2020, 4:15pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Can i conveniently compile bytes into a Rust program with a specific alignment?	14	3947	January 12, 2023
Best Practices for Durable Storage in Rust help	4	2391	November 24, 2020
Mapping large files help	4	749	January 12, 2023
New crate: include_data. Review for soundness and API code review	7	425	October 9, 2023
Uneval: generate Rust code from serializable value code review	6	563	July 30, 2020

Advice on how to embed data in a library/executable

Related Topics