Hi all
A best practice question on date range storage for rust. For reasonably large data sets is it better to. (Assume the data stores the users favorite color, and that we could need that answer for any or every day)
A) store date ranged data with just the from and to records? Like this (inception and current can of course be dates)
Fields:
Person / color / from date. / to date
Values:
John / blue / inception / 3/31/2021
John / green / 4/1/2021 / 7/31/2021
John / red / 8/1/2021 / current
Fran / green / inception / current
...etc. for each persons' preference
B) store each day's values for each person
Fields:
Person / color / date
... prior periods...
John / blue / 3/29/2021
John / blue / 3/30/2021
John / blue / 3/31/2021
John / green / 4/1/2021
John / green / 4/2/2021
Fran / green / 3/29/2021
Fran / green / 3/30/2021
Fran / green / 3/31/2021
Fran / green / 4/1/2021
Fran / green / 4/2/2021
... etc for every day in the calendar for every person in the database.
Obviously option 2 is a ton more data but it seems like the response time may be faster at scale.
Very cuious about the real difference.
The database should matter but i dont know what the better options are for rust or if rust has a preference.
Yep - i understand the correct way to structure the database. My question is if the trade off of making the database less normalized is worth it for performance when it comes to using date ranges.
The answer may be to always model the database correctly no mattter what, but thats sort of my question.
There is no general answer to the question of whether denormalization is worth the benefit. You'll have to measure performance of the two approaches using realistic workloads and data sets for your particular use case, with a database of your choosing. Even more importantly, you'll need to try to predict future queries you may need and whether they can be performed using the denormalized version. This is all about the database and how you use it; nothing to do with Rust.
The less data you have, the faster it gets.
In your two examples, there is no need for "to date" or "current". It is just the date the person prefers a certain color. The transition from one color to the next is defined by that next color appearing. Or do you want to have gaps between end date and start date?
If you want to avoid searching for every day, assign a unique id for every person (how many?) that points to the last db-entry. But I think I wouldn't do that.
If you store your data in a graph, finding the last date entry is extremely simple and fast.