Date range handling

Hi all
A best practice question on date range storage for rust. For reasonably large data sets is it better to. (Assume the data stores the users favorite color, and that we could need that answer for any or every day)

A) store date ranged data with just the from and to records? Like this (inception and current can of course be dates)

Fields:
Person / color / from date. / to date

Values:
John / blue / inception / 3/31/2021
John / green / 4/1/2021 / 7/31/2021
John / red / 8/1/2021 / current
Fran / green / inception / current
...etc. for each persons' preference

B) store each day's values for each person

Fields:
Person / color / date

... prior periods...
John / blue / 3/29/2021
John / blue / 3/30/2021
John / blue / 3/31/2021
John / green / 4/1/2021
John / green / 4/2/2021
Fran / green / 3/29/2021
Fran / green / 3/30/2021
Fran / green / 3/31/2021
Fran / green / 4/1/2021
Fran / green / 4/2/2021
... etc for every day in the calendar for every person in the database.

Obviously option 2 is a ton more data but it seems like the response time may be faster at scale.

Very cuious about the real difference.

The database should matter but i dont know what the better options are for rust or if rust has a preference.

Thanks for your time.

One solution, which might be a little over-engineered for your use case, would be to use a time series database.

No matter where and how you store it, I'd read about Database normalization - Wikipedia first.
Edit: And about the ISO 8601 - Wikipedia standard for dates.

1 Like

Yep - i understand the correct way to structure the database. My question is if the trade off of making the database less normalized is worth it for performance when it comes to using date ranges.
The answer may be to always model the database correctly no mattter what, but thats sort of my question.

There is no general answer to the question of whether denormalization is worth the benefit. You'll have to measure performance of the two approaches using realistic workloads and data sets for your particular use case, with a database of your choosing. Even more importantly, you'll need to try to predict future queries you may need and whether they can be performed using the denormalized version. This is all about the database and how you use it; nothing to do with Rust.

The less data you have, the faster it gets.
In your two examples, there is no need for "to date" or "current". It is just the date the person prefers a certain color. The transition from one color to the next is defined by that next color appearing. Or do you want to have gaps between end date and start date?

If you want to avoid searching for every day, assign a unique id for every person (how many?) that points to the last db-entry. But I think I wouldn't do that.
If you store your data in a graph, finding the last date entry is extremely simple and fast.

This is helpful thanks.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.