Wave-buoy: Storing data to SD card robustly and transmitting over unstable connection

Hi,

I am working on an open-source wave-buoy and drifter for search-and-rescue, meteorological and oceanographic purposes: GitHub - gauteh/sfy: A low-cost drifting wave buoy for near-shore deployments. the firmware and data-server is written in Rust and uses the blues.io notecard to send wave data and GPS trajectories home. We've already tested it in the surf outside Norway:


There's an hackster.io article coming.

However, the buoy needs an SD-card so that no data is lost if connection is temporarily unavailable. I was planning on using embedded-sdmmc, or maybe littlefs2.

My concerns are about how to keep track about what is sent, without modifying anything more than necessary on the SD-card. The reason for this is that preferably the latest messages or data should be sent first, which means that I can't just increment an ID from the start. The connection could also be lost while transmitting so that it would leave "holes" in what is sent, because new messages are added before all un-sent data is transmitted. I would prefer not to have to scan through all packages and modify a sent bit on them, so that they can remain immutable. I think a list of unsent packages stored in a file on the SD-card could quickly grow very large and maybe slow to write. It would also be susceptible to corruption. Do you have any suggestions for how to keep track of what is sent in a robust way that can handle power-loss without losing data?

Cheers, Gaute

3 Likes

It sounds like the buy receives information back from the server to know what failed to send?

In that case the server could ask for specific messages (-range) to be send again.

Yes, it could be kept track of by the server. But the cellular connection goes on for some minutes then it stays off most of the time to save battery. I think it would make things more complex since the buoy has all the information it needs to know what to send next. Sending a huge list, or especially many small requests from the server to the buoy, would work very slowly (I think). Even if a range is requested, the buoy needs to know which messages managed to sent in the range.

I think it would be nice with a way to do this anyway, since there might be too much data on the SD card that can realistically be sent so that the server or user could prioritize what is interesting. But I think that would be for a later version.

In that case...
If the (re)sending algorithm prioritizes old messages first, you would only have to track the last message that failed to send (or the last id, so that 0..id have all been send). That information could be added to messages stored to the SD card. It would "waste" a small amount of extra space, but nothing needs to be edited at a later stage and in case of power loss it can resume at the the id that is stored in the last message.

Yes, but I want to prioritize newest messages first.

there are lots of good interval data structures good for dealing with ranges,
like union/find or interval trees (or whatever you want to call them),
I kind of think though if you use a priority queue/binary heap and a monotonically increasing ID,
and collect unsent data into a range of sorts... (I didn't use range because of reasons though)

Only lightly tested, and doesn't really solve syncing to disk or power loss, but here's my attempt at prioritizing by newest message, and leaving a trail of ranges of id's unsent...

1 Like

Cool.

As in every thing you want to send to the server must, at some point, be sent? There can be zero data loss?

What quantity of data are you transferring?

Thanks, something like this could work. Need to make it not use allocations, but I think all I need is in heapless.

Yeah I figured so, I kind of think the priority queue aspect is perhaps not even necessary there, I think you might be able to get away with a normal stack, but the peek_mut + pop aspect is fairly critical and i'm uncertain of rust data structures besides BinaryHeap that have that. Perhaps the priority queue aspect is useful for some part loading from disk though... anyhow there are probably improvements that could be made to it.

also I think there is a packet dropping bug in there and that it should be: if ... { pop() } else { subtract }, sorry for that.

The buoy is logging acceleration in packages of 1024*3 components, approximately 20 seconds at the current sample rate. The data is stored in half::f16s. So it should be about 1.2 mb / hour, manageable.

It is not the end of the world if a couple of 20s packages are corrupted, but I would like to avoid modifying stored packages to that I don't risk destroying already logged packages or the file-system.

I think that my first version will just store the data to the SD-card, and transmit what I can with fresh data over the network. Then it can be extended with range-requests from the server (as @ratmice suggested), these range-requests can be sent sequentially from one end with a single ID to keep track, then I don't need to deal with disjoint ranges.

1 Like

Have you considered using FRAM instead of an SD card?

Is power loss a concern?

Do you have an idea of how often the buoy will be able to transmit data? In other words, do you know the longest time span that the buoy will not be able to transmit data?

I can't really help here other than to say I think it should be the server's problem to request missing data. Anything to keep the embedded thing simpler and let the server with the resources do the actual hard work...

Bit handwavy, I'm afraid, but that's what I can offer in terms of help.

But I'm also curious: what are you doing with those buoys? I have no idea why one would use those?

I have briefly, but the electronics might be completely broken by water and in that case the SD card might be retrievable, while FRAM would require that the buoy comes back online.

Yes, it will eventually loose power, but as long as most of the data is intact it is ok, if a few packages on the end are lost or corrupted it does not matter.

That will depend a lot. It is using the cellular network and it is drifting in the ocean. It can be minutes, hours, or it could be blown offshore and end up in a different country a week later and come back online. In which case it will be difficult to retrieve the buoy, but it would be great to get the data.

These buoys are intended to be recovered all of them (as opposed to much other oceanographic equipment), but there's always a significant risk that it will be lost.

These are being used at the Meteorological Institute in Norway for measuring waves and drift trajectories. Most incidents: oil-spills, search & rescue, pollution, or drifting boats or ships happen close to the shores. Drift of an object in the sea depends on wind, ocean current and waves, as a function of the shape and size of the object. So a relatively cheap buoy is perfect for measuring currents and waves. We are also pretty interested in waves, and to be able to look at breaking waves we need really high frequencies (relative to typical ocean waves). This buoy is for close to the shore, while there are other related projects [0][1] that work for the open ocean over satellite: but they can only transmit a fraction of the data, spectrum recorded for a period at a few hours interval. Really useful for the open-ocean, but not enough for the details we're trying to capture. This project is also useful in exploring what's possible to do with the cellular network if you are trying to measure or communicate other stuff in the ocean close to the shore.

[0] GitHub - jerabaul29/OpenMetBuoy-v2021a: An easy to build, affordable, customizable, open source instrument for oceanographic measurements
[1] SWIFT

1 Like

Hello, your problem sounds like something satellite experiments or astrophysics experiments would also face? I don't have a contact there though.