[geo-rust] What's a reasonable way to have geometries projection-aware?

I've just found a bug in my code, that in the output CSV, half of lines had geometries in WGS (4326), and another half in pseudo-Mercator (3857).

Initially, all of geoms came from files in 4326, but after a number of divergent manipulations, some of them were reprojected in 3857, some weren't, and then all were written into the file.

I think I'm going to face the problem more than once, and thought what if I could use type system to protect myself from pushing geometries in wrong CRS?

I wonder what's a strategy to have geometries projection-aware?

To start, I'll be fine with 2 CRS, 4326 and 3857.

Problem #1 is that I'll need to keep Proj around, but it seems that geoms are created in one place -- where I read files, so this should be manageable.

BTW, I wonder how expensive is the call Proj::new_known_crs? Is this an sqlite query?

Problem #2 is what and how do I wrap. Maybe a generic AwareGeom<T> with generic fn reproject(&self) method?

Problem #3 is how do I serialize and deserialize them, so that I can visualize the files with QGIS?

Any suggestions are welcome.

I've done some geometry stuff in the past, and I really liked how the euclid crate uses type tags to help you keep track of co-ordinate spaces.

In my case, I was often going between "model" space (where you construct your CAD model) and "screen" space (where things are rendered on the monitor) when rendering the entities in the CAD library I was building, and using a type tag meant it's impossible to accidentally mix a euclid::Point2D<ScreenSpace> and a euclid::Point2D<ModelSpace> without having a euclid::Transform2D<_, ScreenSpace, ModelSpace> to translate between them (in this case, the transformation matrix came from the camera component).

I'm not sure if you can reuse the euclid crate directly, but you might still be able to take inspiration from the way they use type parameters to tag things.


If doing it in the type system makes sense depends on how many coordinate systems you have to deal with, and how dynamic that selection is.

It might make sense to immediately convert all input into a standardised internal coordinate system (that is tagged), then convert it to the output/display coordinate system at the end. That way you don't end up with thousands of different tagged types (which you would if you support all geographic coordinate systems in use across the world).

It all depends on the specific application.

1 Like

That's what I did, and the bug is gone, and I have less &proj passed in the code. But this does not work in tests: where you submit the value into inner functions. I spent an hour yesterday figuring out why a test was failing, and turned out I gave it a point in CRS 4326, not in the inner 3857.

Well, if you write incorrect numeric values in the test, there isn't much you can do. But if you have something like (feel free to use names that suit you of course):

  • Coord<Input, SystemEnum>(system-by-runtime-value, value)
  • Coord<Internal, ()>((), value)
  • Coord<Display, ()>((), value)

I'm on my phone, so this is a rough sketch, should probably be struct, not tuple struct, implement a trait for code that is agnostic over coordinate system, etc. Type aliases should also be used for convenience of course.

But the general idea is that some coordinates are dynamic at runtime which system they are in (for input and maybe output) and these carry the information about the system with them, while the internal system is fixed at compile time, and doesn't have to carry that extra info. Maybe you can get away with just two types: InternalCoord and DynamicCoord.

You can implement conversion via From and/or TryFrom as needed of course.

Then when you write tests, you will either get a type error or need to add a call to into(). Of course if you just write the wrong values and claim it is in a different coordinate system than what it is, nothing can save you.

Now, carrying an extra field per coordinate may be too much overhead for you, I don't know. But lets say you have a vector of 100k coords. Why not make a newtype VecWithSystem(system, inner-vec-of-just-raw-coords). In this case, the various accessory functions could return the suitable typed coords when you actually accessed it. And for the internal type which is statically typed it could be made to not have overhead.

I should also note I don't work in this field. All I know about coordinate systems come from having to convert between left hand and right hand systems at work once, plus contributing changes to open street map.

1 Like

Or perhaps take a look at the pure Rust Geodesy crate?

1 Like