Perpetual n00b struggling with ownership (again)

I say "perpetual" because I am in the midst of my second or third attempt to learn Rust: the first time was in 2015, then I started again about 2 years ago, then ... mostly gave up for a while, then got back into it this summer.

Anyway, I keep running into the same or very similar issues: things seem to go well, and I seem to understand what I am doing, as long as I am working with practice material (I've been through part of the canonical Book; currently working with the Blandy & Orendorff book; did the Rustlings course last week, etc). But the moment I start trying to develop something practical (which is how I learn best), I immediately run into seemingly insoluble issues with ownership/lifetimes/borrowing, and when that happens, everything I do to try to solve the problem just makes it worse. Which is the whole reason I gave up before.

Anyway, right now I'm trying to create a smallish library (with command-line tool) that stores data in a SQLite database - for which I am using the rusqlite crate. It's not a database client per se - just happens to store data in a database, so none of the DB interaction will be exposed to the user. Here's what my code looks like right now:

use rusqlite::{params, Connection, Result};
use std::path::{Path, PathBuf};

#[derive(Debug)]
pub struct Database<'a> {
    path: &'a Path,
    conn: Option<Connection>,
}

impl<'a> Database<'a> {
    pub fn new(path: &'a Path, overwrite: bool) -> Result<Self, Box<dyn std::error::Error>> {
        if &path.exists() & !overwrite {
            panic!("DB file '{:?}' already exists.", &path);
        }
        let konn = Connection::open(&path)?;
        Ok(Database { path: &path, conn: Some(konn) })
    }

    pub fn open(&mut self) {
        if let Ok(konn) = Connection::open(self.path) {
            self.conn = Some(konn);
        }
    }

    pub fn close(&mut self) {
        if let Some(konn) = &self.conn {
            if let Ok(_) = konn.close() {  // NOPE!
                self.conn = None
            }
        }
    }
}

The rationale for wrapping the rusqlite::Connection in my own Database object is that I expect the application to repeatedly access a single DB file, and I want it to automatically open and close the connection as needed.

As indicated by the "NOPE!" comment, compilation fails where I try to close the DB connection, because konn.close() requires a move, which is not permitted. And that totally makes sense, but I don't have the faintest idea of what to do about it. I'm prepared to believe that I'm thinking about the problem in the wrong way, but if so, I have no idea of the right way.

Solutions for this specific situation would of course be helpful, but I think what would be even more helpful is higher-level advice on how to conceptualize this kind of issue - because as I mentioned above, this kind of thing keeps happening, and it's driving me crazy! :wink:

I strongly recommend changing the path field to be of type PathBuf so you don't need the lifetime. It is not a good idea.

As for the close call, you need to take ownership of the connection to close it.

pub fn close(&mut self) {
    if let Some(konn) = self.conn.take() {
        konn.close();
    }
}

This uses Option::take, which gives you an owned version of the option, replacing the self.conn field with None. It is essentially this, except you can't split the operation into two:

pub fn close(&mut self) {
    let conn = self.conn;
    self.conn = None;
    if let Some(konn) = conn {
        konn.close();
    }
}
4 Likes

Alice's solution with take() is good, but, more generally, the pattern is to take a value from the existing member and put a new one in it using std::mem::swap, when something like take() isn't available.

3 Likes

You're not allowed to use references in structs until you think Rust is easy. They're the evil-hardmode of Rust that will ruin your day.

:wink:

Use Box or Arc to store things in structs "by reference". Temporary borrows don't do what you think they do.

23 Likes

Always name the lifetimes like 'path rather than giving abstract 'a name. It doesn't solve your problem but makes it much more clear about where the problem is occurring. Checkout this explanation, hope this helps out

5 Likes

So true. I wish someone had told me this sooner.

2 Likes

I guess that means I have to spend about 5 years studying the language before I can even start to work on projects that interest me. Okay then. LOL.

I strongly recommend changing the path field to be of type PathBuf so you don't need the lifetime. It is not a good idea.

Okay, I will try that. But, other than the obvious syntactic complications, can you explain why it's not a good idea?

I'd be curious to know why you think that having a reference in a struct is important to work on projects that interest you.

In this case for example it saves at most a single copy of a single string at the expense of permanently tying your Database struct to a single location in memory for as long as the Database struct is alive. I'm assuming you have other use cases in mind, so maybe there are use cases where it makes more sense. But as someone who initially thought I wanted references all over the place because I thought it would be better faster when learning Rust, they really do complicate things a lot and should really be selectively used once you have a better feel for it.

6 Likes

Other languages use term "reference" for storing things "by reference" or just referencing any object anywhere in general. It's not like that in Rust.

What Rust calls "reference" is a much more specific thing, that is no so general-purpose. It has a narrower, restrictive usage. Rust references are more like read-only or write-exclusive locks. They make their target unmovable and immutable for entire duration of their existence. They can't exist on their own, only as a counterpart of an owned value.

References in structs also make the whole struct itself temporary, and everything that touches that struct becomes temporary and tied to the scope of the borrowed value that started it.

If these restrictions (that cause you fight with the borrow checker) aren't what you want to achieve, then you don't want temporary references.

99% of the time when you need to store something "by reference", Box (or Arc or String or PathBuf or Vec or some other owned type) is the right answer.

Note that &T and Box<T> have identical representation in memory — a pointer. They differ by ownership.

24 Likes

The above explanation is so clear, and so essential for every Rust nauplius, that it ought to get prominent placement in every introduction to programming in Rust. We probably have more than 100 threads a year in this forum about problems that newcomers have with lifetimes, most of which are caused by misunderstanding the significant constraints imposed by including a Rust @ or @mut reference in a struct.

10 Likes

The upshot of it not being posted prominently enough is that @kornel has gotten very good at explaining the issue.

8 Likes

I'd be curious to know why you think that having a reference in a struct is important to work on projects that interest you.

I'm not sure it is - I was being a bit facetious. It's just that I have a couple of rather ambitious projects that I've been tossing around in my mind for a long time, and Rust seems like the best available language to implement them. And I'm impatient because I've been programming in various languages for ... a while, and usually it doesn't take me so long to learn a new language ... but every previous language I've seriously worked with has had garbage collection. Anyway, I'm trying to model my program based on what has worked in the past, and as I implied above, I don't really know how to "think in Rust" yet. But the responses here are very helpful.

1 Like

Okay, on further thought, your question touches on the One Big Problem that keeps hanging me up in learning Rust. There have been many specific instances, but I think there is really one overall pattern:

I want to define functions that take some value as an argument, and return a struct containing that value.

In this case, that function is Database::new() (which maybe should be called Database::from(), but right now, I'm just trying to get something working). To me this seems like an obvious and natural pattern (BTW, I've done a lot of functional programming, e.g. in OCaml & Scheme), but I can never, ever get it to work in Rust. And believe me, I've spent many, many hours trying, because I always try to solve problems myself before I ask questions - and I'm also very stubborn :wink:

Is this simply Not How Things Are Done [TM] in Rust?

If so, could somebody please show me a simple example of how things are done? Given that:

  • You want to work with a SQLite database ...
  • which may need to be created ...
  • in a user-supplied location in the filesystem ...
  • but the user accesses a high-level interface that does not expose any SQL queries or other details of database interaction ...

What would be an idiomatically Rusty way to implement these basic operations - creating, opening, and closing the database?

The quick answer is that the data will have to be copied into the struct. But if you think about it, that’s really the only option since the struct has a particular layout and the incoming data has a particular likely different layout. If it’s a lot of data like a Vec with a lot of values in it, then ideally you would move the data to the struct, which instead of copying all the data, would just copy the pointers to the heap. With rust there’s a lot of control over what exactly you want to copy and also whether you do want to move it (in which case you will lose access to the data using the original variable).

I think what might be helpful is if you show some sample code (it can be short) with a concrete example of something you’re trying to do but can’t seem to figure out a good way to do it in rust.

EDIT I glossed over the details of what it looks like to implement what i discusses, but there are several options available to you, which is why I think showing some code/pseudocode would be helpful to tie the discussion to something in particular rather than keeping it abstract.

Uh, well, the example in my original post was the sample code. That's all I've got right now ...

As for explaining the project further, that might or might not help, and I'm a bit reluctant because I tend to think way outside the box, and - based on many years' experience - people almost invariably misunderstand my ideas unless I either put a great deal of thought into the explanation, or produce a working product, and then I have to deal with people wanting to talk about how wrong my idea is ... :wink:

I do have, somewhere, some code from 2015 which I somehow managed to take a bit farther - though it still wasn't anywhere near a working program. I'll see if I can find that ...

That’s the normal way to use Rust; if you only take and store a reference to the value (anything that takes a lifetime parameter, really), your struct will be forever tied to the stack frame where that value resides, because you didn’t put the value in the struct. Instead of taking a reference to the database object, take the whole thing.

The next question that comes up is how to let multiple objects all have access to the same value, and that has lots of different options depending on the exact scenario:

  • References let you temporarily loan out an object, usually only for the duration of a function call
  • Cloning lets you duplicate an object so that everyone can have their own copy
  • Rc/Arc are references that keep their target alive as long as they exist

For long-lived structures, you almost always want one of the latter two options.

7 Likes

No that’s fine. I forgot there was code in the original post. In this case, you just want to use PathBuf instead of Path. (PathBuf is the owned version of &Path) I would write it like this.

#[derive(Debug)]
pub struct Database {
    path: PathBuf,
    conn: Option<Connection>,
}

impl Database {
    pub fn new(path: impl Into<PathBuf>, overwrite: bool) -> Result<Self, Box<dyn std::error::Error>> {
       let path = path.into();
       if &path.exists() & !overwrite {
           panic!("DB file '{:?}' already exists.", &path);
       }
       let konn = Connection::open(&path)?;
       Ok(Database { path, conn: Some(konn) })
    }
...
}

The first obvious advantage is that there are no more lifetimes on the struct. This makes it much easier to use. Additionally, a path is likely to be a small amount of data, so copying the data if need be is not a big deal, and this will not be the bottleneck in the program so you might as well copy.

The change I made to the type signature using impl Into<PathBuf> makes the function generic. In this case in particular, it makes it generic over many different types including &Path, PathBuf, String, '&str', which just makes the function easier to use. If you pass a &Path or &str it will just copy the path into new memory owned by Database. And if you pass a PathBuf (or String) it will actually just take ownership of the path in memory avoiding a copy of the path. (It will still copy the pointer and the length of the path).

Hopefully thats helpful. Feel free to ask questions in this forum going forward (rather than just banging your head against the wall which is also what I tend to do when I get stuck against my better judgement).

5 Likes

I think that the short answer is that you can certainly do this in Rust too, but you have to do it by taking ownership. If you only store a reference to the value, you haven't really put the value in the struct. References only borrow the value from somewhere else, so if you have a reference, there must be some other variable that owns the value.

1 Like

I want to define functions that take some value as an argument, and return a struct containing that value.

As I understand ownership, that's exactly how rust works.

fn consumes_x(x: Type) {}   // any x passed into `uses_x()` goes out of scope after here

Anything passed into a function whole (not as a reference) transfers ownership into the function. Then, for that variable (or value from it) to continue living past the braces of the function, it must be used to create something that is returned, otherwise it disappears forever.

The alternative:

fn uses_x(&x: &Type) {}   // x remains in scope, &x lives for the duration of the function braces

keeps x alive in any original scope calling uses_x(), while providing read-only access to it. This does also mean that anything produced by uses_x() needs to make sure its lifetime isn't the same lifetime as x, or else whatever uses_x() returns will only be able to live as long as x also does.

Rust just elides in default lifetimes, but the end result is that what is returned will only be able to live as long as x does, unless something is changed:

fn uses_x<'x_life>(&x: &'x_life Type) -> Type<'x_life> {}

Once you're trying to work different lifetimes on the input and output, things get tougher...

fn uses_x<'x_life, OutType<'new_life>>(&x: &'x_life Type) -> OutType<'new_life> {}

So you have to provide whatever functionality makes it so that the return value will have a new lifetime.

Also, I'm in the same boat of trying to learn rust by working a project. So, if I've got anything wrong, I'd appreciate any corrections anyone has... Every now and then it feels like I get some clarity on an aspect of rust, but I'm constantly finding new and interesting things to screw up!