[Proposal - Arch Linux] Port `Aura` to Rust

If you decide to go for it, remember I do more than alpm bindings :wink:

See:

aur-fetch
aur-depends
srcinfo
pacmanconf
raur

2 Likes

Yes! I've noticed you've done quite a lot there and am very grateful :slight_smile: And actually I've been planning to contact you about collaboration:

  • my experimentation can probably help your alpm series move toward 1.0 confidence (did you have a date in mind to graduate from the rcs?)
  • I managed to get r2d2 working quite nicely, which we could probably publish as r2d2-alpm, if you think there's value in that.
1 Like
  • my experimentation can probably help your alpm series move toward 1.0 confidence (did you have a date in mind to graduate from the rc s?)

Yesterday :stuck_out_tongue:

  • I managed to get r2d2 working quite nicely, which we could probably publish as r2d2-alpm , if you think there's value in that.

That's just for concurrent downloads right? Alpm already has it in git so I don't see a need to reimplement it. Does it do anything else?

Hi there! I'm the author of rua; I'm not sure if you've encountered this project, but it's also (already written) in Rust and targets AUR: https://github.com/vn971/rua

Note that RUA only intends to complement pacman, not replace it, so if you aim to rewrite aura by also encapsulating all pacman operations, then that project is probably not the best start.

Yesterday? What timing, I had begun my experimentation 2 or 3 days ago, so saw the rc and assumed it would stew for a while :slight_smile: Well who knows, I might discover something here or there and can submit patches to it now.

As for the r2d2 bindings, they not (just) for the concurrent downloads - they're to be able to use the Alpm type (and its results) with rayon. There are a bunch of places where Aura does certain checks concurrently, but this is expensive in the Haskell variant, since I'm making shell calls to Pacman. Having an alpm handle is going to speed this up considerably.

Hi! Thanks for joining the party :slight_smile: I have indeed heard of rua. In the end these various projects have different goals / preexisting interfaces, so I don't see them as in competition per se. But if we're all borrowing / working on the same underpinnings, I think we can all benefit.

How exactly does r2ds allow concurrency?

I can see you using it in this code.

// Look up each package in the ALPM database.
            let packages: Vec<(String, String, u64)> = s
                .packages
                .par_iter()
                .map_with(pool, |pul, pkg| {
                    let conn: PooledConnection<AlpmManager> = pul.get().unwrap();
                    let dbs = conn.syncdbs();
                    dbs.iter()
                        .filter_map(|db| db.pkg(pkg).ok())
                        .filter_map(|p| {
                            let total = p.download_size() as u64;
                            package_url(&p)
                                .ok()
                                .map(|u| (p.name().to_string(), u, total))
                        })
                        .next()
                })
                .filter_map(|o| o)
                .collect();

But how does it work? Db along with most other types are not sync or send. Does it just block if there's another Db in scope? In that case is it even actually concurrent?

You're right that since many of the bound types have raw pointers underneath, we don't get Sync instances for them. Critically, though, Alpm has a Send instance, which lets several Alpm handles be opened, shoved into an r2d2 pool, then passed to rayon functions like for_each_with. Now it's true that you can't map into raw alpm type (like Package) and have that be the thing passed between threads, but at least for one stage in the iterator chain, you have full access to everything Alpm can give you.

Now here's the critical question: under what circumstances is the actual underlying db locked by libalpm? Would a simple package lookup from a Db do it? If no, then we're fully concurrent already. If yes, then oh well, at least later IO-based stages of the rayon iterator would still be concurrent.

Ah, you're making multiple alpm handles. I guess that works. The memory overhead is probably pretty hight though.

Why not just iterate and map into URLs on one thread. Then use multithreading on the vec of URLs?

Yes, also entirely possible. For the actual port I'm going to consider such strategies. Having a single handle might just be fast enough to outweigh the scheduling overhead of the concurrency. I will measure when I get to that stage.

I do think that a riir of alpm/pacman could be very productive. Even if it was designed to run alongside the official libalpm, it could be made very 'rust-y', for example integrating into the rust async world, and using standard rust types for things like shared references. You'd also get easily swappable awesome collections like hashbrown. I did at one point start working on this, but it's a lot of work and I got a job :wink: .

A sub-goal of it would be to provide the libalpm C interface, which would also be a test of correctness (as defined by identical behavior to the official implementation).

It's way too much work for it to be worth doing for me at least. Reading packages and dbs is easy. However implementing transactions is not something I wanna go near. Nor something I trust some one else to do correctly either.

1 Like