I am thinking of building an news feed in Rust, I think Rust is highly suitable to build something like that. I want to build something like https://getstream.io/ . Only problem that I see with getstream.io is that data reside in somewhere else, other than that it's great.
I have searched for such news feed projects in Rust, but found non.
My question is what crates should I use in order to speed up development? Should I use a web framework like Actix-web?
Actix-web is pretty nice, and comes with a bunch of convenient built-ins for parsing requests and building responses. It's a good choice for most websites and APIs.
Actix-web can be used with either one request per thread (good for short lived or CPU-bound processing), or with futures for cooperative multitasking (good for long-lived I/O-bound processing). If you're new to Rust, I'd suggest avoiding futures, since the easy async/await interface isn't available yet, and the current one requires more Rust experience.
Have you implemented something like that before? When learning a new language it's best to start with a problem you know well. If you're building something new in a new language, that might be too many unknowns.
Thanks for the reply. Although I have not dealt with this particular problem of before, Its worth giving a try, so I could learn a lot from it. Lets give actix-web a chance for first round
Yeah DB is the bottleneck in this case, when I asked getstream.io they are using rocksdb which is also used by the Linkedin. You could read more about, how getstream do things here
I did find a rust wrapper for rocksdb, I do not think it is regularly maintained.
I was wondering getstream does not use a graph db for such case? may be performance is the issue.
When building such thing I think, it is good to decouple db, so it can be changed later if needed.
Do you have any ideas, what to use for db and how to tackle this problem at the db end?
Then you do what facebook did(or better?). Build an in-memory search-server. The best architecture is to do what scylladb does (seastar framework). Something like that isn't available in rust but only in c++.
It's like in games, you first build the 3d-engine, then you build the game. Do you want to build the 3d-engine ?
If you want in rust, see tantivy for inverted-index, and tikv for distributed db. You can merge those to have a distributed search engine. If you want in c++, see scylladb/seastar + trinity inverted index.
Of course you can replicate seastar/trinity in rust but it will take time.
If you seriously take "highest possible scale", then you're not writing just a Rust program, you're writing an entire distributed system where the Rust part is not even a significant part of the problem.
Facebook has likely spent more than several person-years on scalability of their feed. There are quicker ways to learn Rust than to spend 10 years on a database that supports a scale nobody else has
So if you scale down requirements a bit, you can build something in a reasonable time with e.g. Redis, Postgres or even just an array in memory, hooked to an existing Rust framework.
@kornel IMHO I do think using a language like Rust matter even for distributed system, deterministic performance due to non GC and memory safety than using something else.
I am not trying to learn Rust by building a DB. I am currently reading the Rust book, I am interested in the problem of building a feed that scales, since I also need a feed for an upcoming project.
Can please tell me how much do I need to scale down? As to my knowledge postgres and redis drivers are most stable Rust db drivers so they are also a good choice.
As @ddorian43 pointed out tikv is also a good choice.
When learning, you want to try and avoid juggling too much at once. You want to get something working, learn from it (including your mistakes), then iterate. For example, in your discussions about database, you're trying to pick something based on several simultaneous criteria, including:
scalability
stability
performance
In the first instance, I wouldn't worry too much about any of these, except perhaps a little of "stability" in the sense of "maturity" just so that you can concentrate on your own bugs without complications from library bugs.
I'd concentrate much more on the shape of your problem, and the kind of data model you think you'll need to support: key-value vs relational, structured vs opaque (serialised blob) data, transactional vs eventual-consistency, etc. Redis, diesel->postgresql, and TiKV are all good data stores but very different choices along these criteria.
Make a basic architecture choice, pick something (almost anything) that's a simple implementation of that model that you can start building with, and start building. Later, when you've learned more about your application logic, you can address infrastructure scaling and change from (say) sqlite to postgresql or from in-memory HashMaps to TiKV quite easily, and with a little more effort from (say) SQL-ish to KV-ish. You'll learn about different aspects at different times, rather than trying to make all your decisions at once.