How to build a news feed like in Facebook with Rust?

Hello, I am new to Rust,

I am thinking of building an news feed in Rust, I think Rust is highly suitable to build something like that. I want to build something like https://getstream.io/ . Only problem that I see with getstream.io is that data reside in somewhere else, other than that it's great.

I have searched for such news feed projects in Rust, but found non.
My question is what crates should I use in order to speed up development? Should I use a web framework like Actix-web?

Please help.

Actix-web is pretty nice, and comes with a bunch of convenient built-ins for parsing requests and building responses. It's a good choice for most websites and APIs.

Actix-web can be used with either one request per thread (good for short lived or CPU-bound processing), or with futures for cooperative multitasking (good for long-lived I/O-bound processing). If you're new to Rust, I'd suggest avoiding futures, since the easy async/await interface isn't available yet, and the current one requires more Rust experience.

Have you implemented something like that before? When learning a new language it's best to start with a problem you know well. If you're building something new in a new language, that might be too many unknowns.

1 Like

Thanks for the reply. Although I have not dealt with this particular problem of before, Its worth giving a try, so I could learn a lot from it. Lets give actix-web a chance for first round :slightly_smiling_face:

The whole point is how you'll store/index the activities in the db (use existing db or create your own). Do you have a plan here ?

While the web-layer can be in whatever.

1 Like

@ddorian43 thanks for the reply.

Yeah DB is the bottleneck in this case, when I asked getstream.io they are using rocksdb which is also used by the Linkedin. You could read more about, how getstream do things here

I did find a rust wrapper for rocksdb, I do not think it is regularly maintained.

I was wondering getstream does not use a graph db for such case? may be performance is the issue.

When building such thing I think, it is good to decouple db, so it can be changed later if needed.

Do you have any ideas, what to use for db and how to tackle this problem at the db end?

Depends on what scale/speed you need it.

Lets say highest possible scale, Usually something like facebook they have 21B monthly visits.

Then you do what facebook did(or better?). Build an in-memory search-server. The best architecture is to do what scylladb does (seastar framework). Something like that isn't available in rust but only in c++.
It's like in games, you first build the 3d-engine, then you build the game. Do you want to build the 3d-engine ?

If you want in rust, see tantivy for inverted-index, and tikv for distributed db. You can merge those to have a distributed search engine. If you want in c++, see scylladb/seastar + trinity inverted index.

Of course you can replicate seastar/trinity in rust but it will take time.

I am not trying to fightback Facebook or something like that, I just want to learn how to build a something like that, I am just curious. :thinking:

tikv seems good.

Have you done something like this before?

If you seriously take "highest possible scale", then you're not writing just a Rust program, you're writing an entire distributed system where the Rust part is not even a significant part of the problem.

Facebook has likely spent more than several person-years on scalability of their feed. There are quicker ways to learn Rust than to spend 10 years on a database that supports a scale nobody else has :slight_smile:

So if you scale down requirements a bit, you can build something in a reasonable time with e.g. Redis, Postgres or even just an array in memory, hooked to an existing Rust framework.

3 Likes

I've thought about it but I haven't.

Anyway, you have several answers and you can make your choice, right ?

@kornel IMHO I do think using a language like Rust matter even for distributed system, deterministic performance due to non GC and memory safety than using something else.

I am not trying to learn Rust by building a DB. I am currently reading the Rust book, I am interested in the problem of building a feed that scales, since I also need a feed for an upcoming project.

Can please tell me how much do I need to scale down? As to my knowledge postgres and redis drivers are most stable Rust db drivers so they are also a good choice.

As @ddorian43 pointed out tikv is also a good choice.

When learning, you want to try and avoid juggling too much at once. You want to get something working, learn from it (including your mistakes), then iterate. For example, in your discussions about database, you're trying to pick something based on several simultaneous criteria, including:

  • scalability
  • stability
  • performance

In the first instance, I wouldn't worry too much about any of these, except perhaps a little of "stability" in the sense of "maturity" just so that you can concentrate on your own bugs without complications from library bugs.

I'd concentrate much more on the shape of your problem, and the kind of data model you think you'll need to support: key-value vs relational, structured vs opaque (serialised blob) data, transactional vs eventual-consistency, etc. Redis, diesel->postgresql, and TiKV are all good data stores but very different choices along these criteria.

Make a basic architecture choice, pick something (almost anything) that's a simple implementation of that model that you can start building with, and start building. Later, when you've learned more about your application logic, you can address infrastructure scaling and change from (say) sqlite to postgresql or from in-memory HashMaps to TiKV quite easily, and with a little more effort from (say) SQL-ish to KV-ish. You'll learn about different aspects at different times, rather than trying to make all your decisions at once.

3 Likes

@dcarosone Really good advice, I was trying come up with a perfect solution at once.

One more question, Can you please tell me a good hosting solution to test and deploy a such application with minimum configuration?

You can use heroku for a simple one from an ops perspective. For performance/$, it's dedicated servers.

1 Like