Should I use Rust as my backend server and search program for my college project?

Hi everyone,
I’m a 17-year-old first-year college student currently working on an AI-powered research assistant project. My idea is to use Rust (with Tantivy) as the backend server and search engine, while handling the AI/ML parts with Python (HuggingFace, LangChain, etc.).

Do you think Rust is a good choice for building the backend and search program in a project like this? Or should I stick to a full Python stack?

I want to balance performance (fast search, indexing) and simplicity (ease of development/deployment) since this is a college-level project but I’d also like it to stand out on my resume.

Any advice or suggestions from experienced Rust developers would be greatly appreciated!

Thanks!

Is creating your own full-text search engine a requirement of the course? Because if it is not, you might want to consider an industry-standard search engine like Elasticsearch instead. I've never written my own full-text search engine, but I expect it to be considerably more time consuming than integrating Elasticsearch into your Python project, especially given that an official Python client already exists.

3 Likes

My main question would be what sort of work would the Rust code you're writing be doing? If it's just binding to Tantivy or a set of baked queries that's not super interesting, and this approach will mainly serve to add a bunch of technical guff to get the two languages working together.

I'd say it's well worth it, however, if you can show some actual processing/"thinking" on the Rust side: it's main benefit is ease of writing clear, safe logic without massive performance loss, so this sort of approach works well if you can move some "kernel" of the main Python logic into Rust, and that kernel using Tantivy itself is more of an implementation detail to the Python side.

From reading the crate description it looks the the main thing added by an Elastisearch over Tantivy is distributed processing. It's not clear to me how a research AI agent would use such a thing, since it's unlikely this will be getting a cloud deployment...

1 Like

I think I would get the whole project basically working in Python first. Then add Rust if it still seems like a good idea.

You might not know yet what problem you should spend your time on. It seems premature to assume it's search speed. Maybe tantivy-py is fast enough. Maybe that time will be better spent on search quality, getting the model to fit in your computer's memory, AI compute speed, prompt engineering, your write-up—it could be anything.

2 Likes

If learning Rust is a goal of your research, then why not? But if not, then stay away of Rust, you will waste a lot time for nothing.

2 Likes

Other benefits besides horizontal scaling—which indeed is probably not particularly interesting for a PoC-style student project—I see:

  • You get the whole ELK-stack, including data ingestion, monitoring and a visual database editor, which I like to use to prototype queries and indices
  • Elasticsearch offers more features than full-text search, like vector search, which might also be interesting for OP, given that their application is LLM-centric
  • You get an API (plus a client) that has evolved for years. Depending on how flexible OPs solution has to be and how the final project is being graded, writing your own API (be that a server or a library directly embedded into the Python project) might be considerably more time consuming for little benefit to the final grade
  • OP wants the project to stand out on the resume. I'm not a recruiter, but from my personal experience, plugging together different services and APIs is pretty much how a user-facing application is developed today. Combining this with the fact that DevOps is still quite en vogue, I think if OP can demonstrate how to use and deploy (explicitly mentioned as a requirement in the topic description) a service like the ELK-stack effectively, they'll stand out more

Sorry if this is off topic; but since you're all here:

I'm working on an open source website, SuttaCentral. Our search engine is pretty rough, and uses the ArangoDB built in indexing and querying. Short of doing a masters degree, can anyone recommend resources, academic or otherwise for building search engines?

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.