Auto scaling Web Architecture Strategy

I am new to Rust and am attempting to lay out a game plan for an architecture that could be described as:

"Auto scaling, Redundant, Cloud Infrastructure, with Failover RESTful API/WebSocket Architecture"

I am trying to lay out the binaries that need to be written and the crates I need to use/learn.

Any thoughts would be appreciated.

Environment Assumptions:

  1. All binaries are written in Rust
  2. Client applications are either Web SPA or mobile apps that can communicate with web sockets or RESTful APIs over HTTP2/TLS. I.e. my web servers do not need to serve static files.
  3. Binaries (processes) are run in either VMs or Containers running within Kubernetes or some other orchestration tool that allows autoscaling.
  4. The most critical bottleneck in the application framework is the writing to the writable database. Reads may be able to come from caches and read only replicas. And all other instances can scale.
  5. The number of connections to the databases need to be constrained. As long as the architecture is configurable, future metrics can determine the proper levels of max connections.
  6. The connectivity between the binaries should be considered insecure and as a result, all communication between components should occur over a single TLS secured socket that is maintained. Preferably by mutual TLS authentication that does not rely on root certificates installed on the host but instead by checking the public keys against a distributed white list.
  7. Databases may be relational, NOSQL, or a new type of database written to handle queries that don't map onto these traditional database types effectively.
  8. Desire to use async/.await for all communication over sockets (except maybe the database connections -- more below)

Process/Executable Types:
The following binaries will likely be required:

  1. Web Server Instances (Autoscaling)
    A) Serves RESTful API
    B) Serves Web Sockets
  2. Redundant Controllers
    A) Monitors health of all instances
    B) Creates/distributes x509 certs
    C) Controls number of instances, failover
    D) Component Directory Services
  3. Reporting Aggregators
    A) Captures events
    B) Reports on conditions
  4. Redundant/Load balancing Application Database Server(s) Write/Read
  5. Application Servers
  6. Other specialty servers based on app needs.

Problems and Crates

I assume I can write against alpha code since the async/.await support is not prevalent in older versions of many libraries. (As I was writing this I found that many crates just exited Alpha state in the last 2 days)

Web Server

From all of the reading I have done so far, it seems like you can either go with the latest hyper library 0.13.0-alpha.4 OR actix-web 2.0.0-alpha.1 at the time of writing this.

If writing hyper, it seems that you need something like hyper-rustls 0.19.0-alpha.3 to provide the TLS support.

It appears, although I don't know, that there is supposedly a lot more code you need to write if you directly use hyper. That is speculation as this community code base is moving forward so fast and I can't tell how applicable anything I read is presently without taking the time to learn each technology and then as soon as I move to something else, that learning is out of date.

The tower crates seemed to have a lot of fan fair 6 months ago. But tower-web has not had a release since April 10, 2019, well before async/.await was close to completion. Is Tower still relevant?

Any thoughts on implementing actual non blocking APIs in Tower, versus Hyper, versus Actix-Web?

Database Access

For relational database access, I found diesel and mysql_async, which just entered non-alpha 0.21.0 as of Nov 28 ("26 minutes ago" as I was writing this document). mysql_async has only had 22,861 downloads all time right now. This compared to diesel's 532,536 makes me a bit nervous. But diesel's authors seem to be having some difficulties moving their library to async/.await based on some of the underlying technologies they are using (if I understood the issue correctly).

As I step back and look at using diesel with their ORM support, I wonder how bad this will actually be. With a database connection pool, you want to have a limited number of total connections. The objectives are different than serving 10,000 concurrent sockets. Also, since there is a very small number of database connections and those database connections are likely your most precious resource, is having real OS threads that are 1 to 1 with actual database connections necessarily a bad thing? Having the OS act as soon as data is ready rather than scheduling a polling callback when data is ready might actually be a good thing.

Assuming the 1 OS thread to 1 database connection is acceptable or even optimal for the reasons above, is an architecture where work to be performed on a database connection must be enclosed in a synchronous closure that is scheduled asynchronously with async/.await a bad thing? From there you would simply spawn a limited number of processing threads with connections as you need them and then dequeue synchronous closures off of a queue with a controllable number of max connections driven by server load and what your metrics show your system can support.

Thoughts? Does the above functionality exist in the diesel crate? Another crate placed on top of diesel? Is it a bad idea?

Has anyone used mysql_async? How does it deal with the number of connections that are maintained? Ramping up, max, and Ramping down?

Any other advice on database access in an async/.await world?

Interprocess Communication

This one seems easy. Tokio, which just released non alpha 0.2.1 yesterday with TLS (tokio-tls 0.3.0 released 2 days ago) placed on top of it seems like it is the ideal method.

Of course some bidirectional asynchronous message passing infrastructure must be placed on top of this. Do any crates exist for this? It seems like rolling your own might just be easiest to do anyways.

Any thoughts on this?

If you made it to this point, thank you so much for indulging me. I look forward to any advice/discussion on any aspect of these goals, tentative designs, or tentative crate selection.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.