Using Actix Web Client in CLI

I'm currently working on a template-rendering CLI which has a fact plugin system for discovering facts about the current system. Some plugins are simple, one just exports environment variables, another checks the number of CPUs on the system via the num_cpu crate.

Presently, I'm working on providing EC2 metadata facts using the EC2 metadata service. I haven't gotten very far, but I'm trying to at least do a generic discovery task to determine that:

  1. the EC2 metadata service is present by sending a HEAD request to http://169.254.169.254/.
  2. the EC2 metadata service is, in fact, the EC2 metadata service by checking that the Server header is EC2ws.

The full code can be seen in my pull request, but I'll summarize it here:

lazy_static! {
    pub static ref EC2_METADATA_URL: String = format!("http://{}/", env::var("EC2_METADATA_HOST").unwrap_or("169.254.169.254".to_string()));
}

impl Ec2Plugin {
    /// Determine whether we are currently running in EC2.
    /// 
    /// At present, we send a HEAD request to the EC2 metadata service and examine the `Server` header to determine
    /// whether we're in EC2. If the `Server` header is `EC2ws`, we're in EC2.
    fn is_ec2(&self) -> bool {
        log::debug!("Checking the EC2 metadata server to detect whether we're in EC2...");

        let fut = Client::default().head(EC2_METADATA_URL.as_str())
            .header("User-Agent", "jinjer")
            .send();

        match fut.wait() {
            Ok(response) => {
                match response.status() {
                    StatusCode::OK => {
                        response.headers().get("Server").unwrap_or(&HeaderValue::from_static("Unknown")) == HeaderValue::from_static("EC2ws")
                    },
                    _ => {
                        log::debug!("Received non-200 response: {:?}", response);
                        false
                    }
                }
            },
            Err(e) => {
                log::debug!("Unable to query EC2 metadata service: {:?}", e);   
                false
            }
        }
    }
}

For some reason, this code immediately with an error:

Unable to query EC2 metadata service: Connect(Timeout)

According to the Actix docs, the timeout is set to 5 seconds, but I'm seeing it time out in less than a millisecond.

Outside of this block of code, I'm instantiating an actix_rt::System like so:

static CREATE_ACTIX_RUNTIME: Once = Once::new();

fn main() {
    CREATE_ACTIX_RUNTIME.call_once(|| {
        let _ = actix_rt::System::new("jinjer");
    });

    // execute code
}

It's strange to me that it's failing immediately.

One question I foresee being asked is "why use an asynchronous client at all?" My answer is that for this use-case, it's extremely important to keep execution time as small as possible, as there will be many fact providers executing at once, and I will need to launch a bunch of requests as I crawl each layer of the metadata service, as it mimics a filesystem of directories and files.

What am I missing or doing wrong?


EDIT Adding log output:

2019-06-21T20:29:50.228+0000 DEBUG [main] jinjer::facts::plugins::ec2: Checking the EC2 metadata server to detect whether we're in EC2...
2019-06-21T20:29:50.229+0000 TRACE [main] actix_connect::connector: TCP connector - connecting to "169.254.169.254" port:80
2019-06-21T20:29:50.229+0000 TRACE [main] mio::poll: registering with poller
2019-06-21T20:29:50.229+0000 DEBUG [main] tokio_reactor: adding I/O source: 0
2019-06-21T20:29:50.229+0000 TRACE [main] mio::poll: registering with poller
2019-06-21T20:29:50.229+0000 DEBUG [main] tokio_reactor::registration: scheduling Write for: 0
2019-06-21T20:29:50.229+0000 TRACE [main] mio::poll: deregistering handle with poller
2019-06-21T20:29:50.229+0000 DEBUG [main] tokio_reactor: dropping I/O source: 0
2019-06-21T20:29:50.229+0000 DEBUG [main] jinjer::facts::plugins::ec2: Unable to query EC2 metadata service: Connect(Timeout)

On the same machine:

$ curl -iI http://169.254.169.254/
HTTP/1.0 200 OK
Content-Type: text/plain
Accept-Ranges: bytes
ETag: "3954205079"
Last-Modified: Fri, 21 Jun 2019 19:51:37 GMT
Content-Length: 230
Connection: close
Date: Fri, 21 Jun 2019 20:31:10 GMT
Server: EC2ws

IIUC, this appears to be a case of dropping the SystemRunner immediately upon creating it. You probably want to move ownership of that struct into the static, and call the run() method on it to execute an actor that uses the Client to send your request and wait for a response. The actor will have to stop the SystemRunner when it is finished.

This is a little convoluted, because you need the executor to run futures to completion. I don't think you need the added complexity, and a synchronous http client would work just fine in this case. Here are a few options that might work for you, if you'd like to avoid unnecessary complexity and just make a request that succeeds or fails: knock, attohttpc, mrq.

1 Like

Yeah, I'm familiar with other synchronous HTTP clients out there.

Is there a design pattern for building something like a web crawler, which needs to make asynchronous requests and will close itself after it has finished? This is my first foray into asynchronous Rust and I'd like to work with it closely so that I understand it well for future endeavours.

Do I need to somehow stash the SystemRunner as a thread_local! or something? The actix documentation appears outdated so I'm trying to figure out how to utilize everything properly.

Yes, if you are going to use actix-web, you will need a reference to the SystemRunner so you can execute the futures on it. Think of SystemRunner as an event loop, which is necessary to drive futures to completion. You can run the event loop by calling the run() or block_on() methods on the SystemRunner; see documentation linked above.

Note that calling either method will block the current thread, making it semantically the same as using a synchronous HTTP client.

The design pattern you are asking about, I guess, is building the application around a single global event loop. You can follow the Actix book to get a sense of how this is done in that ecosystem; your main() function typically hands control over to the event loop after setting up at least one actor that will kick off the application's business logic.

So the approach would be to make just about everything use futures and then in the outermost bit of code, run a master future which forks a bunch of sub futures? e.g. don't run futures in isolation, fully commit to them and run them from the top.

1 Like

That's pretty much the gist of it. Also as a followup, I recently stumbled upon runtime which helps eliminate some of the boilerplate required to create and manage a main event loop for async tasks. Check out the examples.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.