Actix web testing with BigQuery client

I inherited an actix web service that saves requests to BigQuery. I am trying to write functional tests that prove out the functionality but I am struggling to deal with the BigQuery dependency.

A simple version of it looks something like this:

use gcp_bigquery_client::Client;
use gcp_bigquery_client::model::table_data_insert_all_request::TableDataInsertAllRequest;

#[actix_web::main]
async fn main() -> anyhow::Result<()> {
    let client = gcp_bigquery_client::Client::with_workload_identity(false).await?;

    HttpServer::new( move  || {
        let client_pool = web::Data::new(client);

        App::new()
            .app_data(client_pool)
            .route("/save/", web::get().to(handle))
    })
        .bind(("0.0.0.0", 8080))?
        .run()
        .await?;
        
    Ok(())
}

async fn handle(req: HttpRequest, client: web::Data<Client>) -> Result<HttpResponse, WebError>   {
	// pluck data from request
	let mut insert_request = TableDataInsertAllRequest::new();
    // add data to insert_request

    client
        .tabledata()
        .insert_all(
	        "project_id", 
	        "dataset_id", 
	        "my_table", 
	        insert_request
	    )
        .await?;
    
    Ok(HttpResponse::Ok().body("OK"))
}

I want to write some tests, but I do not want to actually save the data to BigQuery. If I write something like this:

    #[actix_web::test]
    async fn test_save() {
        let client = gcp_bigquery_client::Client::with_workload_identity(false).await.unwrap();
        let client_pool = web::Data::new(client);

        let app = test::init_service(
            App::new()
                .app_data(client_pool)
                .route("/save", web::post().to(handle)),
        )
        .await;

        let payload = "{ ... }";
        let peer_addr = "127.0.0.1:8080".parse().unwrap();

        let req = test::TestRequest::post()
		    .uri("/save")
		    .peer_addr(peer_addr)
		    .set_payload(payload)
		    .to_request();
        
        let resp = test::call_and_read_body(&app, req).await;
        assert_eq!(resp, b"OK"[..]);
    }
}

The test panics because I am not specifying a real project id and dataset id. I do not see a way to mock the BigQuery client either.

I could set a test flag and only save the data to BigQuery if the test flag is false.

if test == false {
    client
        .tabledata()
        .insert_all(
	        "project_id", 
	        "dataset_id", 
	        "my_table", 
	        insert_request
	    )
        .await?;
}

Does anyone have any better ideas?

I usually set up completely separate staging and production environments, including dependencies like cloud services. Instead of doing a request-wise flag whether the request contains testing or production payloads, I'd pass an environment variable to my program on startup, exposing the environment in a config struct I pass as app data to my endpoints, if necessary. Based on that I'd initiate the BigQuery client to connect either to my staging or production service.

If you don't want to test the BigQuery integration and not want to have two BigQuery instances running, you could use the same environment flag to wrap the database calls and only run them in the production environment:

async fn handle(req: HttpRequest, client: web::Data<Client>, config: web::Data<Config>) -> Result<HttpResponse, WebError>   {
	// pluck data from request
	let mut insert_request = TableDataInsertAllRequest::new();
    // add data to insert_request
    if config.environment == "production" {
        client
            .tabledata()
            .insert_all(
                "project_id", 
	            "dataset_id", 
	            "my_table", 
	            insert_request
            )
            .await?;
    }
    
    Ok(HttpResponse::Ok().body("OK"))
}
1 Like

It depends how far are you willing/ intending to test, but the usual recommendation is to create a façade every time you're interacting with an external service.

1 Like

The BigQuery crate exposes no traits. I imagine the façade I would make would basically abstract a conditional check of whether to call

client
  .tabledata()
  .insert_all(
      "project_id", 
          "dataset_id", 
          "my_table", 
          insert_request
  )
  .await?;

or not.

Are you imagining something different?

Well, that's what the façade is for. You would create an abstraction for the operation(s) you are performing, so that you can decouple your logic from a particular vendor's API. Imagine how would you structure your code if, in the future, you decide to ditch away BigQuery in favour of another vendor.

I consider this an anti-pattern. Databases are complex systems. BigQuery is different enough than Redshift that abstracting it away will almost certainly be leaky and/or be a significant time investment with little to show for it. There is also value in keeping an application as simple as possible.

I respect that you may feel differently. I appreciate your thoughts and for the time you spend responding to my question.

I get your point. As with everything, it depends™. If the abstraction would be leaky anyway, then it might be better to not abstract the API in the first place, as you said.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.