When functions are added to the Runtime, their time increases considerably

I have a function that takes about 50ms to test alone. When I added it to the runtime it took about 600-800ms and was up to 2s when the CPU was fully loaded.

#[test]
fn test_decode_msg() {
    use super::PoolNetwork;
    use super::ClientMessage::Notify;
    // Construct the block template.
    let block = PoolNetwork::genesis_block();
    let template = BlockTemplate::new(
        block.previous_block_hash(),
        block.height(),
        block.timestamp(),
        block.difficulty_target(),
        block.cumulative_weight(),
        block.previous_ledger_root(),
        block.transactions().clone(),
        block.to_coinbase_transaction().unwrap().to_records().next().unwrap(),
    );
    let bs = template.to_bytes_le().unwrap();

    // test only block template
    let start = std::time::Instant::now();
    let _ = BlockTemplate::<PoolNetwork>::read_le(&bs[..]).unwrap();
    println!("read_le {:?}", std::time::Instant::now().saturating_duration_since(start));

    //test message
    let message = Notify(template, u64::MAX/2, "1679091c5a880faf6fb5e6087eb1b2dc".to_string());
    let result = encode_msg::<PoolNetwork>(message).unwrap();
    let start = std::time::Instant::now();
    let _msg = decode_msg::<PoolNetwork>(&result).unwrap();
    println!("decode_msg cost {:?}", std::time::Instant::now().saturating_duration_since(start));
}

The decode_msg function took time in the test:

read_le cost 44.190958ms
decode_msg cost 43.889791ms

But when I run this function inside task::block_in_place,it would take a long time, even more than 2s or 3s.
I tried to run the method containing this function using a separate thread and a separate runtime, but the problem persisted.
I've been trying for a few days and still can't solve it, can someone help me solve this problem´╝č

decode_msg:

pub fn decode_msg<N: Network>(data: &[u8]) -> Result<ClientMessage<N>, anyhow::Error> {
    let start = std::time::Instant::now();
    let mut codec = ClientMessageCodec::<N>::default();
    let mut bytes_mut = BytesMut::new();
    bytes_mut.put(data);
    match codec.decode(&mut bytes_mut) {
        Ok(msg) => {
            if let Some(msg) = msg {
                trace!("decode msg {} from server cost {:?}",msg.name() ,std::time::Instant::now().saturating_duration_since(start));
                Ok(msg)
            } else {
                Err(anyhow::Error::msg("None message decoded"))
            }
        }
        Err(e) => Err(anyhow::Error::msg(format!("decode message failed with error: {}", e))),
    }
}

I also tested Thread-Priority and it didn't work either

It's my fault. I didn't consider the size of the data source