Rust performance differences

Is there a significant difference in performance of Rust code on machines with different configurations?
I have a feature that retrieves 4000 pieces of data from the database, modifies and saves the database according to different situations. It takes 2 minutes to complete on the local machine Mac Studio M1, but it takes 40 minutes on the server (CPU E5-2670 2 core) to complete
There is no difference between local and server Java code with the same functionality, and it can be completed in about 2 minutes
Is it because some Rust configurations are not turned on? thank you!
求助大神,rust 代码 在不同配置的机器上性能差异很大吗?
我有个功能从数据库取出4千条数据,根据不同情况修改保存数据库,在本地机器 mac studio m1 上2分钟完成,服务器(CPU E5-2670 2核)上40分钟都没有完成.
同功能Java代码本地与服务器无差异,也是2分钟左右能完成.
是某些rust配置没有打开的原因吗?谢谢!
rust code

lazy_static! {
    static ref GLOBAL_RUNTIME: Runtime =  tokio::runtime::Builder::new_multi_thread()
                                .worker_threads(4) 
                                .enable_all()
                                .build()
                                .unwrap();
}
rayon::spawn(move || {
                                GLOBAL_RUNTIME.block_on(async {
                                    match global_service:: back_task(&app, cycle, item, ds.clone(), date_time).await {
                                        Ok(s) => println!("{}!{}", if s == 1 { "ok" } else { "fail" }, Local::now().naive_local()),
                                        Err(e) => println!("fail!{:?} {}", e, Local::now().naive_local())
                                    }
                                });
                            });
pub async fn back_task(app: &AppState, cycle: i32, item: i8, codes: Vec<String>, date_time: NaiveDateTime) -> Result<u8, AppError> {
    let mut status = 0;
    if let Ok(specialty) = codes[6].parse::<i32>() {
        let mut cons = Vec::new();
        let mut tops = Vec::new();
        let mut stas = Vec::new();
        let con = &mut app.redis.get().await.expect("redis fail");
        let specialty_attribute = mark_six_tool::code_attribute(codes[6].clone(), date_time);
        for one in m_topics_type::MASTER_TYPES {
            let cid = (item as i32 * 100) + one.0;
            let result = match one.0 {
                1 | 2 | 3 | 12 => specialty_attribute[1].clone(),
                4 | 5 | 17 => codes[6].clone(),
                6 => specialty_attribute[4].clone(),
                7 => if specialty == 49 { "A".to_string() } else { if specialty % 2 == 0 { "B".to_string() } else { "C".to_string() } },
                8 => if specialty == 49 { "A".to_string() } else { if specialty > 24 { "D".to_string() } else { "E".to_string() } },
                9 | 13 => codes[6].clone()[1..].to_string(),
                10 | 14 => codes[6].clone()[..1].to_string(),
                11 | 15 => specialty_attribute[3].clone(),
                16 => format!("{}{}", specialty_attribute[4].clone(), if specialty % 2 == 0 { "B" } else { "C" }),
                18 | 21 => format!("{},{},{},{},{},{},{}"
                                   , mark_six_tool::code_attribute(codes[0].clone(), date_time)[1]
                                   , mark_six_tool::code_attribute(codes[1].clone(), date_time)[1]
                                   , mark_six_tool::code_attribute(codes[2].clone(), date_time)[1]
                                   , mark_six_tool::code_attribute(codes[3].clone(), date_time)[1]
                                   , mark_six_tool::code_attribute(codes[4].clone(), date_time)[1]
                                   , mark_six_tool::code_attribute(codes[5].clone(), date_time)[1]
                                   , specialty_attribute[1]
                ),
                19 | 22 => format!("{},{},{},{},{},{},{}"
                                   , codes[0].clone()[1..].to_string()
                                   , codes[1].clone()[1..].to_string()
                                   , codes[2].clone()[1..].to_string()
                                   , codes[3].clone()[1..].to_string()
                                   , codes[4].clone()[1..].to_string()
                                   , codes[5].clone()[1..].to_string()
                                   , codes[6].clone()[1..].to_string()
                ),
                _ => codes.join(",") //20
            };
            if result.len() > 0 {
                let mut right = false;
                // match m_content_service::find_cid_cycle(&app, cid, cycle).await {
                match m_content_service::find_test(&app).await {
                    Ok(contents) => {
                        for mut content in contents {
                            match m_topics_service::find_one(&app, content.tid).await {
                                Ok(mut topics) => {
                                    match one.0 {
                                        1 | 2 | 3 | 4 | 5 | 6 | 9 | 10 | 11 => right = content.content.contains(&result),
                                        12 | 13 | 14 | 15 | 16 => right = result != content.content.clone(),
                                        18 | 19 => right = result.contains(&content.content),
                                        17 => right = !content.content.contains(&result),
                                        7 | 8 => right = result == "和" || content.content == result,
                                        20 => {
                                            if content.content.contains(",") {
                                                right = true;
                                                let res_arr: Vec<&str> = result.split(",").collect();
                                                let c_arr: Vec<&str> = content.content.split(",").collect();
                                                'outer: for r in res_arr {
                                                    for c in c_arr.clone() {
                                                        if c == r {
                                                            right = false;
                                                            break 'outer;
                                                        }
                                                    }
                                                }
                                            }
                                        }
                                        _ => { //  21|22=>{
                                            if content.content.contains(",") {
                                                let c_arr: Vec<&str> = content.content.split(",").collect();
                                                if c_arr.len() == 2 {
                                                    right = result.contains(&c_arr[0]) && result.contains(&c_arr[1]);
                                                }
                                            }
                                        }
                                    }
                                    content.result = result.clone();
                                    content.status = if right { 1 } else { -1 };
                                    // println!("zhong:{}:{}:{}:{}:{}:{}", one.1,content.tid, content.status, content.cid, content.result, content.content);
                                    topics.total = topics.total + 1;
                                    if right {
                                        topics.correct = topics.correct + 1;
                                        topics.continuous_hit = topics.continuous_hit + 1;
                                        topics.continuous_err = 0;
                                    } else {
                                        topics.continuous_hit = 0;
                                        topics.continuous_err = topics.continuous_err + 1;
                                    }
                                    if topics.continuous_hit > topics.long_continuous_hit {
                                        topics.long_continuous_hit = topics.continuous_hit;
                                    }
                                    if topics.continuous_err > topics.long_continuous_err {
                                        topics.long_continuous_err = topics.continuous_err;
                                    }
                                    let accuracy = if topics.total == 0 { 0.0 } else { ((topics.correct as f64 / topics.total as f64) * 10000.0).round() / 10000.0 };
                                    topics.accuracy = accuracy;
                                    cons.push(content);
                                    if topics.accuracy < one.2 { m_topics_service::update(&app, topics).await?; } else {
                                        tops.push(topics);
                                    }
                                    match m_statistic_service::find_cid_cycle(&app, cid, cycle).await {
                                        Ok(ms) => {
                                            for mut m in ms {
                                                m.is_reck = 1;
                                                m.hit = result.clone();
                                                stas.push(m);
                                            }
                                        }
                                        Err(e) => error!("ms:{:?}", e)
                                    }
                                    // let ms = m_statistic_service::find_cid_cycle(&app, cid, cycle).await?;

                                }
                                Err(e) => error!("topics:{:?}", e)
                            }
                            // let mut topics = m_topics_service::find_one(&app, content.tid).await?;

                        }
                    }
                    Err(e) => error!("contents:{:?}", e)
                }
                
            }
            con.del::<_, i32>(format!("Master{}m_praise", cid)).await?;
            con.del::<_, i32>(format!("Master{}all", cid)).await?;
        }
        // if cons.len() > 0 { m_content_service::batch_update(&app, cons).await?; }
        // if tops.len() > 0 { m_topics_service::batch_update(&app, tops).await?; }
        // if stas.len() > 0 { m_statistic_service::batch_update_lottery(&app, stas).await?; }
        status = 1;
    }
    println!(" status:{status} {}",Local::now());
    Ok(status)
}

java code:springboot3.4

@Async
    public void taskAsync(int cycle, LocalDateTime lotteryDateTime, String codeStr, int s) {
back_task(...){
...
}
}

What database software are you using? E.g. is it MySQL? SQLite? Or is the database also written in Rust?

Hello, it's MySQL and using SQLX for operations

Sqlx is an async library, so there might be some performance issues there related to task concurrency, or you may be using multiple transactions when you only need one. It's difficult to say for sure though without knowing what your code does.

1 Like

Just to be sure, are you building with optimizations? That is, with cargo build --release or a customized profile.

2 Likes

Is it unsafe ?

Can you show the rust code and java code you test? and the database version/config?

is the server (or the server database) under external load?

Well I wouldn't normally be too surprised that a 2012 CPU performed significantly worse than an M1, especially if you're only given 2 cores. But 20x is in the higher end of what would be reasonable (I'd expect more like 5x), and it doesn't explain why Java on the same workload would be faster.

Is the Java code using the same workload? If you're using a larger dataset for Rust to match the local execution time, it's possible you're running out of cache or even memory on the server.

Attached code, please help, thank you.

,ok,Attached code, please help, thank you.

yes,Attached code, please help, thank you.

What are those awaited foo_service functions doing; e.g. are they making network requests?

Can you see any warnings with tokio-console?

FWIW, the code provided does a significant number of copies and allocations, many of which may be unnecessary.

Other than hand-optimizing this code, you might be able to squeeze some performance out of the very old CPU with the help of the auto-vectorizer by enabling a compatible instruction set:

RUSTCFLAGS='-C target-feature=+avx' cargo build --release

Some other target features might also help slightly. You'll have to cross-reference this list against what your CPU actually supports. At first glance, +aes and +bmi1 should both be supported. It's kind of unlikely that they will make a drastic difference, though.

Do you use cargo build --release in both mac and server?