Restart accel project
After two years from accel-0.1.0 release, I have released accel-0.3.0. This project had been actually dead long time after 0.1.0 release, and 0.3.0 is a first release of restarted this project! (0.2.0 has been yanked)
In this two years, there are many enhancement in Rust environment
- procedural-macro has been stablized
- nvtpx module has been added into core::arch::nvptx, though it is still unstable.
- Linker setting for
nvptx64-nvidia-cuda
has been added by the efforts by rust-cuda working group and denzp/rust-ptx-linker project
accel-0.3.0 has been started in order to migrate these features, and here it has done!
What is accel project?
accel is a crate for writing GPU kernel based on CUDA APIs. Here is a vector add example:
use accel::*;
#[kernel]
unsafe fn add(a: *const f32, b: *const f32, c: *mut f32, n: usize) {
let i = accel_core::index();
if (i as usize) < n {
*c.offset(i) = *a.offset(i) + *b.offset(i);
}
}
fn main() -> error::Result<()> {
let device = Device::nth(0)?;
let ctx = device.create_context();
// Allocate memories on GPU
let n = 32;
let mut a = DeviceMemory::<f32>::zeros(ctx.clone(), n);
let mut b = DeviceMemory::<f32>::zeros(ctx.clone(), n);
let mut c = DeviceMemory::<f32>::zeros(ctx.clone(), n);
// Accessible from CPU as usual Rust slice (though this will be slow)
for i in 0..n {
a[i] = i as f32;
b[i] = 2.0 * i as f32;
}
println!("a = {:?}", a.as_slice());
println!("b = {:?}", b.as_slice());
// Launch kernel synchronously
add(ctx,
1 /* grid */,
n /* block */,
&(&a.as_ptr(), &b.as_ptr(), &c.as_mut_ptr(), &n)
).expect("Kernel call failed");
println!("c = {:?}", c.as_slice());
Ok(())
}
- Rust kernel code decorated by
#[kernel]
procedural macro is compiled into a PTX (parallel thread execution) string, the assembly language for nvidia GPUs, usingnvptx64-nvidia-cuda
target. - Based on CUDA Driver API, does not depend on CUDA Runtime API and
nvcc
compiler.
Current status and Roadmap
This project is still in early stage. There are several limitations as following:
-
For runtime on CPU
- Windows and macOS are not supported
- f64 and Complex number supports are missing
- Texture/Surface object handling is missing
- Async features based on CUDA Stream and Events are disabled until async/.await support
-
For writting GPU kernel code
0.3.0 release is focused on what this project can and cannot in order to reveal what we have to do for using it for actual scientific studies. The goal of accel-1.0 is to realized a CUDA/C++ all features on Rust system, but there are several realistic targets:
- async/.await API for CUDA Stream/Event handling
- Consistent integration to cuBLAS, cuRAND, and cuFFT
- libstd for GPU kernel
I will keep working on these targets as I can. If you are interested in, please see GitLab issues.
Links for related projects
- denzp/rust-ptx-builder: Another CUDA kernel builder from Rust crate
- bheisler/RustaCUDA: Another CUDA-based Rust flamework