Vulkano vs. gfx-rs

Advantages and disadvantages:

vulkano

  • A bit simpler because it isn't backend-agnostic, and @tomaka probably has a lot more freedom.
  • Potentially faster because it isn't backend-agnostic? I'll have to hold out for benchmarks, though.
  • Isn't totally stabilized.

gfx-rs

  • On the off chance that Vulkan doesn't become the one standard to rule them all, it's easy to swap backends.
  • Is actually fairly complete-ish, at least compared to vulkano. Even though the Vulkan bindings aren't ready yet, I know that I can get started with OpenGL or DX11 right now and porting later will be fairly straightforward.
  • Has a much bigger community right now.

If only Vulkan support is important, which would be a better option to build libraries for / applications with?

3 Likes

In my opinion it really depends on do you need to write low level vulkan code or not. Do you need any specific vulkan only feature? Personally I would choose gfx-rs over single graphics api.

1 Like

Aren't they both high-level wrappers either way? My understanding was that, if I wanted to write low-level Vulkan code, I'd just use vulkan-sys, which I definitely don't plan on doing. :smile: Yeah, choosing gfx-rs does seem to be a better option right now, but I was wondering what benefits vulkano carries.

If I'm not mistaken still vulkano is low level comparing to gfx-rs. Usually the crates named like xxx-sys are naked FFI bindings without any rust idioms. Vulkano-rs is idiomatic wrapper around vulkano-sys makes FFI usage more the rust way.

1 Like

I don't really like participating in these debates, but when comparing gfx and vulkano or glium, people (especially on IRC) usually notice that one supports a single graphical API while the other supports multiple APIs. Then they stop there and decide that gfx is better. If I agreed with that I obviously wouldn't work on vulkano.

For me the idea of abstracting over multiple APIs is problematic per se. Since all APIs have differences, your library has to be higher-level than each of the graphical APIs that you abstract over. After all if you could simply expose the same API for all the underlying graphical APIs, then there would be no point in having multiple graphical APIs in the first place.

The consequence is that Vulkan-only features such as multiple subpasses, secondary command buffers, having multiple buffers/images that alias in memory, transient images, etc. can't be supported by a cross-API library.

In addition to this, let's say that one day you read articles like this one that teach you that on AMD cards there are two CPU-visible memory heaps and that you should choose wisely between both. How do you make use of that knowledge when you use a cross-API library that doesn't expose the concept of memory heaps? Sure it uses memory heaps internally, but as user you have absolutely no control over it.

The same problem arises when using debuggers. I know that when I call Fence::alloc in vulkano it will always materialize as vkCreateFence in my debugger. Meanwhile in a higher-level library function calls don't necessarily map exactly to low-level functions, which makes following the flow of the program difficult.

Of course these problems don't show up when you draw a triangle. But they may show up when one day some user reports that your 20k-lines-of-code application slows down on some specific hardware vendor. Or that they compiled your code for Android but it runs at 3 frames per seconds because your application doesn't use transient images because transient images aren't exposed in your wrapper's API.

And I'd like to point that this is not about gfx in particular. These are problems related to writing an abstraction over multiple APIs. It is really hard to make people aware of these issues if they have never faced them, but supporting multiple APIs is not a strictly superior solution to supporting a single API.

Now to some specifics:

Where did you get that idea? As far as I know there are tons of features that gfx should/will support but doesn't support yet, like compute shaders or indirect drawing.

That's probably not true either. If you write the exact same code in vulkano and gfx, gfx will probably win right now on the CPU. Safety is the major goal of vulkano, and this causes some overhead. Safety is a non-goal for gfx.

For me safety is more important than performances, and performances can be gained by using Vulkan-specific features. I recently went from 8ms of CPU per frame to 2ms per frame by using secondary command buffers, which don't exist in OpenGL/DX11.

That's not necessarily true, but I'm not familiar enough with gfx to debate it. I know that you will have to rewrite all shaders if you want to switch between OpenGL, DirectX and Vulkan.

26 Likes

Whoops, didn't know that mentioning people notifies them!

Anyway, thanks!

As far as I know there are tons of features that gfx should/will support but doesn’t support yet, like compute shaders or indirect drawing.

I thought compute shaders and whatever the low-level stuff is were the only major features that were under active development, but I guess there's more to it! Anyway, isn't it safe to assume that it's mostly API-compatible with future versions? :slight_smile:

Safety is a non-goal for gfx.

But just going by the README it seems that safety is one of their primary motivations? And its API at least looks safe-ish at a glance. It even looks like they got rid of manual destruction and replaced it with reference-counting everywhere, like vulkano.

That’s not necessarily true [that porting is straightforward.]

I assumed that simpler ports was one of the main draws of a backend-agnostic framework. Isn't it just rewriting shaders and swapping libraries? :confused:

After all if you could simply expose the same API for all the underlying graphical APIs, then there would be no point in having multiple graphical APIs in the first place.

Again, I always just assumed that, at least in the case of Metal, the features are just about the same, but the API was created just to tie developers down to a specific platform. Not sure why libraries always have to offer unique functionality to coexist.

Sure it's API-compatible. I was just answering to "gfx is more complete than vulkano", which for me is not the case.

Safety is much more than just adding reference counting.

It's for example about ensuring that your data conforms to the alignment requirements of the hardware (did you know that the nVidia Windows kernel driver freezes if your uniform buffers were misaligned?), ensuring that you can properly recover from an out of memory error, or ensuring that you cannot read signalling NaNs from a buffer after a lost device.

Vulkan has tons of "weird" rules such as:

If the variable multisample rate feature is not supported, pipeline is a graphics pipeline, the current subpass has no attachments, and this is not the first call to this function with a graphics pipeline after transitioning to the current subpass, then the sample count specified by this pipeline must match that set in the previous pipeline

Or:

If the border color [of a sampler] is one of the VK_BORDER_COLOR_*_OPAQUE_BLACK enums and the VkComponentSwizzle [of the image the sampler is used with] is not VK_COMPONENT_SWIZZLE_IDENTITY for all components (or the equivalent identity mapping), the value of the texel after swizzle is undefined.

Total safety is about checking these rules.

I'm not familiar enough with Metal, but if API X is the same as API Y then you can write an implementation of API X (or Y) that uses Y (or X) under the hood. This will benefit more people than writing an abstraction over both X and Y.

13 Likes

if API X is the same as API Y then you can write an implementation of API X (or Y) that uses Y (or X) under the hood.

In the specific case of Metal, that's exactly what MoltenVK is. Unfortunately it's proprietary. :frowning:

Anyway, thanks for the information!

1 Like

Wouldn't it be best to call someone from gfx-rs community for the talk instead of guessing?

First of all, the comparison is written against pre-ll gfx-rs. This is important to state since it can't be projected into the future, where we changed everything, and this is what our master branch is now.

Essentially, the answer depends on whether you are good with vulkan-only or not.

3 Likes

@tomaka it's been a while, eh? :slight_smile:

If I agreed with that I obviously wouldn’t work on vulkano.

That's obviously false, since we do support multiple APIs, and you still work on vulkano :stuck_out_tongue:

For me the idea of abstracting over multiple APIs is problematic per se.

Both Khronos and W3C disagree with you.

Since all APIs have differences, your library has to be higher-level than each of the graphical APIs that you abstract over.

Also false. Vulkan is lower than Metal, but Vulkan -> Metal has been proven by MoltenVK to translate somewhat efficiently (used in production).

After all if you could simply expose the same API for all the underlying graphical APIs, then there would be no point in having multiple graphical APIs in the first place.

Reasons for multiple native APIs are many, and not all of them are technical (read: some of them are purely political). Technology can be seen as a tool here to solve political problems even :wink:

The consequence is that Vulkan-only features such as multiple subpasses, secondary command buffers, having multiple buffers/images that alias in memory, transient images, etc. can’t be supported by a cross-API library.

Not all of the Vulkan features can be efficiently emulated, but one can take a useful subset and make it portable. That's the goal of Vulkan Portability initiative.

How do you make use of that knowledge when you use a cross-API library that doesn’t expose the concept of memory heaps?

Interestingly, memory types/heaps model of Vulkan allows to encode some nasty foreign API differences nicely. For example, D3D12 resource heaps tier-1 are not allowed to mix buffers/images/targets. We enforce it by exposing multiple memory types and specifying them in memory requirements, so that the user can't mix those resources if they fulfill the requirement. Another example is GL, where exposing multiple Vulkan memory types/heaps allows to communicate to the driver what CPU access we need. In other words, yes, other APIs have different concepts, but they still come down to the same hardware, and so can often be mapped nicely.

Meanwhile in a higher-level library function calls don’t necessarily map exactly to low-level functions, which makes following the flow of the program difficult.

Sure thing. There is another side of the story though - having multiple backends at your disposal opens more opportunity to debug an application. Something might be caught by Metal debug layer, something is easier to debug with PIX for Windows, etc.

I know that you will have to rewrite all shaders if you want to switch between OpenGL, DirectX and Vulkan.

That was in pre-ll. Today though, we uniformly accept SPIRV code and generate the native platform shaders with SPIRV-Cross. The differences (e.g. coordinate systems) are taken into account during this translation, making shaders totally portable.

TL;DR: I encourage you to learn more about the current (and future) gfx-rs. We aren't competitors, we are friends. After all, one of the goals is to link to vulkano :wink:

10 Likes

You're both competitors and collaborators, which is the optimal arrangement to encourage innovation.

2 Likes

And yet, many of them are technical, and not all of them can be papered over just because they're targeting the "same hardware" - they're not!

In my experience, a useful cross-API layer needs to be quite a bit higher level than gfx-rs, to the point that it becomes the component responsible for the types of quirks and differences tomaka referred to. Khronos and W3C cannot change that- all they can do is push the burden higher in the stack where it's much harder to deal with.

I'm even skeptical of Vulkan itself. It's a valiant effort and I hope it works out, but it's fighting an uphill battle against the differences between desktop, mobile, and console GPUs. Something like gfx-rs that has to cover D3D/GL/Metal on top of that is just not going to work as well as something higher level.

That said, perhaps it's the only pragmatic option for the web. But I wish more people would realize that that's precisely what it is- a compromise for pragmatic reasons, not an ideal solution for everything.

And yet, many of them are technical

Well, tell me about that :wink: Both Vulkan Portability and GPUweb groups are currently investigating the API differences and the ways to map from one to another, not without our (gfx-rs team) help. During the evolution of gfx-rs as well as early WebGPU prototyping my perspective on the topic was also getting refined. One conclusion I drew is higher != better.

As a Microsoft member, you could join both of these groups and see how far we've gone. Or I can just drop you some links in private :wink:

not all of them can be papered over just because they’re targeting the “same hardware” - they’re not!

Sure! Apple in particular has the liberty of targeting the narrowest HW range with Metal features.

I’m even skeptical of Vulkan itself. It’s a valiant effort and I hope it works out, but it’s fighting an uphill battle against the differences between desktop, mobile, and console GPUs.

Vulkan is not ideal, by far. I can list a few API aspects that don't map that well to particular IHVs. But Vulkan is the best of what we have now: it's the most well documented API, designed in the open, and focused at the largest selection of devices.

It was quite surprising to me to discover how well Vulkan features often map to the higher level APIs. For example, Metal doesn't have sub-passes? They are perfectly emulated via framebuffer fetches. No descriptor sets? Here come argument buffers. And so on.

Something like gfx-rs that has to cover D3D/GL/Metal on top of that is just not going to work as well as something higher level.

Obviously, the higher you go (think Game::new().play()) the more efficient you can get, but at the cost of reducing the use spectrum.

In my experience, a useful cross-API layer needs to be quite a bit higher level than gfx-rs, to the point that it becomes the component responsible for the types of quirks and differences tomaka referred to.

I didn't find those convincing w.r.t. abstraction itself. Care to pick any in particular and dive deeper?

Khronos and W3C cannot change that- all they can do is push the burden higher in the stack where it’s much harder to deal with.

Vulkan Portability so far looks way more successful than I'd expect. A lot of the complexity/burden can be effectively solved by the SPIRV->native shader translation, which we recently got.

But I wish more people would realize that that’s precisely what it is- a compromise for pragmatic reasons, not an ideal solution for everything.

One thing is to develop an API and figure out what abstraction level covers the users needs most efficiently. Another - is to take into consideration the existing analysis, global awareness, branding, documentation, mental models, tools, ecosystem, and so forth, which suddenly makes the strategy of basing off an already released API more appealing.

4 Likes

I mean, that's my entire point. If we want existing games and engines to start running on the web, then maybe a gfx-rs-like solution is the best we can do for them, but in no case are they just going to drop any of their other backends on other platforms. Which means maybe trying to abstract every API isn't actually all that beneficial.

In the same vein, smaller games and projects (that wouldn't have multiple backends to begin with) won't benefit from an abstraction over every single API either. They do better with higher level APIs, or if they're feeling ambitious, OpenGL or D3D11.

What I'm getting at is that the niche gfx-rs is trying to fill is already filled by existing APIs! Applications that benefit from going that low-level will be writing directly to D3D12/GNM/NVN/Metal/Vulkan regardless. Applications that don't can just use something higher level. Its best use case at the moment seems to be the web, but effort there would IMHO be better spent on fixing up an existing API (probably Vulkan), both securing it for untrusted clients and emulating it only on platforms that don't support it.

That way you have less overhead in the web API, less work designing and maintaining the web API, and less work for clients of the web API.

in no case are they just going to drop any of their other backends on other platforms. Which means maybe trying to abstract every API isn’t actually all that beneficial.

It depends. If our HAL layer works well, those applications may save on development/maintenance cost of these other backends. It's a matter of maturity for us.

In the same vein, smaller games and projects (that wouldn’t have multiple backends to begin with) won’t benefit from an abstraction over every single API either. They do better with higher level APIs, or if they’re feeling ambitious, OpenGL or D3D11.

Don't you think that building those higher level APIs on top of gfx-rs is significantly easier than trying to support multiple backends in the first place? Appliations can benefit from every single API without talking directly to gfx-rs. Amethyst will provide a layer, three-rs will, ggez probably too, there is gfx-render, and so on.

What I’m getting at is that the niche gfx-rs is trying to fill is already filled by existing APIs! Applications that benefit from going that low-level will be writing directly to D3D12/GNM/NVN/Metal/Vulkan regardless.

Mmm... gfx-rs is not going to be the first in line for adoption by behemoths like Unreal Engine. This is totally fine. Our abstraction in principle is not going to be more performant than a hypothetically optimal use of the native APIs directly. But again, the feasibility of it depends on the quality of our work.

web, but effort there would IMHO be better spent on fixing up an existing API (probably Vulkan), both securing it for untrusted clients and emulating it only on platforms that don’t support it.

That's what we are doing :smiley: . I wrote the original Obsidian API proposal. We are implementing the Vulkan portability initiative (which as close to "fixing up an existing API, emulating it on platforms that don't support it" as you can get), and I do my best at steering GPUWeb towards agreeing effectively on the Vulkan model.

6 Likes

No, I don't, and I've been saying so this whole time, as did tomaka. The sum total of work going into gfx-rs+three-rs is greater than simply building a three.js-level API directly on top of a handful of platform APIs. The moment a higher-level API has to do something slightly differently for a particular backend is the moment gfx-rs either gets in the way, or fails to abstract the differences it's hiding.

And no, the idea of several three-rs-level APIs built on top of gfx-rs does not change the equation here. Because gfx-rs is too low level to actually provide the platform-specific logic in question, all of these APIs will have to reimplement it themselves anyway. And because gfx-rs is too high level to let them do so directly, it will take more work to maintain than if they had written to the platform APIs to begin with.

The insistence on providing a layer at this level is what causes problems- no matter how mature gfx-rs gets it will not be a good replacement for the multiple backends of a higher layer. When these higher level APIs/engines/applications ever share code, it is in the form of libraries that don't try to own anything and that are easily bypassed. Some good examples are shader languages like Cg or SPIRV (with SPIRV-Cross), data processing tools like AssImp or NVTT or RAD's libraries, standardized formats like glTF, or highly focused APIs like Pathfinder or Scaleform.

This is, IMO, the biggest idea behind C, C++, and now Rust- the reason they succeed in places that Java, OCaml, D, Go, Swift, etc. fail is that they don't try to own a layer. They have no runtime to get in the way, which makes it almost trivial just to plug them in anywhere at any scale. Libraries and tools like those above have the same property; no comprehensive graphics API-hiding layer ever can.

I feel like there is a fundamental misunderstanding here, somewhere.

The sum total of work going into gfx-rs+three-rs is greater than simply building a three.js-level API directly on top of a handful of platform APIs

I don't understand this part. gfx-hal is basically Vulkan. So this work is three-rs + {Vulkan}, which is strictly less than three-rs + {Vulkan, D3D12, Metal, GL, etc}.

Because gfx-rs is too low level to actually provide the platform-specific logic in question, all of these APIs will have to reimplement it themselves anyway

I think there is a lot of implied here, which don't transmit well across the forum medium. What sort of logic you have in mind? Why would three-rs have to use it?

no matter how mature gfx-rs gets it will not be a good replacement for the multiple backends of a higher layer

It will never be a perfect replacement for multiple backends, but if we aren't talking extremes, what's wrong with providing a good enough one?

Some good examples are shader languages like Cg or SPIRV

I don't see how the model of abstraction served by these is fundamentally different from gfx-hal model (in the API space). Both don't have Metal-specific features, for example.

the biggest idea behind C, C++, and now Rust- the reason they succeed in places that Java, OCaml, D, Go, Swift, etc. fail is that they don’t try to own a layer.

Rust, Swift, and C/C++ (to an extent) don't quite have to talk to the machine directly because they rely on LLVM to own that abstraction layer. So gfx-hal is more of an analogy to LLVM here, and that only strengthens the position it tries to take in my eyes.

5 Likes

No, I'm referring to the collective work it takes to create gfx-rs itself and to create three-rs on top- not just three-rs. First gfx-rs has to be made a general-purpose abstraction of every platform API, and then someone has to build three-rs on top. It would be less collective work to write three-rs directly to the platform APIs, because the abstraction could be less general-purpose and could be modified per-platform rather than worked around when it leaks.

(I highly recommend rereading my last post with this interpretation in mind, before you go charging ahead arguing that gfx-rs would allow work to be shared... because it wouldn't.)

Tomaka already described the sort of logic I'm referring to, and gave some examples. In general I'm talking about the stuff that gfx-rs tries to hide, but which must change across platform APIs and IHVs to get good performance, to work around driver bugs, or to take advantage of new, optional, or platform specific features. gfx-rs itself is too low-level to do this kind of stuff, so its clients will either a) be worse off than if they used something higher level that could take care of it for them, or b) do extra work to get around gfx-rs's leaky abstraction.

To quote myself, "it is in the form of libraries that don’t try to own anything and that are easily bypassed." Cg and SPIRV don't own the GPU- they simply give you one way of describing programs to run on it, and you can freely mix and match them with platform-specific shader languages, all within a single application- just like C/C++/Rust, and completely unlike gfx-hal.

The LLVM analogy does not apply either. Unlike gfx-hal, LLVM runs programs through SSA form, optimization passes, register allocation, and instruction scheduling. It has a vast amount of platform-specific knowledge and tuning to make this work. The true counterparts to LLVM in the GPU space are things like Frostbite's frame graph or Unreal's material system. (They would be even better analogies if they weren't tied to monolithic game engines. I would love to see more lightweight, interoperable libraries pop up to do things like that- Pathfinder and WebRender are pretty close to this ideal.)

The true counterpart to gfx-hal in the compiler space would be a single assembler attempting to target several architectures. Like you claim, this would be "good enough" for a lot of purposes. But it would be a rather quixotic enterprise- the same effort would be better spent tuning LLVM or hand-writing SSE intrinsics.

It is particularly telling that WebAssembly, the closest thing to this hypothetical assembler, goes out of its way to make it fast and straightforward for the VM to run its own, platform-specific optimization passes on its input. It notably does not try to find a common intersection of features and then emulate the rest.