Nvidia Tilus: A Tile-Level GPU Kernel Programming Language

github.com

91 points by ashvardanian 7 days ago


lukax - 3 days ago

NVIDIA does not want CUDA development (e.g. flash attention) to move to Triton because Triton also supports AMD and if ecosystem moves from pure CUDA to Triton, that's bad for NVIDIA's lock-in. That's why there is so much focus on CUDA Python (lower level) and Tilus (higher level, more similar to Triton).

skavi - 3 days ago

What’s the relationship between Tilus and cuTile (and maybe Warp)?

A slight tangent, but I really wish Nvidia would release more details on Tile IR. Specifically on what it enables vs PTX.

Is it just about moving towards more MLIR based infra? Maybe it’s higher level and thus can enable better codegen across generations?

pjmlp - 3 days ago

The recent interest on NVidia making Python first class on CUDA ecosystem is what makes me wonder how successful Mojo migth becomem, if they aren't faster than NVidia with Python.

All the attempts to attack CUDA fail to undestand why most researchers flock to it, instead of enduring the pain of the competion tooling, and they tend to focus on a single aspect of CUDA, be it C++, or something else, but never the polyglot support, the libraries, the IDE integration, the graphical debuggers, the compiler backends for other developers to target CUDA.

alwahi - 3 days ago

okay, I am not a systems level programmer but I am currently learning c with the aim of doing some gpgpu programming using Cuda etc., what is a tile level gpu kernel programming language? and how is it different from something like cuda?

I know i can ask a llm or search on google, but i was hoping someone in the community could explain it in a way i could understand.

ddtaylor - 3 days ago

Nvidia lock in attempt.

drdirk - 2 days ago

The linked paper in the GitHub repository doesn’t contain an NVIDIA email. There is an Amazon email and a bunch of university emails.

How come that this paper has become an NVIDIA project?

Paper: https://arxiv.org/pdf/2504.12984

Archit3ch - 3 days ago

I'm hoping that something like MoYe.jl moves from Nvidia-only to a vendor-agnostic tile DSL.

Keyframe - 3 days ago

So, SIMD but in python and on GPU?

moralestapia - 3 days ago

New thing to experiment with.

Great :).

FilosofumRex - 3 days ago

seriously!!! there is so much more Nvidia and its billionaire managers can do to improve developers experience with CUDA/nvcc/PTX, instead of yet another barely functional, sparsely documented, rarely tested DSL

curtisszmania - 3 days ago

[dead]

camdroidw - 3 days ago

I'm a little out of the loop these days but I thought webgpu obviated the use of platform specific language (say Cuda)?