CUDA Books
github.com84 points by dariubs 8 hours ago
84 points by dariubs 8 hours ago
Having read or at least skimmed most of those books, I think the best intro is 'CUDA Programming: A Developer's Guide to Parallel Computing with GPUs'
Massively Parallel Processors: A Hands-on Approach is not really good in my opinion, many small mistakes and confusing sentences (even when you know cuda).
CUDA by Example: An Introduction to General-Purpose GPU Programming is too simple and abstract too much the architecture.
Next year I'm planning to start writing a cuda book that starts by engineering the hardware, and goes up to the optimization part on that harware (which is basically a nvidia card) including all the main algorithms (except for graphs).
I'm already teaching the course in this way at uni, and it is quite successful among students.
Regarding the section on Python and high-level CUDA, anyone interested should maybe first take a peek at Warp, which I’m guessing is too new to have a book yet. Warp lets you write CUDA kernels directly in Python, and it’s a breeze to get started. https://github.com/nvidia/warp
"AI Systems Performance Engineering" might deserve a mention, even though it's not strictly CUDA.
I liked going through https://www.olcf.ornl.gov/cuda-training-series/ for an intro and some fundamentals.
First one I clicked on is 404: Programming Massively Parallel Processors: A Hands-on Approach (3rd Edition) https://www.cambridge.org/core/books/programming-in-parallel...
Increasingly (for instance ADSP podcast [1]) those in nvidia's inner circle are advocating against writing your own CUDA kernels. (Unless that's your full time job at nvidia, that is).
That would be cool but nvidia released blackwell and still have not released unbroken kernels for sm120. Sm120 is not the data center gpu, so it doesn't get its love. So we can't depend on nvidia to do the right thing is my point unfortunately
It’s not about whether you work at Nvidia. Avoid writing CUDA kernels if there are higher level libraries that do what you need. Do write CUDA kernels if you want to learn how, or if you need the low level control, or to micro-optimize. Being able to fuse kernels to avoid memory traffic or get better specialization is also a reason to reach for raw CUDA. Just consider what’s the right tool for the job…
Any good MOOCs on Parallel programming/NVIDIA?
In an age when your company mandates you to raise your productivity right now with hundreds of percentage points using LLMs, how do you find an excuse to sit down and read a book?
It feels like a dirty secret, doesn't it?
Yeah, corps don't want you to know how to code, they want you to be a prompter...