Forem

# cuda

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Profiling GPU (CUDA) — What Is Actually Limiting Your Kernel?

Profiling GPU (CUDA) — What Is Actually Limiting Your Kernel?

1
Comments
4 min read
LLMs Can Now Write GPU Kernels That Beat torch.compile
Cover image for LLMs Can Now Write GPU Kernels That Beat torch.compile

LLMs Can Now Write GPU Kernels That Beat torch.compile

Comments
7 min read
UltrafastSecp256k1 v3.14.0
Cover image for UltrafastSecp256k1 v3.14.0

UltrafastSecp256k1 v3.14.0

3
Comments 1
1 min read
UltrafastSecp256k1 v3.14.0

UltrafastSecp256k1 v3.14.0

3
Comments
1 min read
Profiling GPU (CUDA) — Introducing GPU Flight

Profiling GPU (CUDA) — Introducing GPU Flight

2
Comments
3 min read
How I Run 6 AI Services Simultaneously on RTX 5090 + WSL2 + Docker (And You Can Too)
Cover image for How I Run 6 AI Services Simultaneously on RTX 5090 + WSL2 + Docker (And You Can Too)

How I Run 6 AI Services Simultaneously on RTX 5090 + WSL2 + Docker (And You Can Too)

1
Comments
6 min read
A GPU-accelerated implementation of Forman-Ricci curvature-based graph clustering in CUDA.

A GPU-accelerated implementation of Forman-Ricci curvature-based graph clustering in CUDA.

Comments
9 min read
AI Engineering: Why the Environment Is the Most Ignored Long-Term Asset

AI Engineering: Why the Environment Is the Most Ignored Long-Term Asset

Comments
5 min read
How to Read GPU Profiling Logs: A Ground-Up Guide

How to Read GPU Profiling Logs: A Ground-Up Guide

1
Comments
7 min read
Two Ways to Move Tensors Without Stopping: Inside vLLM's Async GPU Transfer Patterns

Two Ways to Move Tensors Without Stopping: Inside vLLM's Async GPU Transfer Patterns

5
Comments 1
7 min read
Turkish Sieve Engine (TSE) V.1.0.0

Turkish Sieve Engine (TSE) V.1.0.0

Comments
5 min read
eBPF Tutorial: Tracing CUDA GPU Operations

eBPF Tutorial: Tracing CUDA GPU Operations

Comments
12 min read
Getting started with GPU Programming on an EC2!
Cover image for Getting started with GPU Programming on an EC2!

Getting started with GPU Programming on an EC2!

6
Comments
5 min read
When I Took Numba to the Dojo: A Battle Royale Against Rust and CUDA
Cover image for When I Took Numba to the Dojo: A Battle Royale Against Rust and CUDA

When I Took Numba to the Dojo: A Battle Royale Against Rust and CUDA

Comments
5 min read
Using CuCollections Nvidia Data Structures Library

Using CuCollections Nvidia Data Structures Library

Comments
1 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.