Skip to content

Forem

# cuda

👋 Sign in for the ability to sort posts by relevant, latest, or top.

Cover image for The Microsecond Lie: Why your Go timers are lying about the GPU

May 23

The Microsecond Lie: Why your Go timers are lying about the GPU

#ai #programming #go #cuda

3 min read

May 22

Profiling a CUDA Python Program with GPUFlight

#performance #python #cuda #gpu

10 min read

May 20

TensorRT `trt.Dims` SIGSEGV inside a GStreamer Python plugin — root cause and fix

#tensorrt #gstreamer #python #cuda

4 min read

Cover image for Calling CUDA from Go without cgo

May 16

Calling CUDA from Go without cgo

#ai #softwareengineering #go #cuda

2 min read

Cover image for Why CUDA kernels silently corrupt memory and how to catch the bug

Alan West

May 12

Why CUDA kernels silently corrupt memory and how to catch the bug

#cuda #rust #debugging #gpu

5 min read

May 4

CUDA Out of Memory at 60% Utilization: Tracing PyTorch GPU Memory Fragmentation

#gpu #cuda #pytorch #debugging

4 min read

Anton

Apr 29

How I optimized a Solana vanity address grinder to 44M keys/sec on GPU

#cuda #solana #gpu #cryptocurrency

2 min read

aa24aa

Apr 22

From Black Magic to Science: The Evolution of the CUDA Optimization Skill

#cuda #agents #cutlass #triton

11 min read

cookie

Apr 22

Learning Resources Tech

#webdev #cuda #programming #beginners

1 min read

Apr 12

512MiB 512MB — the silent trtexec bug

#tensorrt #jetson #cuda #debugging

2 min read

Apr 9

Memory Coalescing: Same computation, 6x Performance Difference

#cuda #gpu #aiops #cpp

6 min read

Cover image for Setting Up NVIDIA Drivers and CUDA for ML/DL on Ubuntu 22.04

Apr 6

Setting Up NVIDIA Drivers and CUDA for ML/DL on Ubuntu 22.04

#nvidia #cuda #ubuntu #machinelearning

3 min read

Cover image for Achieving Neuro‑Sama‑Tier Speech‑to‑Text for Your Local AI Companion (Whisper + CUDA + LivinGrimoire)

owly

Apr 7

Achieving Neuro‑Sama‑Tier Speech‑to‑Text for Your Local AI Companion (Whisper + CUDA + LivinGrimoire)

#whisper #designpatterns #python #cuda

5 min read

Apr 8

CUDA Graphs: The 8-Year Overnight Success and the Observability Gap

#cuda #gpu #ebpf #ai

9 min read

Apr 1

124x Slower: What PyTorch DataLoader Actually Does at the Kernel Level

#pytorch #gpu #python #cuda

5 min read

👋 Sign in for the ability to sort posts by relevant, latest, or top.