Skip to content
Navigation menu
Search
Powered by Algolia
Search
Log in
Create account
Forem
Close
#
gpu
Follow
Hide
Posts
Left menu
👋
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
LLM Auto-Tunes llama.cpp, SASS Latency Analysis, DLSS Frame Gen for RTX 40
soy
soy
soy
Follow
Apr 14
LLM Auto-Tunes llama.cpp, SASS Latency Analysis, DLSS Frame Gen for RTX 40
#
gpu
#
nvidia
#
hardware
Comments
Add Comment
3 min read
VRAMを増やせば解決する、は物理的に間違っている — HBM・CXL・Unified Memoryが取れなかったもの
plasmon
plasmon
plasmon
Follow
Apr 14
VRAMを増やせば解決する、は物理的に間違っている — HBM・CXL・Unified Memoryが取れなかったもの
#
llm
#
gpu
#
vram
Comments
Add Comment
4 min read
llama.cppの設定で8GBの性能が5倍変わる — 主要オプションの最適値を出した
plasmon
plasmon
plasmon
Follow
Apr 14
llama.cppの設定で8GBの性能が5倍変わる — 主要オプションの最適値を出した
#
llm
#
llamacpp
#
gpu
Comments
Add Comment
4 min read
One Query, Four GPUs: Tracing a Distributed Training Stall Across Nodes
Ingero Team
Ingero Team
Ingero Team
Follow
Apr 13
One Query, Four GPUs: Tracing a Distributed Training Stall Across Nodes
#
gpu
#
ebpf
#
distributedcomputing
Comments
Add Comment
7 min read
Task Manager is lying about your GPU temps. Here is how to read the real data in Python
Yaroslav Pristupa
Yaroslav Pristupa
Yaroslav Pristupa
Follow
Apr 13
Task Manager is lying about your GPU temps. Here is how to read the real data in Python
#
ai
#
hardware
#
softwaredevelopment
#
gpu
Comments
Add Comment
4 min read
AMD ML Complete Stack
compilersutra
compilersutra
compilersutra
Follow
Apr 12
AMD ML Complete Stack
#
gpu
#
cpu
#
ai
#
llm
Comments
Add Comment
1 min read
RTX 5090 cuBLAS Bug, Neural Texture Compression, Multi-GPU vLLM Inference
soy
soy
soy
Follow
Apr 11
RTX 5090 cuBLAS Bug, Neural Texture Compression, Multi-GPU vLLM Inference
#
gpu
#
nvidia
#
hardware
Comments
Add Comment
3 min read
CUDA SGEMM Bug on RTX 5090, Kernel-Fusing for SGEMV, & Radeon RX 9070 XT Price Surge
soy
soy
soy
Follow
Apr 10
CUDA SGEMM Bug on RTX 5090, Kernel-Fusing for SGEMV, & Radeon RX 9070 XT Price Surge
#
gpu
#
nvidia
#
hardware
Comments
Add Comment
4 min read
TGI - Text Generation Inference - Install, Config, Troubleshoot
Rost
Rost
Rost
Follow
Apr 10
TGI - Text Generation Inference - Install, Config, Troubleshoot
#
docker
#
gpu
#
observability
#
selfhosting
Comments
Add Comment
9 min read
Memory Coalescing: Same computation, 6x Performance Difference
Myoungho Shin
Myoungho Shin
Myoungho Shin
Follow
Apr 9
Memory Coalescing: Same computation, 6x Performance Difference
#
cuda
#
gpu
#
aiops
#
cpp
Comments
Add Comment
6 min read
LLM GPU Breakthroughs: RT Cores, Llama.cpp Parallelism, AMD Optimizations
soy
soy
soy
Follow
Apr 9
LLM GPU Breakthroughs: RT Cores, Llama.cpp Parallelism, AMD Optimizations
#
gpu
#
nvidia
#
hardware
Comments
Add Comment
3 min read
How to Train a 100B+ Parameter Model When You Can't Afford a GPU Cluster
Alan West
Alan West
Alan West
Follow
Apr 9
How to Train a 100B+ Parameter Model When You Can't Afford a GPU Cluster
#
machinelearning
#
deeplearning
#
python
#
gpu
Comments
1
comment
5 min read
How K-Means Clustering Works (Explained by Extracting Colors from Images)
Francesco Di Donato
Francesco Di Donato
Francesco Di Donato
Follow
Apr 9
How K-Means Clustering Works (Explained by Extracting Colors from Images)
#
webgl
#
machinelearning
#
javascript
#
gpu
1
reaction
Comments
Add Comment
3 min read
How I Stopped GGUF Models From Crashing My GPU: A Pre-flight VRAM Check
Dmytro Romanov
Dmytro Romanov
Dmytro Romanov
Follow
Apr 8
How I Stopped GGUF Models From Crashing My GPU: A Pre-flight VRAM Check
#
localllm
#
gpu
#
machinelearning
#
python
Comments
Add Comment
4 min read
99.8% of LLM Inference Power Isn't Spent on Computation
plasmon
plasmon
plasmon
Follow
Apr 8
99.8% of LLM Inference Power Isn't Spent on Computation
#
llm
#
gpu
#
hardware
#
ai
Comments
Add Comment
7 min read
👋
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
We're a blogging-forward open source social network where we learn from one another
Log in
Create account