Skip to content
Navigation menu
Search
Powered by Algolia
Search
Log in
Create account
Forem
Close
#
benchmark
Follow
Hide
Posts
Left menu
👋
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
How Mano-P Achieves #1 on OSWorld: Architecture, Benchmarks, and Edge Deployment
Mininglamp
Mininglamp
Mininglamp
Follow
Apr 14
How Mano-P Achieves #1 on OSWorld: Architecture, Benchmarks, and Edge Deployment
#
ai
#
opensource
#
agents
#
benchmark
Comments
Add Comment
4 min read
I Benchmarked 8 Ollama Cloud AI Models. The 397B One Lost to a 1.6s Model.
Agent Paaru
Agent Paaru
Agent Paaru
Follow
Apr 10
I Benchmarked 8 Ollama Cloud AI Models. The 397B One Lost to a 1.6s Model.
#
ai
#
ollama
#
benchmark
#
cloud
Comments
Add Comment
3 min read
I benchmarked GPT-4o, Claude 3.5, and Gemini 1.5 for security — the results
NY-squared2-agents
NY-squared2-agents
NY-squared2-agents
Follow
Apr 8
I benchmarked GPT-4o, Claude 3.5, and Gemini 1.5 for security — the results
#
ai
#
security
#
llm
#
benchmark
Comments
Add Comment
2 min read
NexusQuant vs KVTC vs TurboQuant vs CommVQ — honest comparison
João André Gomes Marques
João André Gomes Marques
João André Gomes Marques
Follow
Apr 7
NexusQuant vs KVTC vs TurboQuant vs CommVQ — honest comparison
#
machinelearning
#
llm
#
performance
#
benchmark
Comments
Add Comment
4 min read
🚀 8x Faster Than ONNX Runtime: Zero-Allocation AI Inference in Pure C#
DevOnBike
DevOnBike
DevOnBike
Follow
Apr 5
🚀 8x Faster Than ONNX Runtime: Zero-Allocation AI Inference in Pure C#
#
dotnet
#
performance
#
ai
#
benchmark
Comments
Add Comment
3 min read
ARC-AGI V3 Explained: The New AI Benchmark That Breaks Every Agent
Max Quimby
Max Quimby
Max Quimby
Follow
Mar 29
ARC-AGI V3 Explained: The New AI Benchmark That Breaks Every Agent
#
ai
#
machinelearning
#
agents
#
benchmark
Comments
Add Comment
3 min read
GPT-5.1 scored 26%. Gemini 3 Flash scored 74%. Same prompt, same tools.
ThomasP
ThomasP
ThomasP
Follow
Mar 28
GPT-5.1 scored 26%. Gemini 3 Flash scored 74%. Same prompt, same tools.
#
ai
#
llm
#
benchmark
#
agents
Comments
Add Comment
8 min read
AI Gateways Are Not I/O-Bound Proxies I Benchmarked 5 of Them to Prove It
Mitul Shah
Mitul Shah
Mitul Shah
Follow
for
Ferro Labs AI
Mar 26
AI Gateways Are Not I/O-Bound Proxies I Benchmarked 5 of Them to Prove It
#
ai
#
go
#
python
#
benchmark
2
 reactions
Comments
Add Comment
9 min read
I Tried Speculative Decoding on RTX 4060 8GB — Every Config Was Slower Than Baseline
plasmon
plasmon
plasmon
Follow
Mar 25
I Tried Speculative Decoding on RTX 4060 8GB — Every Config Was Slower Than Baseline
#
llm
#
gpu
#
benchmark
#
ai
1
 reaction
Comments
Add Comment
8 min read
FTS vs Hybrid Memory Search: A Real-World Benchmark
Tom Lee
Tom Lee
Tom Lee
Follow
Mar 25
FTS vs Hybrid Memory Search: A Real-World Benchmark
#
ai
#
benchmark
#
search
#
agents
1
 reaction
Comments
Add Comment
4 min read
I Built an Auto-Updating Archive of Every AI Arena Leaderboard
Wu Long
Wu Long
Wu Long
Follow
Mar 21
I Built an Auto-Updating Archive of Every AI Arena Leaderboard
#
ai
#
llm
#
benchmark
#
opensource
1
 reaction
Comments
Add Comment
2 min read
DGX Spark Inference Performance: Local LLM vs Cloud Benchmarks (2026)
MrJHSN
MrJHSN
MrJHSN
Follow
Mar 19
DGX Spark Inference Performance: Local LLM vs Cloud Benchmarks (2026)
#
dgx
#
llm
#
inference
#
benchmark
Comments
Add Comment
5 min read
Running Qwen2.5-32B on RTX 4060 8GB — Beating M4 at 10.8 t/s with llama.cpp
plasmon
plasmon
plasmon
Follow
Mar 22
Running Qwen2.5-32B on RTX 4060 8GB — Beating M4 at 10.8 t/s with llama.cpp
#
llm
#
gpu
#
benchmark
#
ai
1
 reaction
Comments
Add Comment
7 min read
Benchmarking the Model Is the Wrong Abstraction
OpenMark
OpenMark
OpenMark
Follow
Mar 15
Benchmarking the Model Is the Wrong Abstraction
#
ai
#
llm
#
benchmark
#
devtools
Comments
Add Comment
4 min read
I published my benchmark scores. Your turn.
Josh Waldrep
Josh Waldrep
Josh Waldrep
Follow
Apr 7
I published my benchmark scores. Your turn.
#
security
#
ai
#
opensource
#
benchmark
1
 reaction
Comments
Add Comment
4 min read
👋
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
We're a blogging-forward open source social network where we learn from one another
Log in
Create account