Forem

# mlsystems

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
MR‑GRPO in Practice: The Reward Mixer That Stops CLIP From Lying to Your Scene Compiler

MR‑GRPO in Practice: The Reward Mixer That Stops CLIP From Lying to Your Scene Compiler

Comments
8 min read
The Closed‑Loop Consistency Trick: Keeping Scene 12 Faithful to Scene 1 Without Global Memory
Cover image for The Closed‑Loop Consistency Trick: Keeping Scene 12 Faithful to Scene 1 Without Global Memory

The Closed‑Loop Consistency Trick: Keeping Scene 12 Faithful to Scene 1 Without Global Memory

Comments
10 min read
MR‑GRPO in Practice: The Reward Mixer That Stops CLIP From Lying to Your Scene Compiler
Cover image for MR‑GRPO in Practice: The Reward Mixer That Stops CLIP From Lying to Your Scene Compiler

MR‑GRPO in Practice: The Reward Mixer That Stops CLIP From Lying to Your Scene Compiler

Comments
8 min read
Notification Adjudication in My Ops Intelligence Agent: Canonical Events, Cheap Arbitration, and a Sender That Refuses to Spam
Cover image for Notification Adjudication in My Ops Intelligence Agent: Canonical Events, Cheap Arbitration, and a Sender That Refuses to Spam

Notification Adjudication in My Ops Intelligence Agent: Canonical Events, Cheap Arbitration, and a Sender That Refuses to Spam

Comments
9 min read
Phase 2 Calibration: Per‑Category OOD Thresholds + Group‑Relative Reward Normalization in My Scene Compiler
Cover image for Phase 2 Calibration: Per‑Category OOD Thresholds + Group‑Relative Reward Normalization in My Scene Compiler

Phase 2 Calibration: Per‑Category OOD Thresholds + Group‑Relative Reward Normalization in My Scene Compiler

Comments
10 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.