Forem

Machine Learning

A branch of artificial intelligence (AI) and computer science which focuses on the use of data and algorithms to imitate the way that humans learn, gradually improving its accuracy.

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
How do you know that? Teaching Generative Language Models to Reference Answers to Biomedical Questions
Cover image for How do you know that? Teaching Generative Language Models to Reference Answers to Biomedical Questions

How do you know that? Teaching Generative Language Models to Reference Answers to Biomedical Questions

Comments
4 min read
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
Cover image for Learning to (Learn at Test Time): RNNs with Expressive Hidden States

Learning to (Learn at Test Time): RNNs with Expressive Hidden States

2
Comments
3 min read
PaliGemma: A versatile 3B VLM for transfer
Cover image for PaliGemma: A versatile 3B VLM for transfer

PaliGemma: A versatile 3B VLM for transfer

Comments
4 min read
LoRA+: Efficient Low Rank Adaptation of Large Models

LoRA+: Efficient Low Rank Adaptation of Large Models

Comments
3 min read
Abide by the Law and Follow the Flow: Conservation Laws for Gradient Flows

Abide by the Law and Follow the Flow: Conservation Laws for Gradient Flows

Comments
5 min read
ColPali: Efficient Document Retrieval with Vision Language Models
Cover image for ColPali: Efficient Document Retrieval with Vision Language Models

ColPali: Efficient Document Retrieval with Vision Language Models

3
Comments
4 min read
Reasoning in Large Language Models: A Geometric Perspective
Cover image for Reasoning in Large Language Models: A Geometric Perspective

Reasoning in Large Language Models: A Geometric Perspective

Comments
5 min read
Volumetric Rendering with Baked Quadrature Fields

Volumetric Rendering with Baked Quadrature Fields

Comments
3 min read
When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards
Cover image for When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards

When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards

Comments
4 min read
LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control
Cover image for LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control

LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control

1
Comments
3 min read
OpenDiLoCo: An Open-Source Framework for Globally Distributed Low-Communication Training
Cover image for OpenDiLoCo: An Open-Source Framework for Globally Distributed Low-Communication Training

OpenDiLoCo: An Open-Source Framework for Globally Distributed Low-Communication Training

Comments
3 min read
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models
Cover image for A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models

A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models

Comments
4 min read
X-ray Made Simple: Radiology Report Generation and Evaluation with Layman's Terms
Cover image for X-ray Made Simple: Radiology Report Generation and Evaluation with Layman's Terms

X-ray Made Simple: Radiology Report Generation and Evaluation with Layman's Terms

Comments
4 min read
Mooncake: Kimi's KVCache-centric Architecture for LLM Serving
Cover image for Mooncake: Kimi's KVCache-centric Architecture for LLM Serving

Mooncake: Kimi's KVCache-centric Architecture for LLM Serving

1
Comments 1
4 min read
Testing AI on language comprehension tasks reveals insensitivity to underlying meaning

Testing AI on language comprehension tasks reveals insensitivity to underlying meaning

Comments
4 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.