<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: gentic news</title>
    <description>The latest articles on Forem by gentic news (@gentic_news).</description>
    <link>https://forem.com/gentic_news</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3838995%2F269c20bb-f64f-483a-862d-49c6481df897.png</url>
      <title>Forem: gentic news</title>
      <link>https://forem.com/gentic_news</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/gentic_news"/>
    <language>en</language>
    <item>
      <title>Anthropic's Jack Clark: ~60% chance of automated AI R&amp;D by 2028</title>
      <dc:creator>gentic news</dc:creator>
      <pubDate>Mon, 04 May 2026 22:42:33 +0000</pubDate>
      <link>https://forem.com/gentic_news/anthropics-jack-clark-60-chance-of-automated-ai-rd-by-2028-2ej7</link>
      <guid>https://forem.com/gentic_news/anthropics-jack-clark-60-chance-of-automated-ai-rd-by-2028-2ej7</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Anthropic's Jack Clark forecasts ~30% chance of automated AI R&amp;amp;D by 2027 and ~60%+ by 2028, driven by coding gains and agents.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Anthropic's Jack Clark forecasts ~30% chance of fully automated AI R&amp;amp;D by end 2027 and ~60%+ by end 2028. The timeline, shared via @kimmonismus, marks the most specific public prediction from a frontier lab insider.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key facts&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;~30% chance of automated AI R&amp;amp;D by end 2027&lt;/li&gt;
&lt;li&gt;~60%+ chance by end 2028&lt;/li&gt;
&lt;li&gt;Proof-of-concept within 1–2 years on non-frontier model&lt;/li&gt;
&lt;li&gt;Driven by coding, agents, benchmark saturation&lt;/li&gt;
&lt;li&gt;Source: Anthropic co-founder Jack Clark&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Clark, a co-founder of Anthropic, published a detailed essay arguing that fully automated AI R&amp;amp;D — where a frontier AI system autonomously builds its own successor — likely won't arrive this year but may appear as a proof-of-concept within 1–2 years [According to @kimmonismus].&lt;/p&gt;

&lt;p&gt;The essay identifies key drivers: rapid gains in coding capabilities, long-horizon agent work, benchmark saturation, AI-managed subagents, and early signs of models handling core AI research tasks like fine-tuning, kernel optimization, reproducibility, and alignment research.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why this story matters more than the press release suggests&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Clark's forecast is notable not just for its specificity but for its source — an Anthropic insider with visibility into frontier training runs. If Clark is correct, the window for human-exclusive AI R&amp;amp;D leadership closes within ~18–24 months, compressing timelines that most public forecasts place at 3–5 years. The essay also implies that current frontier models (Claude, GPT, Gemini) already exhibit the foundational capabilities for automated research, with the bottleneck being reliability and long-horizon task completion rather than raw intelligence.&lt;/p&gt;

&lt;p&gt;The forecast aligns with recent trends: coding benchmarks like SWE-Bench have seen scores jump from ~30% to ~70% in 12 months, and agent frameworks (Claude Code, Devin, Copilot Workspace) increasingly handle multi-step tasks. The missing piece — end-to-end model training without human intervention — is what Clark expects to see demonstrated on non-frontier models within 1–2 years.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What the essay doesn't address&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Clark's essay does not specify which successor model would be trained, nor does it discuss compute costs, which could exceed $100M even for non-frontier models. The forecast also assumes regulatory environments remain permissive and that no catastrophic failures trigger intervention.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Implications for the field&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If Clark's timeline holds, the next 24 months will see a race between labs to achieve automated R&amp;amp;D first — not just for competitive advantage but for existential risk management. The essay implicitly argues that whoever achieves automated AI R&amp;amp;D first controls the subsequent intelligence explosion, a dynamic that accelerates the need for alignment research.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to watch
&lt;/h2&gt;

&lt;p&gt;Watch for any frontier lab announcing a demonstration of end-to-end model training by an AI agent in 2026–2027. Clark's proof-of-concept window implies a public or leaked demo within 12–18 months. Also watch SWE-Bench scores crossing 80% and agent tool-use reliability metrics.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://gentic.news/article/anthropic-s-jack-clark-60-chance" rel="noopener noreferrer"&gt;gentic.news&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>tech</category>
      <category>opinion</category>
      <category>analysis</category>
    </item>
    <item>
      <title>NVIDIA Feynman GPU Power Semi Content Hits $191K, 17 Blackwell</title>
      <dc:creator>gentic news</dc:creator>
      <pubDate>Mon, 04 May 2026 22:42:30 +0000</pubDate>
      <link>https://forem.com/gentic_news/nvidia-feynman-gpu-power-semi-content-hits-191k-17x-blackwell-29ad</link>
      <guid>https://forem.com/gentic_news/nvidia-feynman-gpu-power-semi-content-hits-191k-17x-blackwell-29ad</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;NVIDIA Feynman GPUs require $191K in power semiconductors per system, 17× Blackwell, driven by 800V DC architecture shift.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;NVIDIA's Feynman GPU generation pushes power semiconductor content to $191,000 per system, a 17× increase over the Blackwell architecture. The leap reflects the industry's shift to 800V DC power delivery, which demands more expensive silicon carbide (SiC) and gallium nitride (GaN) components.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key facts&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Feynman power semiconductor content: $191,000 per system.&lt;/li&gt;
&lt;li&gt;17× increase over Blackwell's ~$11,000 power semi content.&lt;/li&gt;
&lt;li&gt;800V DC architecture requires SiC and GaN components.&lt;/li&gt;
&lt;li&gt;Google plans $190B AI infrastructure buildout.&lt;/li&gt;
&lt;li&gt;NVIDIA's China market share dropped to zero in 2026.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Wccftech reports that NVIDIA's upcoming Feynman GPU architecture will require $191,000 worth of power semiconductors per system, up from roughly $11,000 for Blackwell [According to Wccftech]. The 17× increase is not a GPU cost hike — it reflects the move to 800V DC architectures, which demand silicon carbide (SiC) and gallium nitride (GaN) power devices rather than traditional silicon MOSFETs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why 800V DC Drives the Cost
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F39efc05t1xwz0tm9jo12.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F39efc05t1xwz0tm9jo12.webp" alt="Delivering Massive Performance Leaps for Mixture of Experts Inference ..." width="800" height="429"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The 800V DC architecture shift is the primary driver behind the 17× cost increase. Higher voltage reduces current for the same power level, slashing I²R losses in datacenter distribution. But it requires power semiconductors with higher breakdown voltages — SiC and GaN components that cost 5–10× more per watt than silicon equivalents [Industry estimates]. Each Feynman rack likely contains dozens of 1,200V-class SiC MOSFETs and GaN HEMTs for DC-DC conversion and bus regulation.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Unique Take
&lt;/h2&gt;

&lt;p&gt;This cost increase is not inflationary — it's substitutional. The $191,000 figure represents power delivery, not compute. As hyperscalers like Google plan $190B AI infrastructure buildouts [According to TradingView], the 800V DC standard is becoming the default for next-generation datacenters. Feynman's power semi content is a leading indicator: the cost of moving electrons is now a first-order design constraint for AI clusters.&lt;/p&gt;

&lt;h2&gt;
  
  
  Industry Context
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faefa588chtyfq7wu8b98.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faefa588chtyfq7wu8b98.webp" alt="NVIDIA Blackwell Leads on SemiAnalysis InferenceMAX v1 Benchmarks ..." width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;NVIDIA's China market share has dropped to zero due to US export controls [Per Nvidia's Jensen Huang, May 2026]. The company invested $2 billion in Marvell to deepen the NVLink Fusion partnership [Previously reported]. Feynman's power architecture suggests NVIDIA expects hyperscaler demand to absorb the higher system cost — each Feynman system at $191K in power semi content alone implies total system prices well above $1 million.&lt;/p&gt;

&lt;p&gt;Google's $190B AI buildout, reported by TradingView, signals demand for the 800V DC systems that Feynman targets. The shift also creates opportunities for power semiconductor suppliers: Wolfspeed, Infineon, and ON Semiconductor are the primary SiC/GaN vendors positioned to supply the Feynman generation.&lt;/p&gt;

&lt;p&gt;Wccftech notes the figure is based on supply chain estimates, and NVIDIA has not confirmed per-system BOM breakdowns.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to watch
&lt;/h2&gt;

&lt;p&gt;Watch for NVIDIA's official Feynman launch event, expected late 2026 or early 2027, which should confirm system pricing and power architecture details. Also track Wolfspeed and Infineon earnings for SiC order volumes that would validate the $191K figure.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://gentic.news/article/nvidia-feynman-gpu-power-semi" rel="noopener noreferrer"&gt;gentic.news&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>tech</category>
      <category>product</category>
    </item>
    <item>
      <title>Claude Code Digest — May 01–May 04</title>
      <dc:creator>gentic news</dc:creator>
      <pubDate>Mon, 04 May 2026 16:42:30 +0000</pubDate>
      <link>https://forem.com/gentic_news/claude-code-digest-may-01-may-04-2jj3</link>
      <guid>https://forem.com/gentic_news/claude-code-digest-may-01-may-04-2jj3</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;CCmeter's cache-busting insights can slash your Claude Code costs by up to 40% instantly.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;CCmeter's cache-busting insights can slash your Claude Code costs by up to 40% instantly.&lt;br&gt;
98% reduction in supply-chain risks with Version Sentinel&lt;/p&gt;

&lt;h2&gt;
  
  
  Trending Now
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;🔥 CCmeter: Cut Costs by 40%&lt;/strong&gt;&lt;br&gt;
CCmeter's ability to identify cache-busting patterns can significantly reduce Claude Code expenses. Implementing these insights provides immediate cost savings.&lt;br&gt;
&lt;strong&gt;📈 Version Sentinel: 98% Risk Reduction&lt;/strong&gt;&lt;br&gt;
By blocking hallucinated package versions, Version Sentinel mitigates 98% of supply-chain risks, ensuring safer dependency management.&lt;br&gt;
&lt;strong&gt;✨ Pylon: Self-Hosted Error Fixing&lt;/strong&gt;&lt;br&gt;
Pylon allows you to self-host AI agent pipelines to fix errors locally, ensuring data never leaves your machine and maintaining privacy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Best Practices
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Use Version Sentinel to intercept dependency changes&lt;/strong&gt;&lt;br&gt;
Before: Risk of hallucinated package versions. After: 98% reduction in supply-chain risks.&lt;br&gt;
&lt;strong&gt;Parse session logs with CCmeter for cost insights&lt;/strong&gt;&lt;br&gt;
Without this: Unidentified cost leaks. With this: Up to 40% cost savings from cache-busting insights.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tools &amp;amp; MCP
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;CCmeter&lt;/strong&gt; — CCmeter parses session logs to reveal cache-busting patterns — saves up to 40% on costs&lt;br&gt;
&lt;strong&gt;Version Sentinel&lt;/strong&gt; — Version Sentinel blocks hallucinated package versions — prevents 98% of supply-chain risks&lt;br&gt;
&lt;strong&gt;Pylon&lt;/strong&gt; — Pylon self-hosts AI pipelines for error fixing — keeps data private&lt;/p&gt;

&lt;h2&gt;
  
  
  Multi-Agent Patterns
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Pylon Webhook Trigger&lt;/strong&gt;&lt;br&gt;
Triggers sandboxed Claude Code agents from webhooks to fix errors locally, ensuring no data leaves your machine.&lt;/p&gt;

&lt;h2&gt;
  
  
  Community Requests
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Native MCP server benchmarking tool&lt;/li&gt;
&lt;li&gt;Improved cache-busting detection in CCmeter&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://gentic.news/article/claude-code-community-digest-may-04-2026" rel="noopener noreferrer"&gt;gentic.news&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>tech</category>
      <category>product</category>
    </item>
    <item>
      <title>World2Agent Open-Sources Protocol for Real-World AI Perception</title>
      <dc:creator>gentic news</dc:creator>
      <pubDate>Mon, 04 May 2026 15:36:20 +0000</pubDate>
      <link>https://forem.com/gentic_news/world2agent-open-sources-protocol-for-real-world-ai-perception-3je7</link>
      <guid>https://forem.com/gentic_news/world2agent-open-sources-protocol-for-real-world-ai-perception-3je7</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;World2Agent open-sourced a protocol to standardize how AI agents perceive the real world via sensors. No adoption metrics or technical details were disclosed.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;World2Agent open-sourced a protocol standardizing real-world perception for AI agents. The protocol lets developers install sensors to feed environmental data directly into agentic workflows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key facts&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;World2Agent open-sourced a perception protocol for AI agents.&lt;/li&gt;
&lt;li&gt;Protocol standardizes sensor-to-agent data flow.&lt;/li&gt;
&lt;li&gt;No adoption metrics or partner names disclosed.&lt;/li&gt;
&lt;li&gt;Competes with ROS 2, MQTT, and vendor SDKs.&lt;/li&gt;
&lt;li&gt;Announced via X on an unspecified date in 2026.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;World2Agent just open-sourced a protocol that standardizes how AI agents perceive the real world. The announcement, made via X by @rohanpaul_ai, describes a method to connect physical sensors to AI agents, enabling them to ingest environmental data like temperature, motion, or visual inputs through a unified interface.&lt;/p&gt;

&lt;p&gt;The protocol targets the growing gap between digital AI agents and physical environments. As agents increasingly control robots, drones, or IoT devices, a standardized perception layer becomes critical for interoperability. World2Agent's approach allows developers to "install a sensor" and have agents access that data without custom integration work [According to @rohanpaul_ai].&lt;/p&gt;

&lt;h3&gt;
  
  
  Why This Matters
&lt;/h3&gt;

&lt;p&gt;This is not a product launch but an infrastructure play. By open-sourcing the protocol, World2Agent positions itself as a standard-setter in agent-environment communication, similar to how MQTT standardized IoT messaging or how Anthropic's Model Context Protocol (MCP) standardized tool access for LLMs. The unique take: the protocol could become the Rosetta Stone for agentic perception, but adoption depends on hardware vendors and agent frameworks implementing it.&lt;/p&gt;

&lt;p&gt;No adoption metrics, pricing, or partner names were disclosed in the announcement. The source tweet is thin on technical details—no GitHub repo link, API spec, or sensor compatibility list was provided. The community response on X remains speculative, with some comparing it to existing protocols like ROS 2 or OpenAPI for sensors.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Competitive Landscape
&lt;/h3&gt;

&lt;p&gt;World2Agent enters a fragmented space. Existing solutions like ROS 2 for robotics, MQTT for IoT, and various vendor SDKs (Bosch, Texas Instruments) each solve parts of the problem but lack a universal agent-facing standard. If World2Agent's protocol gains traction, it could reduce integration costs for developers building multi-sensor agent systems, especially in warehouse automation, smart buildings, and autonomous vehicles.&lt;/p&gt;

&lt;h3&gt;
  
  
  What's Missing
&lt;/h3&gt;

&lt;p&gt;The announcement lacks specifics: sensor types supported, latency benchmarks, security model, and whether it works with cloud-based or edge agents. Without these, the protocol remains a concept rather than a deployable tool. Developers should watch for a GitHub release or technical whitepaper in the coming weeks.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to watch
&lt;/h2&gt;

&lt;p&gt;Watch for a GitHub repository release or technical whitepaper from World2Agent detailing supported sensors, latency, and security. Also monitor if agent frameworks like LangChain or CrewAI integrate the protocol—that would signal real adoption.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://gentic.news/article/world2agent-open-sources-protocol" rel="noopener noreferrer"&gt;gentic.news&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>tech</category>
      <category>product</category>
    </item>
    <item>
      <title>UniVidX Generates Video From 1,000 Samples, SIGGRAPH 2026</title>
      <dc:creator>gentic news</dc:creator>
      <pubDate>Mon, 04 May 2026 15:36:16 +0000</pubDate>
      <link>https://forem.com/gentic_news/unividx-generates-video-from-1000-samples-siggraph-2026-1ehi</link>
      <guid>https://forem.com/gentic_news/unividx-generates-video-from-1000-samples-siggraph-2026-1ehi</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;UniVidX generates omni-directional video from &amp;lt;1,000 training samples, using diffusion priors with stochastic masking, accepted at SIGGRAPH 2026.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;UniVidX, accepted at SIGGRAPH 2026, generates video across RGB, depth, and alpha channels after training on fewer than 1,000 samples. The framework uses diffusion priors with stochastic condition masking to achieve omni-directional generation from a single model.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key facts&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Trained on fewer than 1,000 videos&lt;/li&gt;
&lt;li&gt;Accepted at SIGGRAPH 2026 conference&lt;/li&gt;
&lt;li&gt;Generates RGB, intrinsic maps, alpha channels&lt;/li&gt;
&lt;li&gt;Uses diffusion priors with stochastic masking&lt;/li&gt;
&lt;li&gt;No code or benchmark numbers released yet&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;UniVidX, a unified multimodal framework for versatile video generation, was announced via a tweet from @HuggingPapers. The model enables omni-directional generation across RGB, intrinsic maps, and alpha channels using diffusion priors with stochastic condition masking. Critically, it was trained on fewer than 1,000 videos for SIGGRAPH 2026.&lt;/p&gt;

&lt;p&gt;The unique take: Most video generation models—like OpenAI's Sora or Google's Lumiere—require millions of video-text pairs and massive compute clusters. UniVidX's sub-1,000 video training set is orders of magnitude smaller, suggesting that diffusion priors combined with stochastic masking can dramatically compress the data needed for multimodal video generation. This could lower the barrier for custom video models in specialized domains (medical imaging, robotics simulation) where large datasets are unavailable.&lt;/p&gt;

&lt;p&gt;[According to @HuggingPapers], the stochastic condition masking technique allows the model to handle diverse output modalities from a single unified framework. The paper was accepted at SIGGRAPH 2026, the premier computer graphics conference. No code or model weights have been released yet, nor have quantitative benchmarks (FVD, IS, CLIP score) been disclosed in the tweet.&lt;/p&gt;

&lt;h2&gt;
  
  
  Data Efficiency vs. Quality Tradeoff
&lt;/h2&gt;

&lt;p&gt;Training on fewer than 1,000 videos raises questions about output quality and diversity. Without benchmark numbers, it's unclear whether the model matches SOTA quality from larger models. The diffusion prior may compensate for limited data, but ablation studies on mask ratios and prior strength would clarify the tradeoff.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implications for Specialized Video Generation
&lt;/h2&gt;

&lt;p&gt;If UniVidX generalizes beyond the demo domains, it could enable rapid fine-tuning for niche applications—synthetic data generation for robotics, medical video synthesis, or film pre-visualization—where collecting millions of videos is impractical. The SIGGRAPH acceptance lends credibility, but peer reviewers likely saw the full paper, not just the tweet.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to watch
&lt;/h2&gt;

&lt;p&gt;Watch for the full SIGGRAPH 2026 paper release, which should include quantitative benchmarks (FVD, CLIP score) and ablation studies on mask ratios. If code is open-sourced, replication attempts will reveal whether the data-efficiency claim holds across diverse video domains.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://gentic.news/article/unividx-generates-video-from-1000" rel="noopener noreferrer"&gt;gentic.news&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>research</category>
      <category>deeplearning</category>
    </item>
    <item>
      <title>Europe's AI Ambition Gap: No Energy, No Data Centers, No Strategy</title>
      <dc:creator>gentic news</dc:creator>
      <pubDate>Mon, 04 May 2026 03:36:27 +0000</pubDate>
      <link>https://forem.com/gentic_news/europes-ai-ambition-gap-no-energy-no-data-centers-no-strategy-1fa1</link>
      <guid>https://forem.com/gentic_news/europes-ai-ambition-gap-no-energy-no-data-centers-no-strategy-1fa1</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Europe lacks a strategy for AI, with no energy or data center plan, per @kimmonismus. Only minor EU AI Act concessions offered.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Researcher @kimmonismus argues Europe has no coherent AI strategy, citing energy, data centers, and tech company support gaps. The critique comes as the EU AI Act faces softening, but policymakers offer no structural plan.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key facts&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Europe accounts for ~20% of global data center capacity.&lt;/li&gt;
&lt;li&gt;US holds over 40% of global data center capacity.&lt;/li&gt;
&lt;li&gt;China nuclear reactor construction timeline: 3–5 years.&lt;/li&gt;
&lt;li&gt;Europe average industrial electricity price: €0.12–0.15/kWh.&lt;/li&gt;
&lt;li&gt;Training GPT-4 scale model consumes 50–100 GWh electricity.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In a blunt thread on X, researcher @kimmonismus laid out a structural critique of Europe's AI posture: the continent lacks a convincing energy strategy, a serious data center buildout plan, and any clear mechanism to support globally relevant tech companies. [According to @kimmonismus] &lt;/p&gt;

&lt;p&gt;While China builds dozens of nuclear reactors and the US invests heavily in nuclear and solar capacity, Europe's approach is described as "erratic, vague, and fundamentally unserious." The only meaningful concession from the European Commission has been softening parts of the EU AI Act, which @kimmonismus frames as insufficient to address what AI companies actually need.&lt;/p&gt;

&lt;p&gt;The critique lands amid a broader debate about Europe's role in AI infrastructure. The continent's data center capacity lags behind the US and China, and energy constraints are a known bottleneck for training large models. Per publicly available data, Europe accounts for roughly 20% of global data center capacity, while the US holds over 40% and China nearly 15%. Nuclear permitting timelines in Europe average 10–15 years, compared to 5–7 in the US and 3–5 in China.&lt;/p&gt;

&lt;h3&gt;
  
  
  The unique take
&lt;/h3&gt;

&lt;p&gt;@kimmonismus' thread matters less as a new revelation and more as a signal that even sympathetic observers see no credible European AI strategy. The EU AI Act, once touted as a global standard, is now being walked back — and the walking back isn't accompanied by any positive infrastructure or industrial plan. This mirrors a pattern seen across critical technologies: Europe regulates first, builds second, and often doesn't build at all.&lt;/p&gt;

&lt;h3&gt;
  
  
  The energy-data center link
&lt;/h3&gt;

&lt;p&gt;AI training requires both cheap energy and massive compute. Without nuclear or renewables at scale, European AI startups face a structural disadvantage. Per industry estimates, training a single frontier model (e.g., GPT-4 scale) consumes 50–100 GWh of electricity. Europe's average industrial electricity price is roughly €0.12–0.15/kWh, compared to €0.04–0.06/kWh in the US and €0.03–0.05 in China. That 2–3x cost premium compounds across the entire training stack.&lt;/p&gt;

&lt;h3&gt;
  
  
  What's missing
&lt;/h3&gt;

&lt;p&gt;@kimmonismus doesn't offer a solution, but the implied ask is clear: Europe needs a coordinated energy infrastructure plan, fast-tracked data center permitting, and a dedicated fund to support AI-native companies. Without those, the continent will remain an AI consumer, not a builder.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Europe lacks a strategy for AI, with no energy or data center plan, per @kimmonismus.&lt;/li&gt;
&lt;li&gt;Only minor EU AI Act concessions offered.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What to watch
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsubstackcdn.com%2Fimage%2Ffetch%2F%24s_%21sqhO%21%2Cf_auto%2Cq_auto%3Agood%2Cfl_progressive%3Asteep%2Fhttps%253A%252F%252Fsubstack-post-media.s3.amazonaws.com%252Fpublic%252Fimages%252F22e4de10-404e-4d16-99b7-3657372b6b7a_5336x4736.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsubstackcdn.com%2Fimage%2Ffetch%2F%24s_%21sqhO%21%2Cf_auto%2Cq_auto%3Agood%2Cfl_progressive%3Asteep%2Fhttps%253A%252F%252Fsubstack-post-media.s3.amazonaws.com%252Fpublic%252Fimages%252F22e4de10-404e-4d16-99b7-3657372b6b7a_5336x4736.png" alt="The State of AI Data Centers - by Gennaro Cuofano" width="800" height="710"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Watch for the European Commission's 2026 Digital Decade report in Q3, which must include concrete data center and energy infrastructure targets. If those remain vague, expect more capital flight from European AI startups to the US.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;[Updated 03 May via dck_news]&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Hyperscaler earnings confirm that AI demand is now outstripping infrastructure capacity, with Amazon, Google, and Meta joining Microsoft in signaling a shift where growth is tied to power, chips, and unprecedented capital spending [per Data Center Knowledge]. This reinforces @kimmonismus' critique: without a coordinated energy and data center plan, Europe risks being bypassed as the US hyperscalers race to secure power and compute globally.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://gentic.news/article/europe-s-ai-ambition-gap-no-energy" rel="noopener noreferrer"&gt;gentic.news&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>tech</category>
      <category>opinion</category>
      <category>analysis</category>
    </item>
    <item>
      <title>Recursive Multi-Agent Systems Top Hugging Papers; Eywa Bridges LLMs and Scientific Models</title>
      <dc:creator>gentic news</dc:creator>
      <pubDate>Sun, 03 May 2026 21:36:21 +0000</pubDate>
      <link>https://forem.com/gentic_news/recursive-multi-agent-systems-top-hugging-papers-eywa-bridges-llms-and-scientific-models-4n9e</link>
      <guid>https://forem.com/gentic_news/recursive-multi-agent-systems-top-hugging-papers-eywa-bridges-llms-and-scientific-models-4n9e</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Recursive Multi-Agent Systems leads Hugging Papers with 242 upvotes. Eywa and OneManCompany signal a move from chat-based to structural agent collaboration.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Recursive Multi-Agent Systems scored 242 upvotes on Hugging Papers this week, leading a batch of papers on agent collaboration and scientific modeling. The framework scales multi-agent systems through recursive latent-space computation, a departure from standard message-passing architectures.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key facts&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Recursive Multi-Agent Systems: 242 upvotes&lt;/li&gt;
&lt;li&gt;Eywa bridges LLMs and scientific domain models: 192 upvotes&lt;/li&gt;
&lt;li&gt;OneManCompany organizes agents as a virtual firm: 116 upvotes&lt;/li&gt;
&lt;li&gt;World-R1 adds physics-aware loss for 3D video: 115 upvotes&lt;/li&gt;
&lt;li&gt;GLM-5V-Turbo by Zhipu AI: 90 upvotes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The weekly Hugging Papers roundup, curated by @HuggingPapers, highlights six papers that signal a shift toward structured, scalable agent architectures. The top paper, Recursive Multi-Agent Systems (242 upvotes), proposes a new paradigm: instead of agents communicating via natural language or fixed protocols, they exchange compressed latent representations in a recursive loop. This allows the system to maintain state across interactions without exponential message overhead — a key bottleneck in current multi-agent frameworks [According to @HuggingPapers].&lt;/p&gt;

&lt;p&gt;The second-ranked paper, Agentic World Modeling (219 upvotes), offers a comprehensive taxonomy for AI environment modeling, categorizing capabilities, laws, and boundaries. It provides a theoretical foundation for agents that must reason about dynamic worlds, a prerequisite for deployment in robotics or simulation [per the arXiv preprint abstract].&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Eywa Bridges Language and Science&lt;/strong&gt;&lt;br&gt;
The third paper, Heterogeneous Scientific Foundation Model Collaboration — dubbed Eywa — received 192 upvotes. Eywa bridges general-purpose language models with specialized scientific foundation models (e.g., for molecular dynamics, protein folding, or climate simulation). The framework uses a lightweight adapter layer that translates between LLM token space and scientific model embeddings, enabling cross-domain reasoning without retraining either model. This is notable because most scientific AI work remains siloed; Eywa offers a practical interoperability layer [According to @HuggingPapers].&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OneManCompany: Agents as a Firm&lt;/strong&gt;&lt;br&gt;
From Skills to Talent: Organising Heterogeneous Agents as a Real-World Company (116 upvotes) introduces the OneManCompany framework. It treats a collection of specialized agents as employees of a virtual company, with roles, reporting lines, and a shared memory store. The paper argues that organizational structures — not just model architectures — are the missing ingredient for scaling agentic systems to enterprise tasks. The framework includes a hiring module that selects agents based on task requirements, and a performance review loop that updates agent weights [per the arXiv preprint].&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Other Notable Papers&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;World-R1 (115 upvotes) reinforces 3D constraints in text-to-video generation, improving spatial consistency. It adds a physics-aware loss term during training, reducing object jitter and collision artifacts [According to @HuggingPapers].&lt;/li&gt;
&lt;li&gt;GLM-5V-Turbo (90 upvotes) by Zhipu AI targets native foundation models for multimodal agents — models that can natively process text, image, video, and audio without separate encoders. This aligns with the industry trend toward unified multimodal architectures [the company's blog post says].&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Unique Take: The End of Chat-Based Multi-Agent Systems&lt;/strong&gt;&lt;br&gt;
The common thread across these papers is a rejection of chat-based agent interaction. Recursive Multi-Agent Systems, Eywa, and OneManCompany all move away from natural language as the primary communication channel between agents. Instead, they use latent-space compression, adapter-based translation, and organizational hierarchy. This suggests that the field is converging on a structural insight: language is too slow and too ambiguous for inter-agent communication at scale. The winning architectures will likely be those that minimize token overhead and maximize state compression — a pattern visible across the past 90 days in papers like Graph of Thoughts (2024) and AgentVerse.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to watch
&lt;/h2&gt;

&lt;p&gt;Watch for code releases of Recursive Multi-Agent Systems and Eywa on GitHub over the next 4 weeks. Adoption of the latent-space communication pattern in production agent frameworks (e.g., LangGraph, AutoGen) would confirm the shift away from chat-based inter-agent protocols. Also track Zhipu AI's GLM-5V-Turbo API release date.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://gentic.news/article/recursive-multi-agent-systems-top" rel="noopener noreferrer"&gt;gentic.news&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>research</category>
      <category>deeplearning</category>
    </item>
    <item>
      <title>Nvidia's China Market Share Hits Zero, Huang Says</title>
      <dc:creator>gentic news</dc:creator>
      <pubDate>Sun, 03 May 2026 21:36:17 +0000</pubDate>
      <link>https://forem.com/gentic_news/nvidias-china-market-share-hits-zero-huang-says-8pl</link>
      <guid>https://forem.com/gentic_news/nvidias-china-market-share-hits-zero-huang-says-8pl</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Jensen Huang says US export controls reduced Nvidia's China market share to zero, accelerating China's domestic chip ecosystem independence.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Nvidia CEO Jensen Huang says US export controls have driven Nvidia's market share in China to zero. The restrictions have backfired by accelerating China's domestic chip ecosystem, he argues.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key facts&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Nvidia's China market share dropped to zero, per Jensen Huang&lt;/li&gt;
&lt;li&gt;China's domestic chip production grew 14% in 2024&lt;/li&gt;
&lt;li&gt;Huawei's Ascend 910B/910C compete with Nvidia H100&lt;/li&gt;
&lt;li&gt;Nvidia China revenue fell from $5.7B to ~$1.5B&lt;/li&gt;
&lt;li&gt;Chinese AI chip startups raised $3B in 2024&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Nvidia CEO Jensen Huang says the company's market share in China has dropped to zero due to US export controls, and he's arguing the restrictions have basically backfired by pushing China to build its own chip ecosystem faster. [According to @kimmonismus]&lt;/p&gt;

&lt;p&gt;China has managed to become largely independent in chip development, Huang stated, a claim that reflects the growing capability of domestic alternatives like Huawei's Ascend series. The US sanctions, imposed in 2022 and tightened through 2024, banned export of Nvidia's A100, H100, and later modified chips to China.&lt;/p&gt;

&lt;h3&gt;
  
  
  The backfire mechanism
&lt;/h3&gt;

&lt;p&gt;Huang's argument is that export controls created a vacuum that Chinese firms filled aggressively. Huawei's Ascend 910B, launched in 2023, and the 910C, expected in 2025, now compete directly with Nvidia's H100 in Chinese data centers. Chinese AI labs including Baidu, ByteDance, and Tencent have reduced Nvidia orders and shifted to domestic chips.&lt;/p&gt;

&lt;h3&gt;
  
  
  Market reality check
&lt;/h3&gt;

&lt;p&gt;Nvidia's China revenue collapsed from $5.7 billion in fiscal 2022 to approximately $1.5 billion in fiscal 2025, [per Nvidia's 10-K filings]. The company previously derived roughly 25% of data center revenue from China; that figure is now negligible. The zero market share claim likely refers to the high-end AI chip segment where US export bans apply directly.&lt;/p&gt;

&lt;h3&gt;
  
  
  What the data shows
&lt;/h3&gt;

&lt;p&gt;Semiconductor Industry Association data shows China's domestic chip production grew 14% year-over-year in 2024, reaching $45 billion in revenue. Chinese AI chip startups raised over $3 billion in 2024, [per PitchBook]. The US Commerce Department's Bureau of Industry and Security has not commented on Huang's claim.&lt;/p&gt;

&lt;p&gt;Huang's statement is notable because it comes from the CEO of the company most affected by the controls. It aligns with his public lobbying against further restrictions, which he has called economically counterproductive.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to watch
&lt;/h2&gt;

&lt;p&gt;Watch for the US Commerce Department's next export control update, expected Q2 2026, and whether it broadens restrictions to cover Chinese AI chips built on older process nodes. Also monitor Nvidia's Q4 FY2026 China revenue figure and Huawei's Ascend 910C production yields.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://gentic.news/article/nvidia-s-china-market-share-hits" rel="noopener noreferrer"&gt;gentic.news&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>tech</category>
      <category>product</category>
    </item>
    <item>
      <title>Pentagon Strikes Deal With 7 AI Labs for Classified Systems</title>
      <dc:creator>gentic news</dc:creator>
      <pubDate>Sun, 03 May 2026 15:36:17 +0000</pubDate>
      <link>https://forem.com/gentic_news/pentagon-strikes-deal-with-7-ai-labs-for-classified-systems-147f</link>
      <guid>https://forem.com/gentic_news/pentagon-strikes-deal-with-7-ai-labs-for-classified-systems-147f</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;US military deal with 7 AI labs for classified systems. First formal framework for commercial AI on classified networks.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The Pentagon has reached a deal with seven of the largest AI labs to deploy their models on classified military systems. The labs include OpenAI, Anthropic, Google DeepMind, Meta, Microsoft, Amazon, and Scale AI.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key facts&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;7 AI labs including OpenAI, Anthropic, Google DeepMind, Meta, Microsoft, Amazon, Scale AI&lt;/li&gt;
&lt;li&gt;First formal framework for commercial AI on classified military systems&lt;/li&gt;
&lt;li&gt;OpenAI changed its charter in January 2024 to allow military use&lt;/li&gt;
&lt;li&gt;Pentagon created CDAO in 2023 to accelerate AI adoption&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The agreement, first reported by @rohanpaul_ai, marks the first formal framework for using cutting-edge commercial AI on classified military networks. The seven labs represent the dominant players in frontier AI development, collectively commanding over $50 billion in funding.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;US military deal with 7 AI labs for classified systems.&lt;/li&gt;
&lt;li&gt;First formal framework for commercial AI on classified networks.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What the Deal Covers
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqywssq36csoeoh4c2nf0.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqywssq36csoeoh4c2nf0.jpg" alt="The Berkshire Hathaway Inc annual shareholders' meeting in Omaha, Nebraska" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The deal allows each lab's models to be used on classified Pentagon systems for tasks including intelligence analysis, logistics planning, and cyber defense. The exact terms remain undisclosed, but sources indicate the agreement includes data security protocols and human oversight requirements.&lt;/p&gt;

&lt;p&gt;This arrangement parallels recent moves by the Department of Defense to accelerate AI adoption, including the 2023 creation of the Chief Digital and Artificial Intelligence Office (CDAO). The Pentagon has been testing large language models on unclassified systems since early 2024.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Matters
&lt;/h2&gt;

&lt;p&gt;The unique angle here is the reversal of AI labs' earlier resistance to military applications. OpenAI changed its charter in January 2024 to allow 'military and defense' use cases, while Anthropic has maintained a policy against weapons development but permits defensive applications. Meta's open-source Llama models have been used by military contractors for months.&lt;/p&gt;

&lt;p&gt;[According to @rohanpaul_ai], the deal covers 'classified systems' — a significant expansion beyond the unclassified trials. This raises questions about AI safety protocols and the potential for autonomous weapons development, though the labs have publicly stated their models will not be used for autonomous targeting.&lt;/p&gt;

&lt;h2&gt;
  
  
  Competitive Implications
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fixmmra0zlpdpotig25ew.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fixmmra0zlpdpotig25ew.jpg" alt="Exclusive: Pentagon pushing AI companies to expand on classified ..." width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The agreement gives the Pentagon preferential access to the latest models before they reach civilian markets. This could create a two-tier system where military applications receive priority compute and safety testing, potentially slowing commercial releases.&lt;/p&gt;

&lt;p&gt;Scale AI, the data annotation and evaluation company, is an interesting inclusion — it provides the infrastructure for model evaluation rather than frontier models themselves. This suggests the deal encompasses not just model deployment but also testing and validation pipelines.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to watch
&lt;/h2&gt;

&lt;p&gt;Watch for the release of specific safety protocols and human oversight requirements — likely within 90 days. Also watch whether Anthropic or OpenAI disclose any red-teaming results specific to classified applications, and whether the deal triggers congressional hearings on AI in military systems.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://gentic.news/article/pentagon-strikes-deal-with-7-ai" rel="noopener noreferrer"&gt;gentic.news&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>tech</category>
      <category>product</category>
    </item>
    <item>
      <title>RoundPipe: Full Fine-Tune 32B Models on a Single 24GB GPU</title>
      <dc:creator>gentic news</dc:creator>
      <pubDate>Sun, 03 May 2026 15:36:17 +0000</pubDate>
      <link>https://forem.com/gentic_news/roundpipe-full-fine-tune-32b-models-on-a-single-24gb-gpu-1bf5</link>
      <guid>https://forem.com/gentic_news/roundpipe-full-fine-tune-32b-models-on-a-single-24gb-gpu-1bf5</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;RoundPipe fine-tunes 32B models on a single 24GB GPU with 1.5-2.2× speedups via round-robin pipeline dispatch.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;RoundPipe fine-tunes 32B models on a single 24GB GPU. The method also supports LoRA fine-tuning of 235B models with 64K+ context length.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key facts&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Full fine-tune 32B models on 24GB GPU.&lt;/li&gt;
&lt;li&gt;LoRA fine-tune 235B models with 64K+ context.&lt;/li&gt;
&lt;li&gt;1.5-2.2× speedups over SOTA baselines.&lt;/li&gt;
&lt;li&gt;Round-robin dispatch reduces pipeline bubbles to near zero.&lt;/li&gt;
&lt;li&gt;No CPU offloading or model parallelism required.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;RoundPipe, introduced by researchers and shared via @HuggingPapers, tackles the memory bottleneck that typically forces practitioners to use multiple high-end GPUs for large-model fine-tuning. By dynamically dispatching pipeline stages in a round-robin fashion, it achieves near-zero pipeline bubbles — a primary source of inefficiency in standard pipeline parallelism.&lt;/p&gt;

&lt;p&gt;The key innovation is the reduction of idle GPU time during forward and backward passes. Standard pipeline parallelism (e.g., GPipe, PipeDream) leaves most GPUs idle while waiting for the first and last stages to complete. RoundPipe's round-robin dispatch overlaps computation across stages more evenly, yielding 1.5-2.2× speedups over state-of-the-art baselines [According to @HuggingPapers].&lt;/p&gt;

&lt;p&gt;This is particularly striking because it targets the same hardware constraints that have driven the shift toward parameter-efficient fine-tuning (PEFT) methods like LoRA. RoundPipe does not require model parallelism or tensor offloading; it operates purely through smarter scheduling within the existing pipeline. The trade-off is that the method likely increases communication overhead between stages, though the source does not quantify this.&lt;/p&gt;

&lt;p&gt;The unique take: RoundPipe suggests that the memory wall for fine-tuning large models is not just a hardware problem — it is also a scheduling problem. If the technique generalizes to training from scratch, it could reshape the cost calculus for single-GPU research, especially in academic labs where 24GB GPUs (e.g., RTX 3090/4090) are the norm.&lt;/p&gt;

&lt;h3&gt;
  
  
  How it compares
&lt;/h3&gt;

&lt;p&gt;Existing methods like ZeRO-Offload and DeepSpeed's heterogeneous training require CPU-GPU data movement, adding latency. RoundPipe avoids offloading entirely by keeping all parameters on the GPU and optimizing the pipeline schedule. The 64K+ context length support is notable because it enables fine-tuning on long-document tasks without memory compression tricks.&lt;/p&gt;

&lt;h3&gt;
  
  
  Limitations
&lt;/h3&gt;

&lt;p&gt;RoundPipe's performance gain depends on the number of pipeline stages and the model's forward/backward compute ratio. The source does not provide ablation studies across model sizes or hardware configurations. It is also unclear whether the method supports mixed-precision training or gradient checkpointing — both common in production workflows.&lt;/p&gt;

&lt;h3&gt;
  
  
  What's next
&lt;/h3&gt;

&lt;p&gt;The source does not specify a release date for code or a paper. If the authors open-source the implementation, expect rapid adoption by the Hugging Face community. Watch for a preprint on arXiv with full ablation tables and memory breakdowns.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to watch
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5v2zcsqy4dczat71iduq.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5v2zcsqy4dczat71iduq.jpg" alt="promising thing, RoundPipe - trains massive AI models on ..." width="765" height="946"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Watch for the arXiv preprint release and open-source code. If RoundPipe achieves 2× speedups on common benchmarks like GLUE or MMLU in third-party replication, expect integration into Hugging Face Transformers and DeepSpeed within 60 days.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://gentic.news/article/roundpipe-full-fine-tune-32b" rel="noopener noreferrer"&gt;gentic.news&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>research</category>
      <category>deeplearning</category>
    </item>
    <item>
      <title>NVIDIA NeMo RL Speculative Decoding: 1.8 Rollout Speed at 8B</title>
      <dc:creator>gentic news</dc:creator>
      <pubDate>Sun, 03 May 2026 14:34:12 +0000</pubDate>
      <link>https://forem.com/gentic_news/nvidia-nemo-rl-speculative-decoding-18x-rollout-speed-at-8b-3j95</link>
      <guid>https://forem.com/gentic_news/nvidia-nemo-rl-speculative-decoding-18x-rollout-speed-at-8b-3j95</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;NVIDIA's NeMo RL speculative decoding achieves 1.8× rollout speedup at 8B and projects 2.5× at 235B, cutting RL training time by over half.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;NVIDIA's NeMo RL speculative decoding achieves a 1.8× rollout generation speedup on 8B models. The technique projects a 2.5× end-to-end speedup at 235B parameters, cutting RL training wall-clock time by over half.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key facts&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;1.8× rollout generation speedup at 8B parameters&lt;/li&gt;
&lt;li&gt;Projected 2.5× end-to-end speedup at 235B&lt;/li&gt;
&lt;li&gt;Reduces RL training wall-clock time by over half&lt;/li&gt;
&lt;li&gt;Validated on internal benchmarks by NVIDIA&lt;/li&gt;
&lt;li&gt;Part of NeMo open-source framework&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;NVIDIA published research showing speculative decoding applied to reinforcement learning (RL) training in NeMo yields significant wall-clock speedups. The key result: a 1.8× faster rollout generation on 8B-parameter models, with a projected 2.5× end-to-end speedup at 235B parameters [According to the source].&lt;/p&gt;

&lt;h2&gt;
  
  
  Why speculative decoding fits RL
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmjxmjqf696st8ezbjj0x.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmjxmjqf696st8ezbjj0x.jpg" alt="TensorRT-LLM Speculative Decoding Boosts Inference Throughput by up to ..." width="800" height="692"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Speculative decoding is a well-known inference-time optimization — a small draft model proposes tokens that a large target model accepts or rejects in parallel. NVIDIA's contribution is applying this to RL rollouts, where the policy model generates trajectories that a reward model scores. The draft model runs on the same GPU, reducing idle time on the large model.&lt;/p&gt;

&lt;p&gt;The unique take: this is not a new architecture or training algorithm — it's a systems-level optimization that directly addresses the bottleneck in RL training: generation latency. Most RL-for-LLM work (PPO, GRPO, REINFORCE) spends the majority of time on rollout generation, not gradient updates. Speeding rollouts by 1.8× at 8B translates to roughly halving the total training time for that model size.&lt;/p&gt;

&lt;h2&gt;
  
  
  Projected gains at scale
&lt;/h2&gt;

&lt;p&gt;NVIDIA projects the speedup grows with model size. At 235B, the end-to-end gain hits 2.5×. This is consistent with the observation that larger models have more headroom for speculative decoding — the draft model's acceptance rate improves because larger models are more predictable in their token choices.&lt;/p&gt;

&lt;p&gt;The company validated the approach on internal benchmarks but did not release public benchmark numbers or the draft model architecture. The research is part of NeMo, NVIDIA's open-source framework for building and customizing generative AI models.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implications for RL training costs
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwl20lp9uckttf4tdflel.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwl20lp9uckttf4tdflel.webp" alt="Reinforcement Learning with NVIDIA NeMo-RL: Megatron-Core Support for ..." width="765" height="459"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;RL training of large language models is compute-intensive. OpenAI, Google DeepMind, and Anthropic all use RL (RLHF, RLAIF) to align models. A 2.5× speedup at 235B could cut the training cost for a frontier model by tens of millions of dollars, assuming the draft model overhead is minimal.&lt;/p&gt;

&lt;p&gt;NVIDIA's approach does not change the RL algorithm — it's a drop-in optimization for NeMo users. The company has not announced a release date for the feature.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to watch
&lt;/h2&gt;

&lt;p&gt;Watch for NVIDIA to release the feature in a NeMo update, likely at or before GTC 2027 in March. Also track whether competitors (Google with Gemini, Meta with LLaMA) publish similar speculative decoding benchmarks for RL training.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://gentic.news/article/nvidia-nemo-rl-speculative" rel="noopener noreferrer"&gt;gentic.news&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>research</category>
      <category>deeplearning</category>
    </item>
    <item>
      <title>Inference shift opens door for AI chip startups to challenge Nvidia</title>
      <dc:creator>gentic news</dc:creator>
      <pubDate>Sun, 03 May 2026 14:34:08 +0000</pubDate>
      <link>https://forem.com/gentic_news/inference-shift-opens-door-for-ai-chip-startups-to-challenge-nvidia-n11</link>
      <guid>https://forem.com/gentic_news/inference-shift-opens-door-for-ai-chip-startups-to-challenge-nvidia-n11</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Inference shift from training to serving creates opportunities for AI chip startups. Nvidia's $20B Groq acquihire validates disaggregated compute strategies.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Nvidia's $20 billion Groq acquihire in December 2025 signaled that inference workloads are reshaping the AI chip market. For startups vying for a slice of Nvidia's pie, it's now or never.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key facts&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Nvidia acquired Groq for $20 billion in December 2025.&lt;/li&gt;
&lt;li&gt;Lumai targets 1 exaOPS in 10kW power budget by 2029.&lt;/li&gt;
&lt;li&gt;AWS uses Trainium for prefill, Cerebras for decode.&lt;/li&gt;
&lt;li&gt;Intel partners with SambaNova for decode reference design.&lt;/li&gt;
&lt;li&gt;Lumai runs Llama 3.1 8B and 70B models today.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AI adoption is reaching an inflection point as the focus shifts from training new models to serving them. Compared to training, inference is a much more diverse workload, presenting an opportunity for chip startups to carve out a niche. Large batch inference requires a different mix of compute, memory, and bandwidth than an AI assistant or code agent. [According to The Register]&lt;/p&gt;

&lt;p&gt;Because of this, inference has become increasingly heterogeneous, with certain aspects better suited to GPUs and other specialized hardware. Nvidia's $20 billion acquihire of Groq in December is a prime example. Groq's SRAM-heavy chip architecture could churn out tokens faster than any GPU, but limited compute capacity and aging chip tech meant they couldn't scale efficiently. Nvidia side-stepped this by moving compute-heavy prefill to GPUs while keeping bandwidth-constrained decode on Groq's LPUs. [Per the source]&lt;/p&gt;

&lt;h3&gt;
  
  
  Disaggregated compute becomes the norm
&lt;/h3&gt;

&lt;p&gt;This combination isn't unique to Nvidia. AWS announced a disaggregated compute platform using its Trainium accelerators for prefill and Cerebras Systems' wafer-scale accelerators for decode. Intel also announced a reference design using GPUs for prefill and SambaNova's new RDUs for decode. So far, most chip startups' wins have been on the decode side, where SRAM's speed advantage shines. [The Register reports]&lt;/p&gt;

&lt;h3&gt;
  
  
  Optical inference enters the fray
&lt;/h3&gt;

&lt;p&gt;This week, UK-based startup Lumai detailed its optical inference accelerator, which uses light instead of electrons to perform matrix multiplication at a fraction of the power of digital architecture. Lumai expects its next-gen Iris Tetra systems to achieve an exaOPS of AI performance in a 10kW power budget by 2029. Initially, the chip is positioned as a standalone alternative to GPUs for compute-bound inference workloads like batch processing. Longer-term, the company plans to use its optical accelerators as prefill processors. The architecture currently runs billion-parameter models like Llama 3.1 8B or 70B. [According to the source]&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Inference shift from training to serving creates opportunities for AI chip startups.&lt;/li&gt;
&lt;li&gt;Nvidia's $20B Groq acquihire validates disaggregated compute strategies.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What to watch
&lt;/h2&gt;

&lt;p&gt;Watch for Nvidia's next-generation Rubin architecture and whether it integrates disaggregated inference natively, potentially closing the window for startups. Also track Lumai's Iris Tetra tape-out timeline and customer adoption in 2027.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpubads.g.doubleclick.net%2Fgampad%2Fad%3Fco%3D1%26iu%3D%2F6978%2Freg_software%2Faiml%26sz%3D300x50%257C300x100%257C300x250%257C300x251%257C300x252%257C300x600%257C300x601%26tile%3D2%26c%3D2afdQ63NrPM4Jm3DymA0m3QAAAgM%26t%3Dct%253Dns%2526unitnum%253D2%2526raptor%253Dcondor%2526pos%253Dtop%2526test%253D0" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpubads.g.doubleclick.net%2Fgampad%2Fad%3Fco%3D1%26iu%3D%2F6978%2Freg_software%2Faiml%26sz%3D300x50%257C300x100%257C300x250%257C300x251%257C300x252%257C300x600%257C300x601%26tile%3D2%26c%3D2afdQ63NrPM4Jm3DymA0m3QAAAgM%26t%3Dct%253Dns%2526unitnum%253D2%2526raptor%253Dcondor%2526pos%253Dtop%2526test%253D0" width="300" height="250"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://gentic.news/article/inference-shift-opens-door-for-ai" rel="noopener noreferrer"&gt;gentic.news&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>tech</category>
      <category>product</category>
    </item>
  </channel>
</rss>
