<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: kunpeng-ai-lab</title>
    <description>The latest articles on Forem by kunpeng-ai-lab (@kunpeng-ai-lab).</description>
    <link>https://forem.com/kunpeng-ai-lab</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3921113%2F2e2d4255-c982-4868-aa43-0cea63dee11a.jpg</url>
      <title>Forem: kunpeng-ai-lab</title>
      <link>https://forem.com/kunpeng-ai-lab</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/kunpeng-ai-lab"/>
    <language>en</language>
    <item>
      <title>A Practical GEO Case: How an AI System Started Recommending Our Blog</title>
      <dc:creator>kunpeng-ai-lab</dc:creator>
      <pubDate>Sat, 23 May 2026 14:17:21 +0000</pubDate>
      <link>https://forem.com/kunpeng-ai-lab/a-practical-geo-case-how-an-ai-system-started-recommending-our-blog-3cb4</link>
      <guid>https://forem.com/kunpeng-ai-lab/a-practical-geo-case-how-an-ai-system-started-recommending-our-blog-3cb4</guid>
      <description>&lt;p&gt;About one month after launching the Kunpeng AI Lab blog, I noticed a useful GEO case in the wild.&lt;/p&gt;

&lt;p&gt;I asked an AI system to recommend hands-on AI or AI Agent creators. Kunpeng AI Lab appeared as the first recommendation.&lt;/p&gt;

&lt;p&gt;This is not a post about bragging that "AI recommended us." The more useful engineering question is: what public signals made the brand understandable enough to be recommended?&lt;/p&gt;

&lt;h2&gt;
  
  
  GEO is not just SEO with a new name
&lt;/h2&gt;

&lt;p&gt;Traditional SEO focuses on being crawled, ranked, and displayed in search results.&lt;/p&gt;

&lt;p&gt;GEO, or Generative Engine Optimization, has a different problem space: how do AI systems understand your brand well enough to summarize it correctly and recommend it in the right context?&lt;/p&gt;

&lt;p&gt;For developer-facing brands, that context might be:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;practical AI Agent workflows&lt;/li&gt;
&lt;li&gt;real debugging examples&lt;/li&gt;
&lt;li&gt;open-source tooling&lt;/li&gt;
&lt;li&gt;hands-on product reviews&lt;/li&gt;
&lt;li&gt;specific engineering tradeoffs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If your public content is vague, AI has little to work with.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the AI appeared to recognize
&lt;/h2&gt;

&lt;p&gt;The AI did not describe Kunpeng AI Lab only as an "AI blog." It recognized a more specific pattern:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;hands-on AI Agent practice&lt;/li&gt;
&lt;li&gt;real project notes&lt;/li&gt;
&lt;li&gt;debugging records&lt;/li&gt;
&lt;li&gt;PR and issue traces&lt;/li&gt;
&lt;li&gt;reusable skills and workflow templates&lt;/li&gt;
&lt;li&gt;concrete commands, tools, failures, and fixes&lt;/li&gt;
&lt;li&gt;a low amount of pure marketing language&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is the important part.&lt;/p&gt;

&lt;p&gt;The recommendation was not based on a tagline. It was based on repeated evidence.&lt;/p&gt;

&lt;h2&gt;
  
  
  The practical GEO lesson
&lt;/h2&gt;

&lt;p&gt;If you want AI systems to understand and recommend your brand, publishing more is not enough. You need clearer signals.&lt;/p&gt;

&lt;p&gt;First, keep your positioning stable.&lt;/p&gt;

&lt;p&gt;If your core topic is AI Agent engineering, keep returning to that topic. You can explore adjacent ideas, but do not make your public identity change every week.&lt;/p&gt;

&lt;p&gt;Second, make the content verifiable.&lt;/p&gt;

&lt;p&gt;A debugging post with commands, screenshots, logs, and tradeoffs is easier to trust than a page full of abstract claims. Evidence helps people. It also helps AI systems classify the brand correctly.&lt;/p&gt;

&lt;p&gt;Third, repeat the signal across surfaces.&lt;/p&gt;

&lt;p&gt;Article titles, body text, project links, captions, GitHub discussions, and videos should all point to the same area of expertise. Consistency makes the brand easier to summarize.&lt;/p&gt;

&lt;h2&gt;
  
  
  Negative signals also matter
&lt;/h2&gt;

&lt;p&gt;One underrated part of GEO is negative labeling.&lt;/p&gt;

&lt;p&gt;If public content looks like thin marketing, AI may summarize it that way. If a brand only repeats hot topics without showing tests or artifacts, AI may treat it as a secondary commentary source. If low-quality copied pages or unresolved complaints dominate the public web, those signals may also shape the AI's view.&lt;/p&gt;

&lt;p&gt;So GEO is not only about "how do I get recommended?"&lt;/p&gt;

&lt;p&gt;It is also about "how do I avoid being misunderstood?"&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaway
&lt;/h2&gt;

&lt;p&gt;AI search changes the audience for your content.&lt;/p&gt;

&lt;p&gt;Humans still matter most, but AI systems are now part of the discovery layer. They read, compress, summarize, and re-express what they find.&lt;/p&gt;

&lt;p&gt;If you want your brand to appear in the right answers, make it easy to verify:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;keep a stable niche&lt;/li&gt;
&lt;li&gt;publish real cases&lt;/li&gt;
&lt;li&gt;show process and artifacts&lt;/li&gt;
&lt;li&gt;repeat the same expertise signal&lt;/li&gt;
&lt;li&gt;reduce vague marketing language&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is not a shortcut. It is basic brand hygiene for the generative search era.&lt;/p&gt;

&lt;p&gt;Originally published at Kunpeng AI Lab:&lt;br&gt;
&lt;a href="https://kunpeng-ai.com/en/blog/geo-brand-ai-recommendation/" rel="noopener noreferrer"&gt;https://kunpeng-ai.com/en/blog/geo-brand-ai-recommendation/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>writing</category>
      <category>ai</category>
      <category>devrel</category>
      <category>seo</category>
    </item>
    <item>
      <title>When DeepSeek Gets Stuck: How a Strong Mentor Model Finds the Real Root Cause</title>
      <dc:creator>kunpeng-ai-lab</dc:creator>
      <pubDate>Fri, 22 May 2026 04:00:05 +0000</pubDate>
      <link>https://forem.com/kunpeng-ai-lab/when-deepseek-gets-stuck-how-a-strong-mentor-model-finds-the-real-root-cause-3gl8</link>
      <guid>https://forem.com/kunpeng-ai-lab/when-deepseek-gets-stuck-how-a-strong-mentor-model-finds-the-real-root-cause-3gl8</guid>
      <description>&lt;p&gt;In the previous video, we talked about a pattern we call the strong mentor model: a stronger model handles decomposition, review, correction, and validation, while execution-oriented models such as DeepSeek move concrete tasks forward.&lt;/p&gt;

&lt;p&gt;This article goes one layer deeper.&lt;/p&gt;

&lt;p&gt;The interesting question is not simply "how do multiple models work together?" The practical question is what happens when the execution model gets stuck, reads the last error message, and returns a conclusion that sounds plausible but is not actually the root cause.&lt;/p&gt;

&lt;p&gt;Here is a real example from our workflow.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8oubn9z26vhu3rf3sfry.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8oubn9z26vhu3rf3sfry.png" alt="Real multi-agent workspace" width="800" height="497"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem Was Not That DeepSeek Could Not Work
&lt;/h2&gt;

&lt;p&gt;DeepSeek TUI was working on a Rust project task. The implementation had already moved forward, and the formatting check had passed. The failure appeared during validation, when it ran:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;cargo check &lt;span class="nt"&gt;--workspace&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After the command failed, DeepSeek quickly summarized the situation as:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;this shell is missing the MSVC linker.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;At first glance, this is not a ridiculous conclusion. On Windows, Rust builds can depend on the MSVC linker. If &lt;code&gt;link.exe&lt;/code&gt; or the Visual Studio Build Tools environment is missing, builds can fail.&lt;/p&gt;

&lt;p&gt;But in this case, DeepSeek stopped at the surface symptom.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz6h5w9gghvjqpn0x7m6k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz6h5w9gghvjqpn0x7m6k.png" alt="DeepSeek reached a surface-level conclusion" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is a common failure mode in long engineering tasks. An execution model can write code, run commands, and summarize status. But when the chain gets longer, it may anchor on the last visible error and treat it as the root cause.&lt;/p&gt;

&lt;p&gt;That is where the mentor model should step in.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Mentor Model Does Not Directly Patch the Result
&lt;/h2&gt;

&lt;p&gt;The first job of the mentor model is not to take over and rewrite everything.&lt;/p&gt;

&lt;p&gt;It should inspect the execution process:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which commands did DeepSeek run?&lt;/li&gt;
&lt;li&gt;Where did the failure start?&lt;/li&gt;
&lt;li&gt;Which checks had already passed?&lt;/li&gt;
&lt;li&gt;Why did it conclude that the linker was missing?&lt;/li&gt;
&lt;li&gt;Was that conclusion independently verified?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In this case, the stronger model checked the environment more carefully. The machine did have Visual Studio Build Tools installed. &lt;code&gt;link.exe&lt;/code&gt; existed. The actual problem was that the current shell had not loaded the Visual Studio compilation environment, so &lt;code&gt;link.exe&lt;/code&gt; was not visible on &lt;code&gt;PATH&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;That is a very different diagnosis.&lt;/p&gt;

&lt;p&gt;The right conclusion was not "the user must install the linker." The right conclusion was "the current shell has not loaded &lt;code&gt;vcvars64.bat&lt;/code&gt;; initialize the VS build environment first, then rerun validation."&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvd9rd9mjb69y58nk0m66.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvd9rd9mjb69y58nk0m66.png" alt="Mentor model reviewing the execution process" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This distinction matters. If the system sends the user to reinstall Build Tools, it wastes time and may disturb an environment that is already correct. If it identifies the missing shell initialization, the fix is smaller, safer, and reusable.&lt;/p&gt;
&lt;h2&gt;
  
  
  Use a Shared Discussion Folder as the Handoff Layer
&lt;/h2&gt;

&lt;p&gt;In this workflow, the mentor model and DeepSeek do not collaborate only through chat.&lt;/p&gt;

&lt;p&gt;There is a shared &lt;code&gt;discussion&lt;/code&gt; folder. The mentor model writes a guidance file there with the debugging context:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the surface symptom;&lt;/li&gt;
&lt;li&gt;the actual root cause;&lt;/li&gt;
&lt;li&gt;the validation command;&lt;/li&gt;
&lt;li&gt;the repair steps;&lt;/li&gt;
&lt;li&gt;the lesson DeepSeek should reuse next time.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flimi1cb0ujxqbrxgz337.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flimi1cb0ujxqbrxgz337.png" alt="Shared discussion guidance file" width="800" height="449"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This makes the mentor's reasoning inspectable. It becomes an engineering artifact instead of a temporary message.&lt;/p&gt;

&lt;p&gt;For this case, the guidance included a command pattern like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;cmd /c &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;C:&lt;/span&gt;&lt;span class="se"&gt;\P&lt;/span&gt;&lt;span class="s2"&gt;rogram Files (x86)&lt;/span&gt;&lt;span class="se"&gt;\M&lt;/span&gt;&lt;span class="s2"&gt;icrosoft Visual Studio&lt;/span&gt;&lt;span class="se"&gt;\2&lt;/span&gt;&lt;span class="s2"&gt;022&lt;/span&gt;&lt;span class="se"&gt;\B&lt;/span&gt;&lt;span class="s2"&gt;uildTools&lt;/span&gt;&lt;span class="se"&gt;\V&lt;/span&gt;&lt;span class="s2"&gt;C&lt;/span&gt;&lt;span class="se"&gt;\A&lt;/span&gt;&lt;span class="s2"&gt;uxiliary&lt;/span&gt;&lt;span class="se"&gt;\B&lt;/span&gt;&lt;span class="s2"&gt;uild&lt;/span&gt;&lt;span class="se"&gt;\v&lt;/span&gt;&lt;span class="s2"&gt;cvars64.bat&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &amp;amp;&amp;amp; cd /d D:&lt;/span&gt;&lt;span class="se"&gt;\S&lt;/span&gt;&lt;span class="s2"&gt;herlock&lt;/span&gt;&lt;span class="se"&gt;\w&lt;/span&gt;&lt;span class="s2"&gt;orkspace&lt;/span&gt;&lt;span class="se"&gt;\c&lt;/span&gt;&lt;span class="s2"&gt;dx-workspace&lt;/span&gt;&lt;span class="se"&gt;\D&lt;/span&gt;&lt;span class="s2"&gt;eepSeek-TUI &amp;amp;&amp;amp; cargo check --workspace"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The exact command is less important than the principle:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;before asking the user to install a tool, first verify whether the tool exists, whether the shell has loaded the right environment, and whether the failure can be reproduced after initialization.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5awat79jke44zix4i78k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5awat79jke44zix4i78k.png" alt="Root cause and validation command in the guidance file" width="800" height="467"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Send DeepSeek a Guidance Message, Not Just the Answer
&lt;/h2&gt;

&lt;p&gt;After writing the guidance file, the mentor model generates a short message that the user can send back to DeepSeek.&lt;/p&gt;

&lt;p&gt;That message does not simply give DeepSeek the final answer. It tells DeepSeek to read the guidance file, re-check its original conclusion, rerun validation, and correct its own path.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fusujw6f0w1f38o72bbbr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fusujw6f0w1f38o72bbbr.png" alt="Copyable guidance for DeepSeek" width="799" height="441"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This has two practical advantages.&lt;/p&gt;

&lt;p&gt;First, DeepSeek is not merely fed the result. It has to revisit the evidence and verify why the previous conclusion was incomplete.&lt;/p&gt;

&lt;p&gt;Second, the debugging path can be saved as experience. The next time the execution model sees a toolchain, credential, &lt;code&gt;PATH&lt;/code&gt;, or shell-environment failure, it should not stop at the last visible error. It should perform a layered check before making a conclusion.&lt;/p&gt;

&lt;p&gt;That is the difference between delegation and mentorship.&lt;/p&gt;

&lt;p&gt;Delegation means the stronger model finishes the task. Mentorship means the stronger model explains why the execution model got stuck, how to investigate the real cause, how to validate the fix, and how to turn the lesson into a reusable skill.&lt;/p&gt;

&lt;h2&gt;
  
  
  Turn the Workflow Into Skills
&lt;/h2&gt;

&lt;p&gt;If every case depends on a human reminder, the workflow is not stable enough.&lt;/p&gt;

&lt;p&gt;So we turn it into a standard collaboration mechanism.&lt;/p&gt;

&lt;p&gt;On the stronger model side, we install a mentor skill. Its job is to inspect logs, trace context, find the root cause, write a guidance file, and extract reusable lessons.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjkyi558hq02g0sz098tk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjkyi558hq02g0sz098tk.png" alt="Mentor skill file" width="800" height="457"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;On the DeepSeek side, we install an executor skill. Its job is to move the task forward, preserve logs, expose its conclusion when stuck, read mentor guidance, re-validate, and update its experience base.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpjdjg5dr5c5orzyvpsl5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpjdjg5dr5c5orzyvpsl5.png" alt="DeepSeek executor skill" width="800" height="455"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is close to how we think about ACS as well: do not rely on one model being permanently correct. Standardize collaboration, review, correction, and experience capture.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7m43n4avin0m34jgbd7w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7m43n4avin0m34jgbd7w.png" alt="Skill library and accumulated experience" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Real Upgrade Is Recovery After Failure
&lt;/h2&gt;

&lt;p&gt;A single model always has a ceiling.&lt;/p&gt;

&lt;p&gt;In complex engineering work, the real question is often not whether the model can write code. The harder questions are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Can it tell a surface symptom from a root cause?&lt;/li&gt;
&lt;li&gt;Can it inspect the full execution history?&lt;/li&gt;
&lt;li&gt;Can it turn a failed attempt into reusable knowledge?&lt;/li&gt;
&lt;li&gt;Can multiple models coordinate around the same evidence chain instead of producing disconnected guesses?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The strong mentor model pattern is useful because it addresses recovery.&lt;/p&gt;

&lt;p&gt;It is not a claim that DeepSeek becomes identical to Claude Code. It is not about dismissing any model either. It is a practical workflow for making execution models more reliable: when they get stuck, use a stronger mentor model to debug the reasoning path, write explicit guidance, force re-validation, and deposit the lesson into a skill library.&lt;/p&gt;

&lt;p&gt;If that loop keeps running, the execution model becomes smoother over time.&lt;/p&gt;

&lt;p&gt;The gain does not come from one perfect model. It comes from a standardized collaboration system that turns mistakes into reusable process.&lt;/p&gt;




&lt;p&gt;Full canonical version with screenshots: &lt;a href="https://kunpeng-ai.com/en/blog/deepseek-mentor-model-root-cause-debugging/" rel="noopener noreferrer"&gt;https://kunpeng-ai.com/en/blog/deepseek-mentor-model-root-cause-debugging/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>deeplearning</category>
      <category>tooling</category>
      <category>rust</category>
    </item>
    <item>
      <title>How I Make DeepSeek Work Closer to Claude Code in Practice</title>
      <dc:creator>kunpeng-ai-lab</dc:creator>
      <pubDate>Mon, 18 May 2026 03:34:32 +0000</pubDate>
      <link>https://forem.com/kunpeng-ai-lab/how-i-make-deepseek-work-closer-to-claude-code-in-practice-5dn5</link>
      <guid>https://forem.com/kunpeng-ai-lab/how-i-make-deepseek-work-closer-to-claude-code-in-practice-5dn5</guid>
      <description>&lt;p&gt;People have been asking me how I make DeepSeek feel closer to Claude Code in real work.&lt;/p&gt;

&lt;p&gt;My answer is not a magic prompt. It is a mentor model workflow.&lt;/p&gt;

&lt;p&gt;I use a stronger model to plan, supervise, debug, and review. Then I let smaller or cheaper models handle bounded execution tasks in parallel.&lt;/p&gt;

&lt;p&gt;Important caveat: I am not claiming DeepSeek is equivalent to Claude Code as a single model/tool. The comparison is about the practical workflow effect.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fic34humlk5zwioebyv6j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fic34humlk5zwioebyv6j.png" alt="Multi-model mentor workflow" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  1. The mentor model creates the task boundary
&lt;/h2&gt;

&lt;p&gt;Before assigning work, the stronger model defines:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the small task units&lt;/li&gt;
&lt;li&gt;the files or outputs each executor may touch&lt;/li&gt;
&lt;li&gt;the acceptance checks&lt;/li&gt;
&lt;li&gt;the things that must not change&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That alone makes weaker models much more reliable. They no longer have to infer the whole strategy.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F95tytnnj9blru58qdd2p.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F95tytnnj9blru58qdd2p.png" alt="Mentor model planning" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Smaller models execute narrow tasks
&lt;/h2&gt;

&lt;p&gt;DeepSeek becomes useful when I give it work like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;inspect this log and summarize the failure&lt;/li&gt;
&lt;li&gt;draft this section using the existing outline&lt;/li&gt;
&lt;li&gt;analyze this recording and list usable timestamps&lt;/li&gt;
&lt;li&gt;convert this article into a platform version&lt;/li&gt;
&lt;li&gt;modify this specific module without touching unrelated files&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I avoid giving smaller models vague ownership of the whole project.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu4ldulin19fkbgm7fata.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu4ldulin19fkbgm7fata.png" alt="Parallel model execution" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  3. The mentor model reads the process, not only the result
&lt;/h2&gt;

&lt;p&gt;This is the part that matters most.&lt;/p&gt;

&lt;p&gt;The mentor checks command output, logs, stuck points, test failures, render errors, and mismatched assumptions. It does not just ask "did the file exist?"&lt;/p&gt;

&lt;p&gt;For a video segment, it checks resolution, audio behavior, subtitles, and template consistency.&lt;/p&gt;

&lt;p&gt;For article assets, it checks image template usage, manifest records, alt text, and platform rules.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq6yl48fc1wfgl8gyra6n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq6yl48fc1wfgl8gyra6n.png" alt="Feedback loop" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Failures become reusable skills
&lt;/h2&gt;

&lt;p&gt;After a model gets stuck, I want the lesson saved:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;what triggered the failure&lt;/li&gt;
&lt;li&gt;which check should happen earlier next time&lt;/li&gt;
&lt;li&gt;which platform rule matters&lt;/li&gt;
&lt;li&gt;which command or template is reliable&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Those lessons become project skills and handoff notes. This is how later runs get smoother.&lt;/p&gt;

&lt;h2&gt;
  
  
  The short version
&lt;/h2&gt;

&lt;p&gt;DeepSeek works much better for me when it is not asked to be the entire coding agent.&lt;/p&gt;

&lt;p&gt;It becomes much more useful when a stronger model acts as mentor:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;plan the task&lt;/li&gt;
&lt;li&gt;define the boundary&lt;/li&gt;
&lt;li&gt;assign narrow execution&lt;/li&gt;
&lt;li&gt;inspect logs and errors&lt;/li&gt;
&lt;li&gt;correct the process&lt;/li&gt;
&lt;li&gt;turn lessons into reusable memory&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is the real pattern. Not "DeepSeek replaces Claude Code", but "DeepSeek performs better inside a mentor-led agent workflow."&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>opensource</category>
      <category>github</category>
    </item>
    <item>
      <title>Desktop GUI vs Terminal TUI: how I choose the right interface for AI coding agents</title>
      <dc:creator>kunpeng-ai-lab</dc:creator>
      <pubDate>Sat, 16 May 2026 17:26:58 +0000</pubDate>
      <link>https://forem.com/kunpeng-ai-lab/desktop-gui-vs-terminal-tui-how-i-choose-the-right-interface-for-ai-coding-agents-gke</link>
      <guid>https://forem.com/kunpeng-ai-lab/desktop-gui-vs-terminal-tui-how-i-choose-the-right-interface-for-ai-coding-agents-gke</guid>
      <description>&lt;p&gt;A viewer recently asked a very fair question: if desktop AI coding tools are powerful and convenient, why bother with a terminal TUI at all?&lt;/p&gt;

&lt;p&gt;I do not think this is a replacement story.&lt;/p&gt;

&lt;p&gt;Desktop GUI and terminal TUI workflows solve different kinds of friction. A GUI is better when the human needs to stay close to the work: reading code, checking documents, copying context, dropping screenshots, or supervising browser actions. A TUI is better when the work can be split into small, independent tasks and left running with lower overhead.&lt;/p&gt;

&lt;h2&gt;
  
  
  My short rule
&lt;/h2&gt;

&lt;p&gt;Use a desktop GUI when the task needs visual context, frequent human steering, screenshots, web pages, or browser state.&lt;/p&gt;

&lt;p&gt;Use a terminal TUI when the task is already scoped and can run as one of several small parallel jobs.&lt;/p&gt;

&lt;p&gt;Switch back to a GUI when the task happens inside a browser: dashboards, forms, image uploads, publishing previews, and final state checks.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyaf983t797wl8duka6rp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyaf983t797wl8duka6rp.png" alt="The more context you inspect, the more a GUI helps" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Large projects usually benefit from a GUI
&lt;/h2&gt;

&lt;p&gt;Large project work is rarely just command execution.&lt;/p&gt;

&lt;p&gt;You read files. You compare docs. You inspect a web page. You copy terminal output. You may need to give the agent a screenshot or a product state that is difficult to describe in text.&lt;/p&gt;

&lt;p&gt;In that situation, the human has not left the loop. The human is still observing, correcting, and deciding whether the agent is moving in the right direction.&lt;/p&gt;

&lt;p&gt;That is where a desktop GUI helps. It keeps the workspace visible and makes the shared working surface easier to inspect.&lt;/p&gt;

&lt;h2&gt;
  
  
  Parallel agents are often better in a TUI
&lt;/h2&gt;

&lt;p&gt;There is another kind of work: small, scoped, parallel tasks.&lt;/p&gt;

&lt;p&gt;One agent edits a module. Another reads logs. A third runs tests and summarizes the failure. These tasks do not need constant visual supervision. They need clear boundaries, stable execution, and low overhead.&lt;/p&gt;

&lt;p&gt;Opening a separate desktop window for every agent can quickly make the machine feel heavy. This is where a terminal TUI earns its place.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe66b012yd6erh3gr5c9a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe66b012yd6erh3gr5c9a.png" alt="For many small tasks, the terminal stays lighter" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The value of a TUI is not that it looks more technical. The value is that it stays light when several small jobs need to run at the same time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Browser work is usually easier to supervise in a GUI
&lt;/h2&gt;

&lt;p&gt;Some tasks naturally belong in a browser.&lt;/p&gt;

&lt;p&gt;Opening an admin dashboard, filling a form, uploading images, checking a preview, or confirming whether a page was saved are all visual tasks.&lt;/p&gt;

&lt;p&gt;For that kind of work, I prefer a GUI. The agent can see the page change, and I can take over when needed.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fetjbhdsphzp5ml2jrpzs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fetjbhdsphzp5ml2jrpzs.png" alt="If the task happens on a web page, a GUI is easier to supervise" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;There is still an important boundary here. Login, CAPTCHA, payment, security prompts, and final publish actions should remain human-confirmed.&lt;/p&gt;

&lt;h2&gt;
  
  
  My current rule
&lt;/h2&gt;

&lt;p&gt;I usually mix both.&lt;/p&gt;

&lt;p&gt;For exploration and context-heavy work, I start in a GUI. For scoped parallel execution, logs, tests, and long-running small tasks, I use a TUI. For browser operations and publishing flows, I return to a GUI.&lt;/p&gt;

&lt;p&gt;TUI is not old-fashioned. GUI is not a beginner mode. Both are useful when the task matches the interface.&lt;/p&gt;

&lt;p&gt;Read the task first, then choose the interface.&lt;/p&gt;

&lt;p&gt;Originally published at &lt;a href="https://kunpeng-ai.com/en/blog/gui-vs-tui-ai-coding-agent-workflow/" rel="noopener noreferrer"&gt;https://kunpeng-ai.com/en/blog/gui-vs-tui-ai-coding-agent-workflow/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>opensource</category>
      <category>github</category>
    </item>
    <item>
      <title>Green Tests Are Evidence, Not Approval</title>
      <dc:creator>kunpeng-ai-lab</dc:creator>
      <pubDate>Sat, 09 May 2026 05:20:23 +0000</pubDate>
      <link>https://forem.com/kunpeng-ai-lab/green-tests-are-evidence-not-approval-39od</link>
      <guid>https://forem.com/kunpeng-ai-lab/green-tests-are-evidence-not-approval-39od</guid>
      <description>&lt;p&gt;Many teams are starting to use more than one AI coding agent.&lt;/p&gt;

&lt;p&gt;One agent writes code. Another agent reviews. A human owner makes the final call.&lt;/p&gt;

&lt;p&gt;That sounds reasonable, but without a shared process it can become unreliable very quickly.&lt;/p&gt;

&lt;p&gt;The Executor may test its own work. The Reviewer may only check that tests are green. The Owner may receive a confident summary without durable evidence.&lt;/p&gt;

&lt;p&gt;That is the problem ACS tries to solve.&lt;/p&gt;

&lt;p&gt;ACS, short for Agent Collaboration SOP, is a vendor-neutral, file-first workflow for multi-agent engineering collaboration.&lt;/p&gt;

&lt;p&gt;The core principle is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Green tests are evidence, not approval.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Passing tests matter. But they do not prove that scope was respected, UI was inspected, docs match the actual files, public output was redacted, or the change is safe to release.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Green Tests Are Not Enough
&lt;/h2&gt;

&lt;p&gt;Tests answer specific questions. Approval answers a broader question: should this change move forward?&lt;/p&gt;

&lt;p&gt;Green tests do not automatically prove that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the requested scope was respected;&lt;/li&gt;
&lt;li&gt;the UI was opened and visually inspected;&lt;/li&gt;
&lt;li&gt;screenshots exist where visual evidence is needed;&lt;/li&gt;
&lt;li&gt;documentation and handoff notes match the actual files;&lt;/li&gt;
&lt;li&gt;the implementation did not introduce architecture drift;&lt;/li&gt;
&lt;li&gt;public output has been redacted;&lt;/li&gt;
&lt;li&gt;the change is safe to release, merge upstream, or share publicly.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If a human teammate submitted a change with no clear handoff, no review evidence, no scope notes, and no release-risk assessment, most engineering teams would not treat "the tests passed" as enough.&lt;/p&gt;

&lt;p&gt;AI-agent work should not get a weaker standard just because the summary sounds confident.&lt;/p&gt;

&lt;h2&gt;
  
  
  Owner, Executor, Reviewer
&lt;/h2&gt;

&lt;p&gt;ACS separates three roles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Owner: the human decision-maker responsible for goals, scope, release decisions, upstream PR boundaries, and business constraints.&lt;/li&gt;
&lt;li&gt;Executor Agent: responsible for implementation, self-testing, evidence collection, and handoff.&lt;/li&gt;
&lt;li&gt;Reviewer Agent: responsible for independent review across scope, architecture, tests, screenshots, evidence, redaction, and release risk.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The key rule is simple:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The executor does not approve itself.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;An Executor can and should run tests. It can and should summarize what it changed. It can and should collect evidence.&lt;/p&gt;

&lt;p&gt;But approval requires an independent check and a human decision.&lt;/p&gt;

&lt;h2&gt;
  
  
  From Chat Logs to Durable Files
&lt;/h2&gt;

&lt;p&gt;Chat is useful while work is happening. It is a weak long-term engineering record.&lt;/p&gt;

&lt;p&gt;Chat threads can be compressed. They can lose context. They can be separated from the exact repository state they were discussing. They can be hard for a later agent to inspect.&lt;/p&gt;

&lt;p&gt;ACS prefers file-first handoff.&lt;/p&gt;

&lt;p&gt;Typical ACS artifacts include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Executor handoff&lt;/li&gt;
&lt;li&gt;Reviewer report&lt;/li&gt;
&lt;li&gt;Evidence ledger&lt;/li&gt;
&lt;li&gt;Owner consensus report&lt;/li&gt;
&lt;li&gt;Redacted case study&lt;/li&gt;
&lt;li&gt;Anti-pattern review&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This makes the workflow easier to resume after context compression, model changes, machine changes, or handoff to another agent.&lt;/p&gt;

&lt;h2&gt;
  
  
  Case Studies and Anti-Patterns
&lt;/h2&gt;

&lt;p&gt;ACS keeps two long-term memory areas:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;case-studies/&lt;/code&gt; captures redacted examples of real collaboration.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;anti-patterns/&lt;/code&gt; captures recurring failure modes and prevention checklists.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Examples of useful anti-patterns include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the Executor approves its own work;&lt;/li&gt;
&lt;li&gt;the Reviewer only checks whether tests are green;&lt;/li&gt;
&lt;li&gt;evidence exists only in chat;&lt;/li&gt;
&lt;li&gt;UI review happens without screenshots;&lt;/li&gt;
&lt;li&gt;handoff notes drift away from the actual files;&lt;/li&gt;
&lt;li&gt;public materials are shared without redaction.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The goal is to turn repeated mistakes into reusable team memory.&lt;/p&gt;

&lt;h2&gt;
  
  
  Public Sharing Needs a Redaction Gate
&lt;/h2&gt;

&lt;p&gt;Public examples are useful, but they must be safe.&lt;/p&gt;

&lt;p&gt;AI agents can accidentally include sensitive details in handoffs, reports, issues, PR descriptions, blog drafts, and case studies.&lt;/p&gt;

&lt;p&gt;Before publishing a case study, remove:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;customer names;&lt;/li&gt;
&lt;li&gt;private repository URLs;&lt;/li&gt;
&lt;li&gt;local absolute paths;&lt;/li&gt;
&lt;li&gt;tokens, cookies, API keys, and webhooks;&lt;/li&gt;
&lt;li&gt;private chat logs;&lt;/li&gt;
&lt;li&gt;unpublished business information.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The point is not to hide the engineering lesson. The point is to preserve the lesson without leaking what should remain private.&lt;/p&gt;

&lt;h2&gt;
  
  
  Open Source
&lt;/h2&gt;

&lt;p&gt;ACS is open source, and practical contributions are welcome:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;redacted case studies;&lt;/li&gt;
&lt;li&gt;anti-pattern examples;&lt;/li&gt;
&lt;li&gt;reviewer report improvements;&lt;/li&gt;
&lt;li&gt;evidence ledger refinements;&lt;/li&gt;
&lt;li&gt;examples from different agent tools and team setups.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;GitHub:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/kunpeng-ai-lab/agent-collaboration-sop" rel="noopener noreferrer"&gt;https://github.com/kunpeng-ai-lab/agent-collaboration-sop&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Full article:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://kunpeng-ai.com/en/blog/agent-collaboration-sop-acs-case-library/?utm_source=blog_referral&amp;amp;utm_medium=referral&amp;amp;utm_campaign=acs-case-library-202605&amp;amp;utm_content=ending_cta" rel="noopener noreferrer"&gt;https://kunpeng-ai.com/en/blog/agent-collaboration-sop-acs-case-library/?utm_source=blog_referral&amp;amp;utm_medium=referral&amp;amp;utm_campaign=acs-case-library-202605&amp;amp;utm_content=ending_cta&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Multi-agent engineering does not become reliable just because more agents are involved.&lt;/p&gt;

&lt;p&gt;It becomes reliable when execution, review, evidence, and human approval are separated clearly enough to be inspected.&lt;/p&gt;

</description>
      <category>ai</category>
    </item>
  </channel>
</rss>
