<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: BAOFUFAN</title>
    <description>The latest articles on Forem by BAOFUFAN (@_eb7f2a654e97a60ae9f96e).</description>
    <link>https://forem.com/_eb7f2a654e97a60ae9f96e</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3903614%2F88f4214a-aed8-4e71-a7f1-a6aca8cfe579.jpg</url>
      <title>Forem: BAOFUFAN</title>
      <link>https://forem.com/_eb7f2a654e97a60ae9f96e</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/_eb7f2a654e97a60ae9f96e"/>
    <language>en</language>
    <item>
      <title>Building RAG with LangChain &amp; Chroma: Two Hidden Pitfalls That Cost Me 6 Hours</title>
      <dc:creator>BAOFUFAN</dc:creator>
      <pubDate>Wed, 06 May 2026 01:08:16 +0000</pubDate>
      <link>https://forem.com/_eb7f2a654e97a60ae9f96e/building-rag-with-langchain-chroma-two-hidden-pitfalls-that-cost-me-6-hours-1flc</link>
      <guid>https://forem.com/_eb7f2a654e97a60ae9f96e/building-rag-with-langchain-chroma-two-hidden-pitfalls-that-cost-me-6-hours-1flc</guid>
      <description>&lt;p&gt;At 10 PM, my product manager dropped 200 PDFs in my lap: “We need to demo an internal knowledge base Q&amp;amp;A for the boss tomorrow morning—super urgent.” I thought, “RAG? I know this; LangChain plus a vector database, done in minutes.” I started coding at 4 PM and barely finished by 10 PM—not because the pipeline didn’t run, but because two subtle traps dragged the accuracy below 40% and had me debugging for six straight hours. In this article, I’ll walk you through the full RAG system build and pull the two pitfalls out by their roots.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why you can’t just dump documents into GPT
&lt;/h2&gt;

&lt;p&gt;The simplest idea for a system that answers questions like “What is the company holiday policy?” or “What were the conclusions of project X’s retrospective?” is to concatenate all the documents into one giant prompt and send it to GPT. Reality hits fast: 200 PDFs add up to over 800,000 characters. Even GPT-4’s 128K context window chokes, and the per‑call cost will make your finance team come after you. Fine‑tuning is even less realistic—the documents change daily, and you’re not going to burn thousands of dollars every time they do.&lt;/p&gt;

&lt;p&gt;That leaves retrieval‑augmented generation (RAG) as the only viable path: split documents into small chunks, embed each chunk with an embedding model and store the vectors in a vector database. At query time, retrieve the most relevant chunks, stuff them into the prompt as context, and let the LLM generate an answer. The pattern looks simple, but every step—“how to split,” “how to store,” “how to search”—has its own sharp edges. The two that wrecked me were buried deep in the interaction between LangChain and Chroma.&lt;/p&gt;

&lt;h2&gt;
  
  
  Design choices: why Chroma over Pinecone or FAISS
&lt;/h2&gt;

&lt;p&gt;Before picking a vector store, I asked myself three questions: does it cost money, does it support metadata filtering, and can I start/stop it locally with a single command?&lt;/p&gt;

&lt;p&gt;Pinecone costs money and requires data to go to the cloud—immediately ruled out for internal documents. Weaviate is powerful, but deploying it means at least 30 minutes of Docker tinkering, a non‑starter when the demo is “tomorrow morning.” FAISS is blazing fast, but it doesn’t support metadata filtering (like filtering by document type or date range)—a feature we’d need as soon as the business side piles on more requirements. I landed on Chroma: it runs locally, installs with a single &lt;code&gt;pip install chromadb&lt;/code&gt;, and has persistence, metadata filtering, and similarity search built right in. It also integrates with LangChain more smoothly than any other option.&lt;/p&gt;

&lt;p&gt;The overall architecture is straightforward: &lt;strong&gt;load documents → split text → generate embeddings → write to Chroma → when a user asks, retrieve top‑k chunks → stuff into a prompt → LLM generates an answer&lt;/strong&gt;. LangChain chains these steps together; you just need to manage the parameters and edge cases for each stage.&lt;/p&gt;

&lt;h2&gt;
  
  
  Core implementation: two scripts to run the full RAG pipeline
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Script 1’s job&lt;/strong&gt;: turn scattered PDFs into searchable vector chunks and persist them in Chroma so you don’t have to re‑index everything on the next run.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.document_loaders&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PyPDFDirectoryLoader&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.text_splitter&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RecursiveCharacterTextSplitter&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.vectorstores&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Chroma&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAIEmbeddings&lt;/span&gt;

&lt;span class="c1"&gt;# 1. Load PDF directory
&lt;/span&gt;&lt;span class="n"&gt;loader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PyPDFDirectoryLoader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./docs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;          &lt;span class="c1"&gt;# Auto-scan all PDFs
&lt;/span&gt;&lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;loader&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Loaded documents: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; pages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 2. Split: chunk_size and overlap are the source of two major pitfalls, detailed later
&lt;/span&gt;&lt;span class="n"&gt;text_splitter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;RecursiveCharacterTextSplitter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;chunk_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;         &lt;span class="c1"&gt;# Max characters per chunk
&lt;/span&gt;    &lt;span class="n"&gt;chunk_overlap&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;       &lt;span class="c1"&gt;# Overlap to avoid cutting key info across boundaries
&lt;/span&gt;    &lt;span class="n"&gt;separators&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;。&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;，&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;  &lt;span class="c1"&gt;# Priority: paragraphs first
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;text_splitter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Total chunks: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 3. Generate embeddings and store in Chroma (auto-persist to local dir)
&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAIEmbeddings&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;                  &lt;span class="c1"&gt;# Default: text-embedding-ada-002
&lt;/span&gt;&lt;span class="n"&gt;vectordb&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Chroma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;persist_directory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./chroma_db&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;              &lt;span class="c1"&gt;# Reusable after restart, saves re-embedding cost
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;vectordb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;persist&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Vector store built and persisted to ./chroma_db&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Script 2’s job&lt;/strong&gt;: using the stored vector store, build the full “ask → retrieve → generate” chain and force the LLM to answer strictly from the provided documents—no hallucinations.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatOpenAI&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.chains&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RetrievalQA&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.prompts&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PromptTemplate&lt;/span&gt;

&lt;span class="c1"&gt;# 1. Custom prompt: force LLM to base answers only on the given context
&lt;/span&gt;&lt;span class="n"&gt;prompt_template&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;You are a rigorous internal knowledge base assistant. Answer the question strictly based on the context below.
If the answer cannot be found in the context, simply say &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No relevant information found.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; Do not make anything up.

Context:
{context}

Question: {question}
Answer:&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="n"&gt;PROMPT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PromptTemplate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;template&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;prompt_template&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;input_variables&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;context&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 2. Load the persisted vector store, connecting to the same embedding model
&lt;/span&gt;&lt;span class="n"&gt;vectordb&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Chroma&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;persist_directory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./chroma_db&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;embedding_function&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;OpenAIEmbeddings&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 3. Create the QA chain: retrieves top-4 chunks by default, using our custom prompt
&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ChatOpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# temperature=0 for deterministic results
&lt;/span&gt;&lt;span class="n"&gt;qa_chain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;RetrievalQA&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_chain_type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;chain_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stuff&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                               &lt;span class="c1"&gt;# Stuffs retrieved chunks directly into the prompt
&lt;/span&gt;    &lt;span class="n"&gt;retriever&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;vectordb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;as_r&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;Note: The final line of the code block is shown exactly as it appears in the original article.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>programming</category>
    </item>
    <item>
      <title>How I Cut LLM Memory Bug Diagnosis from 2 Hours to 5 Minutes with Playwright &amp; Allure</title>
      <dc:creator>BAOFUFAN</dc:creator>
      <pubDate>Tue, 05 May 2026 12:17:50 +0000</pubDate>
      <link>https://forem.com/_eb7f2a654e97a60ae9f96e/how-i-cut-llm-memory-bug-diagnosis-from-2-hours-to-5-minutes-with-playwright-allure-2162</link>
      <guid>https://forem.com/_eb7f2a654e97a60ae9f96e/how-i-cut-llm-memory-bug-diagnosis-from-2-hours-to-5-minutes-with-playwright-allure-2162</guid>
      <description>&lt;p&gt;At 2 a.m., the customer group chat erupted: "Your AI assistant completely forgot the client background I provided last week and even fabricated new details!" I rolled out of bed, opened the logs, and stared at tens of thousands of conversation lines—like finding a needle in a haystack. That night, I spent nearly two hours tracking down the cause: when conversations exceeded 24 turns, our history truncation logic silently dropped the system prompts in the middle, causing memory to break completely. The next day, I decided this torture had to end with automation. If you're building LLM applications and are tormented by "occasional amnesia" or "intermittent hallucinations," this article will hand you a ready-to-use end-to-end memory testing solution.&lt;/p&gt;

&lt;h2&gt;
  
  
  Problem Breakdown: LLM "Amnesia" Is Scarier Than Hallucination
&lt;/h2&gt;

&lt;p&gt;Once an LLM application is live, users assume it will remember conversation history. The reality is: any link in the chain involving context window management, RAG retrieval, or multi-agent state passing can cause "amnesia"—needed information fails to reach the model, or gets truncated and overwritten along the way. The risk with such bugs is that they don't fail every time; they typically surface only under specific conversation lengths or message sequences, making manual reproduction extremely hard.&lt;/p&gt;

&lt;p&gt;Conventional testing methods fall into two buckets: (1) Unit tests on the model API, verifying single-turn input/output, never touching memory logic. (2) Manually clicking through dozens of conversation turns on the UI, time-consuming, exhausting, and impossible to precisely replicate the exact path. One forgotten step by the tester and the bug slips away. What we truly lack is an automated approach that can converse with the system like a real user and clearly record what it remembers and at which turn it starts forgetting.&lt;/p&gt;

&lt;h2&gt;
  
  
  Solution Design: Why Playwright + Allure?
&lt;/h2&gt;

&lt;p&gt;Facing this requirement, I evaluated several paths:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Direct API calls:&lt;/strong&gt; Fast, but bypass front-end logic like message assembly, timestamps, and user identity injection, missing the full real-world chain.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Selenium:&lt;/strong&gt; Mature ecosystem, but async waiting and Shadow DOM handling in modern front-end frameworks are painful, and the community has clearly shifted toward Playwright.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cypress:&lt;/strong&gt; Limited to JS/TS ecosystem. Our backend and model services are Python-based, making stack unification too costly.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The reasons for finally choosing Playwright are straightforward: native sync/async support, auto-wait mechanism drastically reducing flaky tests, and perfect control over Chromium to capture screenshots of every critical step. Allure was chosen because its reports come with built-in step trees, attachment display, and failure highlighting, enabling non-technical stakeholders (like product managers) to instantly grasp at which turn the memory broke.&lt;/p&gt;

&lt;p&gt;The architecture is simple: &lt;code&gt;pytest + playwright + allure-pytest&lt;/code&gt;. Test cases use Playwright to operate the live front-end, execute multiple conversation turns, and assert each turn that key information is still remembered by the model. Allure packages the conversation content, model responses, assertion results, and page screenshots for each round into a single HTML report, making the reproduction path crystal clear.&lt;/p&gt;

&lt;h2&gt;
  
  
  Core Implementation: From Opening the Browser to Automated Memory Assertions
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;First Code Block: Solving Browser Reuse and Login State&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Often tests slow down drastically due to login flows, requiring scanning QR codes or entering verification codes from scratch each time. We directly save the logged-in &lt;code&gt;storage_state&lt;/code&gt; and reuse it.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
python
# conftest.py - 复用登录态，避免每次测试都登录
import pytest
from playwright.sync_api import sync_playwright
from pathlib import Path

STATE_FILE = Path(__file__).parent / "auth_state.json"

@pytest.fixture(scope="session")
def browser():
    with sync_playwright() as p:
        # 如果想看到执行过程可设为 False
        browser = p.chromium.launch(headless=True, slow_m
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>python</category>
      <category>programming</category>
    </item>
    <item>
      <title>A 2 AM Serialization Bug in LangChain Memory — And How pytest Stopped It Forever</title>
      <dc:creator>BAOFUFAN</dc:creator>
      <pubDate>Tue, 05 May 2026 01:07:40 +0000</pubDate>
      <link>https://forem.com/_eb7f2a654e97a60ae9f96e/a-2-am-serialization-bug-in-langchain-memory-and-how-pytest-stopped-it-forever-jgj</link>
      <guid>https://forem.com/_eb7f2a654e97a60ae9f96e/a-2-am-serialization-bug-in-langchain-memory-and-how-pytest-stopped-it-forever-jgj</guid>
      <description>&lt;p&gt;At 2:17 AM, my monitoring alert yanked me out of sleep: the customer service bot had suddenly lost its memory. Users were asking “Where is my order?” three times in a row, and it kept asking for their phone number as if they were complete strangers. I opened the logs and saw that &lt;code&gt;ConversationBufferMemory&lt;/code&gt; was loading empty message lists. The key was still there in Redis, but somehow deserialization had silently swallowed the data. I rolled back the code from my bed and spent three hours tracing the root cause — &lt;strong&gt;a LangChain upgrade had introduced a pickle deserialization incompatibility that dropped entire conversation histories.&lt;/strong&gt; Manual testing had never covered version upgrade scenarios. The next morning I made a decision: automate the integrity and performance checks for memory storage with pytest, and never let a serialization regression slip through again. Since then, regressions that used to take 30 minutes to verify now finish in 8 seconds.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Breakdown
&lt;/h2&gt;

&lt;p&gt;Our architecture was straightforward: &lt;code&gt;ConversationBufferMemory&lt;/code&gt; + &lt;code&gt;RedisChatMessageHistory&lt;/code&gt; persisting user sessions to Redis. Under the hood, LangChain used &lt;code&gt;pickle&lt;/code&gt; to dump the message list into bytes and stored them under a &lt;code&gt;{session_id}&lt;/code&gt; key — reloading it later with a simple load.&lt;/p&gt;

&lt;p&gt;The problem hit during a version upgrade: we moved from &lt;code&gt;langchain==0.0.352&lt;/code&gt; to &lt;code&gt;0.1.0&lt;/code&gt;, and the fully qualified class names of &lt;code&gt;HumanMessage&lt;/code&gt; and &lt;code&gt;AIMessage&lt;/code&gt; changed. When the old pickle payload was loaded, it threw an &lt;code&gt;AttributeError&lt;/code&gt;. Even worse, the &lt;code&gt;messages&lt;/code&gt; property of &lt;code&gt;RedisChatMessageHistory&lt;/code&gt; was catching that exception and silently returning an empty list — making it look like an innocent empty conversation with no errors anywhere. These kinds of bugs have two nasty traits:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Delayed impact&lt;/strong&gt;: the blow-up doesn’t happen at upgrade time, but only when a user actually reads or writes memory again — monitoring can barely spot it in the first place.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Untestable by hand&lt;/strong&gt;: before an upgrade, QA only validates “can we store and read” with the current version; nobody intentionally seeds old serialized data to check backward compatibility.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Conventional click-and-hope manual testing stands no chance against regressions like this. What we needed was an automated, repeatable integration suite covering:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Write → restart → read integrity&lt;/li&gt;
&lt;li&gt;Multi-version serialized data compatibility&lt;/li&gt;
&lt;li&gt;Performance under large message volumes&lt;/li&gt;
&lt;li&gt;Correctness under concurrent reads and writes&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Plan
&lt;/h2&gt;

&lt;p&gt;I chose &lt;strong&gt;pytest&lt;/strong&gt; as the test framework. It wasn’t that &lt;code&gt;unittest&lt;/code&gt; couldn’t do the job — but the things I needed were just too painful there:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Isolation&lt;/strong&gt;: I wanted a painless Redis substitute. &lt;code&gt;fakeredis&lt;/code&gt; perfectly simulates Redis commands, and combined with pytest fixtures it gives zero external dependencies during testing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parametrization&lt;/strong&gt;: &lt;code&gt;@pytest.mark.parametrize&lt;/code&gt; can cover 10, 100, or 1000 messages in a single line — no manual loops required.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance benchmarks&lt;/strong&gt;: the &lt;code&gt;pytest-benchmark&lt;/code&gt; plugin directly measures average and max latency, much more reliable than me sprinkling &lt;code&gt;time.perf_counter()&lt;/code&gt; around.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Concurrency simulation&lt;/strong&gt;: writing fixtures with &lt;code&gt;threading&lt;/code&gt; or &lt;code&gt;asyncio&lt;/code&gt; is far more intuitive than &lt;code&gt;unittest&lt;/code&gt;’s &lt;code&gt;subTest&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The overall approach: define a &lt;code&gt;FakeRedis&lt;/code&gt; fixture in &lt;code&gt;conftest.py&lt;/code&gt; and monkeypatch &lt;code&gt;redis.Redis.from_url&lt;/code&gt; so that every LangChain Redis call hits the in-memory implementation transparently. Then split the tests into modules:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;test_integrity.py&lt;/code&gt;: verify store / retrieve consistency and cross-instance loading&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;test_compatibility.py&lt;/code&gt;: simulate old serialized payloads and test migration / downgrade logic&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;test_performance.py&lt;/code&gt;: use pytest-benchmark to measure read/write ceilings&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;test_concurrency.py&lt;/code&gt;: multiple threads appending to the same memory, checking for data loss&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Why not use a real Redis? A real instance is essential for CI smoke tests, but hitting one on every push during development is slow and messy. FakeRedis lets the whole suite run in just a few hundred milliseconds. That frictionless experience is what makes the team actually want to write tests — &lt;strong&gt;zero friction&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Core Implementation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. conftest: Hijacking LangChain’s Redis connection with FakeRedis
&lt;/h3&gt;

&lt;p&gt;The central idea is: every test shares one in-memory Redis, completely transparent to LangChain. We monkeypatch both &lt;code&gt;redis.Redis.from_url&lt;/code&gt; and the direct constructor, so no matter how &lt;code&gt;RedisChatMessageHistory&lt;/code&gt; creates a client, it always lands on the same FakeRedis instance.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# conftest.py
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pytest&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;fakeredis&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FakeRedis&lt;/span&gt;

&lt;span class="nd"&gt;@pytest.fixture&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scope&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;function&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fake_redis&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;每个测试函数独立的 FakeRedis 实例&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;FakeRedis&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nd"&gt;@pytest.fixture&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;autouse&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;patch_redis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;monkeypatch&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fake_redis&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;将所有对 Redis 的调用劫持到 FakeRedis，实现零外部依赖&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# 劫持 from_url 方法，LangChain 内部用这个创建连接
&lt;/span&gt;    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_fake_from_url&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;fake_redis&lt;/span&gt;

    &lt;span class="n"&gt;monkeypatch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Redis&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;from_url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_fake_from_url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# 如果有地方直接 redis.Redis(...)，也一并拦截
&lt;/span&gt;    &lt;span class="n"&gt;monkeypatch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Redis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kw&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;fake_redis&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;fake_redis&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With this in place, any test that uses the &lt;code&gt;patch_redis&lt;/code&gt; fixture automatically forces LangChain to read and write my isolated FakeRedis — the database is always pristine.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Testing integrity: write → reload must match bit for bit
&lt;/h3&gt;

&lt;p&gt;Below is &lt;code&gt;test_integrity.py&lt;/code&gt;. It verifies the most fundamental contract: whatever I store for a session must be returned exactly the same when loaded later. I parametrized it to cover single messages, medium-sized conversations, and massive message batches — all edge cases handled.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# test_integrity.py
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pytest&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.memory&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ConversationBufferMemory&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.memory.chat_message_histories&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RedisChatMessageHistory&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.schema&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;HumanMessage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AIMessage&lt;/span&gt;

&lt;span class="c1"&gt;# ... test functions covering write/read integrity
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The full suite (including the compatibility, performance, and concurrency modules) now lives in our CI pipeline. FakeRedis lets us run everything instantly, and the moment anyone bumps a LangChain version we catch serialization regressions before they ever reach production. Since that 2 AM wake-up call, we haven’t lost a single conversation to a silent pickle bug again.&lt;/p&gt;

</description>
      <category>python</category>
      <category>programming</category>
    </item>
    <item>
      <title>"How a Refresh Wiped Out 237 Drafts — and How We Used Playwright to Stop It Forever"</title>
      <dc:creator>BAOFUFAN</dc:creator>
      <pubDate>Mon, 04 May 2026 12:08:35 +0000</pubDate>
      <link>https://forem.com/_eb7f2a654e97a60ae9f96e/how-a-refresh-wiped-out-237-drafts-and-how-we-used-playwright-to-stop-it-forever-1ncm</link>
      <guid>https://forem.com/_eb7f2a654e97a60ae9f96e/how-a-refresh-wiped-out-237-drafts-and-how-we-used-playwright-to-stop-it-forever-1ncm</guid>
      <description>&lt;p&gt;At 2 AM, I was jolted awake by a call from operations. Our user community was on fire: someone had spent half an hour filling out a complex form, accidentally hit refresh, and all their drafts vanished. A backend check showed 237 drafts reduced to just three. The backend wasn't to blame — the database never received a single request. The culprit was our frontend memory storage, and we had &lt;strong&gt;zero automated tests covering it&lt;/strong&gt;. Later, we added automated tests for localStorage and IndexedDB persistence using Playwright, and the same kind of incident never happened again. In this article, I'll walk you through the complete "memory storage health check" blueprint — code included.&lt;/p&gt;




&lt;h2&gt;
  
  
  Breaking Down the Problem: Why Memory Storage Fails Silently
&lt;/h2&gt;

&lt;p&gt;Frontend memory storage broadly refers to using &lt;code&gt;localStorage&lt;/code&gt;, &lt;code&gt;sessionStorage&lt;/code&gt;, &lt;code&gt;IndexedDB&lt;/code&gt;, or state management persistence (like Pinia persist plugins) to cache user input so data isn't lost when the page is refreshed or closed. It’s what lets you hit F5 by accident and still see what you were typing — a baseline experience for any modern web app.&lt;/p&gt;

&lt;p&gt;But its fragility is often underestimated. In our incident, the root cause was simple: a code refactor changed the serialization key for drafts. The old key was no longer read, so after a refresh the app assumed “no draft exists” and wrote an empty state, wiping everything. Conventional manual testing never covered this path because testers always fill forms from scratch — they don't deliberately refresh a half-filled form and then check for restoration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why common approaches fell short:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Unit tests&lt;/strong&gt;: mocking &lt;code&gt;localStorage&lt;/code&gt; can’t replicate real browser storage behavior, storage quotas, or serialization quirks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;E2E tests (Cypress/Playwright)&lt;/strong&gt;: they typically follow the “happy path” — fill from a blank state and submit — without intentionally triggering refresh or crash-recovery scenarios.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Manual verification&lt;/strong&gt;: you can’t guarantee someone clears storage, fills half a form, refreshes, and validates restoration on every regression cycle. It’s too costly and easy to miss.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We needed an &lt;strong&gt;automated, repeatable test suite that could assert storage contents&lt;/strong&gt; — something purpose-built to guard memory storage reliability.&lt;/p&gt;




&lt;h2&gt;
  
  
  Designing the Solution: Why Playwright for Memory Storage Testing
&lt;/h2&gt;

&lt;p&gt;We evaluated three approaches:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Browser extensions + script injection&lt;/strong&gt;: too hacky, impossible to integrate into CI, and can’t accurately simulate real user journeys.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cypress APIs like &lt;code&gt;cy.clearLocalStorage()&lt;/code&gt;&lt;/strong&gt;: they can manipulate storage, but IndexedDB support is weaker, and the execution model doesn’t produce a truly “native” refresh scenario.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Playwright&lt;/strong&gt;: native support for multiple browsers, isolated contexts, &lt;code&gt;page.evaluate()&lt;/code&gt; to read &lt;code&gt;localStorage&lt;/code&gt;/&lt;code&gt;IndexedDB&lt;/code&gt; directly, and &lt;code&gt;page.reload()&lt;/code&gt; that triggers a true page refresh. Plus, the test scripts are plain Node.js, seamlessly pluggable into existing CI pipelines.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Our final decision: &lt;strong&gt;write dedicated “memory storage regression cases” with Playwright, simulating the core path “fill → refresh → verify restoration” and running them automatically on every test pass&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The architecture is straightforward:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each test case creates an isolated browser context (&lt;code&gt;browser.newContext()&lt;/code&gt;) to guarantee a clean storage environment.
&lt;/li&gt;
&lt;li&gt;Each case follows three steps: &lt;strong&gt;write (simulate user input triggering auto-save) → refresh → read (assert that stored drafts match the form values after reload)&lt;/strong&gt;.
&lt;/li&gt;
&lt;li&gt;We designed reusable assertion helpers for both &lt;code&gt;localStorage&lt;/code&gt; and &lt;code&gt;IndexedDB&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Core Implementation: A Full-Body Checkup for Memory Storage
&lt;/h2&gt;

&lt;p&gt;Let’s build a runnable memory storage test suite with Playwright. &lt;strong&gt;The core problem it solves: verifying that after a user fills out a form and it’s auto-saved to localStorage, the draft survives a page refresh intact.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Simulated Page Logic
&lt;/h3&gt;

&lt;p&gt;To make the test runnable, here’s a minimal HTML page with auto-save to &lt;code&gt;localStorage&lt;/code&gt; and restoration on load. Save it as &lt;code&gt;test-app/index.html&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;!DOCTYPE html&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;html&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;body&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;form&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"draftForm"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;input&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"title"&lt;/span&gt; &lt;span class="na"&gt;placeholder=&lt;/span&gt;&lt;span class="s"&gt;"标题"&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;textarea&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"content"&lt;/span&gt; &lt;span class="na"&gt;placeholder=&lt;/span&gt;&lt;span class="s"&gt;"内容"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&amp;lt;/textarea&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/form&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;script&amp;gt;&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;form&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getElementById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;draftForm&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getElementById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;title&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getElementById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;content&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;STORAGE_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;draft_v1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;// 页面加载时恢复草稿&lt;/span&gt;
    &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;restore&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;saved&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;localStorage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getItem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;STORAGE_KEY&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;saved&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;draft&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;saved&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
          &lt;span class="nx"&gt;title&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;draft&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;title&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
          &lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;draft&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// 输入变化自动保存&lt;/span&gt;
    &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;autoSave&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;draft&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;title&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
      &lt;span class="nx"&gt;localStorage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setItem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;STORAGE_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;draft&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="nx"&gt;title&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addEventListener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;input&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;autoSave&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addEventListener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;input&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;autoSave&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nf"&gt;restore&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// 立刻恢复&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/script&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/body&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/html&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Playwright Test: Verifying Drafts Survive a Refresh
&lt;/h3&gt;

&lt;p&gt;Install Playwright: &lt;code&gt;npm i -D playwright @playwright/test&lt;/code&gt;&lt;br&gt;&lt;br&gt;
Create a &lt;code&gt;playwright.config.ts&lt;/code&gt; that points to the test directory and the base URL.&lt;/p&gt;

&lt;p&gt;The test below directly verifies refresh recovery — &lt;strong&gt;if this case fails, your live memory storage is broken&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// tests/memory-storage.spec.ts&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;expect&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@playwright/test&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;DRAFT_TEXT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;这是一段重要的草稿内容，不能丢&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nx"&gt;test&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;describe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;记忆存储 - 草稿恢复&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;test&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;beforeEach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>python</category>
      <category>programming</category>
    </item>
    <item>
      <title>I Spent 6 Hours Fixing LangChain's ConversationBufferMemory — Here's the Automated Test You Need</title>
      <dc:creator>BAOFUFAN</dc:creator>
      <pubDate>Mon, 04 May 2026 01:07:14 +0000</pubDate>
      <link>https://forem.com/_eb7f2a654e97a60ae9f96e/i-spent-6-hours-fixing-langchains-conversationbuffermemory-heres-the-automated-test-you-need-16m1</link>
      <guid>https://forem.com/_eb7f2a654e97a60ae9f96e/i-spent-6-hours-fixing-langchains-conversationbuffermemory-heres-the-automated-test-you-need-16m1</guid>
      <description>&lt;p&gt;At 4:59 PM on a Friday, I was about to close my laptop and sneak out when the QA colleague's icon flashed on DingTalk: "Come check this out. The support bot remembers I'm Zhang San, but when I ask for my order number, it insists it belongs to Li Si." I pulled up the logs and saw LangChain's &lt;code&gt;ConversationBufferMemory&lt;/code&gt; behaving like it had severe amnesia — Session A was mixing up chat history from Session B. In that moment, I knew that unless I built an automated test suite to lock down the accuracy and consistency of memory storage, the next blow-up would definitely happen at 2 AM.&lt;/p&gt;

&lt;h2&gt;
  
  
  Breaking Down the Problem
&lt;/h2&gt;

&lt;p&gt;In LLM-powered chat products, the memory module is responsible for remembering context across multiple turns — so that when the user said earlier "I live in Beijing," the weather query later can automatically include "Beijing." Sounds easy, but things get messy once you land in LangChain: &lt;code&gt;ConversationBufferMemory&lt;/code&gt; stores all conversations in plain text. It works fine as long as the memory fits in RAM, but switch to Redis or a database for persistence, and a whole bunch of issues bubble up — serialization/deserialization, concurrent reads/writes, and trimming old messages.&lt;/p&gt;

&lt;p&gt;In our production scenario, a customer service bot handled hundreds of concurrent users. Each user session was independent but they all shared a common Redis instance. When we first launched, QA manually tested a dozen typical conversation paths and found absolutely no cross-session memory leaks, because manual testing simply can't cover race conditions under high concurrency, nor reproduce edge cases where &lt;code&gt;trim_messages&lt;/code&gt; mixes up adjacent sessions when a Redis connection blinks out. Once real traffic hit, bugs popped up like whack-a-mole — you fixed one, another sprang out. We desperately needed a set of regression tests that could directly verify memory read/write accuracy and cross-session isolation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Designing the Solution
&lt;/h2&gt;

&lt;p&gt;The goal was clear: run through the core logic of the memory module right in our local CI, without a real LLM or a real Redis instance, and catch issues before any code landed.&lt;/p&gt;

&lt;p&gt;Framework choice was a no-brainer — Pytest. Its fixture capabilities are perfect for assembling different memory instances. LangChain's memory abstraction is fairly clean: &lt;code&gt;BaseChatMemory&lt;/code&gt; provides uniform &lt;code&gt;save_context&lt;/code&gt; and &lt;code&gt;load_memory_variables&lt;/code&gt; interfaces, so we could write the same set of tests against different memory backends. A real Redis is too heavy, so we chose &lt;code&gt;fakeredis&lt;/code&gt; to simulate a Redis instance in memory — quick to spin up and zero side effects. All LLM calls were banished with &lt;code&gt;unittest.mock&lt;/code&gt;, because we were testing memory, not the LLM.&lt;/p&gt;

&lt;p&gt;Why not use the built-in &lt;code&gt;langchain.tests&lt;/code&gt;? They only cover the shallowest of interfaces, none of the hard-won scenarios like message type conversion or multi-session isolation. We also didn't want to run Redis in a Docker container — our CI resources are already stretched thin; adding one more container would jam the build queue by an extra 3 minutes.&lt;/p&gt;

&lt;p&gt;The overall architecture: define a &lt;code&gt;fake_redis_memory&lt;/code&gt; fixture inside Pytest's &lt;code&gt;conftest.py&lt;/code&gt;, use it to construct different Memory subclasses (&lt;code&gt;ConversationBufferMemory&lt;/code&gt;, &lt;code&gt;ConversationSummaryMemory&lt;/code&gt;), simulate multi-turn conversations with helper functions, and then assert that the history returned by &lt;code&gt;load_memory_variables&lt;/code&gt; is both complete and free of cross-session contamination.&lt;/p&gt;

&lt;h2&gt;
  
  
  Core Implementation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Building a Zero-Dependency Test Harness
&lt;/h3&gt;

&lt;p&gt;This snippet packages fakeredis, mock LLM, and Memory instantiation into a fixture. All subsequent test cases run on top of it. The non-negotiable requirement: zero network requests, and any single test completes in under 0.3 seconds.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# conftest.py
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pytest&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;unittest.mock&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MagicMock&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.memory&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ConversationBufferMemory&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.chat_message_histories&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RedisChatMessageHistory&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;fakeredis&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FakeRedis&lt;/span&gt;

&lt;span class="nd"&gt;@pytest.fixture&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fake_redis_memory&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="c1"&gt;# 用 fakeredis 构建一个假 Redis 客户端
&lt;/span&gt;    &lt;span class="n"&gt;fake_redis_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FakeRedis&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_create_memory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# 注入伪造的 Redis，保证每次测试的 session 隔离
&lt;/span&gt;        &lt;span class="n"&gt;history&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;RedisChatMessageHistory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;redis_client&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;fake_redis_client&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# ConversationBufferMemory 默认 return_messages=True 时，会返回 Message 对象
&lt;/span&gt;        &lt;span class="n"&gt;memory&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ConversationBufferMemory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;chat_memory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;return_messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;  &lt;span class="c1"&gt;# 关键：确保拿到结构化消息，方便断言
&lt;/span&gt;        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;memory&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;_create_memory&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Testing Accuracy: Every Message Written Must Come Back
&lt;/h3&gt;

&lt;p&gt;This test simulates two rounds of conversation and verifies that the history returned by &lt;code&gt;load_memory_variables&lt;/code&gt; has the exact length and content we expect. It puts an end to the mysterious "I stored two lines but only got one back" bug.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
python
# test_memory_accuracy.py
from langchain.schema import HumanMessage, AIMessage

def test_buffer_memory_keeps_all_messages(fake_redis_memory):
    memory = fake_redis_memory("session_1202")

    # 模拟第一轮对话
    memory.save_context(
        {"input": "我叫张三"},
        {"output": "你好张三"}
    )
    # 模拟第二轮对话
    memory.save_context(
        {"input": "我的订单号是多少"},
        {"output": "你的订单号是 #1123"}
    )

    variables = memory.load_memory_variables({})
    history = variables.get("history", [])

    # 断言：总共应该有 4 条消息（两问两答）
    assert len(history) == 4
    assert isin
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>python</category>
      <category>programming</category>
    </item>
    <item>
      <title>Scaling Rate Limiting from Single‑Node to a Distributed Go+Redis Token Bucket — 10x Throughput Under Load (with Degradation Strategy)</title>
      <dc:creator>BAOFUFAN</dc:creator>
      <pubDate>Sun, 03 May 2026 12:08:09 +0000</pubDate>
      <link>https://forem.com/_eb7f2a654e97a60ae9f96e/scaling-rate-limiting-from-single-node-to-a-distributed-goredis-token-bucket-10x-throughput-ffg</link>
      <guid>https://forem.com/_eb7f2a654e97a60ae9f96e/scaling-rate-limiting-from-single-node-to-a-distributed-goredis-token-bucket-10x-throughput-ffg</guid>
      <description>&lt;p&gt;At 2 AM, an alert pulled me out of bed — the database connection pool of our order service was exhausted, and most requests were returning 504. It turned out a marketing campaign was driving triple the usual traffic. Our in‑memory per‑instance token bucket rate limiter, deployed across three replicas, operated in isolation; global rate limiting was effectively non‑existent. That moment I realized: &lt;strong&gt;if the state is not shared, rate limiting is just an illusion.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Breaking Down the Problem
&lt;/h2&gt;

&lt;p&gt;This is distressingly common in microservices. To protect downstream services, teams often set a limit like “max 200 QPS per instance”. Deploy three instances, and you might assume global traffic will be capped at 600 QPS. In reality, load balancing is rarely perfectly even — one instance exhausts its 200 QPS quota while the other two still have headroom. The actual peak hitting the downstream can easily exceed 900 QPS. This is the fatal flaw of per‑instance rate limiting at scale: &lt;strong&gt;the limiting logic is chopped up by instance boundaries, becoming “paper‑only” rate limiting from a global perspective.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The root cause is simple: the token bucket’s &lt;strong&gt;current token count and last refill timestamp&lt;/strong&gt; live purely in memory and are not shared across instances. A typical Redis fixed‑window counter (INCR + EXPIRE) can share state, but it suffers from boundary spikes — the last 100 ms of one second and the first 100 ms of the next can overlap to produce a burst of twice the allowed rate, still dangerous for downstream systems. We needed a solution that shares state &lt;em&gt;and&lt;/em&gt; smooths traffic — a distributed token bucket.&lt;/p&gt;

&lt;h2&gt;
  
  
  Design
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Choice: Go + Redis + Lua script for a distributed token bucket.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Why not the other options?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Nginx/gateway‑level rate limiting&lt;/strong&gt;: Adds a proxy hop and decouples from business logic, making it hard to implement fine‑grained controls (e.g., mixed limiting by user and API).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pure Redis sliding window&lt;/strong&gt;: Doable with sorted sets, but you must constantly evict expired members, incurring memory and CPU overhead, and the algorithmic complexity often introduces performance bottlenecks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Go distributed rate‑limiting libraries&lt;/strong&gt;: Many are unmaintained or only support simple fixed‑window counters, lacking the flexibility of a token bucket.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The final architecture is straightforward: move the token bucket’s core state (&lt;code&gt;tokens&lt;/code&gt;, &lt;code&gt;last_refill_time&lt;/code&gt;) into Redis, and use a Lua script to atomically calculate and update them. Thanks to Redis’s single‑threaded execution model, unlimited concurrent requests are serialized safely. The application side wraps this in a &lt;code&gt;DistributedTokenBucket&lt;/code&gt; struct that integrates a built‑in degradation strategy: &lt;strong&gt;when Redis is unavailable (timeout, disconnection), it automatically falls back to a local &lt;code&gt;golang.org/x/time/rate&lt;/code&gt; token bucket.&lt;/strong&gt; Even if Redis completely goes down, downstream services are not overwhelmed — we degrade to single‑instance rate limiting, preserving the fundamental protection.&lt;/p&gt;

&lt;h2&gt;
  
  
  Core Implementation
&lt;/h2&gt;

&lt;p&gt;The following Lua script handles the atomic “token generation + consumption check” step. It accepts the timestamp as an argument to avoid relying on potentially inconsistent system clocks across instances. (Using &lt;code&gt;redis.call('TIME')&lt;/code&gt; is also an option, depending on your consistency paranoia.)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// 这段代码解决：如何用一段 Lua 保证“计算新增令牌 -&amp;gt; 判断是否足够 -&amp;gt; 扣减”的原子性&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;tokenBucketLua&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;`
local key       = KEYS[1]              -- 令牌桶 key
local rate      = tonumber(ARGV[1])    -- 每秒生成令牌数
local capacity  = tonumber(ARGV[2])    -- 桶容量
local now       = tonumber(ARGV[3])    -- 当前时间戳（毫秒）
local requested = tonumber(ARGV[4])    -- 请求令牌数

local bucket = redis.call('HMGET', key, 'tokens', 'last_refill')
local tokens = tonumber(bucket[1])
local last_refill = tonumber(bucket[2])

if tokens == nil then
    -- 首次访问，初始化令牌桶
    tokens = capacity
    last_refill = now
end

-- 计算经过的时间及新增令牌数
local delta = math.max(0, now - last_refill)
local new_tokens = math.floor(delta * rate / 1000)
tokens = math.min(capacity, tokens + new_tokens)

local allowed = 0
if tokens &amp;gt;= requested then
    tokens = tokens - requested
    allowed = 1
end

-- 更新 Redis 中的状态，并设置一个合理的 TTL 防止冷数据残留
redis.call('HMSET', key, 'tokens', tokens, 'last_refill', now)
redis.call('EXPIRE', key, 60)

return {allowed, tokens}
`&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, the Go struct and the core &lt;code&gt;Take&lt;/code&gt; method. Its responsibility is to execute the Lua script, handle Redis errors, and trigger the fallback path when Redis is not healthy.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// 这段代码解决：封装 Redis 调用，提供限流入口，并在 Redis 不可用时降级到本地令牌桶&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"context"&lt;/span&gt;
    &lt;span class="s"&gt;"errors"&lt;/span&gt;
    &lt;span class="s"&gt;"time"&lt;/span&gt;

    &lt;span class="s"&gt;"github.com/redis/go-redis/v9"&lt;/span&gt;
    &lt;span class="s"&gt;"golang.org/x/time/rate"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;DistributedTokenBucket&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;rdb&lt;/span&gt;        &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;
    &lt;span class="n"&gt;script&lt;/span&gt;     &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Script&lt;/span&gt;
    &lt;span class="n"&gt;key&lt;/span&gt;        &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;rate&lt;/span&gt;       &lt;span class="kt"&gt;float64&lt;/span&gt; &lt;span class="c"&gt;// 令牌/秒&lt;/span&gt;
    &lt;span class="n"&gt;capacity&lt;/span&gt;   &lt;span class="kt"&gt;int&lt;/span&gt;     &lt;span class="c"&gt;// 桶容量&lt;/span&gt;
    &lt;span class="n"&gt;fallback&lt;/span&gt;   &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;rate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Limiter&lt;/span&gt; &lt;span class="c"&gt;// 本地降级限流器&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;NewDistributedTokenBucket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rdb&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ratePerSec&lt;/span&gt; &lt;span class="kt"&gt;float64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;capacity&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;DistributedTokenBucket&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// 本地降级器：容量和速率取全局值的一部分，保护下游&lt;/span&gt;
    &lt;span class="n"&gt;fallbackLimiter&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;rate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewLimiter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Limit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ratePerSec&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;capacity&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;DistributedTokenBucket&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;rdb&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;      &lt;span class="n"&gt;rdb&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;script&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;   &lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewScript&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tokenBucketLua&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;      &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;rate&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;     &lt;span class="n"&gt;ratePerSec&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;capacity&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;capacity&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;fallback&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;fallbackLimiter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;DistributedTokenBucket&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Take&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;now&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UnixMilli&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;script&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rdb&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;capacity&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c"&gt;// Redis 不可用时，降级为本地令牌桶&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fallback&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Allow&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="p"&gt;([]&lt;/span&gt;&lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{})&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fallback&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Allow&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;allowed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fallback&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Allow&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;allowed&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This design ensures the happy path is fully distributed and cooperative, while the unhappy path keeps the system alive. No single point of failure in Redis should ever bypass &lt;em&gt;all&lt;/em&gt; protection.&lt;/p&gt;

&lt;p&gt;In our load tests, replacing the old per‑instance token bucket with this distributed implementation allowed us to safely absorb a 10x increase in global QPS without crashing the downstream. The fallback kicked in seamlessly during Redis failover, proving that the “paper‑only” days were over.&lt;/p&gt;

</description>
      <category>python</category>
      <category>programming</category>
    </item>
    <item>
      <title>Slash Multi-Level Cache Debugging Time by 90% with Pytest Parametrization</title>
      <dc:creator>BAOFUFAN</dc:creator>
      <pubDate>Sun, 03 May 2026 01:08:38 +0000</pubDate>
      <link>https://forem.com/_eb7f2a654e97a60ae9f96e/slash-multi-level-cache-debugging-time-by-90-with-pytest-parametrization-28kh</link>
      <guid>https://forem.com/_eb7f2a654e97a60ae9f96e/slash-multi-level-cache-debugging-time-by-90-with-pytest-parametrization-28kh</guid>
      <description>&lt;p&gt;The winter in Hangzhou is miserably damp. At 1:47 AM, I was jolted awake by an alert SMS — “User profile page returning mixed values, user A is seeing user B’s orders.” My gut told me it was cache corruption again. After digging around for a while, I found that the invalidation logic between the local &lt;code&gt;lru_cache&lt;/code&gt; and Redis had missed a single &lt;code&gt;delete&lt;/code&gt; in one branch. I had to manually run dozens of test cases just to reproduce it. The next day, I refactored these tests using Pytest parameterization, turning “manual brain exhaustion” into “automated machine exhaustion.” I’ve never lost sleep over this issue since. This article is about &lt;strong&gt;how to use Pytest parameterization to achieve zero-blind-spot testing for multi‑level cache (local + Redis) consistency verification&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Manually Testing Multi‑Level Caches Is a Bottomless Pit
&lt;/h2&gt;

&lt;p&gt;Multi‑level caching is a common pattern: read requests first check a local store (e.g., &lt;code&gt;lru_cache&lt;/code&gt; or &lt;code&gt;cachetools&lt;/code&gt;); on a miss, they hit Redis and then backfill the local cache. Writes update Redis and &lt;strong&gt;selectively invalidate&lt;/strong&gt; the local cache. That "selective invalidation" is a hotbed for bugs — you often skip clearing the local cache on certain update paths for performance reasons, and then a path you thought was safe suddenly breaks.&lt;/p&gt;

&lt;p&gt;For example, an endpoint that changes a username only deletes the Redis key &lt;code&gt;user:{id}&lt;/code&gt;, but the local cache key happens to be &lt;code&gt;user_profile:{id}&lt;/code&gt;. That’s a miss. More subtly, the local TTL is very short. During peak hours, high QPS constantly rebuilds the cache, masking the inconsistency; it’s only exposed late at night when traffic drops. Behavior in the test environment and production look completely different.&lt;/p&gt;

&lt;p&gt;Typical manual testing needs to cover: multi‑key mappings, reads after concurrent updates, backfill on cache miss, TTL expiration boundaries, in‑process mutual exclusion, and more. A human brain can enumerate maybe 20 combinations and still often falls short. Pytest parametrization automates this entire process, and &lt;strong&gt;test cases double as documentation&lt;/strong&gt;, so even newcomers understand them in seconds.&lt;/p&gt;




&lt;h2&gt;
  
  
  Design: Use &lt;code&gt;@pytest.mark.parametrize&lt;/code&gt; to Build a “Scenario Matrix”
&lt;/h2&gt;

&lt;p&gt;My goal wasn’t to test the caching middleware itself but to &lt;strong&gt;verify that the business logic’s composition is correct&lt;/strong&gt;. So I adopted a layered testing approach:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Fake Redis&lt;/strong&gt; (using the &lt;code&gt;fakeredis&lt;/code&gt; library) to eliminate external dependencies and let tests run directly in CI.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The system under test is a &lt;code&gt;CacheManager&lt;/code&gt; class&lt;/strong&gt; that encapsulates the strategy: “local read → Redis read → local backfill” as well as “write Redis + local cleanup.”&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test cases are generated via parameterization&lt;/strong&gt;, covering: whether a key hits in local store, whether it hits Redis, whether backfill occurs, whether the local cache is correctly deleted after a write, and whether dirty reads happen under concurrent access.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Why not use integration tests against a real Redis? &lt;strong&gt;Speed&lt;/strong&gt;. These parameterized cases will eventually cover hundreds of combinations; a unit test must complete in milliseconds, otherwise nobody will run them frequently. And no Docker dependency means what‑you‑see‑is‑what‑you‑get.&lt;/p&gt;




&lt;h2&gt;
  
  
  Core Implementation: Multi‑Level Cache Class + Pytest Parameterized Tests
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. The &lt;code&gt;CacheManager&lt;/code&gt; under test (ready to run)
&lt;/h3&gt;

&lt;p&gt;This code implements the read path (“local first, then remote”) and the write path (“remote first, then clear local”).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# cache_manager.py
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;functools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;lru_cache&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;redis_lib&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CacheManager&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;本地(LRU) + Redis 两级缓存管理器&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;redis_client&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;redis_lib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Redis&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;local_ttl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;redis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;redis_client&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;local_ttl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;local_ttl&lt;/span&gt;
        &lt;span class="c1"&gt;# 本地缓存，最多存 128 个 key，用于实际业务限制内存
&lt;/span&gt;        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_local_store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_local_get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;从本地字典读，并检查过期时间&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="n"&gt;entry&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_local_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;local_ttl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;del&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_local_store&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;value&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_local_set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_local_store&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;value&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_local_delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_local_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# 1. 先查本地
&lt;/span&gt;        &lt;span class="n"&gt;val&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_local_get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;

        &lt;span class="c1"&gt;# 2. 再查 Redis
&lt;/span&gt;        &lt;span class="n"&gt;val&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# 3. 回填本地缓存，注意解码
&lt;/span&gt;            &lt;span class="n"&gt;decoded&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;
            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_local_set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;decoded&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;decoded&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ttl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# 先写远程，再清本地，保证下次本地读强一致
&lt;/span&gt;        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ttl&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# 这里故意只清本地，依赖下次 get 回填
&lt;/span&gt;        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_local_delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Pytest parameterized tests – covering read‑write combinations
&lt;/h3&gt;

&lt;p&gt;The code below solves the problem of &lt;strong&gt;exhaustively iterating all permutations: “local hit/miss × Redis hit/miss × read‑after‑write,”&lt;/strong&gt; verifying both the correctness of the returned values and the backfill logic.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# test_cache_consistency.py
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pytest&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;redis_lib&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;fakeredis&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FakeRedis&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;cache_manager&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;CacheManager&lt;/span&gt;

&lt;span class="nd"&gt;@pytest.fixture&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fake_redis&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>python</category>
      <category>programming</category>
    </item>
    <item>
      <title>From 800 Lines of Shell to 30 Lines of Pytest: 10x Redis Persistence Testing Efficiency</title>
      <dc:creator>BAOFUFAN</dc:creator>
      <pubDate>Sat, 02 May 2026 12:08:06 +0000</pubDate>
      <link>https://forem.com/_eb7f2a654e97a60ae9f96e/from-800-lines-of-shell-to-30-lines-of-pytest-10x-redis-persistence-testing-efficiency-5e9k</link>
      <guid>https://forem.com/_eb7f2a654e97a60ae9f96e/from-800-lines-of-shell-to-30-lines-of-pytest-10x-redis-persistence-testing-efficiency-5e9k</guid>
      <description>&lt;p&gt;It was 2 a.m. when I got jolted awake by an alerting call—all user points data had rolled back by three hours. After digging for ages, I found that ops had tweaked the &lt;code&gt;save&lt;/code&gt; parameter in &lt;code&gt;redis.conf&lt;/code&gt;, changing the RDB snapshot interval from 5 minutes to 3 hours. When the node restarted, a massive amount of hot data simply evaporated. What made it worse: this configuration change had been “tested manually”. A colleague restarted Redis, saw that the keys were still there, and called it good. I cursed at the screen: “What’s the point of testing if you test like this?”&lt;/p&gt;

&lt;p&gt;The next day, I tore down the entire persistence verification setup and rebuilt it with &lt;strong&gt;pytest + Docker&lt;/strong&gt; as an automated test suite. &lt;strong&gt;What used to take 800 lines of Shell and 2 hours of environment tweaking now runs in a few minutes with 30 lines of pytest.&lt;/strong&gt; Best of all, any reckless change to the persistence configuration can be proven within 10 seconds—did we lose data or not?&lt;/p&gt;

&lt;h2&gt;
  
  
  Breaking down the problem: why manual Shell/Docker persistence tests are basically useless
&lt;/h2&gt;

&lt;p&gt;Redis persistence comes in three flavors: RDB, AOF, and a mix of both, plus a jungle of parameters like &lt;code&gt;save&lt;/code&gt;, &lt;code&gt;appendfsync&lt;/code&gt;, &lt;code&gt;aof-use-rdb-preamble&lt;/code&gt;, and many more—combinatorial explosion. Most teams verify persistence in one of two ways:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Manually starting and stopping Docker containers&lt;/strong&gt;, writing a few items with &lt;code&gt;redis-cli&lt;/code&gt;, doing &lt;code&gt;docker restart&lt;/code&gt;, then running &lt;code&gt;KEYS *&lt;/code&gt;—which only proves “it can start”, not “how many seconds of data disappeared”.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Writing a pile of Shell scripts&lt;/strong&gt; that use &lt;code&gt;docker exec&lt;/code&gt; to drive &lt;code&gt;redis-cli&lt;/code&gt; and then &lt;code&gt;diff&lt;/code&gt; the data—scripts that get bloated and are brittle because the environment changes every time: &lt;code&gt;docker stop&lt;/code&gt; wait time, file cleanup policies—even a minor change makes results unpredictable.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The root cause is clear: &lt;strong&gt;Redis persistence is the product of a time window, system signals, and filesystem flushing. Manual operation simply can’t control these precisely.&lt;/strong&gt; For example, &lt;code&gt;docker stop&lt;/code&gt; sends SIGTERM to the container by default; when Redis receives it, it tries to perform an RDB save. But how long does that save take? Will it be cut off by SIGKILL? A Shell script has no ability to simulate fault scenarios like “how much data is lost at the moment of a crash.” Even more importantly, &lt;strong&gt;consistency verification lacks repeatable assertions&lt;/strong&gt;—manual testing only gives you a gut feeling that “probably nothing was lost.” That’s a landmine for production.&lt;/p&gt;

&lt;h2&gt;
  
  
  Solution design: why pytest + Docker, not Testcontainers or a K8s Job?
&lt;/h2&gt;

&lt;p&gt;I wanted a &lt;strong&gt;programmable, assertable, reproducible&lt;/strong&gt; test framework with these core requirements:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Precisely control Redis startup parameters and persistence configuration&lt;/li&gt;
&lt;li&gt;Simulate real-world failures: &lt;code&gt;kill -9&lt;/code&gt;, power-off-style shutdown, AOF file truncation, etc.&lt;/li&gt;
&lt;li&gt;Automatically clean up the environment after a run—no leftover garbage&lt;/li&gt;
&lt;li&gt;Run in CI/CD, but also instantly on a dev machine&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Technology comparison:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Solution&lt;/th&gt;
&lt;th&gt;Pros&lt;/th&gt;
&lt;th&gt;Why not chosen&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Shell + docker-compose&lt;/td&gt;
&lt;td&gt;Team familiarity&lt;/td&gt;
&lt;td&gt;Weak assertions, unable to precisely control restarts and signals, shell script maintenance nightmare&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Testcontainers (Python)&lt;/td&gt;
&lt;td&gt;Native pytest integration, good lifecycle management&lt;/td&gt;
&lt;td&gt;Can only operate parameters through &lt;code&gt;redis-cli&lt;/code&gt; after initialization? In reality, config changes (e.g., dynamic AOF switching) require another wrapper; also the underlying &lt;code&gt;docker-java&lt;/code&gt; isn’t very friendly to Python, high debugging cost&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kubernetes Job&lt;/td&gt;
&lt;td&gt;Production-grade&lt;/td&gt;
&lt;td&gt;Too heavy, can’t run locally, CI needs a K8s cluster – using a sledgehammer to crack a nut&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;docker-py + pytest&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Lightweight, programmable container lifecycle control, native Python assertions&lt;/td&gt;
&lt;td&gt;This is the one I chose. Use the &lt;code&gt;docker&lt;/code&gt; SDK to start/stop containers and manage volumes, &lt;code&gt;redis-py&lt;/code&gt; for data read/write, pytest fixtures for environment injection. The whole solution is under 500 lines of Python, and on CI it only depends on a Docker daemon.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Architecturally, I split the tests into three layers:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Infrastructure layer&lt;/strong&gt;: &lt;code&gt;docker-py&lt;/code&gt; creates Redis containers, mounts temporary volumes for RDB/AOF files.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Operation layer&lt;/strong&gt;: &lt;code&gt;redis-py&lt;/code&gt; writes, reads, issues &lt;code&gt;CONFIG SET&lt;/code&gt;, &lt;code&gt;BGSAVE&lt;/code&gt;, etc.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Assertion layer&lt;/strong&gt;: pytest asserts whether data exists, whether files were created, whether the AOF contains the last write.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This layering lets test cases focus only on “write data → how it dies → is the data correct after restart,” without caring about how the container starts or what mount paths are used.&lt;/p&gt;

&lt;h2&gt;
  
  
  Core implementation: ready-to-run test code
&lt;/h2&gt;

&lt;p&gt;The following code addresses one problem: &lt;strong&gt;verify that after a Redis process is killed with &lt;code&gt;kill -9&lt;/code&gt;, all data written after the last BGSAVE is lost as expected—and no extra loss occurs&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. conftest.py: managing the Redis container lifecycle with a fixture
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
python
# conftest.py
import pytest
import docker
import redis
import time
import os

REDIS_IMAGE = "redis:7.2"  # 固定版本，避免 CI 上拉取 latest 导致不一致

@pytest.fixture(scope="function")
def rdb_container(tmp_path):
    """
    启动一个配置了 RDB 持久化的 Redis 容器，数据文件写入临时目录。
    tmp_path 是 pytest 提供的临时路径，每个测试函数独立，互不干扰。
    """
    client = docker.from_env()
    data_dir = tmp_path / "data"
    data_dir.mkdir()

    container = client.containers.run(
        image=REDIS_IMAGE,
        name=f"redis-rdb-test-{os.getpid()}",  # 避免容器重名
        command=[
            "redis-server",
            "--save 900 1",        # 900秒内至少1次修改则保存，这里故意设大，手动控制BGSAVE
            "--save 300 10",
            "--save 60 10000",
            "--dir /data",
            "--dbfilename dump.rdb"
        ],
        volumes={str(data_dir): {"bind": "/data", "mode": "rw"}},
        ports={"6379/tcp": None},  # 让 Docker 分配随机端口
        detach=True,
        remov
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>python</category>
      <category>programming</category>
    </item>
    <item>
      <title>Uncovering 8% IndexedDB Data Loss After Browser Crashes with Playwright</title>
      <dc:creator>BAOFUFAN</dc:creator>
      <pubDate>Sat, 02 May 2026 01:07:53 +0000</pubDate>
      <link>https://forem.com/_eb7f2a654e97a60ae9f96e/uncovering-8-indexeddb-data-loss-after-browser-crashes-with-playwright-3j2m</link>
      <guid>https://forem.com/_eb7f2a654e97a60ae9f96e/uncovering-8-indexeddb-data-loss-after-browser-crashes-with-playwright-3j2m</guid>
      <description>&lt;p&gt;At 2 a.m., our user group exploded — people were saying data had just vanished, as if the browser had “eaten” it. Our frontend stores application state in IndexedDB, which is supposed to be far more reliable than localStorage. How could it disappear without a trace? I spent two hours digging through logs and backend records before zeroing in on a dark secret of browser storage: when disk space gets tight, Chrome will silently delete IndexedDB data without any notification. Worse, you can’t reproduce it by hand because you’re not running on the “chosen” hard drive. I decided to write an automated test with Playwright that simulates browser crashes and storage pressure — and expose IndexedDB’s real behavior.&lt;/p&gt;

&lt;h2&gt;
  
  
  Breaking down the problem
&lt;/h2&gt;

&lt;p&gt;IndexedDB was designed to be a client-side persistent storage, and the W3C spec even says “data should be kept as long as possible”. But a spec is one thing; what browser vendors actually implement is another. Chrome has a mechanism called &lt;strong&gt;“Storage Pressure Eviction”&lt;/strong&gt;: when the user’s disk space drops below a certain threshold, the browser evicts data from less “important” origins using an LRU policy. By default, IndexedDB does not request a persistent-storage permission (&lt;code&gt;navigator.storage.persist()&lt;/code&gt;), so it’s very easy to get kicked out. If you haven’t applied for persistent storage permission in a PWA, your database is about as sturdy as a camping tent.&lt;/p&gt;

&lt;p&gt;Why don’t normal testing approaches work? Because manual testing only covers “normal reads and writes” — it can’t simulate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A sudden browser process crash (kill, power loss)&lt;/li&gt;
&lt;li&gt;The context being unexpectedly destroyed and then restarted (user closing a tab and reopening it)&lt;/li&gt;
&lt;li&gt;The internal cleanup triggered by a disk-space warning&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These scenarios require a controlled environment where you can repeatedly run a fast write → destroy → rebuild → verify loop automatically. That’s exactly what Playwright’s Browser Context isolation and its rich CDP (Chrome DevTools Protocol) capabilities are built for.&lt;/p&gt;

&lt;h2&gt;
  
  
  Solution design
&lt;/h2&gt;

&lt;p&gt;I didn’t choose Selenium because it’s too heavy and context management feels unnatural. I skipped Puppeteer because Playwright natively supports multiple browsers and multiple contexts with a more modern API. Most importantly, each context created by Playwright’s &lt;code&gt;browser.new_context()&lt;/code&gt; has its own independent storage sandbox — closing that context is equivalent to destroying the entire session’s IndexedDB, perfectly simulating the “user closes browser / tab” action.&lt;/p&gt;

&lt;p&gt;The architecture is a straightforward “brutal loop validation”:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Use Playwright to create a persistent context (so it won’t be automatically cleaned up).&lt;/li&gt;
&lt;li&gt;Open the page and inject a script that writes a record with a unique ID and a checksum into IndexedDB, then explicitly call &lt;code&gt;navigator.storage.persist()&lt;/code&gt; to request persistence.&lt;/li&gt;
&lt;li&gt;Actively close that context to simulate a browser close or crash.&lt;/li&gt;
&lt;li&gt;Create a new context, open the same page, read from IndexedDB, and check both data integrity and the number of records.&lt;/li&gt;
&lt;li&gt;Repeat N times, each time writing data of random sizes and occasionally using CDP commands to simulate storage-pressure events.&lt;/li&gt;
&lt;li&gt;Count the number of data-loss events and inconsistencies, then generate a report.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Why not use incognito mode for this? Because IndexedDB in incognito is designed to be wiped on close — testing persistence there would be pure performance art.&lt;/p&gt;

&lt;h2&gt;
  
  
  Core implementation
&lt;/h2&gt;

&lt;p&gt;First, install Playwright and pytest. Then you can run the following three pieces of code directly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Code 1: IndexedDB utility functions — solving “how to reliably write and make sure it’s actually flushed to disk”&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is the foundation. Inside &lt;code&gt;page.evaluate()&lt;/code&gt; we wrap the entire IndexedDB transaction lifecycle in a Promise, ensuring the data is committed before returning.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# idb_helpers.py
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;playwright.sync_api&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Page&lt;/span&gt;

&lt;span class="n"&gt;IDB_WRITE_SCRIPT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
async (dbName, storeName, key, value) =&amp;gt; {
    return new Promise((resolve, reject) =&amp;gt; {
        const request = indexedDB.open(dbName, 1);
        request.onupgradeneeded = (event) =&amp;gt; {
            const db = event.target.result;
            if (!db.objectStoreNames.contains(storeName)) {
                db.createObjectStore(storeName, { keyPath: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; });
            }
        };
        request.onsuccess = (event) =&amp;gt; {
            const db = event.target.result;
            // The transaction scope must include storeName, otherwise the write won&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;t go through
            const tx = db.transaction(storeName, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;readwrite&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;);
            const store = tx.objectStore(storeName);
            // Store a CRC field inside value to verify consistency later
            store.put({ id: key, data: value, checksum: simpleChecksum(value) });
            tx.oncomplete = () =&amp;gt; resolve(true);
            tx.onerror = (e) =&amp;gt; reject(e);
        };
        request.onerror = (e) =&amp;gt; reject(e);

        function simpleChecksum(str) {
            let hash = 0;
            for (let i = 0; i &amp;lt; str.length; i++) {
                hash = ((hash &amp;lt;&amp;lt; 5) - hash) + str.charCodeAt(i);
                hash |= 0; // Convert to 32bit integer
            }
            return hash;
        }
    };
}
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;write_indexeddb&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Page&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;db_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;store_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;IDB_WRITE_SCRIPT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;db_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;store_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why did we add a checksum here? Because we&lt;/p&gt;

</description>
      <category>python</category>
      <category>programming</category>
    </item>
    <item>
      <title>3 Asyncio Pitfalls That Took Me 3 Hours to Debug and Almost Crashed Production</title>
      <dc:creator>BAOFUFAN</dc:creator>
      <pubDate>Fri, 01 May 2026 20:25:25 +0000</pubDate>
      <link>https://forem.com/_eb7f2a654e97a60ae9f96e/3-asyncio-pitfalls-that-took-me-3-hours-to-debug-and-almost-crashed-production-1fdm</link>
      <guid>https://forem.com/_eb7f2a654e97a60ae9f96e/3-asyncio-pitfalls-that-took-me-3-hours-to-debug-and-almost-crashed-production-1fdm</guid>
      <description>&lt;p&gt;Here’s the story: last week my lead asked me to optimize a data aggregation service that calls 20 downstream APIs. The serial version took around 18 seconds — users were ready to throw their keyboards. Obvious IO-bound job, right? I thought I’d slap on asyncio, ship it in half a day, and look like a hero. Instead, I spent three hours falling into every rabbit hole asyncio had to offer, and nearly took down production. This post walks through the three biggest pitfalls I hit and how to write async code that actually works in the real world.&lt;/p&gt;

&lt;h2&gt;
  
  
  Get Your Concepts Straight First
&lt;/h2&gt;

&lt;p&gt;At its core, asyncio is a &lt;strong&gt;single-threaded event loop&lt;/strong&gt; — a master scheduler that lines up coroutines. When one coroutine is waiting on IO, the loop politely tells it to step aside and runs whichever coroutine is ready instead. You only need two keywords: &lt;code&gt;async def&lt;/code&gt; to define a coroutine function, and &lt;code&gt;await&lt;/code&gt; to yield control, telling the event loop “I’ll be waiting here, go do something else.”&lt;/p&gt;

&lt;p&gt;Most tutorials show you this perfect‑world example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# simulate network IO
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data from &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;tasks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;api/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;gather&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;tasks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Clean, elegant, 5 requests in 1 second. But the moment you drop this into a real project, things get messy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pitfall 1: &lt;code&gt;await&lt;/code&gt; Inside a Sync Function — And Boom, Errors
&lt;/h2&gt;

&lt;p&gt;I naively added &lt;code&gt;await fetch()&lt;/code&gt; right inside an existing Flask route function. Immediate &lt;code&gt;SyntaxError: 'await' outside async function&lt;/code&gt;. Alright, I’ll just change the route to &lt;code&gt;async def&lt;/code&gt;. Request comes in — &lt;code&gt;RuntimeError: There is no current event loop in thread 'Thread-1'&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Here’s why: Flask uses a thread pool to handle requests. Each worker thread doesn’t have its own event loop, and you can’t just call &lt;code&gt;asyncio.run()&lt;/code&gt; inside a thread that already has a loop running. My view ended up calling &lt;code&gt;asyncio.run(main())&lt;/code&gt; and triggered a cascade of “event loop already running” errors.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What you should do:&lt;/strong&gt; If you can, switch to an async‑native framework like Quart or FastAPI. If you’re stuck with Flask, create a global event loop at startup and schedule work with &lt;code&gt;loop.run_until_complete()&lt;/code&gt;. Or, even simpler: spin up a background asyncio thread and communicate with the web thread via a queue.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pitfall 2: Blocking Calls Inside a Coroutine — Performance Tanks
&lt;/h2&gt;

&lt;p&gt;Feeling clever, I wrote:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;gather&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;call_api_blocking&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Total time? Still ~18 seconds. Logging showed each task finishing one after another, no concurrency at all. The culprit: &lt;code&gt;call_api_blocking&lt;/code&gt; used &lt;code&gt;requests.get()&lt;/code&gt;, a synchronous blocking call. &lt;code&gt;await&lt;/code&gt; is useless here — while the first &lt;code&gt;requests.get&lt;/code&gt; sits there, the whole thread is frozen and no other coroutine gets a chance to run.&lt;/p&gt;

&lt;p&gt;Asyncio only plays nice with its own async IO primitives. When you have a blocking call, &lt;strong&gt;you must ship it to a thread pool&lt;/strong&gt; with &lt;code&gt;loop.run_in_executor()&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;call_api_async&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;loop&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_running_loop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;loop&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run_in_executor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now the blocking happens in a separate thread and the event loop can immediately switch to another coroutine. Later I replaced &lt;code&gt;requests&lt;/code&gt; with &lt;code&gt;aiohttp&lt;/code&gt; entirely, and performance really took off. &lt;strong&gt;The golden rule: async is all-or-nothing. Don’t mix in blocking calls that hijack your thread.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Pitfall 3: Orphaned Tasks — Memory Climbs, Then OOM
&lt;/h2&gt;

&lt;p&gt;After performance looked good, I rolled it out. Two days later, the pod was OOMKilled. Memory kept growing slowly, and the GC wasn’t collecting objects. After digging, I found the culprit. To “flexibly control concurrency” I had written something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;tasks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;urls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;tasks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tasks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Looks fine, right? But inside &lt;code&gt;process(url)&lt;/code&gt; some branches returned early, and a few exceptions weren’t handled properly. This left tasks in &lt;code&gt;PENDING&lt;/code&gt; or &lt;code&gt;CANCELLED&lt;/code&gt; state while still referenced by the &lt;code&gt;tasks&lt;/code&gt; list. Those tasks held onto large response data, so the GC chain was never broken — classic memory leak.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; Use &lt;code&gt;asyncio.TaskGroup&lt;/code&gt; (Python 3.11+) to manage lifetimes automatically. If any task fails, all others are cancelled and resources are cleaned up:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;TaskGroup&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;tg&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;urls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;tg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you’re on an older Python version, be diligent about cancelling pending tasks in a &lt;code&gt;finally&lt;/code&gt; block and clearing references.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Production‑Ready Version
&lt;/h2&gt;

&lt;p&gt;Here’s the core skeleton I ended up with — concurrency controlled via semaphore, a reused &lt;code&gt;aiohttp&lt;/code&gt; session, isolated exceptions, and timeouts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;aiohttp&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AsyncFetcher&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;concurrency&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sem&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Semaphore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;concurrency&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# limit concurrency to avoid hammering downstream
&lt;/span&gt;        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;timeout&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aiohttp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;ClientTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch_one&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;aiohttp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ClientSession&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sem&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>python</category>
      <category>异步编程</category>
      <category>asyncio</category>
      <category>性能优化</category>
    </item>
    <item>
      <title>I Rewrote Our Crawler with asyncio and Got a 15x Performance Boost</title>
      <dc:creator>BAOFUFAN</dc:creator>
      <pubDate>Fri, 01 May 2026 02:00:40 +0000</pubDate>
      <link>https://forem.com/_eb7f2a654e97a60ae9f96e/i-rewrote-our-crawler-with-asyncio-and-got-a-15x-performance-boost-199j</link>
      <guid>https://forem.com/_eb7f2a654e97a60ae9f96e/i-rewrote-our-crawler-with-asyncio-and-got-a-15x-performance-boost-199j</guid>
      <description>&lt;p&gt;Last week, I finally snapped. Our “legacy” news aggregator was crawling 200 sites in &lt;strong&gt;8 minutes&lt;/strong&gt;, with two database timeouts along the way. Ops complained it was “slower than a tortoise,” the product manager asked, “Can we get it under 1 minute?” I said: give me half a day, and I’ll rewrite it with asyncio.&lt;/p&gt;

&lt;p&gt;The result? &lt;strong&gt;Total time dropped from 487 seconds to 32 seconds — a 15x speedup.&lt;/strong&gt; My boss walked past my desk, glanced at the screen, and literally said, “Whoa, now &lt;em&gt;that’s&lt;/em&gt; the speed it should be.” Today I’ll walk you through that refactor — no textbook fluff, just real, battle‑tested takeaways.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why asyncio, not threading?
&lt;/h2&gt;

&lt;p&gt;When faced with I/O‑bound tasks, many folks reach for &lt;code&gt;concurrent.futures&lt;/code&gt; and thread pools. But threads come with GIL overhead, context‑switching costs, and let’s be honest — a crawler spends 99% of its time waiting for network responses. Using OS threads to “wait for I/O” is like hiring a fleet of drivers just to have them sit in their cars.&lt;/p&gt;

&lt;p&gt;asyncio takes a different approach: &lt;strong&gt;single thread + event loop&lt;/strong&gt;. When a coroutine is waiting for a network response, it voluntarily yields control (&lt;code&gt;await&lt;/code&gt;), and the event loop immediately switches to another coroutine that’s ready to run. No thread‑switching overhead, no lock contention, minimal memory footprint.&lt;/p&gt;

&lt;p&gt;Three core ingredients:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Event loop&lt;/strong&gt; – the scheduler; it runs whatever is ready.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Coroutines&lt;/strong&gt; – &lt;code&gt;async def&lt;/code&gt; functions that suspend with &lt;code&gt;await&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Futures/Tasks&lt;/strong&gt; – wrappers around coroutines that let you wait for results.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It’s a completely different mindset from synchronous code — you have to get comfortable thinking concurrently.&lt;/p&gt;

&lt;h2&gt;
  
  
  The refactor: from blocking sync to async concurrency
&lt;/h2&gt;

&lt;p&gt;Let’s start with the synchronous crawler I inherited (simplified core logic):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;

&lt;span class="n"&gt;URLS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://httpbin.org/delay/1?id=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch_sync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# each request blocks for 1 second (simulating network I/O)
&lt;/span&gt;    &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;fetch_sync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;URLS&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;elapsed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;同步耗时: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;elapsed&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;s, 结果数: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Output: 同步耗时: 10.12s, 结果数: 10
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Ten requests, each taking 1 second, executed one after another — naturally that’s 10 seconds. Who can put up with that?&lt;/p&gt;

&lt;p&gt;Converting to asyncio boils down to two steps: swap the I/O function for its async counterpart, then schedule everything concurrently.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;aiohttp&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;

&lt;span class="n"&gt;URLS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://httpbin.org/delay/1?id=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch_async&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;aiohttp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ClientSession&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# aiohttp async request — await yields control
&lt;/span&gt;    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;aiohttp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;ClientSession&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;tasks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;fetch_async&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;URLS&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;gather&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;tasks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c1"&gt;# run all coroutines concurrently
&lt;/span&gt;        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;

&lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="n"&gt;elapsed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;异步耗时: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;elapsed&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;s, 结果数: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Output: 异步耗时: 1.05s, 结果数: 10
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;asyncio.gather()&lt;/code&gt; fires off all 10 coroutines at once, so the total time is roughly that of the slowest request (1 second) instead of the sum. That’s the magic of the event loop: while coroutine 1 is waiting on I/O, the loop is already running coroutine 2, 3, … until a response arrives and control is handed back.&lt;/p&gt;

&lt;h2&gt;
  
  
  Going deeper: semaphores and error handling — don’t let async become chaos
&lt;/h2&gt;

&lt;p&gt;If you think the snippet above is production‑ready, you’re probably in for a rude awakening. The first pitfall I hit was &lt;strong&gt;unlimited concurrency&lt;/strong&gt;. When the URL list grew from 10 to 2,000, the target server instantly banned my IP — because I had opened 2,000 TCP connections at once.&lt;/p&gt;

&lt;p&gt;The fix: &lt;code&gt;asyncio.Semaphore&lt;/code&gt;, to cap the number of simultaneous coroutines.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch_with_limit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sem&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;retries&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;sem&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="c1"&gt;# semaphore controls how many coroutines run at once
&lt;/span&gt;        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;attempt&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;retries&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;HTTP &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;attempt&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;retries&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;请求失败: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;, 错误: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
                &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="n"&gt;attempt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# exponential backoff
&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main_with_limit&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;sem&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Semaphore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# max 50 concurrent requests
&lt;/span&gt;    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;aiohttp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;ClientSession&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;tasks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;fetch_with_limit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sem&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;fo&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>python</category>
      <category>异步编程</category>
      <category>asyncio</category>
      <category>爬虫实战</category>
    </item>
    <item>
      <title>asyncio Pitfalls: The Mistake That Cost Me 3 Hours</title>
      <dc:creator>BAOFUFAN</dc:creator>
      <pubDate>Fri, 01 May 2026 01:58:14 +0000</pubDate>
      <link>https://forem.com/_eb7f2a654e97a60ae9f96e/asyncio-pitfalls-the-mistake-that-cost-me-3-hours-4o58</link>
      <guid>https://forem.com/_eb7f2a654e97a60ae9f96e/asyncio-pitfalls-the-mistake-that-cost-me-3-hours-4o58</guid>
      <description>&lt;p&gt;Here’s the story: last week my boss threw a “simple” task at me — pull data from 120 internal APIs simultaneously and compile a report. I thought, “This is just I/O-bound work. I know asyncio like the back of my hand.” So I cranked out the first version in 10 minutes. To my disbelief, it ran even slower than a serial approach, and some endpoints never returned any data. That afternoon, I stared at the terminal output, tweaking and cursing for three full hours — until I spotted one innocuous function call. Then it all clicked.&lt;/p&gt;

&lt;p&gt;If you’re doing concurrency with asyncio, the following pitfalls might make you question your life choices.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. The Culprit: Synchronous Blocking Call Inside a Coroutine
&lt;/h2&gt;

&lt;p&gt;Here’s my first naive implementation — can you spot the problem right away?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;  &lt;span class="c1"&gt;# 注意：经典的同步库
&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch_api&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;协程函数：获取 API 数据&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Starting &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# 模拟获取数据 —— 这里埋了一颗大雷
&lt;/span&gt;    &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# 同步阻塞调用！
&lt;/span&gt;    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Finished &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;urls&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://httpbin.org/delay/1?req=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;gather&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;fetch_api&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;urls&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;elapsed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;10 请求耗时: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;elapsed&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;

&lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The result left me dumbfounded: 10 requests took over 10 seconds — exactly like a serial run. The reason is painfully simple: &lt;code&gt;requests.get()&lt;/code&gt; is a &lt;strong&gt;synchronous blocking&lt;/strong&gt; call. While waiting for the network, it completely holds the thread hostage, so the event loop never gets a chance to switch to another coroutine. Mixing synchronous code into an &lt;code&gt;async def&lt;/code&gt; is like stuffing a tractor engine into a sports car. The golden rule of asyncio is: &lt;strong&gt;every I/O operation must be asynchronous&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Two ways to fix it: swap to an async HTTP library (like &lt;code&gt;aiohttp&lt;/code&gt;), or offload the blocking call with &lt;code&gt;loop.run_in_executor&lt;/code&gt;. I recommend the former:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;aiohttp&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch_api&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;aiohttp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;ClientTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;urls&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://httpbin.org/delay/1?req=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;aiohttp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;ClientSession&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;tasks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;fetch_api&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;urls&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;gather&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;tasks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;elapsed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;10 请求耗时: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;elapsed&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;

&lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After the switch, 10 requests finished in about 1.5 seconds. My boss’s frown finally relaxed.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Forgetting to &lt;code&gt;await&lt;/code&gt; — The Coroutine That Never Ran
&lt;/h2&gt;

&lt;p&gt;This trap has bitten me more times than I’d like to admit. Check out this classic:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;say_hello&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hello&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="c1"&gt;# 事故现场：创建协程对象，但忘了 await
&lt;/span&gt;    &lt;span class="nf"&gt;say_hello&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;          &lt;span class="c1"&gt;# 只会返回一个 coroutine object，不会执行
&lt;/span&gt;    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;End&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you run it, the terminal only prints &lt;code&gt;End&lt;/code&gt;. The &lt;code&gt;Hello&lt;/code&gt; never appears. Python doesn’t raise an error — it silently creates a coroutine object and drops it into the void. The correct approach is &lt;code&gt;await say_hello()&lt;/code&gt;, or wrap it with &lt;code&gt;asyncio.create_task(say_hello())&lt;/code&gt; so the event loop manages it. My personal habit: &lt;strong&gt;whenever I call an &lt;code&gt;async def&lt;/code&gt; function, I either put &lt;code&gt;await&lt;/code&gt; in front of it or wrap it with &lt;code&gt;create_task&lt;/code&gt;. I never leave a coroutine naked.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Exception Handling in &lt;code&gt;gather&lt;/code&gt; — One Rotten Task Spoils the Whole Bunch
&lt;/h2&gt;

&lt;p&gt;When I took on that 120‑endpoint task, a few APIs occasionally timed out or returned 500. I used &lt;code&gt;asyncio.gather&lt;/code&gt; and quickly learned that &lt;strong&gt;if a single task raises an exception, all the other tasks — finished or unfinished — get cancelled&lt;/strong&gt;, leaving me with zero usable data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# 错误示范：一个炸，全家炸
&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;bad_request&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;接口挂了&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;good_request&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;正常数据&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;gather&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;bad_request&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="nf"&gt;good_request&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;捕获异常，但 good_request 的结果也丢了&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The fix is simple — add &lt;code&gt;return_exceptions=True&lt;/code&gt; to &lt;code&gt;gather&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;gather&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;task1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...,&lt;/span&gt;
    &lt;span class="n"&gt;return_exceptions&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nf"&gt;log_error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;      &lt;span class="c1"&gt;# 单独处理异常
&lt;/span&gt;    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With this pattern, you can gracefully handle partial failures — log the errors and still process all the valid responses. No more wasting 3 hours staring at the terminal!&lt;/p&gt;




&lt;p&gt;&lt;em&gt;These pitfalls are sneaky, but once you understand the underlying mechanics, asyncio becomes a powerful ally. Hope this saves you from the same debugging rabbit hole I fell into.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>asyncio</category>
      <category>爬虫</category>
      <category>性能优化</category>
    </item>
  </channel>
</rss>
