<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Sabita kumari</title>
    <description>The latest articles on Forem by Sabita kumari (@sabitak).</description>
    <link>https://forem.com/sabitak</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3866686%2Febada04b-906e-4a43-9d5b-8a6f35c42ef2.png</url>
      <title>Forem: Sabita kumari</title>
      <link>https://forem.com/sabitak</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/sabitak"/>
    <language>en</language>
    <item>
      <title>Large Language Models, Explained Like You're a Curious Human</title>
      <dc:creator>Sabita kumari</dc:creator>
      <pubDate>Fri, 10 Apr 2026 18:38:39 +0000</pubDate>
      <link>https://forem.com/sabitak/large-language-models-explained-like-youre-a-curious-human-51ac</link>
      <guid>https://forem.com/sabitak/large-language-models-explained-like-youre-a-curious-human-51ac</guid>
      <description>&lt;p&gt;Everything you need to know about how ChatGPT-style AI actually works.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Actually &lt;em&gt;Is&lt;/em&gt; a Large Language Model?
&lt;/h2&gt;

&lt;p&gt;Strip away the hype and an LLM is surprisingly simple in structure. It boils down to &lt;strong&gt;two files&lt;/strong&gt; sitting on a hard drive:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;A very large file of numbers&lt;/strong&gt; — these are the "parameters" (or weights) of the neural network. Think of them as billions of tiny dials that have been carefully tuned.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A small file of code&lt;/strong&gt; — this is the algorithm that reads those numbers and actually produces text. It can be as short as ~500 lines of C code.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's it. Meta's Llama 2 70B model, for example, is a 140 GB parameter file plus a tiny run script. Together, they can run on a regular MacBook — no internet needed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────┐     ┌─────────────────────────┐
│     📦 Parameters File      │     │      ⚙️ Run Code         │
│                             │  +  │                         │
│    140 GB of numbers        │     │   ~500 lines of C       │
│  Billions of tiny "dials"   │     │  The "engine" that      │
│  encoding world knowledge   │     │  reads the dials        │
└─────────────────────────────┘     └─────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;🧠 &lt;strong&gt;Everyday Analogy:&lt;/strong&gt; Imagine a piano with 70 billion keys. The &lt;strong&gt;parameters file&lt;/strong&gt; is a sheet of music that tells you exactly how hard to press each key. The &lt;strong&gt;run code&lt;/strong&gt; is the pianist who reads the sheet and plays. Together, they produce language.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  How Is an LLM "Trained"?
&lt;/h2&gt;

&lt;p&gt;Training is the expensive, one-time process of figuring out the right value for every single parameter. Here's the recipe for a model like Llama 2 70B:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Training Recipe:&lt;/strong&gt;&lt;br&gt;
📚 ~10 TB of internet text (books, articles, code, forums…)&lt;br&gt;
🖥️ A cluster of ~6,000 GPUs running for ~12 days&lt;br&gt;
💰 Roughly $2 million in compute costs&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;During training, the model is given a sentence with one word missing and asked: &lt;em&gt;"What comes next?"&lt;/em&gt; It guesses, checks the real answer, and adjusts its billions of dials slightly. Repeat this trillions of times, and the model becomes remarkably good at predicting the next word — and in doing so, it absorbs enormous amounts of factual knowledge, grammar, reasoning patterns, and even a bit of common sense.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌──────────────┐        ┌──────────────────┐        ┌─────────────────┐
│   🌐 Internet │ -----&amp;gt; │   🔥 Training     │ -----&amp;gt; │  🧠 Trained      │
│              │        │                  │        │     Model       │
│  ~10 TB text │        │  6,000 GPUs      │        │  140 GB params  │
│              │        │  12 days · $2M   │        │  Compressed     │
│              │        │  "Predict next   │        │  knowledge      │
│              │        │   word"          │        │                 │
└──────────────┘        └──────────────────┘        └─────────────────┘

        Lossy compression: 10 TB of knowledge → 140 GB of parameters
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;🧠 &lt;strong&gt;Everyday Analogy:&lt;/strong&gt; Think of training like a student reading the entire internet and taking the world's longest fill-in-the-blank exam. By forcing itself to predict missing words, it absorbs facts, writing styles, logic, and languages. The process is a &lt;strong&gt;lossy compression&lt;/strong&gt; — like squeezing a library into a zip file. Most of the knowledge is kept, but some details get lost or garbled.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The Three Stages of Building an AI Assistant
&lt;/h2&gt;

&lt;p&gt;A freshly trained model isn't ready to be a helpful chatbot. It goes through up to three stages to become the assistant you know from products like ChatGPT or Claude.&lt;/p&gt;

&lt;h3&gt;
  
  
  Stage 1 — Pre-training
&lt;/h3&gt;

&lt;p&gt;The model reads a massive chunk of the internet and learns to predict the next word. At this point it's like a very well-read parrot: it can generate text that &lt;em&gt;sounds&lt;/em&gt; like the internet, but it doesn't know how to hold a conversation. Ask it a question and it might just generate more questions, or make up a fake Wikipedia article. This is what people call &lt;strong&gt;"hallucination"&lt;/strong&gt; — the model is dreaming plausible-sounding text rather than answering you.&lt;/p&gt;

&lt;h3&gt;
  
  
  Stage 2 — Fine-tuning (Alignment)
&lt;/h3&gt;

&lt;p&gt;Human labelers write thousands of ideal question-and-answer pairs. The model is then trained on this curated dataset, teaching it to &lt;em&gt;behave&lt;/em&gt; like a helpful assistant: answer directly, refuse harmful requests, and follow instructions. Think of it as finishing school for the parrot — it learns manners and format.&lt;/p&gt;

&lt;h3&gt;
  
  
  Stage 3 — RLHF (Optional Polish)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Reinforcement Learning from Human Feedback.&lt;/strong&gt; Humans are shown two or more model answers and asked "which is better?" These preferences are used to further nudge the model toward responses people actually prefer. It's like letting a restaurant taste-tester rank dishes so the chef improves over time.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────┐       ┌─────────────────┐       ┌─────────────────┐
│  1️⃣ Pre-training │ ----&amp;gt; │  2️⃣ Fine-tuning  │ ----&amp;gt; │  3️⃣ RLHF         │
│                 │       │                 │       │                 │
│ Reads the       │       │ Learns Q&amp;amp;A      │       │ Human           │
│ internet        │       │ format          │       │ preferences     │
│ → Base model    │       │ → Assistant     │       │ → Polished      │
│                 │       │   model         │       │   assistant     │
└─────────────────┘       └─────────────────┘       └─────────────────┘

              Each stage builds on the previous one
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Scaling Laws: Bigger = Smarter (Predictably)
&lt;/h2&gt;

&lt;p&gt;One of the most surprising discoveries in AI is that LLM performance follows predictable &lt;strong&gt;scaling laws&lt;/strong&gt;. There are two main knobs you can turn:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;N&lt;/strong&gt; — the number of parameters (model size)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;D&lt;/strong&gt; — the amount of training data&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Crank either one up, and the model's ability to predict the next word improves in a smooth, predictable curve. And because next-word prediction accuracy correlates with reasoning ability, the model gets better at all sorts of tasks — math, coding, history, common sense — almost "for free," without being specifically taught those skills.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;🧠 &lt;strong&gt;Everyday Analogy:&lt;/strong&gt; It's like a student who reads more books and has a bigger brain — they get better at &lt;em&gt;everything&lt;/em&gt;, not just one subject. Double the books and brain size, and you can predict roughly how much smarter they'll get.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Tool Use &amp;amp; Multimodality: LLMs Learn to Use Tools
&lt;/h2&gt;

&lt;p&gt;Modern LLMs aren't limited to text-in, text-out. They're gaining abilities that make them feel more like capable assistants:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🌐 &lt;strong&gt;Web browsing&lt;/strong&gt; — searching for up-to-date information&lt;/li&gt;
&lt;li&gt;🧮 &lt;strong&gt;Calculator / code interpreter&lt;/strong&gt; — running Python to crunch numbers or make charts&lt;/li&gt;
&lt;li&gt;👁️ &lt;strong&gt;Vision&lt;/strong&gt; — understanding images, screenshots, diagrams&lt;/li&gt;
&lt;li&gt;🎤 &lt;strong&gt;Audio&lt;/strong&gt; — hearing speech and speaking back&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This means the model can look at a photo of a hand-drawn wireframe, write the HTML code for it, search the web for a library it needs, and run the code to show you a working preview — all in one conversation.&lt;/p&gt;




&lt;h2&gt;
  
  
  The "LLM Operating System" Vision
&lt;/h2&gt;

&lt;p&gt;Here's a powerful way to think about where this is all heading. Instead of viewing an LLM as a chatbot, think of it as the &lt;strong&gt;kernel&lt;/strong&gt; (the core brain) of a new kind of operating system:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;                        ┌──────────────────┐
                        │   🧠 Memory       │
                        │  Context window   │
                        └────────┬─────────┘
                                 │
  ┌──────────────────┐   ┌──────┴───────┐   ┌──────────────────┐
  │  📁 Local Files   │───│  LLM KERNEL  │───│   🔧 Tools       │
  │  Documents, data  │   │  Coordinates │   │  Browser, calc,  │
  └──────────────────┘   │  everything  │   │  code            │
                         │  like a CPU   │   └──────────────────┘
                         └──────┬───────┘
                                │
            ┌───────────────────┼───────────────────┐
            │                                       │
   ┌────────┴─────────┐                 ┌───────────┴────────┐
   │  🌐 Internet      │                 │  👁️🎤 Senses       │
   │  Search, APIs     │                 │  Vision, audio,    │
   └──────────────────┘                  │  speech            │
                                         └────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Just as Windows or macOS coordinates your screen, keyboard, files, and apps, the LLM OS coordinates memory, tools, files, and senses to solve whatever problem you throw at it. The conversation window is its RAM; the internet is its hard drive; the code interpreter is its app store.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Dark Side: Security Challenges
&lt;/h2&gt;

&lt;p&gt;This new paradigm is powerful — but it also opens up entirely new categories of attacks. Here are the four biggest threats researchers are racing to solve:&lt;/p&gt;

&lt;h3&gt;
  
  
  🎭 Jailbreak Attacks
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt; Tricking the model into ignoring its safety rules. For example, asking it to roleplay as a character who "happens" to reveal dangerous information, or encoding a harmful question in Base64 so the filter doesn't catch it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real-world analogy:&lt;/strong&gt; Convincing a security guard to let you in by wearing a costume.&lt;/p&gt;

&lt;h3&gt;
  
  
  🧬 Adversarial Attacks
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt; Specially crafted "gibberish" text suffixes or invisible noise patterns in images that exploit mathematical weaknesses in the neural network, forcing it to produce harmful output.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real-world analogy:&lt;/strong&gt; A dog whistle — sounds like nothing to humans, but the model "hears" a command.&lt;/p&gt;

&lt;h3&gt;
  
  
  💉 Prompt Injection
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt; Hiding secret instructions in web pages or documents (e.g., in white text on a white background) that the model reads and obeys when it browses or processes files. The model can't easily tell "user instructions" from "content instructions."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real-world analogy:&lt;/strong&gt; Slipping a forged memo into someone's inbox so they follow fake orders.&lt;/p&gt;

&lt;h3&gt;
  
  
  ☠️ Data Poisoning
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt; An attacker publishes carefully crafted text on the internet. When that text gets swept into the model's training data, it plants a hidden "backdoor" — a trigger phrase that makes the model misbehave in a specific way.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real-world analogy:&lt;/strong&gt; Contaminating ingredients at the factory so every product made later has a hidden flaw.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;The bottom line:&lt;/strong&gt; AI security is an active cat-and-mouse game. Researchers discover attacks, build defenses, and then attackers find new workarounds. These models are empirical artifacts — they work remarkably well, but we don't yet have mathematical proofs of their safety.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;p&gt;If you remember just five things from this post, let them be these:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;An LLM is two files&lt;/strong&gt; — a huge parameter file and a tiny run-code file.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Training = compression.&lt;/strong&gt; The model squeezes the internet's knowledge into its weights by learning to predict the next word.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Three stages&lt;/strong&gt; turn a raw model into a polished assistant: pre-training, fine-tuning, and RLHF.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scaling laws&lt;/strong&gt; mean that bigger models + more data = predictably better performance.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security is unsolved.&lt;/strong&gt; Jailbreaks, adversarial attacks, prompt injection, and data poisoning are active open problems.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We're at the beginning of something genuinely new — a technology that compresses human knowledge into a portable, runnable format and can coordinate tools, senses, and memory to solve problems. The potential is enormous, and so are the challenges. Understanding how it works is the first step to using it well and thinking clearly about where it's headed.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Thanks for reading! If you found this helpful, drop a ❤️ and follow for more AI explainers.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>machinelearning</category>
      <category>beginners</category>
    </item>
    <item>
      <title>System Design From Scratch: The Components That Actually Run Production Systems</title>
      <dc:creator>Sabita kumari</dc:creator>
      <pubDate>Thu, 09 Apr 2026 20:53:10 +0000</pubDate>
      <link>https://forem.com/sabitak/system-design-from-scratch-the-components-that-actually-run-production-systems-422l</link>
      <guid>https://forem.com/sabitak/system-design-from-scratch-the-components-that-actually-run-production-systems-422l</guid>
      <description>&lt;p&gt;You open amazon.com. A product page loads in under a second. Behind that single page load, your request hit a DNS server, bounced through a CDN edge node, passed a rate limiter, got distributed by a load balancer, routed by an API gateway, processed by a microservice, checked a Redis cache, and maybe — maybe — touched an actual database.&lt;/p&gt;

&lt;p&gt;That's system design. Not theory. Not whiteboard boxes. The actual machinery that keeps websites alive when millions of people use them at the same time.&lt;/p&gt;

&lt;p&gt;Here's how each piece works, why it exists, and when you need it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Client-Server Relationship and DNS
&lt;/h2&gt;

&lt;p&gt;Everything starts with two things: a client and a server.&lt;/p&gt;

&lt;p&gt;The client is whatever device makes the request — your phone, laptop, a smart fridge, doesn't matter. The server is a machine that runs 24/7 with a public IP address, sitting in a data center somewhere, waiting for requests.&lt;/p&gt;

&lt;p&gt;The problem is that IP addresses look like &lt;code&gt;10.5.8.2&lt;/code&gt;. Nobody remembers that. So we have DNS — the Domain Name System — which is basically a global phone book. You type &lt;code&gt;amazon.com&lt;/code&gt;, your browser asks a DNS server "what's the IP for this?", and the DNS server responds with &lt;code&gt;10.5.8.2&lt;/code&gt;. Your browser then connects directly to that IP.&lt;/p&gt;

&lt;p&gt;That lookup process is called DNS resolution. It happens before anything else, every single time.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4w339q4jbb3nsduyyhdl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4w339q4jbb3nsduyyhdl.png" alt=" " width="727" height="274"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;*&lt;/p&gt;

&lt;h2&gt;
  
  
  Vertical vs. Horizontal Scaling
&lt;/h2&gt;

&lt;p&gt;Your server has 2 CPUs and 4 GB of RAM. Traffic grows. The machine starts choking. What do you do?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Vertical scaling (scale up):&lt;/strong&gt; Upgrade the machine. Add more RAM, more CPU cores, faster disks. The problem? You usually need to restart the machine to do this. That means downtime. For a hobby project, fine. For Amazon during Black Friday, absolutely not. There's also a hard ceiling — you can only make a single machine so powerful before physics says no.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Horizontal scaling (scale out):&lt;/strong&gt; Add more machines. Instead of one beefy server, run three identical servers in parallel. If one goes down, the other two keep serving traffic. No restart needed. No ceiling — just add another machine.&lt;/p&gt;

&lt;p&gt;This is why every serious production system uses horizontal scaling. You get zero-downtime deployments, redundancy if a server dies, and linear capacity growth.&lt;/p&gt;

&lt;p&gt;But horizontal scaling creates a new problem: if you have three servers, how does the client know which one to talk to?&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6ykkzptm6q4c6skwa3z3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6ykkzptm6q4c6skwa3z3.png" alt=" " width="652" height="349"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Load Balancers
&lt;/h2&gt;

&lt;p&gt;A load balancer sits in front of your servers and distributes incoming traffic across them. The client never talks to the servers directly — it talks to the load balancer, and the load balancer decides which server handles each request.&lt;/p&gt;

&lt;p&gt;The simplest distribution algorithm is Round Robin: request 1 goes to server A, request 2 to server B, request 3 to server C, then back to A. More sophisticated load balancers also run health checks — they periodically ping each server, and if one stops responding, they stop sending it traffic until it recovers.&lt;/p&gt;

&lt;p&gt;In AWS, this is the Elastic Load Balancer (ELB). Most teams don't build their own. Managed load balancers handle SSL termination, sticky sessions, and connection draining — so your team can focus on the application.&lt;/p&gt;




&lt;h2&gt;
  
  
  API Gateways and Microservices
&lt;/h2&gt;

&lt;p&gt;As your application grows, you stop running everything in one monolithic codebase. Authentication becomes its own service. Orders become their own service. Payments get their own service. This is microservice architecture — each business function runs independently, with its own database, its own deployment pipeline, and its own team.&lt;/p&gt;

&lt;p&gt;The question becomes: how does the client know which service to call? It doesn't. That's what the API gateway handles.&lt;/p&gt;

&lt;p&gt;An API gateway is a single entry point that routes requests based on the URL path. A request to &lt;code&gt;/auth&lt;/code&gt; goes to the authentication service. A request to &lt;code&gt;/orders&lt;/code&gt; goes to the order service. A request to &lt;code&gt;/payments&lt;/code&gt; goes to the payment service. The client only knows about one URL — the gateway handles the rest.&lt;/p&gt;

&lt;p&gt;It also acts as a reverse proxy, meaning the internal services are never exposed to the public internet. The gateway is the only thing with a public IP. Everything behind it is internal.&lt;/p&gt;

&lt;p&gt;The load balancer, API gateway, and microservices flow:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdexculidzjbpe7pmbc1v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdexculidzjbpe7pmbc1v.png" alt=" " width="687" height="451"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Asynchronous Communication and Queues
&lt;/h2&gt;

&lt;p&gt;Some tasks don't need to happen in real time. If a user places an order and the system needs to send a confirmation email, that email doesn't need to go out in the same millisecond. It can happen 2 seconds later. Or 10 seconds later. The user won't notice.&lt;/p&gt;

&lt;p&gt;This is where asynchronous communication comes in. Instead of the main server sending the email itself (and blocking until it's done), it pushes a task into a queue — a first-in, first-out list of jobs waiting to be processed. Background workers pull tasks from the queue at their own pace.&lt;/p&gt;

&lt;p&gt;AWS SQS is the most common managed queue. The pattern is simple: producer pushes a message, consumer pulls it, processes it, and acknowledges it. If the consumer crashes before acknowledging, the message goes back into the queue for another worker to pick up.&lt;/p&gt;

&lt;p&gt;This matters when the task is heavy. Imagine sending a million promotional emails. If the main server tried to send them synchronously, it would be stuck for hours. With a queue and 10 background workers, each worker handles 100,000 emails in parallel. The main server moved on the instant it pushed the tasks.&lt;/p&gt;




&lt;h2&gt;
  
  
  Event-Driven and Fan-Out Architecture
&lt;/h2&gt;

&lt;p&gt;Here's a common scenario: a payment succeeds, and you need to send an email confirmation, an SMS, and a WhatsApp message. Three actions from one event.&lt;/p&gt;

&lt;p&gt;You could have the payment service call each notification system directly. But that creates tight coupling — if the SMS service is slow, it blocks the payment response. If someone adds a push notification later, you have to modify the payment service code.&lt;/p&gt;

&lt;p&gt;The better approach is pub-sub (publish-subscribe). The payment service publishes a "payment succeeded" event to a topic (AWS SNS, for example). Three separate queues are subscribed to that topic — one for email, one for SMS, one for WhatsApp. Each queue has its own worker.&lt;/p&gt;

&lt;p&gt;This is a fan-out architecture. One event fans out to multiple independent channels. The critical benefit: if the SMS worker crashes, it retries on its own. The email and WhatsApp workers don't know or care. No cascading failures. Each channel is fully independent.&lt;/p&gt;

&lt;p&gt;The async processing and fan-out architecture:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flvv030oiwa17jdm2r2md.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flvv030oiwa17jdm2r2md.png" alt=" " width="679" height="482"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Rate Limiting
&lt;/h2&gt;

&lt;p&gt;Without rate limiting, a single bad actor (or a botnet) can flood your servers with millions of requests and take your system down. This is a DDoS attack, and it happens constantly.&lt;/p&gt;

&lt;p&gt;Rate limiting caps the number of requests a user or IP can make within a time window. Two common algorithms:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Token bucket:&lt;/strong&gt; Each user has a bucket that fills with tokens at a fixed rate (say, 5 per second). Each request costs one token. If the bucket is empty, the request is rejected. This allows short bursts — if a user hasn't made requests in a while, their bucket is full and they can fire several at once.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Leaky bucket:&lt;/strong&gt; Requests enter a queue that drains at a fixed rate. Excess requests overflow and get dropped. This produces a perfectly steady output regardless of input burstiness.&lt;/p&gt;

&lt;p&gt;Most production systems implement rate limiting at the load balancer or API gateway level, before requests even reach your services.&lt;/p&gt;




&lt;h2&gt;
  
  
  Database Scaling: Read Replicas
&lt;/h2&gt;

&lt;p&gt;Your database is a single machine. Most web applications read far more than they write — a product page might get viewed 10,000 times for every one inventory update. So the database bottleneck is usually reads, not writes.&lt;/p&gt;

&lt;p&gt;The fix is read replicas. You keep one primary node that handles all write operations. Every write gets replicated to one or more read replicas. Your application sends reads to replicas and writes to the primary. This spreads the load across multiple machines.&lt;/p&gt;

&lt;p&gt;The tradeoff is replication lag — there's a small delay (usually milliseconds) between a write hitting the primary and propagating to replicas. For most applications, this is fine. For financial transactions where you need to read your own write immediately, you route that specific read to the primary.&lt;/p&gt;




&lt;h2&gt;
  
  
  Caching with Redis
&lt;/h2&gt;

&lt;p&gt;Even with read replicas, database queries take time. A cache sits between your application and the database, storing the results of frequent queries in memory.&lt;/p&gt;

&lt;p&gt;Redis is the standard. It's an in-memory key-value store. When your application needs data, it checks Redis first. Cache hit? Return the result instantly — no database query needed. Cache miss? Query the database, store the result in Redis for next time, and return it.&lt;/p&gt;

&lt;p&gt;For a product page that gets 50,000 views per hour, this means 1 database query and 49,999 cache hits. The database barely notices.&lt;/p&gt;

&lt;p&gt;The hard part of caching is invalidation — knowing when to throw away stale data. If a product's price changes, the cached version is wrong until it expires or gets manually evicted. Most teams use a TTL (time-to-live) of 30 seconds to a few minutes, depending on how stale the data can be.&lt;/p&gt;




&lt;h2&gt;
  
  
  CDNs and Global Optimization
&lt;/h2&gt;

&lt;p&gt;Your servers are in Virginia. A user in Mumbai is 13,000 km away. Even at the speed of light, that round trip adds latency. For static content — images, CSS files, JavaScript bundles, product photos — there's no reason to fetch them from Virginia every time.&lt;/p&gt;

&lt;p&gt;A CDN (Content Delivery Network) copies your static content to edge locations around the world. Amazon CloudFront has edge nodes in Mumbai, London, São Paulo, Tokyo, and dozens of other cities. When the Mumbai user requests a product photo, the CDN serves it from the Mumbai edge — no round trip to Virginia.&lt;/p&gt;

&lt;p&gt;CDNs use anycast routing, which means the same IP address resolves to different physical servers depending on the user's location. The network automatically routes each user to the closest edge node.&lt;/p&gt;

&lt;p&gt;If the content is already cached at the edge, it's returned immediately. If not, the edge fetches it from the origin server, caches it, and serves it. Future requests from that region hit the cache instead of the origin.&lt;/p&gt;

&lt;p&gt;For a global e-commerce site, CDNs cut page load times from seconds to milliseconds for users far from the data center. They also reduce bandwidth costs on the origin server, because most requests never reach it.&lt;/p&gt;

&lt;p&gt;Database scaling, caching, and CDN:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpec4js33p0b7ignjcpw5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpec4js33p0b7ignjcpw5.png" alt=" " width="687" height="536"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Putting It All Together
&lt;/h2&gt;

&lt;p&gt;Here's the complete request flow when someone opens a product page:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;DNS&lt;/strong&gt; resolves &lt;code&gt;amazon.com&lt;/code&gt; to an IP address&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CDN&lt;/strong&gt; serves static assets (images, CSS, JS) from the nearest edge&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rate limiter&lt;/strong&gt; checks if the user has exceeded their request quota&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Load balancer&lt;/strong&gt; picks a healthy server and forwards the request&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API gateway&lt;/strong&gt; routes &lt;code&gt;/products/123&lt;/code&gt; to the product service&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Product service&lt;/strong&gt; checks &lt;strong&gt;Redis cache&lt;/strong&gt; for the product data&lt;/li&gt;
&lt;li&gt;Cache miss → query a &lt;strong&gt;read replica&lt;/strong&gt; database&lt;/li&gt;
&lt;li&gt;If a purchase happens → &lt;strong&gt;payment service&lt;/strong&gt; publishes an event&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pub-sub&lt;/strong&gt; fans out to email, SMS, WhatsApp &lt;strong&gt;queues&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Background workers&lt;/strong&gt; process each notification independently&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Every component exists because a single server running everything stops working at scale. DNS gives you human-friendly addresses. Horizontal scaling gives you redundancy. Load balancers distribute traffic. API gateways route to services. Queues decouple heavy tasks. Caching reduces database load. CDNs cut latency. Rate limiting protects the system.&lt;/p&gt;

&lt;p&gt;None of this is optional at scale. It's the reason that page loads in under a second.&lt;/p&gt;

</description>
      <category>systemdesign</category>
      <category>webdev</category>
      <category>backend</category>
      <category>architecture</category>
    </item>
    <item>
      <title>50 Claude Code Best Practices Every AI Engineer Should Know</title>
      <dc:creator>Sabita kumari</dc:creator>
      <pubDate>Wed, 08 Apr 2026 03:02:33 +0000</pubDate>
      <link>https://forem.com/sabitak/50-claude-code-best-practices-every-ai-engineer-should-know-2025-edition-3p79</link>
      <guid>https://forem.com/sabitak/50-claude-code-best-practices-every-ai-engineer-should-know-2025-edition-3p79</guid>
      <description>&lt;p&gt;50 Claude Code tips to help you build with Claude that nobody talks about.&lt;/p&gt;

&lt;p&gt;Over the past 24 hours, I read the new Claude Code best practices document so you don't have to.&lt;/p&gt;

&lt;h2&gt;
  
  
  New Best Practices for Claude Code
&lt;/h2&gt;

&lt;p&gt;I've extracted all the best practices + added some of my own from personal experience to compile the ultimate list of Claude Code best practices.&lt;/p&gt;

&lt;p&gt;This list also includes various Claude Code tools + learning resources.&lt;/p&gt;

&lt;p&gt;Rapid-fire style - let's go.&lt;/p&gt;




&lt;h2&gt;
  
  
  Foundational Tips
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;50. Clear Task Framing&lt;/strong&gt; - State exactly what you want Claude to do before anything else.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;49. Front Load Instructions&lt;/strong&gt; - Always put the most important instruction at the very top of the prompt.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;48. Give Claude a way to verify its work&lt;/strong&gt; - Include tests, screenshots, or expected outputs so Claude can check itself. This is the single highest-leverage thing you can do.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;47. Prompt Structure Tip&lt;/strong&gt; - To make the last few tips practical, I like this prompting structure:&lt;br&gt;
&lt;code&gt;[Role] + [Task] + [Context]&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;46. Chrome Extension Tip&lt;/strong&gt; - UI changes can be verified using the Claude Chrome extension. It opens a browser, tests the UI, and iterates until the code works.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;45. Explore first, then plan, then code&lt;/strong&gt; - Research (this process can include other LLMs), then enter Plan Mode, then switch back to normal mode to execute code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;44. Provide specific context in your prompts&lt;/strong&gt; - The more precise your instructions, the better. Claude can only infer context.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;43. Assume Zero Context&lt;/strong&gt; - Assume Claude knows nothing about your project. Tell it everything it needs to know.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;42. Rich Context&lt;/strong&gt; - Use &lt;code&gt;@&lt;/code&gt; to link files, data, and images.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;41. Claude.MD Tip&lt;/strong&gt; - Run &lt;code&gt;/init&lt;/code&gt; to generate a starter &lt;code&gt;CLAUDE.md&lt;/code&gt; file for your current project.&lt;/p&gt;




&lt;h2&gt;
  
  
  Projects &amp;amp; Skills Use
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;40. Project Instructions&lt;/strong&gt; - Use project-level instructions to define long-term behavior instead of repeating prompts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;39. Project Memory&lt;/strong&gt; - Edit the "Memory" tab to control exactly what Claude should retain or ignore over time (this works in projects as well).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;38. Claude Skills&lt;/strong&gt; - Use them to turn repeatable workflows into Skills instead of re-prompting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;37. Skill From Examples&lt;/strong&gt; - Paste a great output and ask Claude to turn it into a reusable Skill. You can even upload screenshots and ask Claude to replicate it; turn it into a skill (an easy way to create elite skills).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;36. Skill Versioning&lt;/strong&gt; - Duplicate and version Skills as you refine workflows instead of editing live ones.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;35. Project Hygiene&lt;/strong&gt; - Regularly prune memory, files, and instructions to avoid drift.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;34. Project Context Bleed&lt;/strong&gt; - Separate projects for unrelated workstreams to prevent context bleed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;33. Claude Skills Repo&lt;/strong&gt; - A library of 80,000+ Claude Skills: &lt;a href="https://skillsmp.com/" rel="noopener noreferrer"&gt;https://skillsmp.com/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;32. Claude Skills Library&lt;/strong&gt; - A cool website with plug-and-play Skills and more: &lt;a href="https://mcpservers.org/claude-skills" rel="noopener noreferrer"&gt;https://mcpservers.org/claude-skills&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;31. Project Memory Location&lt;/strong&gt; - Project memory can be stored in either &lt;code&gt;./CLAUDE.md&lt;/code&gt; or &lt;code&gt;./.claude/CLAUDE.md&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Underrated Mini Tips (most people don't know about these)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;30. Model Stacking&lt;/strong&gt; - Use other LLMs to plan your projects and generate advanced mega prompts before ever opening Claude Code — this strategy also saves tokens from Plan Mode.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;29. Create custom subagents&lt;/strong&gt; - Define specialized assistants in &lt;code&gt;.claude/agents/&lt;/code&gt; that Claude can delegate to for isolated tasks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;28. Output Scoring&lt;/strong&gt; - Ask Claude to score its answer against your pre-defined success criteria.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;27. Install Plug-ins&lt;/strong&gt; - Run &lt;code&gt;/plugin&lt;/code&gt; to browse the marketplace. Plugins add skills, tools, and integrations without any configuration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;26. Claude Code taught IN Claude Code&lt;/strong&gt; - A course that teaches you Claude Code directly IN Claude Code: &lt;a href="https://ccforeveryone.com/" rel="noopener noreferrer"&gt;https://ccforeveryone.com/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;25. Claude Interviews&lt;/strong&gt; - For larger projects, have Claude interview you first. Start with a minimal prompt and ask Claude to interview you using the &lt;code&gt;AskUserQuestion&lt;/code&gt; tool.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;24. Correct Often&lt;/strong&gt; - Course-correct Claude often. The moment it starts going off track, stop (&lt;code&gt;ESC&lt;/code&gt; to stop Claude mid-action).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;23. Clear&lt;/strong&gt; - Run &lt;code&gt;/clear&lt;/code&gt; to start a clean session.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;22. Rewind&lt;/strong&gt; - Double-tap &lt;code&gt;ESC&lt;/code&gt; or &lt;code&gt;/rewind&lt;/code&gt; to open checkpoint menu.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;21. Run Multiple Sessions&lt;/strong&gt; - There are two main ways to run parallel sessions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Claude Desktop:&lt;/strong&gt; Manage multiple local sessions visually. Each session gets its own isolated worktree.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Claude Web:&lt;/strong&gt; Run on Anthropic's secure cloud infrastructure in isolated VMs.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Debugging, Error Handling, Common Failure Patterns
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;20. Step Isolation&lt;/strong&gt; - Re-run only the broken step instead of regenerating everything.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;19. Error Reproduction&lt;/strong&gt; - Ask Claude to intentionally reproduce the failure to understand it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;18. Rollback Prompts&lt;/strong&gt; - Revert to the last known good prompt and reapply changes one at a time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;17. Over-Specified CLAUDE.md&lt;/strong&gt; - If your &lt;code&gt;CLAUDE.md&lt;/code&gt; is too long, Claude ignores half of it because important rules get lost in the noise.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Fix: Ruthlessly prune. If Claude already does something correctly without the instruction, delete it or convert it to a hook.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;16. Don't make this mistake&lt;/strong&gt; - You start with one task, then ask Claude something unrelated, then go back to the first task. Context is full of irrelevant information.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Fix: &lt;code&gt;/clear&lt;/code&gt; between unrelated tasks.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;15. Over-Correcting&lt;/strong&gt; - Claude does something wrong, you correct it, it's still wrong, you correct again. Context is polluted with failed approaches.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Fix: After two failed corrections, &lt;code&gt;/clear&lt;/code&gt; and write a better initial prompt incorporating what you learned.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;14. Step-by-Step Replay&lt;/strong&gt; - Have Claude walk through how it generated the answer line by line.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;13. The Infinite Exploration&lt;/strong&gt; - You ask Claude to "investigate" something without scoping it. Claude reads hundreds of files, filling the context.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Fix: Scope investigations narrowly or use subagents so the exploration doesn't consume your main context.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;12. Debugging Project&lt;/strong&gt; - Create an AI project dedicated to debugging code (Grok 4 Heavy is good at debugging).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;11. Context Window Management&lt;/strong&gt; - Claude's context window fills up fast. As this happens, Claude may start forgetting earlier instructions. This page will help you eliminate that problem: &lt;a href="https://code.claude.com/docs/en/costs#reduce-token-usage" rel="noopener noreferrer"&gt;https://code.claude.com/docs/en/costs#reduce-token-usage&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Tips
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;10. Notion Database&lt;/strong&gt; - Connect your Notion database to Claude to store your best &amp;amp; most commonly used prompts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;9. Learn Claude Code in Action&lt;/strong&gt; - Anthropic's learning resources: &lt;a href="https://www.anthropic.com/learn" rel="noopener noreferrer"&gt;https://www.anthropic.com/learn&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;8. Claude Courses&lt;/strong&gt; - Courses from Coursera: &lt;a href="https://www.anthropic.com/learn" rel="noopener noreferrer"&gt;https://www.anthropic.com/learn&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;7. Boris' Set Up&lt;/strong&gt; - How the creator of Claude Code maximises Claude Code: Boris' Claude Code Setup Cheatsheet&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. Claude Code Best Practices (DOC)&lt;/strong&gt; - Link to the latest doc: &lt;a href="https://code.claude.com/docs/en/best-practices" rel="noopener noreferrer"&gt;https://code.claude.com/docs/en/best-practices&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Safe Autonomous Mode&lt;/strong&gt; - Use &lt;code&gt;claude --dangerously-skip-permissions&lt;/code&gt; to bypass all permission checks and let Claude work uninterrupted. This works well for workflows like fixing lint errors or generating boilerplate code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Slow &amp;amp; Steady&lt;/strong&gt; - Take your time. Especially if building a serious workflow. Plan. Plan. Plan. THEN, execute.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Claude Superpowers&lt;/strong&gt; - A GitHub Repo of Claude Code superpowers: &lt;a href="https://github.com/obra/superpowers" rel="noopener noreferrer"&gt;https://github.com/obra/superpowers&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Hooks&lt;/strong&gt; - Best for actions that must happen every time with zero exceptions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. How to Extend Claude Code&lt;/strong&gt; - Anthropic's Guide: &lt;a href="https://code.claude.com/docs/en/features-overview" rel="noopener noreferrer"&gt;https://code.claude.com/docs/en/features-overview&lt;/a&gt;&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>ai</category>
      <category>productivity</category>
      <category>programming</category>
    </item>
  </channel>
</rss>
