<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Donia Shaban</title>
    <description>The latest articles on Forem by Donia Shaban (@donia_shaban_757cda160187).</description>
    <link>https://forem.com/donia_shaban_757cda160187</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2882021%2F9d90e811-132f-4f11-90cc-8aca36ccda35.jpg</url>
      <title>Forem: Donia Shaban</title>
      <link>https://forem.com/donia_shaban_757cda160187</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/donia_shaban_757cda160187"/>
    <language>en</language>
    <item>
      <title>Rate Limiting in ASP.NET Core</title>
      <dc:creator>Donia Shaban</dc:creator>
      <pubDate>Sat, 23 May 2026 05:59:50 +0000</pubDate>
      <link>https://forem.com/donia_shaban_757cda160187/rate-limiting-in-aspnet-core-5elf</link>
      <guid>https://forem.com/donia_shaban_757cda160187/rate-limiting-in-aspnet-core-5elf</guid>
      <description>&lt;h3&gt;
  
  
  What is Rate Limiting?
&lt;/h3&gt;

&lt;p&gt;Rate limiting controls how many requests a client can make to your API within a specific time window. ASP.NET Core 7+ ships with built-in rate limiting middleware, so you don't need any third-party packages. It protects your API from abuse (DoS attacks), ensures fair usage among clients, controls infrastructure costs, and keeps the service stable under load.&lt;/p&gt;




&lt;h3&gt;
  
  
  Setup
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;System.Threading.RateLimiting&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;Microsoft.AspNetCore.RateLimiting&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These two namespaces are essential. &lt;code&gt;System.Threading.RateLimiting&lt;/code&gt; contains the core algorithms and options classes. &lt;code&gt;Microsoft.AspNetCore.RateLimiting&lt;/code&gt; contains the middleware and the &lt;code&gt;[EnableRateLimiting]&lt;/code&gt; / &lt;code&gt;[DisableRateLimiting]&lt;/code&gt; attributes.&lt;/p&gt;




&lt;h3&gt;
  
  
  Registering the Middleware
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddRateLimiter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;options&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This registers the rate limiter service with the DI container. Inside the lambda you define one or more &lt;strong&gt;policies&lt;/strong&gt; — each policy is a named rule that you can apply to endpoints later.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UseRateLimiter&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This plugs the middleware into the pipeline. It must be called &lt;strong&gt;before&lt;/strong&gt; &lt;code&gt;app.MapControllers()&lt;/code&gt; so it intercepts requests before they reach your controllers.&lt;/p&gt;




&lt;h3&gt;
  
  
  Policy 1 — Fixed Window Limiter (&lt;code&gt;"DefaultPolicy"&lt;/code&gt;)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddFixedWindowLimiter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"DefaultPolicy"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;limiterOptions&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;limiterOptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Window&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TimeSpan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;FromMinutes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;limiterOptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PermitLimit&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;limiterOptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;QueueProcessingOrder&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;QueueProcessingOrder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OldestFirst&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;limiterOptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;QueueLimit&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The idea:&lt;/strong&gt; Time is divided into fixed, non-overlapping windows (e.g. 0:00–1:00, 1:00–2:00, ...). Each window gets a fresh counter. The moment the window ends, the counter resets completely regardless of when requests arrived inside it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Line by line:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;Window = TimeSpan.FromMinutes(1)&lt;/code&gt; — each time window lasts 1 minute. When the minute ends, the counter resets to zero.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;PermitLimit = 100&lt;/code&gt; — at most 100 requests are allowed within each 1-minute window. Request #101 gets rejected with HTTP 429 Too Many Requests.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;QueueProcessingOrder = QueueProcessingOrder.OldestFirst&lt;/code&gt; — when the limit is hit, excess requests can be queued. This says: serve the oldest waiting request first (FIFO). The alternative is &lt;code&gt;NewestFirst&lt;/code&gt; (LIFO).&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;QueueLimit = 10&lt;/code&gt; — at most 10 requests can wait in the queue. If the queue is also full, the request is immediately rejected without waiting.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The problem with Fixed Window:&lt;/strong&gt; If 100 requests arrive in the last 5 seconds of minute 1, and another 100 arrive in the first 5 seconds of minute 2, you get 200 requests in a 10-second span — a burst — because both windows reset independently. This is why Sliding Window exists.&lt;/p&gt;




&lt;h3&gt;
  
  
  Policy 2 — Sliding Window Limiter (&lt;code&gt;"SlidingWindow"&lt;/code&gt;)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddSlidingWindowLimiter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"SlidingWindow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;limiterOptions&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;limiterOptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Window&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TimeSpan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;FromMinutes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;limiterOptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PermitLimit&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;limiterOptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;QueueProcessingOrder&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;QueueProcessingOrder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OldestFirst&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;limiterOptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;QueueLimit&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;limiterOptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SegmentsPerWindow&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;6&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;limiterOptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AutoReplenishment&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The idea:&lt;/strong&gt; Instead of one big window, the window is split into smaller &lt;strong&gt;segments&lt;/strong&gt;. As each segment expires, the requests that were counted in that segment become available again. The window "slides" forward segment by segment, so limits are enforced more smoothly and bursts at window boundaries are prevented.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Line by line:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;Window = TimeSpan.FromMinutes(1)&lt;/code&gt; — the total window is still 1 minute.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;PermitLimit = 100&lt;/code&gt; — still 100 requests per window overall.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;SegmentsPerWindow = 6&lt;/code&gt; — the 1-minute window is split into 6 segments of 10 seconds each. Every 10 seconds, the oldest segment "falls off" and its request count is freed up.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;AutoReplenishment = true&lt;/code&gt; — the replenishment (freeing up expired segments) happens automatically in the background. If you set this to &lt;code&gt;false&lt;/code&gt;, you'd have to call &lt;code&gt;TryReplenish()&lt;/code&gt; manually, which is rare.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;QueueProcessingOrder&lt;/code&gt; and &lt;code&gt;QueueLimit&lt;/code&gt; — same meaning as Fixed Window above.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Concrete example:&lt;/strong&gt; At second 0 you send 100 requests. At second 10, segment 1 expires, freeing 10 slots (100/6 ≈ 16 per segment, but proportionally). The counter decrements gradually rather than resetting all at once — much fairer and burst-resistant.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7wlrzuw0hnxdgrvpxvhz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7wlrzuw0hnxdgrvpxvhz.png" alt="interactive" width="799" height="443"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  Policy 3 — Concurrency Limiter (&lt;code&gt;"Concurrency"&lt;/code&gt;)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddConcurrencyLimiter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Concurrency"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;limiterOptions&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;limiterOptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PermitLimit&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;50&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;limiterOptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;QueueProcessingOrder&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;QueueProcessingOrder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OldestFirst&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;limiterOptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;QueueLimit&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The idea:&lt;/strong&gt; This doesn't care about time windows at all. It limits how many requests are being &lt;strong&gt;processed simultaneously&lt;/strong&gt; at any given moment. Think of it like a semaphore — a permit is acquired when a request enters and released when it finishes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Line by line:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;PermitLimit = 50&lt;/code&gt; — at most 50 requests can be actively running at the same time. If a 51st request comes in while all 50 slots are busy, it either queues or gets rejected.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;QueueLimit = 100&lt;/code&gt; — up to 100 requests can wait in line for a slot to free up.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;QueueProcessingOrder = QueueProcessingOrder.OldestFirst&lt;/code&gt; — the oldest queued request gets the next freed slot.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;When to use it:&lt;/strong&gt; Ideal for protecting expensive operations like DB-heavy endpoints or file processing, where you care about CPU/connection pool exhaustion rather than request rate over time.&lt;/p&gt;




&lt;h3&gt;
  
  
  Policy 4 — Per-User Policy (&lt;code&gt;"ApiUserPolicy"&lt;/code&gt;)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddPolicy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"ApiUserPolicy"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;httpContext&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
    &lt;span class="n"&gt;RateLimitPartition&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetFixedWindowLimiter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;partitionKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;httpContext&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;User&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Identity&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="p"&gt;??&lt;/span&gt; &lt;span class="s"&gt;"anonymous"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;factory&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;FixedWindowRateLimiterOptions&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;Window&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TimeSpan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;FromMinutes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;PermitLimit&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;AutoReplenishment&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;
        &lt;span class="p"&gt;}));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The idea:&lt;/strong&gt; This is a &lt;strong&gt;partitioned&lt;/strong&gt; policy — meaning each user gets their own independent rate limit counter. User A's requests don't affect User B's quota. This is how real-world APIs (like GitHub's API) work: each authenticated user has their own limit.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Line by line:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;RateLimitPartition.GetFixedWindowLimiter(...)&lt;/code&gt; — creates a Fixed Window limiter, but partitioned per key rather than global.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;partitionKey: httpContext.User.Identity?.Name ?? "anonymous"&lt;/code&gt; — the partition key is the &lt;strong&gt;authenticated username&lt;/strong&gt;. If the user is not authenticated, all unauthenticated users share the key &lt;code&gt;"anonymous"&lt;/code&gt; (meaning they share one limit together — a deliberate design to pressure them into authenticating).&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;PermitLimit = 1000&lt;/code&gt; — authenticated users get a generous limit of 1000 requests/minute, suitable for API consumers with tokens.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;AutoReplenishment = true&lt;/code&gt; — the window resets automatically without manual intervention.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Policy 5 — Per-IP Policy (&lt;code&gt;"IpPolicy"&lt;/code&gt;)
&lt;/h3&gt;

&lt;p&gt;csharp&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddPolicy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"IpPolicy"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;httpContext&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
    &lt;span class="n"&gt;RateLimitPartition&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetSlidingWindowLimiter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;partitionKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;httpContext&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RemoteIpAddress&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;ToString&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;??&lt;/span&gt; &lt;span class="s"&gt;"unknown"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;factory&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;SlidingWindowRateLimiterOptions&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;Window&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TimeSpan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;FromMinutes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;PermitLimit&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;SegmentsPerWindow&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;AutoReplenishment&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;
        &lt;span class="p"&gt;}));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The idea:&lt;/strong&gt; Same partitioned concept as above, but the partition key is the &lt;strong&gt;client's IP address&lt;/strong&gt; instead of the username. This is useful for public endpoints where users aren't authenticated — you limit by IP to prevent one machine from hammering your API.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Line by line:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;partitionKey: httpContext.Connection.RemoteIpAddress?.ToString() ?? "unknown"&lt;/code&gt; — extracts the caller's IP address as a string (e.g. &lt;code&gt;"192.168.1.5"&lt;/code&gt;). If somehow the IP can't be determined, falls back to &lt;code&gt;"unknown"&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Uses a Sliding Window internally (same as Policy 2) — smoother enforcement, no burst problem at boundaries.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;PermitLimit = 100&lt;/code&gt; with &lt;code&gt;SegmentsPerWindow = 6&lt;/code&gt; — each IP gets 100 requests/minute, enforced per 10-second segments.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Applying Policies to Endpoints
&lt;/h3&gt;

&lt;p&gt;There are two ways:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Via attribute on a Controller action:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;HttpGet&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;EnableRateLimiting&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;policyName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"DefaultPolicy"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;IActionResult&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;Get&lt;/span&gt;&lt;span class="p"&gt;(...)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;[EnableRateLimiting]&lt;/code&gt; attribute on the &lt;code&gt;Get&lt;/code&gt; action tells the middleware: apply &lt;code&gt;"DefaultPolicy"&lt;/code&gt; to this specific endpoint. Other actions in the same controller (like &lt;code&gt;GetById&lt;/code&gt;, &lt;code&gt;Post&lt;/code&gt;, &lt;code&gt;Put&lt;/code&gt;, &lt;code&gt;Delete&lt;/code&gt;) have no attribute, so they are &lt;strong&gt;not rate-limited&lt;/strong&gt; by default.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Via Minimal API:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;MapGet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/api/products-mn"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="p"&gt;(...)&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;}).&lt;/span&gt;&lt;span class="nf"&gt;RequireRateLimiting&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"DefaultPolicy"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For Minimal APIs, you chain &lt;code&gt;.RequireRateLimiting("policyName")&lt;/code&gt; on the endpoint definition. The second Minimal API endpoint (&lt;code&gt;/api/products-mn/{productId:int}&lt;/code&gt;) has no &lt;code&gt;.RequireRateLimiting()&lt;/code&gt; call, so it's unrestricted.&lt;/p&gt;




&lt;h3&gt;
  
  
  What Happens When the Limit is Exceeded?
&lt;/h3&gt;

&lt;p&gt;When a request is rejected (limit hit + queue full), the middleware automatically returns &lt;strong&gt;HTTP 429 Too Many Requests&lt;/strong&gt;. You can customize the rejection response globally using &lt;code&gt;options.OnRejected&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OnRejected&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HttpContext&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;StatusCode&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;429&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HttpContext&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;WriteAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Too many requests. Please slow down."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Summary of All Policies
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Policy&lt;/th&gt;
&lt;th&gt;Algorithm&lt;/th&gt;
&lt;th&gt;Limit&lt;/th&gt;
&lt;th&gt;Partition By&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;DefaultPolicy&lt;/td&gt;
&lt;td&gt;Fixed Window&lt;/td&gt;
&lt;td&gt;100 req/min&lt;/td&gt;
&lt;td&gt;Global&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SlidingWindow&lt;/td&gt;
&lt;td&gt;Sliding Window&lt;/td&gt;
&lt;td&gt;100 req/min&lt;/td&gt;
&lt;td&gt;Global&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Concurrency&lt;/td&gt;
&lt;td&gt;Concurrency&lt;/td&gt;
&lt;td&gt;50 simultaneous&lt;/td&gt;
&lt;td&gt;Global&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ApiUserPolicy&lt;/td&gt;
&lt;td&gt;Fixed Window&lt;/td&gt;
&lt;td&gt;1000 req/min&lt;/td&gt;
&lt;td&gt;Per username&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IpPolicy&lt;/td&gt;
&lt;td&gt;Sliding Window&lt;/td&gt;
&lt;td&gt;100 req/min&lt;/td&gt;
&lt;td&gt;Per IP address&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

</description>
      <category>api</category>
      <category>csharp</category>
      <category>dotnet</category>
      <category>security</category>
    </item>
    <item>
      <title>Caching in ASP.NET Core</title>
      <dc:creator>Donia Shaban</dc:creator>
      <pubDate>Wed, 20 May 2026 03:35:57 +0000</pubDate>
      <link>https://forem.com/donia_shaban_757cda160187/caching-in-aspnet-core-29e</link>
      <guid>https://forem.com/donia_shaban_757cda160187/caching-in-aspnet-core-29e</guid>
      <description>&lt;h2&gt;
  
  
  Caching in ASP.NET Core — A Complete Guide
&lt;/h2&gt;

&lt;p&gt;Caching is the single most impactful performance optimization you can apply to a web app. The idea is simple: instead of recomputing or re-fetching the same data on every request, you store it somewhere fast and serve it from there. ASP.NET Core gives you three distinct caching mechanisms, each operating at a different layer of your app. Let's break each one down.&lt;/p&gt;




&lt;h3&gt;
  
  
  Why do we actually use caching?
&lt;/h3&gt;

&lt;p&gt;A &lt;strong&gt;100ms delay causes 7% fewer conversions&lt;/strong&gt; — meaning if your checkout page takes even one tenth of a second longer, you statistically lose customers. Amazon calculated this years ago and it still holds. A &lt;strong&gt;3-second load time loses 40% of users&lt;/strong&gt; entirely — they leave before the page finishes.&lt;/p&gt;

&lt;p&gt;The reason caching is the go-to fix is the bottleneck breakdown the PDF shows: &lt;strong&gt;60% of slowness comes from the database&lt;/strong&gt;, 25% from slow API calls, and only 15% from memory/other issues. Caching directly attacks the biggest problem — the DB. Instead of hitting SQL Server 1000 times for the same product data, you hit it once and serve the cached result 999 times. That's where the "80–90% DB load reduction" figure comes from.&lt;/p&gt;

&lt;p&gt;The Netflix example (70% startup time reduction) is real — they heavily cache user profiles, recommendation lists, and content metadata in Redis so that when you open the app, almost nothing needs a live DB query.&lt;/p&gt;

&lt;h3&gt;
  
  
  Where do the actual problems come from?
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Database (60% of problems)&lt;/strong&gt; — This is the classic N+1 query problem, missing indexes, fetching entire tables when you need 3 rows, and hitting the DB on every single request for data that barely changes (like a list of product categories). Caching fixes this directly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;APIs (25% of problems)&lt;/strong&gt; — Calling external services (payment gateways, weather APIs, third-party data) on every request. If you call an exchange rate API on every page load, you're adding 200–500ms of network latency every time. Cache that response for 5 minutes and the latency disappears.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Memory (15% of problems)&lt;/strong&gt; — This is actually caused by bad caching, not a lack of it. When you cache without expiration policies or cache huge objects carelessly, you put pressure on the GC (Garbage Collector). The server starts spending CPU time collecting memory instead of serving requests. This is why the "set expiration policies" rule matters so much — cache is not free RAM, it's borrowed RAM.&lt;/p&gt;

&lt;p&gt;The performance metrics from the PDF are also worth internalizing for your own apps: target average response time under 200ms, keep CPU below 70%, memory below 80%, and HTTP 5xx errors below 0.1%. Those are the numbers you'd put on a production monitoring dashboard.&lt;/p&gt;




&lt;h3&gt;
  
  
  1. In-Memory Cache (&lt;code&gt;IMemoryCache&lt;/code&gt;)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt; Data is stored directly in the server process's RAM. It's the fastest cache available — a dictionary lookup with no network round-trip.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where it lives:&lt;/strong&gt; Inside your application process. If you restart the server, the cache is gone. If you have multiple servers behind a load balancer, each server has its own isolated cache.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Single-server apps, frequently read but rarely changed data (e.g., lookup tables, config values, user roles).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvwkeua049c5vdfwbj4gj.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvwkeua049c5vdfwbj4gj.jpg" alt="in_memory_cache_flow" width="800" height="306"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How to use it:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Program.cs&lt;/span&gt;
&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddMemoryCache&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// In your service&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ProductService&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;IMemoryCache&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AppDbContext&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Product&lt;/span&gt;&lt;span class="p"&gt;?&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;GetProductAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;TryGetValue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;$"product:&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;out&lt;/span&gt; &lt;span class="n"&gt;Product&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="n"&gt;cached&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;cached&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;product&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Products&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;FindAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;options&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;MemoryCacheEntryOptions&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;SetAbsoluteExpiration&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TimeSpan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;FromMinutes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;10&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;SetSlidingExpiration&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TimeSpan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;FromMinutes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

        &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;$"product:&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;product&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;product&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key concepts to know:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Absolute expiration&lt;/strong&gt; — the item is always removed after X time, no matter how much it's accessed.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sliding expiration&lt;/strong&gt; — the timer resets every time the item is accessed; it's evicted only if nobody touches it for X time.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Eviction policies&lt;/strong&gt; — when RAM gets tight, ASP.NET Core uses LRU (Least Recently Used) to drop items. You can also set &lt;code&gt;CacheItemPriority&lt;/code&gt; to protect critical entries.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cache Stampede&lt;/strong&gt; — if 100 requests arrive simultaneously on a cache miss, they all hit the DB at once. Use &lt;code&gt;GetOrCreateAsync&lt;/code&gt; or a &lt;code&gt;SemaphoreSlim&lt;/code&gt; lock to handle this safely.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  2. Distributed Cache (&lt;code&gt;IDistributedCache&lt;/code&gt;)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt; Data is stored in an external shared store — Redis being the most common choice — that all your servers can reach. When you scale out to 3 servers, they all read from and write to the same cache.&lt;br&gt;
&lt;strong&gt;Where it lives:&lt;/strong&gt; Outside your app process, usually Redis or SQL Server.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Multi-server deployments, session data, anything that must be consistent across servers, large-scale apps.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbv1olfknb12x2yjlx74l.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbv1olfknb12x2yjlx74l.jpg" alt="distributed_cache_flow" width="800" height="353"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How to use it (with Redis):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Program.cs&lt;/span&gt;
&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddStackExchangeRedisCache&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;options&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Configuration&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"localhost:6379"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;InstanceName&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"MyApp:"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// In your service&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ProductService&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;IDistributedCache&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AppDbContext&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Product&lt;/span&gt;&lt;span class="p"&gt;?&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;GetProductAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;$"product:&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;cached&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetStringAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cached&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="k"&gt;not&lt;/span&gt; &lt;span class="k"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;JsonSerializer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Deserialize&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Product&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;(&lt;/span&gt;&lt;span class="n"&gt;cached&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;product&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Products&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;FindAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;options&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;DistributedCacheEntryOptions&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;SetAbsoluteExpirationRelativeToNow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TimeSpan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;FromMinutes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;10&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;SetStringAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;JsonSerializer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Serialize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;product&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;product&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice that unlike &lt;code&gt;IMemoryCache&lt;/code&gt;, the distributed cache stores &lt;strong&gt;bytes/strings&lt;/strong&gt;, so you serialize/deserialize yourself. This is the price of network storage — but the payoff is consistency across all your servers.&lt;/p&gt;




&lt;h3&gt;
  
  
  3. Output Caching &amp;amp; Response Caching
&lt;/h3&gt;

&lt;p&gt;These two are often confused. They both cache the HTTP response, but they work at different layers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Response Caching&lt;/strong&gt; — sets HTTP cache headers (&lt;code&gt;Cache-Control&lt;/code&gt;, &lt;code&gt;Expires&lt;/code&gt;) that tell the &lt;strong&gt;client or proxy&lt;/strong&gt; to cache the response. The server itself doesn't store anything; it just instructs whoever's asking.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Output Caching&lt;/strong&gt; (ASP.NET Core 7+) — the &lt;strong&gt;server&lt;/strong&gt; caches the full HTTP response and serves it directly on repeat requests, before your controller action even executes. No business logic, no DB query.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzcg1w8y59fsjdlc5s9b5.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzcg1w8y59fsjdlc5s9b5.jpg" alt="output_vs_response_cache_fixed" width="800" height="964"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Output Caching setup:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Program.cs&lt;/span&gt;
&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddOutputCache&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;options&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddBasePolicy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Cache&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
    &lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddPolicy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Products"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Cache&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Expire&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TimeSpan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;FromMinutes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;5&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Tag&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"products-tag"&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UseOutputCache&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// In your controller&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;HttpGet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"products"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;OutputCache&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;PolicyName&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Products"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;IActionResult&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;GetProducts&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// Invalidation: purge a specific tag&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;_cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;EvictByTagAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"products-tag"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Response Caching setup:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Program.cs&lt;/span&gt;
&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddResponseCaching&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UseResponseCaching&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// In your controller&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;HttpGet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"products"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;ResponseCache&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Duration&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;300&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;VaryByQueryKeys&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="s"&gt;"category"&lt;/span&gt; &lt;span class="p"&gt;})]&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;IActionResult&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;GetProducts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;ResponseCache&lt;/code&gt; attribute tells the framework to emit &lt;code&gt;Cache-Control: public, max-age=300&lt;/code&gt; headers. The browser or CDN does the actual caching.&lt;/p&gt;




&lt;h3&gt;
  
  
  Quick Comparison
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;strong&gt;Feature&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;In-Memory&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Distributed&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Output Cache&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Response Cache&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Where stored&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Server RAM&lt;/td&gt;
&lt;td&gt;Redis / SQL Server&lt;/td&gt;
&lt;td&gt;Server (middleware)&lt;/td&gt;
&lt;td&gt;Client / Proxy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Survives restart?&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No (unless backed by Redis)&lt;/td&gt;
&lt;td&gt;Yes (in browser)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Multi-server safe?&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Caches&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Any object&lt;/td&gt;
&lt;td&gt;Bytes / strings&lt;/td&gt;
&lt;td&gt;Full HTTP response&lt;/td&gt;
&lt;td&gt;Full HTTP response&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Granularity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Fine (per key)&lt;/td&gt;
&lt;td&gt;Fine (per key)&lt;/td&gt;
&lt;td&gt;Per endpoint/route&lt;/td&gt;
&lt;td&gt;Per URL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Best for&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Fast lookups, small data&lt;/td&gt;
&lt;td&gt;Sessions, scaled apps&lt;/td&gt;
&lt;td&gt;Read-heavy API endpoints&lt;/td&gt;
&lt;td&gt;Public, static-ish content&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




</description>
      <category>backend</category>
      <category>dotnet</category>
      <category>performance</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
