<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Ilya Masliev</title>
    <description>The latest articles on Forem by Ilya Masliev (@ilya_masliev).</description>
    <link>https://forem.com/ilya_masliev</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3458741%2Ffd5993f8-adf5-4e74-889e-4c9252f5447d.jpeg</url>
      <title>Forem: Ilya Masliev</title>
      <link>https://forem.com/ilya_masliev</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/ilya_masliev"/>
    <language>en</language>
    <item>
      <title>Building a Resilience Engine in Python: Internals of LimitPal (Part 2)</title>
      <dc:creator>Ilya Masliev</dc:creator>
      <pubDate>Fri, 06 Feb 2026 11:26:06 +0000</pubDate>
      <link>https://forem.com/ilya_masliev/building-a-resilience-engine-in-python-internals-of-limitpal-part-2-hm1</link>
      <guid>https://forem.com/ilya_masliev/building-a-resilience-engine-in-python-internals-of-limitpal-part-2-hm1</guid>
      <description>&lt;p&gt;&lt;em&gt;How the executor pipeline, clock abstraction, and circuit breaker architecture actually work.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;If you haven’t read Part 1, the short version:&lt;/p&gt;

&lt;p&gt;Resilience shouldn’t be a pile of decorators.&lt;br&gt;
It should be a system.&lt;/p&gt;

&lt;p&gt;Part 1 explained the motivation.&lt;/p&gt;

&lt;p&gt;This post is about &lt;strong&gt;how the system is built&lt;/strong&gt;.&lt;/p&gt;


&lt;h2&gt;
  
  
  The core design constraint
&lt;/h2&gt;

&lt;p&gt;I started with one rule:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Every resilience feature must compose cleanly with others.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Most libraries solve &lt;em&gt;a single concern&lt;/em&gt; well.&lt;/p&gt;

&lt;p&gt;But composition is where systems break.&lt;/p&gt;

&lt;p&gt;Retry + rate limiting + circuit breaker is not additive.&lt;br&gt;
It’s architectural.&lt;/p&gt;

&lt;p&gt;So LimitPal is built around one idea:&lt;/p&gt;

&lt;p&gt;👉 &lt;strong&gt;A single execution pipeline&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Everything plugs into it.&lt;/p&gt;


&lt;h2&gt;
  
  
  The executor pipeline
&lt;/h2&gt;

&lt;p&gt;Every call flows through the same stages:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Circuit breaker → Rate limiter → Retry loop → Result recording
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Not arbitrary order.&lt;/p&gt;

&lt;p&gt;This ordering is deliberate.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Circuit breaker first
&lt;/h3&gt;

&lt;p&gt;Fail fast.&lt;/p&gt;

&lt;p&gt;If the upstream service is already down,&lt;br&gt;
don’t waste tokens,&lt;br&gt;
don’t trigger retries,&lt;br&gt;
don’t create load.&lt;/p&gt;

&lt;p&gt;This protects &lt;em&gt;your own&lt;/em&gt; system.&lt;/p&gt;
&lt;h3&gt;
  
  
  Step 2: Rate limiter
&lt;/h3&gt;

&lt;p&gt;Only after we know execution is allowed&lt;br&gt;
do we consume capacity.&lt;/p&gt;

&lt;p&gt;This ensures:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;breaker failures don’t eat quota&lt;/li&gt;
&lt;li&gt;retries still respect rate limits&lt;/li&gt;
&lt;li&gt;burst behavior stays predictable&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Step 3: Retry loop
&lt;/h3&gt;

&lt;p&gt;Retry lives &lt;strong&gt;inside&lt;/strong&gt; the limiter window.&lt;/p&gt;

&lt;p&gt;Not outside.&lt;/p&gt;

&lt;p&gt;This is important.&lt;/p&gt;

&lt;p&gt;If retry lived outside,&lt;br&gt;
one logical call could consume infinite capacity.&lt;/p&gt;

&lt;p&gt;Inside the window:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;A call is a budgeted operation.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That constraint keeps systems stable under stress.&lt;/p&gt;
&lt;h3&gt;
  
  
  Step 4: Result recording
&lt;/h3&gt;

&lt;p&gt;Success/failure feedback feeds the breaker.&lt;/p&gt;

&lt;p&gt;This closes the loop.&lt;/p&gt;

&lt;p&gt;The executor isn’t just running code —&lt;br&gt;
it’s adapting to system health.&lt;/p&gt;


&lt;h2&gt;
  
  
  Why decorators fail here
&lt;/h2&gt;

&lt;p&gt;Decorators look composable.&lt;/p&gt;

&lt;p&gt;They aren’t.&lt;/p&gt;

&lt;p&gt;Each decorator:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;owns its own time model&lt;/li&gt;
&lt;li&gt;owns its own retry logic&lt;/li&gt;
&lt;li&gt;owns its own failure semantics&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Stack them and you get:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;emergent behavior you didn’t design&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The executor forces:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a shared clock&lt;/li&gt;
&lt;li&gt;a shared failure model&lt;/li&gt;
&lt;li&gt;a shared execution lifecycle&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That’s what makes the system predictable.&lt;/p&gt;


&lt;h2&gt;
  
  
  The clock abstraction (the hidden hero)
&lt;/h2&gt;

&lt;p&gt;Time is the hardest dependency in resilience systems.&lt;/p&gt;

&lt;p&gt;Retries depend on time.&lt;br&gt;
Rate limiting depends on time.&lt;br&gt;
Circuit breakers depend on time.&lt;/p&gt;

&lt;p&gt;If every component calls &lt;code&gt;time.time()&lt;/code&gt; directly:&lt;/p&gt;

&lt;p&gt;You lose control.&lt;/p&gt;

&lt;p&gt;LimitPal introduces a pluggable clock:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Clock&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Protocol&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;...&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;...&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;sleep_async&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Everything uses this.&lt;/p&gt;

&lt;p&gt;Not system time.&lt;/p&gt;

&lt;h3&gt;
  
  
  Production clock
&lt;/h3&gt;

&lt;p&gt;Uses monotonic time:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;immune to system clock jumps&lt;/li&gt;
&lt;li&gt;safe under NTP sync&lt;/li&gt;
&lt;li&gt;stable under container migrations&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  MockClock
&lt;/h3&gt;

&lt;p&gt;Tests become deterministic:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;clock&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;advance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;5.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No waiting.&lt;br&gt;
No flakiness.&lt;br&gt;
No race conditions.&lt;/p&gt;

&lt;p&gt;You can simulate minutes of retry behavior instantly.&lt;/p&gt;

&lt;p&gt;This isn’t a testing trick.&lt;/p&gt;

&lt;p&gt;It’s architectural control over time.&lt;/p&gt;


&lt;h2&gt;
  
  
  Circuit breaker architecture
&lt;/h2&gt;

&lt;p&gt;The breaker is a state machine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CLOSED → OPEN → HALF_OPEN → CLOSED
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But the tricky part isn’t the states.&lt;/p&gt;

&lt;p&gt;It’s &lt;strong&gt;transition discipline&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  CLOSED
&lt;/h3&gt;

&lt;p&gt;Normal operation.&lt;/p&gt;

&lt;p&gt;Failures increment a counter.&lt;br&gt;
Success resets it.&lt;/p&gt;

&lt;p&gt;When threshold reached → OPEN.&lt;/p&gt;
&lt;h3&gt;
  
  
  OPEN
&lt;/h3&gt;

&lt;p&gt;All calls fail immediately.&lt;/p&gt;

&lt;p&gt;No retry.&lt;br&gt;
No limiter usage.&lt;/p&gt;

&lt;p&gt;Just fast rejection.&lt;/p&gt;

&lt;p&gt;After recovery timeout → HALF_OPEN.&lt;/p&gt;
&lt;h3&gt;
  
  
  HALF_OPEN
&lt;/h3&gt;

&lt;p&gt;Limited probing phase.&lt;/p&gt;

&lt;p&gt;We allow a small number of calls.&lt;/p&gt;

&lt;p&gt;If they succeed → CLOSED.&lt;br&gt;
If they fail → back to OPEN.&lt;/p&gt;

&lt;p&gt;This prevents retry storms after recovery.&lt;/p&gt;

&lt;p&gt;The breaker is not just protection.&lt;/p&gt;

&lt;p&gt;It’s a &lt;em&gt;stability regulator&lt;/em&gt;.&lt;/p&gt;


&lt;h2&gt;
  
  
  Why retry must be jittered
&lt;/h2&gt;

&lt;p&gt;Exponential backoff without jitter is dangerous.&lt;/p&gt;

&lt;p&gt;If 1,000 clients retry at the same time:&lt;/p&gt;

&lt;p&gt;You get a synchronized spike.&lt;/p&gt;

&lt;p&gt;You kill the service again.&lt;/p&gt;

&lt;p&gt;Jitter spreads retries across time.&lt;/p&gt;

&lt;p&gt;Instead of:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;all retry at t=1s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You get:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;retry in [0.9s, 1.1s]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Small randomness → large stability gain.&lt;/p&gt;

&lt;p&gt;This is one of those details that separates toy resilience&lt;br&gt;
from production resilience.&lt;/p&gt;


&lt;h2&gt;
  
  
  Key-based isolation
&lt;/h2&gt;

&lt;p&gt;Limiters operate per key:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;user:123
tenant:acme
ip:10.0.0.1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each key gets its own bucket.&lt;/p&gt;

&lt;p&gt;This prevents one bad actor&lt;br&gt;
from starving everyone else.&lt;/p&gt;

&lt;p&gt;Internally this means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;dynamic bucket allocation&lt;/li&gt;
&lt;li&gt;TTL eviction&lt;/li&gt;
&lt;li&gt;bounded memory&lt;/li&gt;
&lt;li&gt;optional LRU trimming&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without this,&lt;br&gt;
rate limiting becomes a memory leak.&lt;/p&gt;


&lt;h2&gt;
  
  
  Sync + async parity
&lt;/h2&gt;

&lt;p&gt;Most Python libraries choose:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;sync OR async&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;LimitPal enforces parity.&lt;/p&gt;

&lt;p&gt;Same API.&lt;br&gt;
Different executor.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(...)&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(...)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No hidden behavior differences.&lt;/p&gt;

&lt;p&gt;This matters when codebases mix:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;background workers&lt;/li&gt;
&lt;li&gt;HTTP servers&lt;/li&gt;
&lt;li&gt;CLI tools&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One mental model everywhere.&lt;/p&gt;




&lt;h2&gt;
  
  
  The real goal
&lt;/h2&gt;

&lt;p&gt;LimitPal isn’t about rate limiting.&lt;/p&gt;

&lt;p&gt;Or retry.&lt;/p&gt;

&lt;p&gt;Or circuit breakers.&lt;/p&gt;

&lt;p&gt;It’s about:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;making failure behavior explicit and composable&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Resilience stops being ad-hoc glue&lt;br&gt;
and becomes architecture.&lt;/p&gt;

&lt;p&gt;That’s the difference between:&lt;/p&gt;

&lt;p&gt;“I added retry”&lt;/p&gt;

&lt;p&gt;and&lt;/p&gt;

&lt;p&gt;“I designed a failure strategy.”&lt;/p&gt;




&lt;h2&gt;
  
  
  What’s next
&lt;/h2&gt;

&lt;p&gt;Planned work:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;observability hooks&lt;/li&gt;
&lt;li&gt;adaptive rate limiting&lt;/li&gt;
&lt;li&gt;Redis backend&lt;/li&gt;
&lt;li&gt;bulkhead pattern&lt;/li&gt;
&lt;li&gt;framework integrations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Because resilience doesn’t end at execution.&lt;br&gt;
It extends into operations.&lt;/p&gt;




&lt;h2&gt;
  
  
  Closing thought
&lt;/h2&gt;

&lt;p&gt;Distributed systems fail.&lt;/p&gt;

&lt;p&gt;That’s not optional.&lt;/p&gt;

&lt;p&gt;What’s optional is whether failure behavior is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;accidental&lt;/li&gt;
&lt;li&gt;or engineered&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;LimitPal is an attempt to engineer it.&lt;/p&gt;

&lt;p&gt;Docs:&lt;br&gt;
&lt;a href="https://limitpal.readthedocs.io/" rel="noopener noreferrer"&gt;https://limitpal.readthedocs.io/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Repo:&lt;br&gt;
&lt;a href="https://github.com/Guli-vali/limitpal" rel="noopener noreferrer"&gt;https://github.com/Guli-vali/limitpal&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you like deep infrastructure tools — feedback welcome.&lt;/p&gt;

</description>
      <category>python</category>
      <category>internals</category>
      <category>performance</category>
      <category>microservices</category>
    </item>
    <item>
      <title>I Felt Like a Clown Wiring 5 Libraries Just to Build a Resilient API Client</title>
      <dc:creator>Ilya Masliev</dc:creator>
      <pubDate>Thu, 05 Feb 2026 13:11:28 +0000</pubDate>
      <link>https://forem.com/ilya_masliev/i-felt-like-a-clown-wiring-5-libraries-just-to-build-a-resilient-api-client-270k</link>
      <guid>https://forem.com/ilya_masliev/i-felt-like-a-clown-wiring-5-libraries-just-to-build-a-resilient-api-client-270k</guid>
      <description>&lt;p&gt;&lt;em&gt;So I wrote one that unifies everything.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The moment it broke me
&lt;/h2&gt;

&lt;p&gt;I just wanted a simple API client.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;AsyncClient&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.example.com/users/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That lasted about 5 minutes.&lt;/p&gt;

&lt;p&gt;Because real APIs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;rate limit you&lt;/li&gt;
&lt;li&gt;timeout&lt;/li&gt;
&lt;li&gt;return 503&lt;/li&gt;
&lt;li&gt;sometimes completely die&lt;/li&gt;
&lt;li&gt;and retries can DDoS your own service&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So I did what every Python dev does.&lt;/p&gt;

&lt;p&gt;I started stacking libraries.&lt;/p&gt;




&lt;h2&gt;
  
  
  The decorator tower of doom
&lt;/h2&gt;

&lt;p&gt;First: rate limiting.&lt;/p&gt;

&lt;p&gt;Then: retry.&lt;/p&gt;

&lt;p&gt;Then: circuit breaker.&lt;/p&gt;

&lt;p&gt;And suddenly my function looked like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@breaker&lt;/span&gt;
&lt;span class="nd"&gt;@retry&lt;/span&gt;&lt;span class="p"&gt;(...)&lt;/span&gt;
&lt;span class="nd"&gt;@sleep_and_retry&lt;/span&gt;
&lt;span class="nd"&gt;@limits&lt;/span&gt;&lt;span class="p"&gt;(...)&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch_user&lt;/span&gt;&lt;span class="p"&gt;(...):&lt;/span&gt;
    &lt;span class="bp"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And I hated it.&lt;/p&gt;

&lt;p&gt;Not because it didn’t work — but because it didn’t scale.&lt;/p&gt;

&lt;p&gt;Problems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;3+ libraries&lt;/li&gt;
&lt;li&gt;fragile decorator ordering&lt;/li&gt;
&lt;li&gt;conflicting abstractions&lt;/li&gt;
&lt;li&gt;async quirks&lt;/li&gt;
&lt;li&gt;painful testing&lt;/li&gt;
&lt;li&gt;scattered observability&lt;/li&gt;
&lt;li&gt;dependency sprawl&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And this was for &lt;strong&gt;one function&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Now imagine 10 APIs. Per-user limits. Background jobs. Webhooks.&lt;/p&gt;

&lt;p&gt;You’re no longer writing business logic.&lt;/p&gt;

&lt;p&gt;You’re babysitting resilience glue code.&lt;/p&gt;




&lt;h2&gt;
  
  
  The idea: resilience as a pipeline
&lt;/h2&gt;

&lt;p&gt;What if resilience wasn’t decorator soup?&lt;/p&gt;

&lt;p&gt;What if every call flowed through a &lt;strong&gt;single orchestrator&lt;/strong&gt;?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;limitpal&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;AsyncResilientExecutor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;AsyncTokenBucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;RetryPolicy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CircuitBreaker&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;executor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AsyncResilientExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;limiter&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;AsyncTokenBucket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;capacity&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;refill_rate&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;retry_policy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;RetryPolicy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_attempts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;circuit_breaker&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;CircuitBreaker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;failure_threshold&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user:123&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_call&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No decorators.&lt;br&gt;
No stacking libraries.&lt;br&gt;
No fragile glue.&lt;/p&gt;

&lt;p&gt;One execution pipeline.&lt;/p&gt;

&lt;p&gt;That’s what &lt;strong&gt;LimitPal&lt;/strong&gt; is.&lt;/p&gt;


&lt;h2&gt;
  
  
  What LimitPal actually gives you
&lt;/h2&gt;

&lt;p&gt;LimitPal is a toolkit for building resilient Python clients and services.&lt;/p&gt;

&lt;p&gt;It combines:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;rate limiting (Token / Leaky bucket)&lt;/li&gt;
&lt;li&gt;retry with exponential backoff + jitter&lt;/li&gt;
&lt;li&gt;circuit breaker&lt;/li&gt;
&lt;li&gt;composable limiters&lt;/li&gt;
&lt;li&gt;a resilience executor that orchestrates everything&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;full async + sync support&lt;/li&gt;
&lt;li&gt;zero dependencies&lt;/li&gt;
&lt;li&gt;thread-safe by default&lt;/li&gt;
&lt;li&gt;deterministic time control for tests&lt;/li&gt;
&lt;li&gt;key-based isolation (per-user / per-IP / per-tenant)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The goal isn’t more features.&lt;/p&gt;

&lt;p&gt;The goal is &lt;strong&gt;fewer moving parts&lt;/strong&gt;.&lt;/p&gt;


&lt;h2&gt;
  
  
  The resilience pipeline (this is the key idea)
&lt;/h2&gt;

&lt;p&gt;Every call goes through:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Circuit breaker check
→ Rate limiter
→ Execute + retry loop
→ Record result
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This ordering matters.&lt;/p&gt;

&lt;p&gt;You’re not just “adding retry”.&lt;/p&gt;

&lt;p&gt;You’re designing failure behavior as a system.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;breaker stops cascading failures&lt;/li&gt;
&lt;li&gt;limiter protects infrastructure&lt;/li&gt;
&lt;li&gt;retry handles temporary issues&lt;/li&gt;
&lt;li&gt;executor keeps it coherent&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One mental model instead of five.&lt;/p&gt;




&lt;h2&gt;
  
  
  The testing problem nobody talks about
&lt;/h2&gt;

&lt;p&gt;Time-based logic is brutal to test.&lt;/p&gt;

&lt;p&gt;Traditional approach:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Slow. Flaky. Non-deterministic.&lt;/p&gt;

&lt;p&gt;LimitPal uses a pluggable clock.&lt;/p&gt;

&lt;p&gt;So tests become:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;clock&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;advance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Instant. Deterministic.&lt;/p&gt;

&lt;p&gt;You can simulate minutes of retries in milliseconds.&lt;/p&gt;

&lt;p&gt;For teams that care about reliability, this is a game changer.&lt;/p&gt;




&lt;h2&gt;
  
  
  Real example
&lt;/h2&gt;

&lt;p&gt;A resilient HTTP client in ~10 lines:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;executor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AsyncResilientExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;limiter&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;AsyncTokenBucket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;capacity&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;refill_rate&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;retry_policy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;RetryPolicy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_attempts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;base_delay&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;circuit_breaker&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;CircuitBreaker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;failure_threshold&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.example.com&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;api&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You automatically get:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;burst control&lt;/li&gt;
&lt;li&gt;exponential retry&lt;/li&gt;
&lt;li&gt;cascading failure protection&lt;/li&gt;
&lt;li&gt;clean async semantics&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No decorator tower.&lt;/p&gt;




&lt;h2&gt;
  
  
  When should you use this?
&lt;/h2&gt;

&lt;p&gt;Use LimitPal if you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;build API clients&lt;/li&gt;
&lt;li&gt;call flaky third-party services&lt;/li&gt;
&lt;li&gt;run background jobs&lt;/li&gt;
&lt;li&gt;need per-user limits&lt;/li&gt;
&lt;li&gt;care about deterministic tests&lt;/li&gt;
&lt;li&gt;want clean async support&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you only need retry — smaller libs are fine.&lt;/p&gt;

&lt;p&gt;If you need &lt;strong&gt;composition&lt;/strong&gt;, that’s the niche.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 2: internals
&lt;/h2&gt;

&lt;p&gt;This post is about the idea.&lt;/p&gt;

&lt;p&gt;In Part 2 I’ll go deep into:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;how the executor pipeline works&lt;/li&gt;
&lt;li&gt;circuit breaker state machine&lt;/li&gt;
&lt;li&gt;clock abstraction design&lt;/li&gt;
&lt;li&gt;composite limiter architecture&lt;/li&gt;
&lt;li&gt;failure modeling&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Because resilience isn’t magic.&lt;/p&gt;

&lt;p&gt;It’s architecture.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pip install limitpal
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Docs:&lt;br&gt;
&lt;a href="https://limitpal.readthedocs.io/" rel="noopener noreferrer"&gt;https://limitpal.readthedocs.io/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Repo:&lt;br&gt;
&lt;a href="https://github.com/Guli-vali/limitpal" rel="noopener noreferrer"&gt;https://github.com/Guli-vali/limitpal&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If it saves you pain — stars are welcome ⭐&lt;/p&gt;

</description>
      <category>python</category>
      <category>performance</category>
      <category>architecture</category>
      <category>microservices</category>
    </item>
  </channel>
</rss>
