<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Aivars Kalvāns</title>
    <description>The latest articles on Forem by Aivars Kalvāns (@aivarsk).</description>
    <link>https://forem.com/aivarsk</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F367555%2Fb61510de-ac39-41f9-8c25-0d8de79e920e.jpg</url>
      <title>Forem: Aivars Kalvāns</title>
      <link>https://forem.com/aivarsk</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/aivarsk"/>
    <language>en</language>
    <item>
      <title>TigerBeetle as a file storage</title>
      <dc:creator>Aivars Kalvāns</dc:creator>
      <pubDate>Sun, 07 Dec 2025 00:00:00 +0000</pubDate>
      <link>https://forem.com/aivarsk/tigerbeetle-as-a-file-storage-540p</link>
      <guid>https://forem.com/aivarsk/tigerbeetle-as-a-file-storage-540p</guid>
      <description>&lt;p&gt;&lt;em&gt;Could not keep it under the rug until April Fool’s Day&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;TigerBeetle is a reliable, fast, and highly available database for financial accounting. It tracks financial transactions &lt;strong&gt;or anything else that can be expressed as double-entry bookkeeping&lt;/strong&gt; , providing three orders of magnitude more performance and &lt;strong&gt;guaranteeing durability even in the face of network, machine, and storage faults.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgrzk9remn36sqkfhvelx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgrzk9remn36sqkfhvelx.png" alt="Challenge accepted" width="225" height="225"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Continuing my &lt;a href="https://aivarsk.com/2025/12/06/tigerbeetle-without-olgp-database1/" rel="noopener noreferrer"&gt;if all you have is a hammer, everything looks like a nail&lt;/a&gt; journey, I wanted to store arbitrary binary blobs in TigerBeetle to protect them from storage faults. If I can do that, I can store anything.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;id&lt;/code&gt; field of my Accounts will contain the filename (16-byte limit). I will store the total file size in the &lt;code&gt;user_data_64&lt;/code&gt; field and the filename length in the &lt;code&gt;user_data_32&lt;/code&gt; field (to simplify decoding). And my Accounts will have this nice property that &lt;code&gt;credits_posted&lt;/code&gt; will contain the actual number of bytes written. I can detect failed uploads and resume that (a future TODO).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
def create_a_file(filename, size):
    if len(filename) &amp;gt; 16:
        raise ValueError("Invalid filename, more than 16 bytes")
    account = tb.Account(
        id=int.from_bytes(filename.encode()),
        user_data_64=size,
        user_data_32=len(filename),
        ledger=FILE,
        code=FILE,
    )
    errors = client.create_accounts([account])
    if errors:
        raise ValueError(errors[0])
    return account

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What makes me unhappy is that I have not found a good use for &lt;code&gt;user_data_128&lt;/code&gt; on the Account record. Such a waste of resources!&lt;/p&gt;

&lt;p&gt;I will store the actual bytes in Transfer &lt;code&gt;user_data_128&lt;/code&gt;, &lt;code&gt;user_data_64&lt;/code&gt;, and &lt;code&gt;user_data_32&lt;/code&gt; fields. That gives a total of 28 bytes per Transfer, and the Transfer &lt;code&gt;amount&lt;/code&gt; will contain the number of bytes used in the Transfer. Which will be 28 for all Transfers except the last one containing the remaining bytes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
            transfers.append(
                tb.Transfer(
                    id=tb.id(),
                    debit_account_id=system_id,
                    credit_account_id=file_id,
                    amount=len(block),
                    user_data_128=int.from_bytes(block[:16]),
                    user_data_64=int.from_bytes(block[16:24]),
                    user_data_32=int.from_bytes(block[24:]),
                    ledger=FILE,
                    code=FILE,
                )
            )

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Because TigerBeetle uses double-entry bookkeeping, I will transfer all bytes from a system file “.” (debit side) to the desired file (credit side). Which is extremely useful for audit purposes to verify that &lt;code&gt;debits_posted&lt;/code&gt; on the system file Account is the same as &lt;code&gt;credits_posted&lt;/code&gt; on all file Account records.&lt;/p&gt;

&lt;p&gt;As for getting data out of TigerBeetle, I can retrieve all credit Transfers for the specific Account. They are always correctly ordered by the &lt;code&gt;timestamp&lt;/code&gt; field as &lt;a href="https://docs.tigerbeetle.com/coding/time/#timestamps-are-totally-ordered" rel="noopener noreferrer"&gt;guaranteed by the TigerBeetle.&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
        timestamp_min = 0

        while True:
            transfers = client.get_account_transfers(
                tb.AccountFilter(
                    account_id=file_id, flags=tb.AccountFilterFlags.CREDITS, limit=BULK, timestamp_min=timestamp_min
                )
            )
            for transfer in transfers:
                timestamp_min = transfer.timestamp
                ...

            if len(transfers) &amp;lt; BULK:
                break
            timestamp_min += 1

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All that put together, I performed tests on some of the most valuable files I did not want to ever looose:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
(venv)  du -b ~/Downloads/homework.mp4
104718755 /home/aivarsk/Downloads/homework.mp4
(venv)  time ./tbcp ~/Downloads/homework.mp4 tb:backup.mp4

real 2m3.697s
user 1m4.408s
sys 0m1.568s

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So, you can store your files durably at speeds close to 642 kB/s. Now, let’s retrieve the file and store it on the disk:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
(venv)  time ./tbcp tb:backup.mp4 copy.mp4

real 0m47.588s
user 0m27.027s
sys 0m0.553s

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Downloading is around four times faster at 2,228 kB/s! And of course, I verified that not a single bit was lost during the round-trip:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
(venv)  sha256sum ~/Downloads/homework.mp4
4ee75486c7c65a5c158f7f6b2ca6458195aa25b155b0688173b4b52583ce4cac /home/aivarsk/Downloads/homework.mp4
(venv)  sha256sum copy.mp4
4ee75486c7c65a5c158f7f6b2ca6458195aa25b155b0688173b4b52583ce4cac copy.mp4
(venv) 

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you want to store your valuable files, guaranteeing durability even in the face of network, machine, and storage faults, &lt;a href="https://gist.github.com/aivarsk/2b26854c956e36fdfd73349586f2b168" rel="noopener noreferrer"&gt;here is the full source code&lt;/a&gt;&lt;/p&gt;

</description>
      <category>tigerbeetle</category>
      <category>gl</category>
      <category>accounting</category>
    </item>
    <item>
      <title>Running TigerBeetle without a control plane database. Part one.</title>
      <dc:creator>Aivars Kalvāns</dc:creator>
      <pubDate>Sat, 06 Dec 2025 00:00:00 +0000</pubDate>
      <link>https://forem.com/aivarsk/running-tigerbeetle-without-a-control-plane-database-part-one-16h4</link>
      <guid>https://forem.com/aivarsk/running-tigerbeetle-without-a-control-plane-database-part-one-16h4</guid>
      <description>&lt;p&gt;TigerBeetle is a database built for financial accounting, and the only record types available are &lt;a href="https://docs.tigerbeetle.com/reference/account/" rel="noopener noreferrer"&gt;Accounts&lt;/a&gt; and &lt;a href="https://docs.tigerbeetle.com/reference/transfer/" rel="noopener noreferrer"&gt;Transfers&lt;/a&gt;. That might be enough for the simplest accounting setup, but not for any realistic financial product.&lt;/p&gt;

&lt;p&gt;The way &lt;a href="https://docs.tigerbeetle.com/coding/system-architecture/" rel="noopener noreferrer"&gt;TigerBeetle solves that&lt;/a&gt; is by requiring an Online General Purpose (OLGP) database in the control plane that stores metadata and mapping between TigerBeetle’s identifiers and identifiers used by the rest of the systems. This can be done, and the documentation is really nice on guiding you, but… what about the dual write problem?&lt;/p&gt;

&lt;p&gt;Here’s an idea: what about running TigerBeetle without the control plane database? I am not saying you should do it, but I wanted to try out if that is possible and the best ways to do it. This is a work in progress.&lt;/p&gt;

&lt;h3&gt;
  
  
  Challenge
&lt;/h3&gt;

&lt;p&gt;It depends on the banking and payment card system, but often your account or card is not just a single physical account record. It is a “product” and “product agreement” that links together multiple accounts, conditions, metadata, and other types of records, just to tell you what the current balance is. Years ago, I worked with something we called “analytical accounting.” We had separate accounts for purchases, cashout, refunds, credit/debit transfers, interest, and different kinds of fees. Something similar that you can achieve by analytical systems, but it was running inside the accounting system with the same consistency guarantees.&lt;/p&gt;

&lt;p&gt;Accounts and Transfers in TigerBeetle are immutable. You can change credit and debit amounts by posting transactions. You can also update Account &lt;code&gt;flags&lt;/code&gt; and set or unset &lt;code&gt;AccountFlags.CLOSED&lt;/code&gt; by &lt;a href="https://docs.tigerbeetle.com/coding/recipes/close-account/" rel="noopener noreferrer"&gt;posting a pending transfer or voiding it&lt;/a&gt;. In practice, you often have more statuses for accounts to accept credits while rejecting debit operations, etc.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tools
&lt;/h3&gt;

&lt;p&gt;TigerBeetle gives us 3 fields where we can store arbitrary information for both Accounts and Transfers: &lt;code&gt;user_data_128&lt;/code&gt;, &lt;code&gt;user_data_64&lt;/code&gt;, &lt;code&gt;user_data_32&lt;/code&gt;, containing 16, 8, and 4 bytes of information. There are other fields like &lt;code&gt;ledger&lt;/code&gt; and &lt;code&gt;code&lt;/code&gt; that can be used for some things, but generally they should represent the ledger (and the currency) and &lt;code&gt;code&lt;/code&gt; to distinguish different account and transfer types. And there is one more field: the &lt;code&gt;id&lt;/code&gt; field itself (128 bits or 16 bytes), which gives us uniqueness checks out of the box.&lt;/p&gt;

&lt;p&gt;When your systems use integer IDs, it is a straightforward task. Most likely, you will have UUIDs, and it is easy to convert them to integers and back using Python:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; user_data_128 = uuid.uuid4().int
&amp;gt;&amp;gt;&amp;gt; uuid.UUID(int=user_data_128)
UUID('a1833b6a-a185-47aa-90ac-2f78979df3be')

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you have textual codes and identifiers, you can even store those with some encoding scheme:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; user_data_32 = int.from_bytes("EUR".encode())
&amp;gt;&amp;gt;&amp;gt; user_data_32.to_bytes(3).decode()
'EUR'

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Further, TigerBeetle provides &lt;a href="https://docs.tigerbeetle.com/reference/requests/lookup_accounts/" rel="noopener noreferrer"&gt;&lt;code&gt;lookup_accounts&lt;/code&gt;&lt;/a&gt; and &lt;a href="https://docs.tigerbeetle.com/reference/requests/lookup_transfers/" rel="noopener noreferrer"&gt;&lt;code&gt;lookup_transfers&lt;/code&gt;&lt;/a&gt; to retrieve Accounts and Transfers by the &lt;code&gt;id&lt;/code&gt; field. And there are &lt;a href="https://docs.tigerbeetle.com/reference/requests/query_accounts/" rel="noopener noreferrer"&gt;&lt;code&gt;query_accounts&lt;/code&gt;&lt;/a&gt; and &lt;a href="https://docs.tigerbeetle.com/reference/requests/query_transfers/" rel="noopener noreferrer"&gt;&lt;code&gt;query_transfers&lt;/code&gt;&lt;/a&gt; to query Accounts and Transfers by a combination of &lt;code&gt;user_data_128&lt;/code&gt;, &lt;code&gt;user_data_64&lt;/code&gt;, &lt;code&gt;user_data_32&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Solution
&lt;/h3&gt;

&lt;p&gt;Dealing with 1:1 relations is easy, just put our UUID in the &lt;code&gt;id&lt;/code&gt; field.&lt;/p&gt;

&lt;p&gt;1:n relations are harder. First, let’s use &lt;code&gt;user_data_128&lt;/code&gt; to store our UUID and link together multiple accounts by having the same value in that field. Second, there might be multiple cards and accounts for the same client UUID. For that, you can store the counter per client in the &lt;code&gt;user_data_32&lt;/code&gt; field.&lt;/p&gt;

&lt;p&gt;You can find all cards and accounts of a client by running a query on the &lt;code&gt;user_data_128&lt;/code&gt; field. After choosing the desired card/account, you can retrieve the whole TigerBeetle account set for the card/account product agreement by running a query on &lt;code&gt;user_data_128&lt;/code&gt; and &lt;code&gt;user_data_32&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import os
import uuid
from enum import IntEnum

import tigerbeetle as tb

class Code(IntEnum):
    MAIN_ACCOUNT = 1
    LIMIT_ACCOUNT = 2
    EUR = 978
    STATUS = 9000

def register_new_account_agreement(client_id: uuid.UUID):
    with tb.ClientSync(cluster_id=0, replica_addresses=os.getenv("TB_ADDRESS", "3000")) as client:
        existing = client.query_accounts(
            tb.QueryFilter(user_data_128=client_id.int, ledger=Code.EUR, code=Code.MAIN_ACCOUNT, limit=100)
        )

        seq = (max(account.user_data_32 for account in existing) + 1) if existing else 1

        accounts = [
            tb.Account(
                id=tb.id(),
                user_data_128=client_id.int,
                user_data_32=seq,
                ledger=Code.EUR,
                code=Code.MAIN_ACCOUNT,
                flags=tb.AccountFlags.LINKED,
            ),
            tb.Account(
                id=tb.id(),
                user_data_128=client_id.int,
                user_data_32=seq,
                ledger=Code.EUR,
                code=Code.LIMIT_ACCOUNT,
            ),
        ]
        account_errors = client.create_accounts(accounts)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There is a race condition between finding the highest value of the sequence number and assigning it to an account. But that can be solved by running account creation from a single thread or by serialisation through locks in Redis or somewhere else.&lt;/p&gt;

&lt;p&gt;This solves the creation of agreements/account sets. But what about updates?&lt;/p&gt;

&lt;h3&gt;
  
  
  If all you have is a hammer, everything looks like a nail
&lt;/h3&gt;

&lt;p&gt;You can’t modify any of the Account fields, but you can post a Transfer of &lt;code&gt;0&lt;/code&gt; amount with no financial impact and store information in any of the Transaction’s user data fields. And to read back the current value, you can ask for the last Transfer of a specific type (&lt;code&gt;code&lt;/code&gt; field) by using the &lt;code&gt;limit=1&lt;/code&gt; and &lt;code&gt;tb.AccountFilterFlags.REVERSED&lt;/code&gt; flag. Like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;        main, limit = accounts[0], accounts[1]
        transfer_errors = client.create_transfers(
            [
                tb.Transfer(
                    id=tb.id(),
                    debit_account_id=limit.id,
                    credit_account_id=main.id,
                    amount=0,
                    ledger=main.ledger,
                    user_data_128=0b1001001,
                    code=Code.STATUS,
                )
            ]
        )
        ...

        transfers = client.get_account_transfers(
            tb.AccountFilter(
                account_id=main.id,
                limit=1,
                code=Code.STATUS,
                flags=tb.AccountFilterFlags.CREDITS | tb.AccountFilterFlags.REVERSED,
            ),
        )
        print(transfers[0].user_data_128)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When one Transfer has too little fields, you can always post two, three or more and retrieve the same amount to reconstruct the information. Not that you should do it, but if two extra fields are the only reason to introduce an OLGP database, you might choose to abuse TigerBeetle to achieve the same.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;To be continued about building a transaction out of multiple Transfers.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>tigerbeetle</category>
      <category>accounting</category>
      <category>gl</category>
    </item>
    <item>
      <title>The lost art of semaphores</title>
      <dc:creator>Aivars Kalvāns</dc:creator>
      <pubDate>Thu, 09 Oct 2025 00:00:00 +0000</pubDate>
      <link>https://forem.com/aivarsk/the-lost-art-of-semaphores-gl8</link>
      <guid>https://forem.com/aivarsk/the-lost-art-of-semaphores-gl8</guid>
      <description>&lt;p&gt;I am a huge fan of &lt;a href="https://man7.org/linux/man-pages/man7/svipc.7.html" rel="noopener noreferrer"&gt;System V Inter Process Communication primitives&lt;/a&gt;. There is some rawness and UNIX spirit to them. There is a newer and kinda “improved” version of those primitives named POSIX IPC. While there are a few things in POSIX IPC that can’t be done with System V IPC, most of the time it’s the other way around. Primarily due to the rawness of System V IPC. Let’s check the &lt;a href="https://man7.org/linux/man-pages/man7/sem_overview.7.html" rel="noopener noreferrer"&gt;POSIX semaphores&lt;/a&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;sem_post&lt;/code&gt; can be used to release a semaphore (increment by 1)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;sem_wait&lt;/code&gt; can be used to acquire a semaphore (and decrement by 1)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;System V IPC has a single call for that: &lt;a href="https://man7.org/linux/man-pages/man2/semop.2.html" rel="noopener noreferrer"&gt;&lt;code&gt;semop&lt;/code&gt;&lt;/a&gt;. It can increment or decrement a semaphore by an arbitrary value. It also has the operation flags for each operation. And there, within flags, you can find one of the pearls of System V IPC - the &lt;code&gt;SEM_UNDO&lt;/code&gt; flag.&lt;/p&gt;

&lt;p&gt;What the &lt;code&gt;SEM_UNDO&lt;/code&gt; flag does is add the operation to an &lt;a href="https://github.com/torvalds/linux/blob/50c19e20ed2ef359cf155a39c8462b0a6351b9fa/ipc/sem.c#L2415" rel="noopener noreferrer"&gt;“undo list” within the kernel&lt;/a&gt;. Whenever the process terminates because of natural causes or is brutally killed by &lt;code&gt;SIGTERM&lt;/code&gt;, &lt;code&gt;SIGKILL&lt;/code&gt;, out-of-memory killer, or other reasons, the kernel will revert the semaphore operation. Think about it - your process acquires a semaphore and gets killed while holding it, and it will prevent other processes from acquiring the semaphore again. With &lt;code&gt;SEM_UNDO&lt;/code&gt;, you can choose what happens: if you used the semaphore as a counting semaphore, you can ask the kernel to release it automatically. When you acquire the semaphore to modify some shared resources, you can keep the semaphore stuck. It’s all up to you.&lt;/p&gt;

&lt;p&gt;Which brings me back to a previous topic of &lt;a href="https://aivarsk.com/2025/08/26/gunicorn-busy-workers/" rel="noopener noreferrer"&gt;tracking Gunicorn’s busy worker count&lt;/a&gt;. I used a semaphore there as a “reverse counting semaphore”: I released the semaphore (increment by 1) every time a process starts and acquired the semaphore (decrement by 1) every time a process stopped. But Python’s &lt;code&gt;multiprocessing.Semaphore&lt;/code&gt; is a POSIX semaphore. When a worker gets killed by the OOM killer or dies, the semaphore is not decremented, and the worker count is incorrect.&lt;/p&gt;

&lt;p&gt;So I decided to build my own Python wrappers around the &lt;a href="https://github.com/aivarsk/sysvipc-python" rel="noopener noreferrer"&gt;System V IPC&lt;/a&gt; to fix this issue and also make my other System V IPC projects more enjoyable. It’s more fun to use Python for quick tests than C++ code. With the library, here’s how you count the Gunicorn’s busy workers.&lt;/p&gt;

&lt;p&gt;First, we have to create a new semaphore. It’s just a number that can be shared with others. The downside of that - you have to perform a manual cleanup by scheduling the removal of the semaphore when the main process terminates:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import atexit
import sysvipc

sem = sysvipc.semget(sysvipc.IPC_PRIVATE, 1, 0o600)
atexit.register(lambda: sysvipc.semctl(sem, 0, sysvipc.IPC_RMID))

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can read the current value of the semaphore at any time with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;curval = sysvipc.semctl(sem, 0, sysvipc.GETVAL)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And then each process can increment the semaphore without confusing the reader with strange names like “unlock” or “post”. I also specify the &lt;code&gt;SEM_UNDO&lt;/code&gt; flag, and the kernel will apply &lt;code&gt;-1&lt;/code&gt; to the semaphore when the process terminates for any reason:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sysvipc.semop(sem, [(0, 1, sysvipc.SEM_UNDO)])

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once the process is done with the work, I decrement the semaphore. Again - no confusing names like “lock” or “wait”. The &lt;code&gt;SEM_UNDO&lt;/code&gt; will add &lt;code&gt;+1&lt;/code&gt; to the kernel’s semaphore adjustments and make the total adjustment &lt;code&gt;0&lt;/code&gt;. Past this point, when a process terminates, nothing will be subtracted from the semaphore value, and it will correctly represent the number of active workers.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sysvipc.semop(sem, [(0, -1, sysvipc.SEM_UNDO)])

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And this is just the beginning, I need to write more &lt;a href="http://pybind11.com/" rel="noopener noreferrer"&gt;Pybind11&lt;/a&gt; wrappers for System V IPC to unlock more goodies in Python.&lt;/p&gt;

</description>
      <category>python</category>
      <category>unix</category>
      <category>semaphore</category>
      <category>gunicorn</category>
    </item>
    <item>
      <title>Talking to payment cards over NFC</title>
      <dc:creator>Aivars Kalvāns</dc:creator>
      <pubDate>Sun, 28 Sep 2025 00:00:00 +0000</pubDate>
      <link>https://forem.com/aivarsk/talking-to-payment-cards-over-nfc-56i7</link>
      <guid>https://forem.com/aivarsk/talking-to-payment-cards-over-nfc-56i7</guid>
      <description>&lt;p&gt;I had a great experience speaking about contactless payment cards at &lt;a href="https://bsideskrakow.pl/" rel="noopener noreferrer"&gt;BSides Krakow&lt;/a&gt;. For those who want to get their hands dirty:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://drive.google.com/file/d/1sgm-PzaBobW9gHJ9kOBxmA0oexW6LaxW/view?usp=sharing" rel="noopener noreferrer"&gt;Slides are here&lt;/a&gt; and &lt;a href="https://gist.github.com/aivarsk/4eb5d1756b36989cde2c38ac4b95c050" rel="noopener noreferrer"&gt;here are the code snippets&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>cards</category>
      <category>nfc</category>
      <category>emv</category>
      <category>hce</category>
    </item>
    <item>
      <title>Ring buffer in the database</title>
      <dc:creator>Aivars Kalvāns</dc:creator>
      <pubDate>Tue, 23 Sep 2025 00:00:00 +0000</pubDate>
      <link>https://forem.com/aivarsk/ring-buffer-in-the-database-4n81</link>
      <guid>https://forem.com/aivarsk/ring-buffer-in-the-database-4n81</guid>
      <description>&lt;p&gt;We had a requirement to display the last N transactions on the ATM screen (”mini-statement”). The simplest solution is to keep the list of transactions, order them by date, and take the newest N transactions. But it gets tricky once you realize there are active customers making several transactions per day and inactive ones who use the card occasionally and might make a transaction every couple of months. Which means you have to preserve transaction records for a long period, and queries run more slowly. For a larger customer base, even half a year of data might be challenging to query in real time.&lt;/p&gt;

&lt;p&gt;Now we start thinking of keeping only the last N transactions per payment card and doing an &lt;code&gt;INSERT&lt;/code&gt; followed by a clever &lt;code&gt;DELETE&lt;/code&gt; that discards all extra transactions. In a regular programming language, a better solution would be to have a &lt;a href="https://en.wikipedia.org/wiki/Circular_buffer" rel="noopener noreferrer"&gt;ring buffer&lt;/a&gt; with a fixed capacity and a “write pointer” that points to where the newest record should be stored, overwriting the oldest one. To implement something similar in the database, we would need locking to prevent concurrent updates between retrieving the “write pointer”, storing the record, and advancing the “write pointer”.&lt;/p&gt;

&lt;p&gt;Years ago, I came up with a solution that I still have mixed feelings about. It is nice because it can be done with a single SQL statement, requires no explicit locking, and it works (in production as well). On the other hand, it is a bit fugly, my internal DBA demon complains about the size of the WAL records, and it’s weird. Old colleagues called that “shifter” because it works like the bit-shifting operations &lt;code&gt;&amp;lt;&amp;lt;&lt;/code&gt; and &lt;code&gt;&amp;gt;&amp;gt;&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;First, you create a model with as many message/transaction fields as you want to keep:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class History(models.Model):
    ...
    message1 = models.TextField(blank=True, null=True)
    message2 = models.TextField(blank=True, null=True)
    message3 = models.TextField(blank=True, null=True)
    message4 = models.TextField(blank=True, null=True)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every time you want to store a new message, you assign it to the field &lt;code&gt;message1&lt;/code&gt;. Whatever value was there in &lt;code&gt;message1&lt;/code&gt; you store it in &lt;code&gt;message2&lt;/code&gt;. Whatever value was there in &lt;code&gt;message2&lt;/code&gt;you store it in &lt;code&gt;message3&lt;/code&gt; , etc. And the value of the last field &lt;code&gt;message4&lt;/code&gt; in this case is forgotten.&lt;/p&gt;

&lt;p&gt;And here’s how it works. We start with an empty ring buffer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; History.objects.get(pk=1). __dict__
{'_state': &amp;lt;django.db.models.base.ModelState object at 0x721f90e8bf50&amp;gt;, 'id': 1, 'message1': None, 'message2': None, 'message3': None, 'message4': None}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We add the first message and verify it is stored correctly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; History.objects.filter(pk=1).update(message4=F("message3"), message3=F("message2"), message2=F("message1"), message1="first message")
&amp;gt;&amp;gt;&amp;gt; History.objects.get(pk=1). __dict__
{'_state': &amp;lt;django.db.models.base.ModelState object at 0x721f90e8bfe0&amp;gt;, 'id': 1, 'message1': 'first message', 'message2': None, 'message3': None, 'message4': None}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We add the second message and verify it is stored correctly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; History.objects.filter(pk=1).update(message4=F("message3"), message3=F("message2"), message2=F("message1"), message1="second message")
&amp;gt;&amp;gt;&amp;gt; History.objects.get(pk=1). __dict__
{'_state': &amp;lt;django.db.models.base.ModelState object at 0x721f90e8bfb0&amp;gt;, 'id': 1, 'message1': 'second message', 'message2': 'first message', 'message3': None, 'message4': None}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We add the third message and verify it is stored correctly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; History.objects.filter(pk=1).update(message4=F("message3"), message3=F("message2"), message2=F("message1"), message1="third message")
&amp;gt;&amp;gt;&amp;gt; History.objects.get(pk=1). __dict__
{'_state': &amp;lt;django.db.models.base.ModelState object at 0x721f90ea8110&amp;gt;, 'id': 1, 'message1': 'third message', 'message2': 'second message', 'message3': 'first message', 'message4': None}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We add the fourth message and verify it is stored correctly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; History.objects.filter(pk=1).update(message4=F("message3"), message3=F("message2"), message2=F("message1"), message1="fourth message")
&amp;gt;&amp;gt;&amp;gt; History.objects.get(pk=1). __dict__
{'_state': &amp;lt;django.db.models.base.ModelState object at 0x721f90ea81a0&amp;gt;, 'id': 1, 'message1': 'fourth message', 'message2': 'third message', 'message3': 'second message', 'message4': 'first message'}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We add the fifth message and verify it is stored correctly and the first message has been discarded:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; History.objects.filter(pk=1).update(message4=F("message3"), message3=F("message2"), message2=F("message1"), message1="fifth message")
&amp;gt;&amp;gt;&amp;gt; History.objects.get(pk=1). __dict__
{'_state': &amp;lt;django.db.models.base.ModelState object at 0x721f90ea82f0&amp;gt;, 'id': 1, 'message1': 'fifth message', 'message2': 'fourth message', 'message3': 'third message', 'message4': 'second message'}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So? Is this stupid or smart? After all these years I still have mixed feelings about it.&lt;/p&gt;

</description>
      <category>orm</category>
      <category>django</category>
      <category>database</category>
    </item>
    <item>
      <title>Tracking Gunicorn's busy worker count</title>
      <dc:creator>Aivars Kalvāns</dc:creator>
      <pubDate>Tue, 26 Aug 2025 00:00:00 +0000</pubDate>
      <link>https://forem.com/aivarsk/tracking-gunicorns-busy-worker-count-3g0i</link>
      <guid>https://forem.com/aivarsk/tracking-gunicorns-busy-worker-count-3g0i</guid>
      <description>&lt;p&gt;I was investigating performance issues of a Django application running with Gunicorn behind a Nginx server. First, I added more timing information to Nginx access.log:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;log_format timing '$remote_addr - $remote_user [$time_local] '
                  '"$request" $status $body_bytes_sent '
                  '"$http_referer" "$http_user_agent" rt="$request_time" uct="$upstream_connect_time" uht="$upstream_header_time" urt="$upstream_response_time"';

access_log /var/log/nginx/access.log timing;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After the Nginx reload, it started to report the total request time and the time waiting for a response from Gunicorn. I also checked the timing in Chrome developer tools. All the times matched, which means the network latency or Nginx was not to blame.&lt;/p&gt;

&lt;p&gt;However, for a specific URL, response times were in the range of 3 to 22 seconds. The Gunicorn access log already contained the time spent processing the request by using &lt;a href="https://docs.gunicorn.org/en/stable/settings.html#access-log-format" rel="noopener noreferrer"&gt;a custom access log format string&lt;/a&gt;. And within the Gunicorn, those URL requests took less than a second. It was clear that requests get buffered between Nginx and Gunicorn in the &lt;a href="https://docs.gunicorn.org/en/stable/settings.html#backlog" rel="noopener noreferrer"&gt;connection backlog&lt;/a&gt;, and Gunicorn needs &lt;a href="https://docs.gunicorn.org/en/stable/settings.html#worker-processes" rel="noopener noreferrer"&gt;more workers to process the requests&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;But how to find out how many Gunicorn workers are being used, how many are idle, and how to monitor that? I did not find a good answer. However, Gunicorn has functions that are called &lt;a href="https://docs.gunicorn.org/en/stable/settings.html#pre-request" rel="noopener noreferrer"&gt;before a request is processed by the worker&lt;/a&gt; and &lt;a href="https://docs.gunicorn.org/en/stable/settings.html#post-request" rel="noopener noreferrer"&gt;after it has been processed&lt;/a&gt;. I could use that to maintain the total number of active requests. But how to do that across multiple processes and avoid the lost updates?&lt;/p&gt;

&lt;p&gt;Meet the &lt;a href="https://docs.gunicorn.org/en/stable/settings.html#access-log-format" rel="noopener noreferrer"&gt;multiprocess Semaphore&lt;/a&gt;! It is a number living in the OS kernel memory: acquiring a semaphore decrements its value, releasing a semaphore increments the value, and a semaphore can’t become negative (acquiring will block). Normally, it is used for synchronization, but I will use it as an atomic gauge: release to increase it and acquire to decrease it.&lt;/p&gt;

&lt;p&gt;Another trick I discovered: the Gunicorn access log can log the request headers. So instead of adding custom logs, I added a new HTTP header to the request and stored the counter value in it. And here is the complete solution for it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from multiprocessing import Semaphore

accesslog = "-"
access_log_format = '%(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s" rt=%(L)s busy=%({x-busy}i)s'

busy = Semaphore(0)

def pre_request(worker, req):
    busy.release()
    req.headers.append(("x-busy", str(busy.get_value())))

def post_request(worker, req, environ, resp):
    busy.acquire()

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>gunicorn</category>
      <category>django</category>
    </item>
    <item>
      <title>Monolith First</title>
      <dc:creator>Aivars Kalvāns</dc:creator>
      <pubDate>Tue, 08 Jul 2025 00:00:00 +0000</pubDate>
      <link>https://forem.com/aivarsk/monolith-first-15ph</link>
      <guid>https://forem.com/aivarsk/monolith-first-15ph</guid>
      <description>&lt;p&gt;Many go to Martin Fowler for microservice architecture, distributed systems, micro frontends, event sourcing, and other fancy ideas about architecture, but &lt;a href="https://martinfowler.com/bliki/MonolithFirst.html" rel="noopener noreferrer"&gt;a few have noticed the advice to do a monolith first&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;As I hear stories about teams using a microservices architecture, I’ve noticed a common pattern.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Almost all the successful microservice stories have started with a monolith that got too big and was broken up&lt;/li&gt;
&lt;li&gt;Almost all the cases where I’ve heard of a system that was built as a microservice system from scratch, it has ended up in serious trouble.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This pattern has led many of my colleagues to argue that &lt;strong&gt;you shouldn’t start a new project with microservices, even if you’re sure your application will be big enough to make it worthwhile.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>monolith</category>
      <category>microservices</category>
    </item>
    <item>
      <title>Transactional task outbox in Django with django-taskq</title>
      <dc:creator>Aivars Kalvāns</dc:creator>
      <pubDate>Tue, 01 Jul 2025 00:00:00 +0000</pubDate>
      <link>https://forem.com/aivarsk/transactional-task-outbox-in-django-with-django-taskq-4801</link>
      <guid>https://forem.com/aivarsk/transactional-task-outbox-in-django-with-django-taskq-4801</guid>
      <description>&lt;p&gt;We have given up on distributed transactions (2PC) but have not given up working with multiple resources like the database, message brokers, and queues. Instead, everybody tries to build their own atomic operations over multiple resources. Some do code&amp;amp;pray, and others try to have a database as a source of truth and with &lt;a href="https://microservices.io/patterns/data/transactional-outbox.html" rel="noopener noreferrer"&gt;the transactional outbox pattern&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://pypi.org/project/django-taskq/" rel="noopener noreferrer"&gt;django-taskq&lt;/a&gt; uses the Django database as the one and only backend for storing tasks (function calls with parameters). Because it uses the Django ORM under the hood, it also obeys Django transactions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from django_taskq.celery import shared_task

@shared_task(queue="kafka-events", autoretry_for=(Exception,))
def something_happened(*, key: str, value: str):
    ...

with transaction.atomic():
    model1.save()
    model2.save()
    something_happened.delay(key=str(uuid.uud4()), value=payload.model_dump_json())

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Either all models will be updated and a new task scheduled or none of it will happen. And when the task runs, the model changes will be there in the database. I guess all of us have a Celery story about having a similar code and tasks executed before changes are committed and visible in the database. At which point everybody starts using &lt;code&gt;transaction.on_commit&lt;/code&gt; that works most of the time while the broker keeps running, the network to broker is reliable and Redis or application is not being killed by OOM killer.&lt;/p&gt;

&lt;p&gt;It still does not prevent failures while the task is being executed or the task not being idempotent but at least both the model changes and the task is recorded atomically and task failure or success will be recorded in the database.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>A Philosophy of Software Design</title>
      <dc:creator>Aivars Kalvāns</dc:creator>
      <pubDate>Thu, 26 Jun 2025 00:00:00 +0000</pubDate>
      <link>https://forem.com/aivarsk/a-philosophy-of-software-design-2o5p</link>
      <guid>https://forem.com/aivarsk/a-philosophy-of-software-design-2o5p</guid>
      <description>&lt;p&gt;In a world full of Uncles Bob &lt;a href="https://www.youtube.com/watch?v=lz451zUlF-k" rel="noopener noreferrer"&gt;be John Ousterhout&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/johnousterhout/aposd-vs-clean-code" rel="noopener noreferrer"&gt;More of method length, comment and TDD discussion&lt;/a&gt; and &lt;em&gt;A Philosophy of Software Design&lt;/em&gt; should be on every bookshelf. It has &lt;a href="https://www.youtube.com/watch?v=4xqkI953K6Y" rel="noopener noreferrer"&gt;some critcism&lt;/a&gt; but so does everything.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Serializable isolation level and transaction processing</title>
      <dc:creator>Aivars Kalvāns</dc:creator>
      <pubDate>Wed, 25 Jun 2025 00:00:00 +0000</pubDate>
      <link>https://forem.com/aivarsk/serializable-isolation-level-and-transaction-processing-4pm</link>
      <guid>https://forem.com/aivarsk/serializable-isolation-level-and-transaction-processing-4pm</guid>
      <description>&lt;p&gt;While on the topic of &lt;a href="https://www.tpc.org/tpcc/results/tpcc_results5.asp" rel="noopener noreferrer"&gt;On-Line Transaction Processing Benchmarks&lt;/a&gt;, it’s interesting to observe the strategies companies employ to achieve optimal results. All code for both the transaction monitor and database is available in the PDF report. Let’s look at the &lt;a href="https://www.tpc.org/results/fdr/tpcc/Oracle_SPARC_SuperCluster_with_T3-4s_TPC-C_FDR_120210.pdf" rel="noopener noreferrer"&gt;Oracle one that uses Oracle Tuxedo and the Oracle database&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;There’s a lot of cryptic C code using the OCI interface and PL/SQL code blocks. But you won’t find any signs of pessimistic (&lt;code&gt;SELECT ... FOR UPDATE&lt;/code&gt;) or optimistic (&lt;code&gt;UPDATE .. WHERE version=:version&lt;/code&gt;) locking. How so? How can they achieve correctness without explicit locking?&lt;/p&gt;

&lt;p&gt;Oracle decided they could achieve the best results by using a serializable transaction isolation level:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ALTER SESSION SET ISOLATION_LEVEL = SERIALIZABLE

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/cncpt/data-concurrency-and-consistency.html#GUID-8DA9A191-4CA3-4B1A-995F-4B17471C2738" rel="noopener noreferrer"&gt;Serializable isolation level&lt;/a&gt; in Oracle database means database will detect concurrent changes to table rows and return “ORA-08177: Cannot serialize access for this transaction”. This is a bit similar to optimistic locking except the database doing it for you. And then you can add retries in your code that repeats all updates until success. Like this payment code from TPC-C:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;DECLARE /* payz */
    not_serializable EXCEPTION;
    PRAGMA EXCEPTION_INIT(not_serializable,-8177);
    deadlock EXCEPTION;
    PRAGMA EXCEPTION_INIT(deadlock,-60);
    snapshot_too_old EXCEPTION;
    PRAGMA EXCEPTION_INIT(snapshot_too_old,-1555);
BEGIN
    LOOP BEGIN
       BEGIN 
           UPDATE cust
               SET c_balance = c_balance - :h_amount,
               c_ytd_payment = c_ytd_payment + :h_amount,
               c_payment_cnt = c_payment_cnt + 1
               WHERE ...;
           UPDATE dist
               SET d_ytd = d_ytd + :h_amount
               WHERE ...;
            EXIT;
        EXCEPTION WHEN not_serializable OR deadlock OR snapshot_too_old THEN
            ROLLBACK;
            :retry := :retry + 1;
        END;
    END LOOP;
END;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So I used this approach in the early version of an accounting system. But performance tests on hot accounts were so bad, I had to give up. The reasons for that are similar to what &lt;a href="https://youtu.be/GEkeOHw87Sg?si=urUjkVKqeoaIscu7&amp;amp;t=944" rel="noopener noreferrer"&gt;Cliff Click shared about his experience with Hardware Transactional Memory&lt;/a&gt; and “perf counters” / “mod counters” leading to transaction retries.&lt;/p&gt;

&lt;p&gt;With a serializable isolation level database is ignorant about what kind of changes your application makes. All updates are equally important for the database. But from the application point of view, a lot of changes are commutative and you don’t care in what order they happen:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;incrementing transaction count&lt;/li&gt;
&lt;li&gt;increasing account balance&lt;/li&gt;
&lt;li&gt;decreasing account balance as long as it does not become negative&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;However, the database is not aware of that. When your transaction starts with the count of 42 and some other transaction manages to increment it before you, you get &lt;code&gt;ORA-08177&lt;/code&gt; and have to retry again. The database sees that some bits have changed and that’s all it cares about. But I don’t care if the transaction count is 43, 44, or 89 after I increment it as long as the current (whatever) value is incremented by 1.&lt;/p&gt;

&lt;p&gt;The only way how serializable transaction isolation level can be faster for TPC-C tests is when the contention is relatively low. For me, I settled on using relative updates and sometimes optimistic locking for accounts where the balance was decreased or used to calculate fees and interest based on the exact current value. It was doing several entries per authorization along with cryptography, card checks, and network messages at &lt;a href="https://blogs.oracle.com/solaris/post/tieto-card-suite-optimized-on-oracle-supercluster" rel="noopener noreferrer"&gt;5,000 authorizations per second&lt;/a&gt;.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>A broker-less distributed messaging system from the previous century</title>
      <dc:creator>Aivars Kalvāns</dc:creator>
      <pubDate>Sun, 22 Jun 2025 00:00:00 +0000</pubDate>
      <link>https://forem.com/aivarsk/a-broker-less-distributed-messaging-system-from-the-previous-century-2ebg</link>
      <guid>https://forem.com/aivarsk/a-broker-less-distributed-messaging-system-from-the-previous-century-2ebg</guid>
      <description>&lt;p&gt;When examining the &lt;a href="https://www.tpc.org/tpcc/results/tpcc_results5.asp" rel="noopener noreferrer"&gt;On-Line Transaction Processing Benchmark&lt;/a&gt;, most people focus on the performance numbers and the database software. But there is another column named “TP Monitor” that lists the transaction monitor software. Before cloud-scale systems took over, the best performance numbers were achieved with Oracle Tuxedo (or BEA Tuxedo, before Oracle acquired it). The good results of Oracle database and Tuxedo led my previous company to choose them as the basis for payment card software in the late 1990s and early 2000s.&lt;/p&gt;

&lt;p&gt;While Oracle Tuxedo is proprietary software, &lt;a href="https://pubs.opengroup.org/onlinepubs/009649399/toc.pdf" rel="noopener noreferrer"&gt;The XATMI specification&lt;/a&gt; is public. The main building block is an RPC call: you call a service by its name (&lt;code&gt;svc&lt;/code&gt;), pass some data to it (&lt;code&gt;idata&lt;/code&gt;, &lt;code&gt;ilen&lt;/code&gt;), and receive data back (&lt;code&gt;odata&lt;/code&gt;, &lt;code&gt;olen&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;int tpcall(char *svc, char *idata, long ilen, char **odata, long *olen, long flags);

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Under the hood, it’s all just a wrapper around two API calls: one for sending a request to the service (&lt;code&gt;tpacall&lt;/code&gt;) and the other one for waiting for a response (&lt;code&gt;tpgetrply&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;int tpacall(char *svc, char *data, long len, long flags);
int tpgetrply(int *cd, char **data, long *len, long flags);

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And yes - the creators decided to skip the letter ‘e’ in &lt;code&gt;tpgetrply&lt;/code&gt; while still having longer API names like &lt;code&gt;tpadvertise&lt;/code&gt;, &lt;code&gt;tpconnect&lt;/code&gt; and &lt;code&gt;tpunadvertise&lt;/code&gt;. But unlike later specifications like &lt;a href="https://en.wikipedia.org/wiki/Common_Object_Request_Broker_Architecture" rel="noopener noreferrer"&gt;CORBA&lt;/a&gt;, it made very clear which calls are RPC and you were forced to handle failures and timeouts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;But&lt;/strong&gt; , the API itself is not that interesting. You can implement the API but still have a shitty XATMI implementation. Or you can do it like Tuxedo did and have something simple and elegant. So let’s look into that and maybe you can take some design lessons out of it.&lt;/p&gt;

&lt;p&gt;Tuxedo was developed by AT&amp;amp;T along with UNIX so it used &lt;a href="https://en.wikipedia.org/wiki/UNIX_System_V#SVR1" rel="noopener noreferrer"&gt;System V inter-process message queues, semaphores, and shared memory&lt;/a&gt;. Message queues are the foundation of Tuxedo and the most important function calls are &lt;code&gt;msgsnd&lt;/code&gt;, which just copies data into kernel space, and &lt;code&gt;msgrcv&lt;/code&gt;, which copies it back. &lt;a href="https://www.tuhs.org/cgi-bin/utree.pl?file=pdp11v/usr/src/uts/pdp11/os/msg.c" rel="noopener noreferrer"&gt;It’s really that simple&lt;/a&gt;. But because the message queue lives in the kernel, messages live there as long as the kernel is running. Senders and receives can come and go, and the messages will stay there. There is no “broker” process as we expect nowadays that has to keep running or do persistence of messages to the storage. Kernel is the broker.&lt;/p&gt;

&lt;p&gt;A &lt;a href="https://docs.nats.io/nats-concepts/core-nats/reqreply" rel="noopener noreferrer"&gt;request-reply pattern&lt;/a&gt; is built by the caller having its own response queue and the callee having a request queue. Each request message includes the response queue identifier where the response should be stored. Tuxedo implements timeouts waiting for the response, handles (ignores) late responses, and does other housekeeping.&lt;/p&gt;

&lt;p&gt;Now queues are identified by a number however the Tuxedo works with nice service names. To map between service names and the queue identifiers, Tuxedo uses shared memory. Again - the shared memory is kept alive by the kernel and outlives all processes. However, all processes can access the memory to do the lookup of the service name to the queue identifier. Like a serverless name server.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffduukeaop69lou9wbr4n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffduukeaop69lou9wbr4n.png" alt="Local Tuxedo" width="659" height="321"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To put it all together: the caller process#1 looks up the name of the service “FOO”. When “FOO” is not found or the queue does not exist, you get an error. When the queue is full, you can either wait or fail based on call mode. Once the request message is added to the queue, the caller process#1 proceeds to wait for a response. When the response is not received within the timeout, you get an error. On the callee side process#2 polls requests from queue#3. Once a request is received, it does the work and puts the response back into the reply queue mentioned in the request (queue#4).&lt;/p&gt;

&lt;p&gt;Now what about the “distributed” part? Instead of adding a new transport for the service call, Tuxedo introduces gateways that connect multiple machines. On the caller machine, the gateway says it provides the “FOO” service. Once it receives the request, it forwards it using whatever transport protocol to the gateway on the other machine. On the other machine, the gateway acts as a caller and calls process#2.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhhamv5c7uvcdmn9jdtx2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhhamv5c7uvcdmn9jdtx2.png" alt="Distributed Tuxedo" width="800" height="185"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Simple and nice, isn’t it?&lt;/p&gt;

</description>
    </item>
    <item>
      <title>My Friday's "old man yells at cloud" moment</title>
      <dc:creator>Aivars Kalvāns</dc:creator>
      <pubDate>Fri, 06 Jun 2025 00:00:00 +0000</pubDate>
      <link>https://forem.com/aivarsk/my-fridays-old-man-yells-at-cloud-moment-4joe</link>
      <guid>https://forem.com/aivarsk/my-fridays-old-man-yells-at-cloud-moment-4joe</guid>
      <description>&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=0WYgKc00J8s&amp;amp;t=2096s" rel="noopener noreferrer"&gt;Casey Muratori had this to say&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;You should never take library design advice from anyone who hasn’t had to make a living selling a library in a competitive arena.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I will rephrase it slightly and put it on the wall:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;“You should never take software design advice from anyone who hasn’t had to make a living selling software in a competitive arena.”&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;“Never take” might be too strong but at least be skeptical. Too much advice and ideas come from people who get paid hourly or the project takes as long as it takes and costs as much as it does. Or they move on to the next shiny project as soon as it’s done and never look back. In a way, their assumptions about development, maintenance, costs, and extensibility never get challenged or measured. But but but the “developer performance”… compared to what?&lt;/p&gt;

&lt;p&gt;Selling software or running it for years does the reality check. Does your methodology and “tech stack” leave a space for a profit margin? Do your state machines, event logs, and databases allow you to recover from bugs and incidents? Why extensive testing did miss those bugs? How often do you change the database, XML/JSON parser, or the (G)UI? Or things you put into configuration files because you might want to modify those later? How easy is it to modify and implement new features in the clean unit-tested OCP micro-service multi-az code?&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
