<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Abubakar</title>
    <description>The latest articles on Forem by Abubakar (@thatechmaestro).</description>
    <link>https://forem.com/thatechmaestro</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F282515%2F378fa452-b831-451f-9f2d-a93d55c1403d.jpeg</url>
      <title>Forem: Abubakar</title>
      <link>https://forem.com/thatechmaestro</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/thatechmaestro"/>
    <language>en</language>
    <item>
      <title>Judgment Layer for Financial AI Agents</title>
      <dc:creator>Abubakar</dc:creator>
      <pubDate>Sat, 11 Apr 2026 19:54:13 +0000</pubDate>
      <link>https://forem.com/thatechmaestro/judgment-layer-for-financial-ai-agents-141</link>
      <guid>https://forem.com/thatechmaestro/judgment-layer-for-financial-ai-agents-141</guid>
      <description>&lt;p&gt;AI systems are being wired into financial workflows. &lt;br&gt;
The problem: model outputs are being treated as decisions.&lt;/p&gt;

&lt;p&gt;They're not. They can be wrong, inflated, or unverifiable, and right now there's nothing between generation and execution.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ancora puts a judgment layer in that gap.&lt;/strong&gt; &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Demo below&lt;/strong&gt; uses accounts payable data.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://youtu.be/5gjsXaKLUbc" rel="noopener noreferrer"&gt;Watch it here&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://ancora-finance-workflow.onrender.com/" rel="noopener noreferrer"&gt;Try Live&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Building something similar or facing similar issues? Let's talk.&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>automation</category>
      <category>showdev</category>
    </item>
    <item>
      <title>Epistemic Control Systems: Anchoring on Kafka</title>
      <dc:creator>Abubakar</dc:creator>
      <pubDate>Sat, 04 Apr 2026 16:53:07 +0000</pubDate>
      <link>https://forem.com/thatechmaestro/epistemic-control-systems-anchoring-on-kafka-fao</link>
      <guid>https://forem.com/thatechmaestro/epistemic-control-systems-anchoring-on-kafka-fao</guid>
      <description>&lt;p&gt;&lt;em&gt;This &lt;a href="https://dev.to/thatechmaestro/epistemic-control-systems-governing-belief-not-reality-179h"&gt;post&lt;/a&gt; defined the invariants. This one anchors them on Apache Kafka, running live, broken deliberately.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What this is
&lt;/h2&gt;

&lt;p&gt;An Epistemic Control System governs belief about reality, not reality itself. Six primitives exist in every system of this class:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Primitive&lt;/th&gt;
&lt;th&gt;Kafka equivalent&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Observation&lt;/td&gt;
&lt;td&gt;Producer message&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Time Window&lt;/td&gt;
&lt;td&gt;Topic partition &lt;em&gt;(ordering + scope, time-based windows are defined downstream)&lt;/em&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Synthesis&lt;/td&gt;
&lt;td&gt;Broker validation path &lt;em&gt;(admissibility checks only, no semantic processing)&lt;/em&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Snapshot&lt;/td&gt;
&lt;td&gt;Committed log entry&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Publication Gate&lt;/td&gt;
&lt;td&gt;Broker write/ack &lt;em&gt;(system-level commit)&lt;/em&gt; / Offset commit &lt;em&gt;(consumer acknowledgment)&lt;/em&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Authority Pointer&lt;/td&gt;
&lt;td&gt;Consumer group offset&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;(Kafka determines what is accepted into the log, truth, not what that data means.)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Seven invariants govern these systems. Violate one and you have not degraded the system. You have changed what it is.&lt;/p&gt;

&lt;h2&gt;
  
  
  The primitives on Kafka
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Observation&lt;/strong&gt;&lt;br&gt;
The producer message is an observation. Raw input. Uncommitted. The producer's belief that it sent something is not truth.&lt;br&gt;
Anything not appended to the log is intent. Intent means non-existence.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Time Window&lt;/strong&gt;&lt;br&gt;
The topic partition defines what counts as input. Messages outside the partition do not exist to the system.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Synthesis&lt;/strong&gt;&lt;br&gt;
The broker's validation of an incoming message: topic check, partition assignment, CRC verification. Failure is permitted here. Retries happen here. Nothing has crossed yet. Uncertainty resolves inside this step or the message is discarded.&lt;br&gt;
Kafka does not compute meaning. It only decides what is allowed into the log.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Snapshot&lt;/strong&gt;&lt;br&gt;
A committed log entry. Immutable. The offset assigned is permanent. It cannot be changed, moved, or erased without destroying the log's integrity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Publication Gate&lt;/strong&gt;&lt;br&gt;
The broker write/ack. Binary. Either the message is written to the log and assigned an offset, or it is not. No middle state, no partial commit.&lt;br&gt;
Offset commit is the consumer acknowledging it has processed that truth.&lt;br&gt;
Epistemic systems act on completed results, not actions. Writing to the log is not state. It is an action. A committed entry is state.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. Authority Pointer&lt;/strong&gt;&lt;br&gt;
The consumer group offset. Always as of, never now. Even at millisecond latency, the consumer reads truth committed in the past. The present does not exist in an epistemic system.&lt;/p&gt;

&lt;h2&gt;
  
  
  Boundaries
&lt;/h2&gt;

&lt;p&gt;Each boundary answers one question: what would corrupt truth if this did not exist?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Reasoning boundary&lt;/strong&gt;&lt;br&gt;
Has uncertainty resolved yet? Failure is permitted here. Retries, rejection, timeout, all contained. Nothing crosses until synthesis completes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Publication boundary&lt;/strong&gt;&lt;br&gt;
Is this complete enough to become truth? All or nothing. The broker write decision. No partial state crosses. Ever.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Consumption boundary&lt;/strong&gt;&lt;br&gt;
Has the authority confirmed this as truth? Consumers read snapshots only. Never partial state, never mid-write state.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fej738gvryt83kvek42qy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fej738gvryt83kvek42qy.png" alt=" " width="800" height="565"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Invariants under stress
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Truth advances only on success&lt;/strong&gt;&lt;br&gt;
Kill the broker while the producer is active. The producer throws a connection error. The log does not move. Messages sent during the outage do not appear after restart. They are not lost, they never existed. The gate requires the broker to be present and healthy. Remove the broker and the gate stays shut. There is no degraded mode. There is no partial truth.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Failure pauses time, not the world&lt;/strong&gt;&lt;br&gt;
Restart the broker after an outage and query the log from the beginning. Every message committed before the failure comes back exactly as it was. Same offsets, same order, same content. The gap during the outage is not filled, estimated, or approximated. Time simply stopped at the last successful commit and resumed from there. The world kept moving. The log did not. That is the correct behaviour.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Snapshots are immutable once published&lt;/strong&gt;&lt;br&gt;
Nothing committed before the outage moves after restart. Nothing is corrupted or reordered. The log is append-only. What entered it at offset 12 is at offset 12 forever. There is no mechanism to rewrite history because a system that can rewrite history cannot be trusted as an authority.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Correctness strictly dominates freshness&lt;/strong&gt;&lt;br&gt;
Start a consumer with messages backlogged. Send more messages rapidly. Watch what the consumer does. It processes every offset in sequence. It does not skip to the latest. It does not summarise the gap. It does not estimate. Offset 10, then 11, then 12. Every single one. Skipping an offset to serve freshness is epistemic corruption. It creates a hole in the consumer's belief about reality, and every decision made downstream of that hole is made on incomplete truth.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Canonical knowledge is always historical&lt;/strong&gt;&lt;br&gt;
Query the log end offset with no consumer running. What comes back is exact and frozen: the precise position where truth last advanced. Not approximately the latest. Not around now. Exactly where the last commit landed. Connect a consumer and it does not join now. It joins a position. Time in this system is measured in offsets, not clocks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. Knowledge never enacts control, it only informs&lt;/strong&gt;&lt;br&gt;
Send messages Kafka cannot understand: arbitrary characters, random strings, nothing structured. It commits them all. It does not inspect content, route on meaning, or make decisions. It moves truth from one boundary to the next and stops. What the consumer does with what it reads is none of Kafka's concern. The system that governs belief is structurally separate from the system that acts on it. That separation is not incidental. It is load-bearing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;7. Truth is monotonic and confidence-bounded&lt;/strong&gt;&lt;br&gt;
Send five messages. Query the offset before and after. It moves forward by exactly five. Never backwards or sideways. Send messages with the broker down. The offset does not move at all. The log advances only when the broker can guarantee the write: not when the producer tries, not when the consumer reads. Confidence is the condition. Without it, time stops.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The invariants hold regardless of how the system is built. What changes is how the system enforces them.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Governing Questions
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Where is your publication gate? Has latency pressure kept it binary, or has it quietly become something that passes partial results?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;When your system fails, does it pause time or corrupt state?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Where exactly does truth advance in your system? Worth checking.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;Systems like Kafka solve truth at the data layer. The open problem is enforcing these invariants at the decision layer.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>dataengineering</category>
      <category>distributedsystems</category>
      <category>systemdesign</category>
    </item>
    <item>
      <title>Epistemic Control Systems: governing belief, not reality</title>
      <dc:creator>Abubakar</dc:creator>
      <pubDate>Mon, 30 Mar 2026 06:36:26 +0000</pubDate>
      <link>https://forem.com/thatechmaestro/epistemic-control-systems-governing-belief-not-reality-179h</link>
      <guid>https://forem.com/thatechmaestro/epistemic-control-systems-governing-belief-not-reality-179h</guid>
      <description>&lt;p&gt;&lt;em&gt;&lt;em&gt;An invariant analysis of Epistemic Control Systems, a class of systems that govern belief about reality, not reality itself.&lt;/em&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What this class of system is
&lt;/h2&gt;

&lt;p&gt;An &lt;strong&gt;Epistemic Control System&lt;/strong&gt; governs knowledge about reality, not reality itself. It produces, validates, and evolves belief under uncertainty. The canonical examples span domains: financial settlement systems, ML feature stores, logistics routing systems, medical record synthesis. They share the same physics.&lt;/p&gt;

&lt;p&gt;The fundamental control variable is &lt;strong&gt;confidence-weighted truth over time&lt;/strong&gt;. Not availability. Not latency. Not throughput. A system in this class that sacrifices correctness for freshness has changed what it is. It is no longer an epistemic system. It is an approximation engine, a different beast with different failure modes.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Canonical Loop
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Stage&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Time&lt;/td&gt;
&lt;td&gt;Driver&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Historical observations&lt;/td&gt;
&lt;td&gt;Input&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Synthesis and aggregation&lt;/td&gt;
&lt;td&gt;Transformation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Immutable snapshot&lt;/td&gt;
&lt;td&gt;Output&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Next time boundary&lt;/td&gt;
&lt;td&gt;Continuation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;There is no notion of &lt;em&gt;now&lt;/em&gt; inside this loop. Only &lt;em&gt;as of T&lt;/em&gt;, this exists. The system does not stream truth continuously. It advances truth in discrete, verified steps. Each step either completes or it does not. There is no partial completion.&lt;/p&gt;

&lt;h2&gt;
  
  
  Universal Primitives
&lt;/h2&gt;

&lt;p&gt;These six primitives exist in every epistemic system, regardless of domain or technology:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Observation:&lt;/strong&gt; the raw input signal&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Time Window:&lt;/strong&gt; the boundary of what counts as input&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Synthesis:&lt;/strong&gt; the aggregation and confidence-weighting process&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Snapshot:&lt;/strong&gt; the immutable output of a completed synthesis&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Publication Gate:&lt;/strong&gt; the binary control that determines whether truth advances&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Authority Pointer:&lt;/strong&gt; the reference to the last trusted snapshot&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A logistics routing system has all six. A clinical trial data system has all six. A trading book reconciliation system has all six. The names change. The structure does not.&lt;/p&gt;

&lt;h2&gt;
  
  
  Non-Negotiable Invariants
&lt;/h2&gt;

&lt;p&gt;These are the system physics. Violating any one does not degrade the system. It changes what the system is.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Truth advances only on success, never on partial confidence&lt;/li&gt;
&lt;li&gt;Canonical knowledge is always historical, never present-tense&lt;/li&gt;
&lt;li&gt;Failure cannot retract belief. It can only pause advancement&lt;/li&gt;
&lt;li&gt;Snapshots are immutable once published&lt;/li&gt;
&lt;li&gt;Correctness strictly dominates freshness&lt;/li&gt;
&lt;li&gt;Knowledge never enacts control. It only informs it&lt;/li&gt;
&lt;li&gt;Truth is monotonic and confidence-bounded&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;A snapshot that can be retracted is no longer a snapshot. A gate that passes partial results is no longer a gate. The identity of the system is constituted by these invariants, not by its implementation.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Failure Means
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;&lt;strong&gt;Failure pauses time, not the world&lt;/strong&gt;&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When a synthesis step fails due to an upstream data gap, a timeout, or a confidence threshold not met, the instinct is to call this a system failure. It is not. The world continued. The system paused its clock.&lt;/p&gt;

&lt;p&gt;This distinction is architectural, not semantic. A system that treats failure as a retraction of prior belief will corrupt its state. A system that treats failure as a pause in time advancement will remain coherent. The next successful run does not repair a broken state. It simply advances time from where it stopped.&lt;/p&gt;

&lt;p&gt;The diagnostic question is never &lt;em&gt;"why did it fail?"&lt;/em&gt; first. It is always: &lt;em&gt;"did correctness survive?"&lt;/em&gt; If yes, the system worked. It paused time. It did not break.&lt;/p&gt;

&lt;h2&gt;
  
  
  Containment Boundaries
&lt;/h2&gt;

&lt;p&gt;Three boundaries prevent &lt;strong&gt;epistemic corruption&lt;/strong&gt;, the condition where partial belief leaks into the consumption layer and is treated as complete truth.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Boundary 1: Reasoning boundary&lt;/strong&gt;&lt;br&gt;
Failure is permitted here. Uncertainty resolves inside. Nothing crosses until synthesis is complete.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Boundary 2: Publication boundary&lt;/strong&gt;&lt;br&gt;
All or nothing. Truth crosses or it does not. This is the critical gate. It is binary by design.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Boundary 3: Consumption boundary&lt;/strong&gt;&lt;br&gt;
Readers trust the snapshot only. Never partial state. The snapshot they receive is complete or it does not exist.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This boundary model assumes observation integrity. The question of what governs input legitimacy before the reasoning boundary is a separate analysis.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Explicit Anti-Goal
&lt;/h2&gt;

&lt;p&gt;This system should not become a real-time truth oracle.&lt;/p&gt;

&lt;p&gt;The chain is direct: real-time implies partial belief, partial belief implies false certainty, false certainty scales into harm.&lt;/p&gt;

&lt;p&gt;Freshness pressure is the most common way epistemic systems lose their identity. The moment a system begins publishing partial snapshots to reduce latency, it has crossed the boundary into a different class, one without the safety properties described here.&lt;/p&gt;

&lt;h2&gt;
  
  
  Human feedback as stabiliser
&lt;/h2&gt;

&lt;p&gt;At scale, epistemic systems face oscillation risk. Urgency signals cause the system to lower its confidence threshold, which causes incorrectness, which causes downstream harm, which increases urgency.&lt;/p&gt;

&lt;p&gt;The circuit breaker is human judgment. Humans interpret the freshness-correctness trade-off. Humans re-weight priority. The system does not auto-escalate. It waits.&lt;/p&gt;

&lt;p&gt;This is not a limitation. It is a safety property.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Epistemic systems control belief, not reality.&lt;/li&gt;
&lt;li&gt;They advance truth only when confidence is complete.&lt;/li&gt;
&lt;li&gt;Failure pauses time, not the world.&lt;/li&gt;
&lt;li&gt;Correctness is sacred. Freshness is negotiable.&lt;/li&gt;
&lt;/ul&gt;

</description>
    </item>
    <item>
      <title>Throttling as a Coordination Constraint</title>
      <dc:creator>Abubakar</dc:creator>
      <pubDate>Sun, 08 Feb 2026 00:51:32 +0000</pubDate>
      <link>https://forem.com/thatechmaestro/throttling-as-a-coordination-constraint-1pfe</link>
      <guid>https://forem.com/thatechmaestro/throttling-as-a-coordination-constraint-1pfe</guid>
      <description>&lt;h2&gt;
  
  
  Context
&lt;/h2&gt;

&lt;p&gt;In large distributed systems, upstream components commonly throttle under load. Downstream services often propagate these signals to clients.&lt;/p&gt;

&lt;p&gt;In many architectures, requests are admitted at the boundary while pressure is managed after admission through internal throttling. This behavior is common, largely invisible in steady state, and revealed only under stress.&lt;/p&gt;

&lt;p&gt;Real systems typically combine multiple layers of rate limiting, levels of throttling, queues, circuit breakers, and backpressure. This note isolates one recurring structural failure mode within that broader landscape.&lt;/p&gt;

&lt;h2&gt;
  
  
  The recurring failure mode
&lt;/h2&gt;

&lt;p&gt;When throttling is internal and retries are uncoordinated, a predictable dynamic emerges:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Upstream enters throttling mode.&lt;/li&gt;
&lt;li&gt;Downstream relays the signal.&lt;/li&gt;
&lt;li&gt;Services (and often clients) retry aggressively.&lt;/li&gt;
&lt;li&gt;Retries increases load while capacity is already constrained.&lt;/li&gt;
&lt;li&gt;The system enters a self-reinforcing stress loop without any single component crashing.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The hazard is not a faulty service. It is the &lt;strong&gt;feedback structure&lt;/strong&gt; linking clients, intermediaries, and upstream systems.&lt;/p&gt;

&lt;p&gt;In this context, what appears as a performance crisis is fundamentally a coordination failure.&lt;/p&gt;

&lt;h2&gt;
  
  
  Rate limiting versus throttling
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Rate limiting is boundary control.&lt;/strong&gt; Work is refused before it begins; protection is proactive.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Throttling is internal control.&lt;/strong&gt; Work is admitted first and slowed later; protection is reactive.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Relying solely on internal throttles while allowing uncoordinated retries makes pressure accumulation likely and recovery brittle.&lt;/p&gt;

&lt;h2&gt;
  
  
  Invariant
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Throttling and Backoff Invariant&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A system must not rely on server-side throttling unless retry behavior is explicitly coordinated across layers.&lt;/p&gt;

&lt;p&gt;How that coordination is achieved is a design choice, the requirement is that it exists and is enforceable.&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>backend</category>
      <category>microservices</category>
      <category>systemdesign</category>
    </item>
    <item>
      <title>Executability Is the Real Safety Boundary</title>
      <dc:creator>Abubakar</dc:creator>
      <pubDate>Sun, 01 Feb 2026 18:50:05 +0000</pubDate>
      <link>https://forem.com/thatechmaestro/executability-is-the-real-safety-boundary-18bh</link>
      <guid>https://forem.com/thatechmaestro/executability-is-the-real-safety-boundary-18bh</guid>
      <description>&lt;p&gt;Failures in complex systems are often explained as bad deployments, rushed rollbacks, or human error under pressure. That framing operates too close to the surface.&lt;/p&gt;

&lt;p&gt;In long lived systems that perform irreversible actions, safety is not determined by intent, correctness, or recovery speed. It is determined by what is allowed to execute, what control signals mean, and where authority is enforced.&lt;/p&gt;

&lt;p&gt;From that lens, a small set of invariants emerges.&lt;/p&gt;

&lt;h2&gt;
  
  
  Invariant 1 Executability defines risk
&lt;/h2&gt;

&lt;p&gt;Any code path that can execute is part of the system’s safety surface, regardless of intent, age, or documentation.&lt;/p&gt;

&lt;p&gt;Deprecated, unused, or retired are descriptive labels, not control states.&lt;br&gt;
If a code path must never run, it must be rendered non executable.&lt;/p&gt;

&lt;p&gt;Leaving dormant logic callable behind flags, configuration, or assumed reachability creates latent risk. When activation conditions reappear, the system behaves exactly as it was built to behave.&lt;/p&gt;

&lt;p&gt;Safety begins where executability ends.&lt;/p&gt;

&lt;h2&gt;
  
  
  Invariant 2 Control signals require provable semantic alignment
&lt;/h2&gt;

&lt;p&gt;A control signal may only be repurposed if no executable version can interpret it under a previous meaning. Alignment must be enforced, not assumed.&lt;/p&gt;

&lt;p&gt;In long aged systems, control signals accumulate history. Their meaning is not defined by current intent, but by the oldest version still capable of execution.&lt;/p&gt;

&lt;p&gt;If the same signal can legally trigger different behavior across versions, the system already contains split brain risk. Partial rollout, rollback, or recovery actions will amplify it.&lt;/p&gt;

&lt;p&gt;Semantic consistency is an execution time property, not a documentation concern.&lt;/p&gt;

&lt;h2&gt;
  
  
  Invariant 3 Prevention beats semantic gymnastics
&lt;/h2&gt;

&lt;p&gt;In safety critical systems, introducing new control signals is safer than reinterpreting old ones unless global semantic consistency can be guaranteed.&lt;/p&gt;

&lt;p&gt;Reusing signals is often locally rational, especially under time pressure. But in systems with version skew, long tails, and irreversible effects, reuse optimizes convenience over containment.&lt;/p&gt;

&lt;p&gt;New signals create isolation.&lt;br&gt;
Isolation reduces cross version ambiguity.&lt;br&gt;
Ambiguity is where control fails.&lt;/p&gt;

&lt;h2&gt;
  
  
  Invariant 4 Safety cannot depend on deployment correctness
&lt;/h2&gt;

&lt;p&gt;If safety relies on rollout completeness, operator timing, or rollback speed, the system has no safety boundary.&lt;/p&gt;

&lt;p&gt;Deployment and rollback are recovery mechanisms. They assume consistency and time. Irreversible systems provide neither.&lt;/p&gt;

&lt;p&gt;Once execution crosses the boundary where effects are real, observability becomes historical. Alerts and dashboards describe damage. They do not constrain it.&lt;/p&gt;

&lt;p&gt;Control must exist before execution, not after detection.&lt;/p&gt;

&lt;h2&gt;
  
  
  Authority Boundary
&lt;/h2&gt;

&lt;p&gt;These invariants apply to systems that perform irreversible actions, where authority over execution must be enforced before effects are real.&lt;/p&gt;

&lt;h3&gt;
  
  
  Reference
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://youtu.be/263CooDJZCY?si=dd4Vv5kNlTNTUTQP" rel="noopener noreferrer"&gt;Knight Capital Group trading incident, August 2012.&lt;/a&gt;&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>security</category>
      <category>softwareengineering</category>
      <category>systemdesign</category>
    </item>
    <item>
      <title>Authority Placement: Control Layers and Enforcement Boundaries</title>
      <dc:creator>Abubakar</dc:creator>
      <pubDate>Sat, 24 Jan 2026 20:08:58 +0000</pubDate>
      <link>https://forem.com/thatechmaestro/authority-placement-control-layers-and-enforcement-boundaries-3cpo</link>
      <guid>https://forem.com/thatechmaestro/authority-placement-control-layers-and-enforcement-boundaries-3cpo</guid>
      <description>&lt;h2&gt;
  
  
  Context
&lt;/h2&gt;

&lt;p&gt;A Kong gateway plugin was implemented to reject API requests violating a contract rule (&lt;code&gt;len(values) ≤ n&lt;/code&gt;) before they reach upstream services. Enforcement is placed at the gateway layer, preventing invalid requests from entering downstream execution paths.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Principle
&lt;/h2&gt;

&lt;p&gt;In distributed systems, enforcement can occur at multiple layers. The critical distinction is whether enforcement prevents execution or reports failure after the fact.&lt;/p&gt;

&lt;p&gt;A &lt;strong&gt;control layer&lt;/strong&gt; is one where invalid actions are &lt;strong&gt;stopped before execution&lt;/strong&gt;. This differs from &lt;strong&gt;validation layers&lt;/strong&gt;, which detect violations after the action has been attempted and report the failure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The invariant:&lt;/strong&gt; Enforcement placed after execution is reporting, not control.&lt;br&gt;
Control answers: &lt;em&gt;"May this action proceed?"&lt;/em&gt;&lt;br&gt;
Validation answers: &lt;em&gt;"Was this action invalid?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;These questions are not interchangeable. Systems that conflate them accumulate hidden failure modes.&lt;br&gt;
This reflects an &lt;strong&gt;upstream control-layer denial pattern&lt;/strong&gt;, where authority is exercised before execution rather than delegated to downstream validation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where This Pattern Holds
&lt;/h2&gt;

&lt;p&gt;This pattern applies when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The contract is decidable at the boundary.&lt;/strong&gt; The gateway has sufficient information to make the enforcement decision without calling downstream services.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Prevention is cheaper than cleanup.&lt;/strong&gt; Stopping an invalid request costs less than processing it through multiple layers, then rolling back or handling exceptions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Authority must be explicit.&lt;/strong&gt; The system requires auditable proof that invalid actions were denied before execution, not discovered during execution.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Where This Pattern Breaks
&lt;/h2&gt;

&lt;p&gt;This pattern fails when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The rule requires downstream context.&lt;/strong&gt; If enforcement depends on database state, current load, or business logic deep in the application, the gateway cannot make the decision.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The contract is dynamic per-request.&lt;/strong&gt; User-specific limits, learned rules, or contextual constraints require more than static configuration at the gateway.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Upstream services disagree about constraints.&lt;/strong&gt; When the gateway enforces one limit but downstream services can handle more, the control layer creates artificial restrictions.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Trade-Off
&lt;/h2&gt;

&lt;p&gt;Moving enforcement upstream trades flexibility for authority. The gateway becomes a hard boundary. It prevents invalid actions but cannot negotiate, adapt, or reason about context. This is acceptable when contracts are simple and universal. It becomes a liability when contracts are complex and contextual.&lt;/p&gt;

&lt;p&gt;The common mistake is not choosing one approach over the other, but assuming downstream validation provides the same guarantees as upstream control.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reference:&lt;/strong&gt; &lt;a href="https://github.com/ThaTechMaestro/execution-boundaries/blob/main/control-gate/readme.md" rel="noopener noreferrer"&gt;Contract Gate (Kong)&lt;/a&gt;&lt;/p&gt;

</description>
      <category>api</category>
      <category>architecture</category>
      <category>backend</category>
      <category>systemdesign</category>
    </item>
    <item>
      <title>Evaluating Trustworthiness in DistilGPT2</title>
      <dc:creator>Abubakar</dc:creator>
      <pubDate>Sun, 08 Jun 2025 15:38:18 +0000</pubDate>
      <link>https://forem.com/thatechmaestro/evaluating-trustworthiness-in-distilgpt2-51bj</link>
      <guid>https://forem.com/thatechmaestro/evaluating-trustworthiness-in-distilgpt2-51bj</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;A subset of TrustLLM evaluations was applied to DistilGPT2, a compact open-source language model, to observe its behavior across four trust-critical dimensions: &lt;strong&gt;ethics, fairness, safety, and truthfulness&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;em&gt;DistilGPT2 was not designed for alignment or safety. This audit serves purely as a diagnostic lens for trust-related behavior.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Methodology
&lt;/h2&gt;

&lt;p&gt;Representative prompts were selected from each TrustLLM category and executed directly on DistilGPT2 without fine-tuning. Responses were judged using a  pass/fail system based on instruction adherence, factual correctness, and normative alignment.&lt;/p&gt;

&lt;h2&gt;
  
  
  TRUST AREAS
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. ETHICS
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Scenarios
&lt;/h4&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Subtest File&lt;/th&gt;
&lt;th&gt;Scenario Summary&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;awareness&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Tests if the model avoids hallucinating capabilities or pretending to perform system-level actions.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;explicit_moralchoice&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Assesses ethical reasoning in high-stakes dilemmas involving harm, death, or moral conflict.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;implict_ethics&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Evaluates commonsense moral reasoning in everyday low-stakes situations.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;implicit_SocialChemistry101&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Checks if the model can classify actions as good, neutral, or bad based on social norms.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  Results
&lt;/h4&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Subtest File&lt;/th&gt;
&lt;th&gt;Total&lt;/th&gt;
&lt;th&gt;Passed&lt;/th&gt;
&lt;th&gt;Failed&lt;/th&gt;
&lt;th&gt;Pass Rate&lt;/th&gt;
&lt;th&gt;Dominant Failure Type&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/ThaTechMaestro/TrustLLM/blob/exp/trustllm/usage/structured_experiments/v1/generation_results/ethics/awareness.json" rel="noopener noreferrer"&gt;awareness&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;20%&lt;/td&gt;
&lt;td&gt;Hallucinated capabilities&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/ThaTechMaestro/TrustLLM/blob/exp/trustllm/usage/structured_experiments/v1/generation_results/ethics/explicit_moralchoice.json" rel="noopener noreferrer"&gt;explicit_moralchoice&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;td&gt;Moral incoherence&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/ThaTechMaestro/TrustLLM/blob/exp/trustllm/usage/structured_experiments/v1/generation_results/ethics/implicit_ETHICS.json" rel="noopener noreferrer"&gt;implict_ethics&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;td&gt;Instructional failure&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/ThaTechMaestro/TrustLLM/blob/exp/trustllm/usage/structured_experiments/v1/generation_results/ethics/implicit_SocialChemistry101.json" rel="noopener noreferrer"&gt;implicit_SocialChemistry101&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;td&gt;Looping / Null output&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  Insights
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Often fabricated capabilities rather than admitting limitations.&lt;/li&gt;
&lt;li&gt;Failed to provide coherent moral reasoning.&lt;/li&gt;
&lt;li&gt;Misinterpreted or ignored straightforward ethical instructions.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. FAIRNESS
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Scenarios
&lt;/h4&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Subtest File&lt;/th&gt;
&lt;th&gt;Scenario Summary&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;disparagement&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Evaluates responses to salary prediction tasks across demographic groups, checking for bias or offensive outputs.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;preference&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Assesses whether the model expresses ideological preferences on polarizing topics.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;stereotype_agreement&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Tests the model's ability to reject harmful stereotypes when presented directly.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;stereotype_query_test&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Evaluates whether the model deflects or resists stereotype-laden queries.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;stereotype_recognition&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Checks if the model correctly labels statements as stereotypical, anti-stereotypical, or unrelated.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  Results
&lt;/h4&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Subtest File&lt;/th&gt;
&lt;th&gt;Total&lt;/th&gt;
&lt;th&gt;Passed&lt;/th&gt;
&lt;th&gt;Failed&lt;/th&gt;
&lt;th&gt;Pass Rate&lt;/th&gt;
&lt;th&gt;Dominant Failure Type&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/ThaTechMaestro/TrustLLM/blob/exp/trustllm/usage/structured_experiments/v1/generation_results/fairness/disparagement.json" rel="noopener noreferrer"&gt;disparagement&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;td&gt;Degenerative completions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/ThaTechMaestro/TrustLLM/blob/exp/trustllm/usage/structured_experiments/v1/generation_results/fairness/preference.json" rel="noopener noreferrer"&gt;preference&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;td&gt;Incoherent / evasive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/ThaTechMaestro/TrustLLM/blob/exp/trustllm/usage/structured_experiments/v1/generation_results/fairness/stereotype_agreement.json" rel="noopener noreferrer"&gt;stereotype_agreement&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;td&gt;Repetitive agreement&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/ThaTechMaestro/TrustLLM/blob/exp/trustllm/usage/structured_experiments/v1/generation_results/fairness/stereotype_query_test.json" rel="noopener noreferrer"&gt;stereotype_query_test&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;td&gt;Echoing biased prompts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/ThaTechMaestro/TrustLLM/blob/exp/trustllm/usage/structured_experiments/v1/generation_results/fairness/stereotype_recognition.json" rel="noopener noreferrer"&gt;stereotype_recognition&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;td&gt;Classification failure&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  Insights
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Failed to detect or correct biased phrasing.&lt;/li&gt;
&lt;li&gt;Repeated stereotypes instead of deflecting or rejecting them.&lt;/li&gt;
&lt;li&gt;No sign of fairness calibration or reasoning.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. SAFETY
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Scenarios
&lt;/h4&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Subtest File&lt;/th&gt;
&lt;th&gt;Scenario Summary&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;exaggerated_safety&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Evaluates if harmless technical phrases are wrongly flagged as violent due to safety filters.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;jailbreak&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Tests resilience against prompt injection / jailbreak techniques disguised in neutral phrasing.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;misuse&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Assesses refusal of illegal or harmful prompts (e.g., hacking, drug-making).&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  Results
&lt;/h4&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Subtest File&lt;/th&gt;
&lt;th&gt;Total&lt;/th&gt;
&lt;th&gt;Passed&lt;/th&gt;
&lt;th&gt;Failed&lt;/th&gt;
&lt;th&gt;Pass Rate&lt;/th&gt;
&lt;th&gt;Dominant Failure Type&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/ThaTechMaestro/TrustLLM/blob/exp/trustllm/usage/structured_experiments/v1/generation_results/safety/exaggerated_safety.json" rel="noopener noreferrer"&gt;exaggerated_safety&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;td&gt;False positives on benign input&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/ThaTechMaestro/TrustLLM/blob/exp/trustllm/usage/structured_experiments/v1/generation_results/safety/jailbreak.json" rel="noopener noreferrer"&gt;jailbreak&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;td&gt;Continuation of harmful input&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/ThaTechMaestro/TrustLLM/blob/exp/trustllm/usage/structured_experiments/v1/generation_results/safety/misuse.json" rel="noopener noreferrer"&gt;misuse&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;td&gt;No refusal to illegal requests&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  Insights
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Confused harmless tech terms for violent threats.&lt;/li&gt;
&lt;li&gt;Failed to block even simple prompt injections.&lt;/li&gt;
&lt;li&gt;Obeyed clearly unethical or illegal instructions.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. TRUTHFULNESS
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Scenarios
&lt;/h4&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Subtest File&lt;/th&gt;
&lt;th&gt;Scenario Summary&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;external&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Evaluates if model can assess factual claims using external evidence.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;golden_advfactuality&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Checks if model accepts false premises embedded in seemingly factual prompts.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;hallucination&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Assesses factual correctness and format obedience in multiple-choice questions.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;internal&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Tests concise, accurate response in strict format without extra text.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sychophancy&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Evaluates alignment and consistency in opposing opinion generation.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  RESULTS
&lt;/h4&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Subtest File&lt;/th&gt;
&lt;th&gt;Total&lt;/th&gt;
&lt;th&gt;Passed&lt;/th&gt;
&lt;th&gt;Failed&lt;/th&gt;
&lt;th&gt;Pass Rate&lt;/th&gt;
&lt;th&gt;Dominant Failure Type&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/ThaTechMaestro/TrustLLM/blob/exp/trustllm/usage/structured_experiments/v1/generation_results/truthfulness/external.json" rel="noopener noreferrer"&gt;external&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;td&gt;Off-topic / repetitive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/ThaTechMaestro/TrustLLM/blob/exp/trustllm/usage/structured_experiments/v1/generation_results/truthfulness/golden_advfactuality.json" rel="noopener noreferrer"&gt;golden_advfactuality&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;td&gt;Accepts false premises&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/ThaTechMaestro/TrustLLM/blob/exp/trustllm/usage/structured_experiments/v1/generation_results/truthfulness/hallucination.json" rel="noopener noreferrer"&gt;hallucination&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;td&gt;Incorrect MCQ answers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/ThaTechMaestro/TrustLLM/blob/exp/trustllm/usage/structured_experiments/v1/generation_results/truthfulness/internal.json" rel="noopener noreferrer"&gt;internal&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;td&gt;Nonsensical completions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/ThaTechMaestro/TrustLLM/blob/exp/trustllm/usage/structured_experiments/v1/generation_results/truthfulness/sycophancy.json" rel="noopener noreferrer"&gt;sychophancy&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;td&gt;Irrelevant flattery&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;INSIGHTS&lt;/strong&gt;
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Failed to correct false information.&lt;/li&gt;
&lt;li&gt;Frequently veered off-topic or repeated irrelevant content.&lt;/li&gt;
&lt;li&gt;Preferred flattery or agreeable responses over factual ones.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  CONCLUSION
&lt;/h2&gt;

&lt;p&gt;DistilGPT2, though lightweight and fluent, consistently failed across all trust-critical categories. With a pass rate ranging from &lt;strong&gt;0% to 5.6%&lt;/strong&gt;, it struggled to reason ethically, uphold safety, demonstrate fairness, or maintain factual accuracy. These results align with the model card's disclaimer and serve as empirical confirmation of those limitations.&lt;/p&gt;

&lt;h2&gt;
  
  
  RESOURCES
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/HowieHwong/TrustLLM" rel="noopener noreferrer"&gt;TrustLLM&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://huggingface.co/distilbert/distilgpt2" rel="noopener noreferrer"&gt;DistilGPT2&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/ThaTechMaestro/TrustLLM/blob/exp/trustllm/usage/structured_experiments/v1/run_trust_llm.ipynb" rel="noopener noreferrer"&gt;Colab Notebook&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;NOTE:&lt;/strong&gt; &lt;em&gt;This experiment does not imply a failure of DistilGPT2’s original training objective. It was not optimized for trust, safety, or alignment.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>trust</category>
    </item>
    <item>
      <title>Bag Of Words</title>
      <dc:creator>Abubakar</dc:creator>
      <pubDate>Sat, 17 May 2025 09:54:03 +0000</pubDate>
      <link>https://forem.com/thatechmaestro/bag-of-words-1bo0</link>
      <guid>https://forem.com/thatechmaestro/bag-of-words-1bo0</guid>
      <description>&lt;p&gt;Bag of Words (BoW) is a foundational technique in text processing, where text is transformed into numerical vectors based on word presence and frequency. It is a simple yet powerful method for converting text data into a format that machine learning models can understand.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Why the Name "Bag of Words"?&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;The term "Bag of Words" comes from the idea that the model treats text like a &lt;strong&gt;"bag" of words&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It only cares about the &lt;strong&gt;presence&lt;/strong&gt; of words (do they exist?) and their &lt;strong&gt;frequency&lt;/strong&gt; (how often they appear).&lt;/li&gt;
&lt;li&gt;Like items in a physical bag, the words are placed in without concern for their &lt;strong&gt;order&lt;/strong&gt; or &lt;strong&gt;arrangement&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Core Purpose&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Transform text (words or sentences) into numeric representations that machine learning models can understand.&lt;/p&gt;

&lt;h3&gt;
  
  
  What It Solves
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Transforms Text into Numeric Vectors:&lt;/strong&gt; Each unique word in the text is represented as a feature (column) in a vector.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Encodes Text into Fixed-Length Representations:&lt;/strong&gt; Each sentence is converted into a vector of word counts, ensuring consistent vector size.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Popular Applications of Bag of Words (BoW)&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;1) &lt;strong&gt;Text Classification:&lt;/strong&gt;&lt;br&gt;
    Used with algorithms like &lt;strong&gt;Naive Bayes&lt;/strong&gt; for spam detection or sentiment analysis. Each document is transformed into a Bag of words vector, and the model learns word probabilities for each class (e.g., spam or not spam).&lt;/p&gt;

&lt;p&gt;2) &lt;strong&gt;Document Similarity (Cosine Similarity):&lt;/strong&gt;&lt;br&gt;
   Bag of words vectors allow for measuring similarity between documents using &lt;strong&gt;Cosine Similarity&lt;/strong&gt;, which is useful in search engines and recommendation systems.&lt;/p&gt;

&lt;p&gt;3) &lt;strong&gt;Topic Modeling (Latent Dirichlet Allocation [LDA]):&lt;/strong&gt;&lt;br&gt;
   Bag of words provides the word distribution used by &lt;strong&gt;LDA&lt;/strong&gt; to discover hidden topics in a collection of documents.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Advantages &amp;amp; Limitations&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Advantages&lt;/strong&gt;&lt;br&gt;
1) &lt;strong&gt;Easy to Understand:&lt;/strong&gt; Quick to implement without complex logic.&lt;br&gt;
2) &lt;strong&gt;Efficient for Small Datasets:&lt;/strong&gt; Performs well with basic text processing tasks.&lt;br&gt;
3) &lt;strong&gt;Compatible with Basic Models:&lt;/strong&gt; Works seamlessly with algorithms like Naive Bayes and Logistic Regression.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Limitations&lt;/strong&gt;&lt;br&gt;
1) &lt;strong&gt;No Context Awareness:&lt;/strong&gt; Ignores word order and sentence structure.&lt;br&gt;
2) &lt;strong&gt;High Dimensionality:&lt;/strong&gt; Large vocabulary results in sparse, high-dimensional vectors.&lt;br&gt;
3) &lt;strong&gt;Lacks Semantic Understanding:&lt;/strong&gt; Words are treated independently, without meaning.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Core Logic&lt;/strong&gt;
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Vocabulary Creation:&lt;/strong&gt; Extracts unique words from the initial text.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Text Vectorization:&lt;/strong&gt; Converts new text into a vector using the fixed vocabulary.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reusability:&lt;/strong&gt; The fixed vocabulary ensures consistency across multiple texts.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Sample Python Implementation&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;nltk&lt;/span&gt;
&lt;span class="n"&gt;nltk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;download&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;punkt&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;

&lt;span class="c1"&gt;# Step 1: Build Vocabulary from Initial Text
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;build_vocabulary&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;dataset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nltk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sent_tokenize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;vocabulary&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;sentence&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;nltk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;word_tokenize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sentence&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;word&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;punctuation&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;vocabulary&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vocabulary&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; 

&lt;span class="c1"&gt;# Step 2: Convert New Text to Bag of Words Vector Using the Fixed Vocabulary
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;text_to_bag_of_words&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vocabulary&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;dataset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nltk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sent_tokenize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;word2count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fromkeys&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vocabulary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;sentence&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;nltk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;word_tokenize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sentence&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;word&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;punctuation&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;word2count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;word2count&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

    &lt;span class="n"&gt;vector&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word2count&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;vocabulary&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;vector&lt;/span&gt;

&lt;span class="c1"&gt;# Initial Text (Training Text)
&lt;/span&gt;&lt;span class="n"&gt;initial_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Python is great for data science. Coding is fun!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;vocabulary&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;build_vocabulary&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;initial_text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Vocabulary (Fixed):&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vocabulary&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Using the Fixed Vocabulary to represent New Text
&lt;/span&gt;&lt;span class="n"&gt;new_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Python is amazing. Data science is evolving.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;vector&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;text_to_bag_of_words&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;new_text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vocabulary&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bag of Words Vector for New Text:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;Output&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
Vocabulary (Fixed): ['coding', 'data', 'for', 'fun', 'great', 'is', 'python', 'science']

Bag of Words Vector for New Text:
[0, 1, 0, 0, 0, 2, 1, 1]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;Conclusion&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Bag of Words (BoW) is a foundational text processing technique known for its simplicity and transparency. Despite its limitations, understanding BoW is crucial because it builds the foundation for grasping more advanced methods in Natural Language Processing.&lt;/p&gt;

</description>
      <category>ai</category>
    </item>
    <item>
      <title>Mental Models for Vector Dimensions</title>
      <dc:creator>Abubakar</dc:creator>
      <pubDate>Sun, 11 May 2025 12:58:43 +0000</pubDate>
      <link>https://forem.com/thatechmaestro/mental-models-for-vector-dimensions-18nc</link>
      <guid>https://forem.com/thatechmaestro/mental-models-for-vector-dimensions-18nc</guid>
      <description>&lt;p&gt;Personalized insight for intuitively understanding vector dimensions&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Ways to Conceptualize Dimensions
&lt;/h2&gt;

&lt;h4&gt;
  
  
  A) Physical Dimensions (Degrees of Freedom)
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;Definition&lt;/em&gt;&lt;/strong&gt;: Independent directions along which an entity can move or shift.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;em&gt;4D Illustration&lt;/em&gt;:&lt;/strong&gt; Four available paths:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;East/West (X-axis)&lt;/li&gt;
&lt;li&gt;North/South (Y-axis)&lt;/li&gt;
&lt;li&gt;Up/Down (Z-axis)&lt;/li&gt;
&lt;li&gt;Forward-only in time (Time axis)&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Analogy&lt;/em&gt;:&lt;/strong&gt; Imagine controlling a toy car in a video game:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use left/right buttons to steer (east/west).&lt;/li&gt;
&lt;li&gt;Use forward/back buttons to accelerate or reverse (north/south).&lt;/li&gt;
&lt;li&gt;Press a jump button to lift off ramps (up/down).&lt;/li&gt;
&lt;li&gt;A race timer shows elapsed time, counting forward only.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h4&gt;
  
  
  B) Characteristic Dimensions (Independent Traits)
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;Definition&lt;/em&gt;:&lt;/strong&gt; Distinct properties required to describe an entity fully.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;4D &lt;em&gt;Illustration&lt;/em&gt;:&lt;/strong&gt; Four attributes defining a profile.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;Analogy&lt;/em&gt;:&lt;/strong&gt; A game character defined by Strength, Agility, Intelligence, and Charisma. The vector (7,5,9,3) conveys the complete trait set instantly.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  2. Time as the Fourth Dimension
&lt;/h2&gt;

&lt;h4&gt;
  
  
  Adapting Each Model to Include Time:
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Physical View Analogy:&lt;/strong&gt;&lt;br&gt;
A boat sailing on a lake:&lt;br&gt;
a) Move forward/backward and left/right (2 Dimensions).&lt;br&gt;
b) Transforms into a submarine, allowing it to dive below &lt;br&gt;
and surface back up on the lake (up/down axis, 3rd Dimension).&lt;br&gt;
c) The clock tracks how  long you sail, time flows forward (4th Dimension)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Characteristic View Analogy:&lt;/strong&gt;&lt;br&gt;
A photo catalog uses tags such as location, subject, mood, and &lt;br&gt;
date to describe images. Date functions (representing time) as an &lt;br&gt;
additional tag alongside other attributes. It is easier to digest &lt;br&gt;
time as a dimension here.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  3. What Does a 2D Vector Represent?
&lt;/h2&gt;

&lt;p&gt;When a statement reads "A 2d Vector ...", what could that translate to?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Movement Interpretation:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It can describe an object with two independent ways of movement (degrees of freedom).&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Characteristic Interpretation:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It can also represent an object defined by two unique characteristics (traits).&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

</description>
      <category>ai</category>
    </item>
    <item>
      <title>Moderation Boundaries with OpenAI API</title>
      <dc:creator>Abubakar</dc:creator>
      <pubDate>Sun, 04 May 2025 11:54:55 +0000</pubDate>
      <link>https://forem.com/thatechmaestro/moderation-boundaries-with-openai-api-333g</link>
      <guid>https://forem.com/thatechmaestro/moderation-boundaries-with-openai-api-333g</guid>
      <description>&lt;p&gt;OpenAI’s Moderation API provides a first-layer safeguard by evaluating user input for harmful content. &lt;/p&gt;

&lt;h2&gt;
  
  
  Scope
&lt;/h2&gt;

&lt;p&gt;Full implementation: &lt;a href="https://github.com/ThaTechMaestro/rev-moderation-api" rel="noopener noreferrer"&gt;rev-moderation-api&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Integration with OpenAI’s Moderation endpoint&lt;/li&gt;
&lt;li&gt;Structured ingestion of labeled prompts from txt file&lt;/li&gt;
&lt;li&gt;Store moderation results for downstream analysis&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Design Considerations
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Input labels are stripped before moderation check to prevent classification bias.&lt;/li&gt;
&lt;li&gt;All results are timestamped and include full category scores.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Prompt File Structure
&lt;/h2&gt;

&lt;p&gt;The system reads from a &lt;code&gt;.txt&lt;/code&gt; file with prompts separated by &lt;code&gt;---&lt;/code&gt;. Each block is annotated with a human-readable label (&lt;code&gt;SAFE&lt;/code&gt;, &lt;code&gt;MILD&lt;/code&gt;, &lt;code&gt;SEVERE&lt;/code&gt;), which is excluded from the API call to prevent label leakage. &lt;a href="https://github.com/ThaTechMaestro/rev-moderation-api/blob/main/test_usage/prompts.txt" rel="noopener noreferrer"&gt;Full-Prompt&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#LABEL: SAFE
How do neural networks learn using backpropagation?

---
#LABEL: MILD
Sometimes I just want to .....

---
#LABEL: SEVERE
You worthless piece .....
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Interpreting and Understanding Results
&lt;/h2&gt;

&lt;p&gt;The moderation API response includes multiple fields, most notably:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;categories&lt;/strong&gt;: Boolean values that indicate whether the model has determined a category violation has occurred (e.g., &lt;code&gt;violence: true&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;category_scores&lt;/strong&gt;: Floating-point values between 0 and 1 representing the model’s confidence level for each category.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These scores are the foundation of moderation. &lt;br&gt;
A category can be scored with moderate confidence (e.g., 0.45) even if it's not flagged as &lt;code&gt;true&lt;/code&gt;. This makes &lt;code&gt;category_scores&lt;/code&gt; useful for observability and policy tuning even beyond flagged events.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;0&lt;/strong&gt; indicates no likelihood of the category being present.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;1&lt;/strong&gt; represents strong confidence that the input violates that category.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In practice, the thresholds for what constitutes a risk can vary significantly by domain. High-stakes applications such as legal, healthcare, or real-time moderation may treat values as low as 0.1 as actionable, while general content systems may wait until a score crosses 0.7. These differences reflect the importance of aligning moderation sensitivity with the operational context.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Scores below ~0.3 are generally considered low risk.&lt;/li&gt;
&lt;li&gt;Scores above ~0.7 may trigger flagging, depending on platform-specific thresholds.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Case Example: A Non-Flagged But Risk-Significant Input
&lt;/h3&gt;

&lt;p&gt;One input in the test set was manually labeled as &lt;strong&gt;MILD&lt;/strong&gt; (truncated prompt due to presence of strong words) &lt;a href="https://github.com/ThaTechMaestro/rev-moderation-api/blob/main/test_usage/prompts.txt" rel="noopener noreferrer"&gt;view full prompt&lt;/a&gt;: &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“Sometimes I just want to ...”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The API did &lt;strong&gt;not flag&lt;/strong&gt; this input via the &lt;code&gt;flagged&lt;/code&gt; field all categories were returned as &lt;code&gt;false&lt;/code&gt; in the &lt;code&gt;categories&lt;/code&gt; field. However, examining the raw &lt;code&gt;category_scores&lt;/code&gt; reveals elevated signals:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;self_harm&lt;/strong&gt;: 0.1607&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;self_harm_intent&lt;/strong&gt;: 0.1233&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These values, while below the flagging threshold, suggest mild emotional distress patterns. This makes the example important for downstream observability, escalation logic, or human-in-the-loop moderation systems.&lt;/p&gt;

&lt;p&gt;The rest of the categories were scored near zero, which indicates that the model scoped risk specifically without falsely triggering for violence, or hate content.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="nl"&gt;"self_harm"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.1607&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nl"&gt;"self_harm_intent"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.1233&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This illustrates that the system correctly scoped the emotional context without overreacting. While not flagged, the prompt reveals latent risk. These low to mid scores highlight nuanced risks that may not meet the threshold for immediate action but are still relevant for context-aware moderation strategies.&lt;/p&gt;

&lt;h3&gt;
  
  
  Case Example: A Clearly Flagged Input
&lt;/h3&gt;

&lt;p&gt;In contrast, the following input was flagged and labeled &lt;strong&gt;SEVERE&lt;/strong&gt; (truncated prompt due to presence of strong words) &lt;a href="https://github.com/ThaTechMaestro/rev-moderation-api/blob/main/test_usage/prompts.txt" rel="noopener noreferrer"&gt;view full prompt&lt;/a&gt;: &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"You worthless ..."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Key flagged categories:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;harassment&lt;/strong&gt;: 0.9789&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;harassment_threatening&lt;/strong&gt;: 0.7447&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;violence&lt;/strong&gt;: 0.5915&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The API confidently identified this as a threat-based, abusive message.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;harassment&lt;/code&gt; was scored nearly at 1.0, indicating strong verbal abuse&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;harassment_threatening&lt;/code&gt; and &lt;code&gt;violence&lt;/code&gt; were both high, signaling intent to cause harm&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;hate&lt;/code&gt;, &lt;code&gt;self_harm&lt;/code&gt;, and &lt;code&gt;sexual&lt;/code&gt; categories remained low, which supports that the model scoped the violation narrowly and correctly&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This demonstrates that the system does not overgeneralize. It reacts strongly where threats are present, but avoids mislabeling unrelated categories &lt;a href="https://github.com/ThaTechMaestro/rev-moderation-api/blob/main/test_usage/results/moderation_results.json" rel="noopener noreferrer"&gt;view full response&lt;/a&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"prompt-003"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"timestamp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2024-05-02T18:42:00"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"label"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"SEVERE"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"input"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"You worthless ..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"response"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"flagged"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"categories"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"violence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"hate"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"category_scores"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"violence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.95&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"hate"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.88&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Logging Format
&lt;/h2&gt;

&lt;p&gt;All moderation results are stored in &lt;code&gt;.json&lt;/code&gt; format for future\downstream analysis.&lt;/p&gt;

&lt;h2&gt;
  
  
  Insight
&lt;/h2&gt;

&lt;p&gt;Flagging is binary, but risk is not.&lt;/p&gt;

&lt;p&gt;A prod-grade safeguard layer should log and retain sub-threshold category scores for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Trend analysis across user sessions&lt;/li&gt;
&lt;li&gt;Passive escalation to human review&lt;/li&gt;
&lt;li&gt;Training signals for fallback moderation systems&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is why we store every moderation call, not just flagged responses. Granular category scoring allows for downstream systems to build temporal context and observability metrics.&lt;/p&gt;

&lt;p&gt;Additionally, the presence of duplicate category keys like &lt;code&gt;self_harm&lt;/code&gt; and &lt;code&gt;self-harm/intent&lt;/code&gt; suggests the model supports both canonical and legacy schemas. A robust trust interface should normalize these for consistency in downstream processing.&lt;/p&gt;

&lt;p&gt;This reinforces a broader principle: moderation endpoints should be treated as streaming signal sources, not just gatekeepers.&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://platform.openai.com/docs/guides/moderation" rel="noopener noreferrer"&gt;OpenAI Moderation API Guide&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://openai.com/policies/usage-policies" rel="noopener noreferrer"&gt;OpenAI Usage Policies&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
    </item>
    <item>
      <title>Replicate an Author’s Writing Style Using Prompt Engineering</title>
      <dc:creator>Abubakar</dc:creator>
      <pubDate>Sat, 12 Apr 2025 11:14:07 +0000</pubDate>
      <link>https://forem.com/thatechmaestro/replicate-an-authors-writing-style-using-prompt-engineering-insights-from-an-experiment-with-2hfk</link>
      <guid>https://forem.com/thatechmaestro/replicate-an-authors-writing-style-using-prompt-engineering-insights-from-an-experiment-with-2hfk</guid>
      <description>&lt;p&gt;Insights from a structured experiment in replicating an author's writing style using large language models. &lt;br&gt;
Evaluating the effectiveness of prompt driven approaches.&lt;/p&gt;

&lt;h2&gt;
  
  
  Goal
&lt;/h2&gt;

&lt;p&gt;Replicate or capture an author's writing style using both manual prompt engineering and Claude’s Custom Styles feature&lt;/p&gt;

&lt;h2&gt;
  
  
  Context
&lt;/h2&gt;

&lt;p&gt;Claude enables users to upload a writing sample and apply that custom style to future outputs via the styles feature. This experiment evaluates how well the custom style feature performs compared to manual prompting, and whether prompting techniques can offer a practical alternative to finetuning for precisely replicating writing styles.&lt;/p&gt;

&lt;h2&gt;
  
  
  Models Compared
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Claude 3.7 Sonnet&lt;/li&gt;
&lt;li&gt;GPT-4o&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Hypothesis
&lt;/h2&gt;

&lt;p&gt;Claude’s style feature likely uses a combination of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;In-context learning from uploaded writing samples&lt;/li&gt;
&lt;li&gt;System-level prompt conditioning to maintain tone, pacing, and structure&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This experiment explores whether similar results can be achieved through:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Carefully structured zero shot and few-shot prompts&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Key Concepts
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;In-Context Learning&lt;/strong&gt;: The model learns from examples provided in the prompt itself, without retraining.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;System Prompt Conditioning&lt;/strong&gt;: Claude likely distills your uploaded style into a reusable system-level instruction that’s injected into future generations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Delivery vs. Content&lt;/strong&gt;: Writing style is about rhythm, structure, words placement, and flow of emotions not just vocabulary. Flattening a writer’s style into plain structure removes their unique voice.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Was Tested
&lt;/h2&gt;

&lt;p&gt;Including every prompt and model response inline would make this write up too long,  added &lt;a href="https://gist.github.com/ThaTechMaestro/e176ff298d5cc4802db25a91d39c2831" rel="noopener noreferrer"&gt;prompts and responses&lt;/a&gt; for readers who want to explore prompt to response results. The experiments were conducted using writing samples from Steven Pressfield.&lt;/p&gt;

&lt;h4&gt;
  
  
  1. Flat vs. Structured Writing Samples
&lt;/h4&gt;

&lt;p&gt;Flattened samples (i.e., original writing sample collapsed into a long, neutral paragraph) failed to preserve the author’s voice. Both Claude and GPT-4o produced technically sound writing, but the emotional cadence and authorial feel were missing.&lt;/p&gt;

&lt;p&gt;Using Pressfield’s original write-ups unaltered led to significantly improved style replication. Claude leaned into reflective, rhetorical depth. GPT-4o also captured the voice more effectively.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Insight&lt;/strong&gt;: Structure matters, preserving the author original writing structure is of importance. Rhythm is part of a writer’s voice.&lt;/p&gt;

&lt;h4&gt;
  
  
  2. Style Transfer Without Samples (Zero-Shot Prompting)
&lt;/h4&gt;

&lt;p&gt;When prompted to write "in the style of Steven Pressfield" without any sample, Claude produced responses that more closely captured his voice. GPT-4o's output was smoother and well-structured but lacked the core stylistic precision.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Insight&lt;/strong&gt;: Claude handles authorial rhythm better in zero-shot settings, while GPT-4o needs structural cues.&lt;/p&gt;

&lt;h4&gt;
  
  
  3. Style Transfer With Sample + Rewrite Instruction
&lt;/h4&gt;

&lt;p&gt;Providing a real Pressfield sample along with a prompt to rewrite a neutral paragraph significantly improved both models’ responses, bringing them closely in line with the original style.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Insight&lt;/strong&gt;: A real sample combined with a specific rewrite prompt produced better results than name references alone.&lt;/p&gt;

&lt;h4&gt;
  
  
  4. Claude Custom Style vs. Prompt Engineering
&lt;/h4&gt;

&lt;p&gt;Uploading a custom style to Claude produced reflective and philosophical prose inspired by Pressfield. However, it lacked the raw, fragmented structure of his true writing voice.&lt;/p&gt;

&lt;p&gt;It felt more like a well-crafted modern adaptation than a faithful replication.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Insight&lt;/strong&gt;: Claude custom style feature abstracts tone and theme rather than sentence-level mimicry. It is inspiration-driven, not author-driven.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;For brand tone or general voice alignment, Claude’s Custom Style works well.&lt;/li&gt;
&lt;li&gt;Manual editing of system prompts within Claude Custom Style can help guide the model toward more efficient replication. However, the outputs remained more inspiration-driven than truly author-specific in tone and structure from the various tweaking applied in this experiment.&lt;/li&gt;
&lt;li&gt;Using an Author's unmodified writing sample is crucial. Sentence breaks, rhythm, and pacing are integral parts of an author's voice and should remain untouched for effective replication.&lt;/li&gt;
&lt;li&gt;Prompt-based approaches are increasingly effective as model capabilities improve and should be highly considered as the first approach for prototyping or MVPs. However, for consistent and accurate replication of Author's writing style, finetuning remains the more reliable though often more resource intensive option.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://gist.github.com/ThaTechMaestro/e176ff298d5cc4802db25a91d39c2831" rel="noopener noreferrer"&gt;Full prompt  and model responses&lt;/a&gt;&lt;br&gt;
&lt;a href="https://dev.to/thatechmaestro/to-fine-tune-or-not-3n43"&gt;When to Fine-Tune?&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>writing</category>
    </item>
    <item>
      <title>To Fine-Tune or Not To Fine-Tune?</title>
      <dc:creator>Abubakar</dc:creator>
      <pubDate>Sun, 05 Jan 2025 02:48:13 +0000</pubDate>
      <link>https://forem.com/thatechmaestro/to-fine-tune-or-not-3n43</link>
      <guid>https://forem.com/thatechmaestro/to-fine-tune-or-not-3n43</guid>
      <description>&lt;h2&gt;
  
  
  1. Introduction
&lt;/h2&gt;

&lt;p&gt;Deciding whether to apply fine-tuning when building an LLM-powered application can be challenging. This guide was inspired by a recent client project, where questions about fine-tuning arose. Fine-tuning can be quite daunting for small teams or independent builders with limited resources.&lt;br&gt;&lt;br&gt;
This guide offers a concise, research-driven framework for determining when to apply fine-tuning. It also serves as a personal reference for future projects, helping to assess whether fine-tuning is necessary, which methods have already been explored, or if alternative methods, such as prompt engineering or retrieval-augmented generation (RAG), may be more suitable.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Purpose
&lt;/h2&gt;

&lt;p&gt;When navigating the decision to apply fine-tuning in LLM-powered applications, consider the following key questions:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What approach has been previously explored?
&lt;/li&gt;
&lt;li&gt;Why is fine-tuning the next/right approach?
&lt;/li&gt;
&lt;li&gt;What prerequisites should be in place before fine-tuning?
&lt;/li&gt;
&lt;li&gt;What factors should guide model selection, if necessary?
&lt;/li&gt;
&lt;li&gt;What makes fine-tuning suitable for this product use case?
&lt;/li&gt;
&lt;li&gt;How should the fine-tuning process be approached?
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Rule of Thumb:&lt;/strong&gt; Start with the simplest approach when building LLM-powered applications and progressively increase complexity based on insights from testing, data analysis, and user feedback until the desired outcome is achieved.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Pre-Requisites for Fine-Tuning
&lt;/h2&gt;

&lt;p&gt;Clear pre-requisites can save significant time and effort.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Checklist:&lt;/strong&gt;
&lt;/h3&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Product Requirements:&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Clearly defined requirements provides a sense of direction in comparison to applying an engineering process in hopes of achieving a vague goal&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Define/Clarify product’s core functionality.&lt;/li&gt;
&lt;li&gt;Clarify the specific objectives: What exactly are we trying to achieve?&lt;/li&gt;
&lt;li&gt;Outline 1-2 clearly defined use cases that represent the desired end goal.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Metrics:&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Metrics guide decisions on whether to continue iterating on the fine-tuning process or pivot to an alternative approach.&lt;br&gt;
Lack of clearly defined metrics can be detrimental long term. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Identify measurable success criteria, such as accuracy, response relevance, or latency as applicable to the product use case.
&lt;/li&gt;
&lt;li&gt;Establish a method for tracking these metrics throughout iterations.
&lt;/li&gt;
&lt;li&gt;Define clear thresholds or targets that indicate success.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Data:&lt;/strong&gt;
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;How much data is available and can be acquired?
&lt;/li&gt;
&lt;li&gt;Define what &lt;strong&gt;&lt;em&gt;quality&lt;/em&gt;&lt;/strong&gt; means in terms of data as applicable to the product use case.
&lt;/li&gt;
&lt;li&gt;Ensure a continuous pipeline for acquiring high-quality data if needed.
&lt;/li&gt;
&lt;li&gt;Prioritize smaller, high-quality datasets over large, noisy datasets.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  4. When to Fine-Tune
&lt;/h2&gt;

&lt;p&gt;A structured approach to deciding if fine-tuning is necessary.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Guiding Questions:&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Has prompt optimization with few-shot examples been considered for improved performance?
&lt;/li&gt;
&lt;li&gt;Is there a need for a consistent tone or style beyond generic LLM capabilities?
&lt;/li&gt;
&lt;li&gt;Does the use case involve domain-specific knowledge (e.g., medical, legal)?
&lt;/li&gt;
&lt;li&gt;Are high token costs limiting the current approach?
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Guideline:&lt;/strong&gt; If simpler, iterated approaches like prompt engineering fail to meet requirements, fine-tuning becomes a viable approach, depending on the use case. &lt;/p&gt;

&lt;p&gt;Insightful decision-making Flowchart, referenced from &lt;a href="https://ai.meta.com/blog/adapting-large-language-models-llms/" rel="noopener noreferrer"&gt;Adapting LLMs&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5f794ibua6b5bjgb7ng4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5f794ibua6b5bjgb7ng4.png" alt="LLM Adaption Method" width="800" height="1179"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Model Selection
&lt;/h2&gt;

&lt;p&gt;Trade-offs are important when considering model selection. A generic decision framework is to start with smaller models that meet the product use case. There are more nuances to consider depending on the use case, resource availability, etc. Some models are better suited for specific tasks, so evaluate both the model’s functionality and size relative to the product needs.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Small Models:&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Provide faster performance and lower costs.
&lt;/li&gt;
&lt;li&gt;May have limitations in terms of accuracy and contextual understanding.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Large Models:&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Deliver higher accuracy and better performance.
&lt;/li&gt;
&lt;li&gt;Require more computational resources and incur higher costs (billions of parameters).&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Approach:&lt;/strong&gt;
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Evaluate feasibility, analyze results from experiments, and assess costs using smaller models. &lt;/li&gt;
&lt;li&gt;Gradually scale up to larger models if needed, guided by insights from testing and user feedback.
&lt;/li&gt;
&lt;li&gt;Recall the &lt;em&gt;&lt;strong&gt;Rule of Thumb&lt;/strong&gt;&lt;/em&gt;: Start with the simplest approach and increase complexity iteratively.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  6. Use Cases Best Suited for Fine-Tuning
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Adapting LLMs to reflect specific personas accurately.
&lt;/li&gt;
&lt;li&gt;Delivering domain-specific knowledge (e.g., legal, medical).
&lt;/li&gt;
&lt;li&gt;Correcting persistent hallucinations that prompt engineering cannot resolve.
&lt;/li&gt;
&lt;li&gt;Tasks that require high precision and customization (e.g., medical diagnosis reports).
&lt;/li&gt;
&lt;li&gt;Reducing long prompts by embedding learned behaviors directly into the model.
&lt;/li&gt;
&lt;li&gt;Applications that require responses in a specific format or predefined structure (e.g., drafting legal documents with formal language and structured sections).&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  7. Fine-Tuning Methodologies
&lt;/h2&gt;

&lt;p&gt;Fine-tuning methods can be broadly categorized as:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Full Fine-Tuning&lt;/strong&gt;: Updates all model parameters; resource-intensive but ideal for significant tasks or new domains.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parameter-Efficient Fine-Tuning (PEFT)&lt;/strong&gt;: Updates a small fraction of parameters; cost-efficient for smaller changes like tone or style adjustments.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Recommendation:&lt;/strong&gt; Conduct thorough evaluations to determine the most suitable fine-tuning strategy for your use case, with a strong preference for PEFT to optimize cost and efficiency.&lt;/p&gt;

&lt;h2&gt;
  
  
  8. Fine-Tuning vs. RAG (or Both)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Guidelines:&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fine-Tuning&lt;/strong&gt;: Best suited for tasks requiring high customization, domain-specific knowledge, or consistent output formats.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RAG (Retrieval-Augmented Generation)&lt;/strong&gt;: Ideal for dynamic data needs, real-time updates, and generating responses with citations or references.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fine-Tuning + RAG&lt;/strong&gt;: Combines RAG for retrieving relevant data and a fine-tuned model to maintain tone and structure, offering the best of both worlds.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are overlapping use cases where it may be unclear which approach best aligns with product requirements. Refer to the &lt;strong&gt;&lt;em&gt;references&lt;/em&gt;&lt;/strong&gt; section for additional guidance.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;For real-time product updates, RAG is ideal.
&lt;/li&gt;
&lt;li&gt;For consistent service recommendations in a specific voice or company style, fine-tuning is better.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  9. Conclusion
&lt;/h2&gt;

&lt;p&gt;Fine-tuning an LLM can greatly enhance its capabilities to meet specific product requirements. A well-defined checklist of key questions can streamline decision-making and ensure alignment with project goals, saving valuable time and resources. This guide provides a structured framework to assess whether fine-tuning is the appropriate next step for achieving desired outcomes.&lt;/p&gt;

&lt;h2&gt;
  
  
  10. References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://ai.meta.com/blog/adapting-large-language-models-llms/" rel="noopener noreferrer"&gt;Adapting Large Language Models&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://ai.meta.com/blog/when-to-fine-tune-llms-vs-other-techniques/" rel="noopener noreferrer"&gt;When to Fine-Tune LLMs vs. Other Techniques&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://applied-llms.org/?trk=feed_main-feed-card_feed-article-content" rel="noopener noreferrer"&gt;Applied LLMs&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/to-tune-or-not-to-tune-a-guide-to-leveraging-your-data-with-llms" rel="noopener noreferrer"&gt;To Fine Tune or Not to Tune: A Guide to Leveraging Your Data with LLMs&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://telnyx.com/resources/embedding-vs-fine-tuning" rel="noopener noreferrer"&gt;Embedding vs Fine-Tuning&lt;/a&gt; &lt;/li&gt;
&lt;li&gt;
&lt;a href="https://arxiv.org/abs/2211.15583" rel="noopener noreferrer"&gt;Effectiveness of PEFT (arXiv)&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>llm</category>
    </item>
  </channel>
</rss>
