<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Zach A. Thomas</title>
    <description>The latest articles on Forem by Zach A. Thomas (@dysmento).</description>
    <link>https://forem.com/dysmento</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F808902%2F42e1c400-3bcd-4b4e-bff0-f3ff1b968069.jpeg</url>
      <title>Forem: Zach A. Thomas</title>
      <link>https://forem.com/dysmento</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/dysmento"/>
    <language>en</language>
    <item>
      <title>First Look at Amazon DataZone</title>
      <dc:creator>Zach A. Thomas</dc:creator>
      <pubDate>Thu, 30 Mar 2023 14:05:41 +0000</pubDate>
      <link>https://forem.com/aws-builders/first-look-at-amazon-datazone-35d5</link>
      <guid>https://forem.com/aws-builders/first-look-at-amazon-datazone-35d5</guid>
      <description>&lt;p&gt;AWS has released a public preview of an impressive suite of capabilities called &lt;a href="https://aws.amazon.com/datazone/"&gt;Amazon DataZone.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is based on a thorny problem in trying to tap the potential of business data: data is often trapped within the corporate silo where it was created, e.g. Sales, Operations, Marketing, etc. Even when we succeed in breaking through the silo to get access to some other group's data, we have a host of additional problems: cleaning the data, normalizing data definitions, questions of compliance (encryption, the handling of personally identifying information, retention policy, etc.), and making it easy to use.&lt;/p&gt;

&lt;p&gt;Businesses have tried solving this problem by centralizing the data access problem in a new group, sometimes called Data Science. This group is in charge of the data lake, and cleaning and describing all the data sources. This is &lt;em&gt;almost&lt;/em&gt; a good idea, but it creates an unfortunate bottleneck; the agility of the business is hampered by the need to get the data source tidied and blessed by an overworked group of people who are not the domain experts in any of the data sources they're made responsible for. The capabilities of the Data Science team are stretched thin and they don't scale with the appetite for business data.&lt;/p&gt;

&lt;p&gt;A new, decentralized approach to these problems was described by Zhamak Dehghani from ThoughtWorks. It's called &lt;a href="https://martinfowler.com/articles/data-mesh-principles.html"&gt;Data Mesh,&lt;/a&gt; and it's based on the idea that each data source should be treated as a data &lt;em&gt;product&lt;/em&gt;, and the owners of the data source are &lt;em&gt;also&lt;/em&gt; the owners of the data product. You can think of a data source as a kind of API, and many groups can contribute to the shared catalog, including providing the documentation and access rules. The final piece of the Data Mesh puzzle is that governance should be federated rather than centralized.&lt;/p&gt;

&lt;p&gt;Amazon appears to be taking this approach seriously, and DataZone is a system of systems for connecting producers of data with consumers from a hub that is self-service for both kinds of participants. From their product page, the key features are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;catalog: Search for published data, request access, and start working with your data in days instead of weeks.&lt;/li&gt;
&lt;li&gt;projects: Collaborate with teams through data assets, and manage and monitor data assets across projects.&lt;/li&gt;
&lt;li&gt;portal: Access analytics with a personalized view for data assets through a web-based application or API.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It's clear you still need the data specialists, but they can be the ones empowering disparate groups with these new capabilities so they can begin to be autonomous. It's consistent with the concept of a platform team from &lt;a href="https://teamtopologies.com/"&gt;Team Topologies.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I look forward to getting my hands dirty with Amazon's new offering. I don't know how good their implementation is yet, but I have become convinced that this architecture is a source of significant competitive advantage.&lt;/p&gt;

&lt;p&gt;(image credit Rhk111, CC BY-SA 4.0 &lt;a href="https://creativecommons.org/licenses/by-sa/4.0"&gt;https://creativecommons.org/licenses/by-sa/4.0&lt;/a&gt;, via Wikimedia Commons)&lt;/p&gt;

</description>
      <category>aws</category>
      <category>datascience</category>
    </item>
    <item>
      <title>Why Microservices Are Not Just Another Fad</title>
      <dc:creator>Zach A. Thomas</dc:creator>
      <pubDate>Sat, 11 Feb 2023 21:32:09 +0000</pubDate>
      <link>https://forem.com/dysmento/why-microservices-are-not-just-another-fad-1n1c</link>
      <guid>https://forem.com/dysmento/why-microservices-are-not-just-another-fad-1n1c</guid>
      <description>&lt;p&gt;I don't like chasing technology fads. It's important to maintain a healthy level of skepticism, so you don't waste your time, or even end up making some expensive mistakes.&lt;/p&gt;

&lt;p&gt;If your skepticism is too well-developed, however, you might overlook some solutions to your thorny problems.&lt;/p&gt;

&lt;p&gt;The debate about microservices is important, insofar as they are &lt;em&gt;not&lt;/em&gt; a silver bullet, and you could wind up in trouble if you think you can just "sprinkle some microservices on it" and then lean back and watch the magic happen. See &lt;a href="https://martinfowler.com/bliki/MicroservicePrerequisites.html" rel="noopener noreferrer"&gt;this post&lt;/a&gt; by Martin Fowler about what kind of processes you should have in place before it makes sense to try to introduce microservices to your architecture.&lt;/p&gt;

&lt;p&gt;But sometimes our debates are unproductive, because we get hung up on less relevant details, like how many lines of code makes it "micro," or whether serverless makes microservices obsolete. For me, that's like arguing over what color the paint job should be when you really should be deciding whether you're getting a car, a truck, or a city bus.&lt;/p&gt;

&lt;p&gt;The real power unlocked by microservices is not about the particulars of the technology choices you make. What it's really about is enabling a workflow whereby a small team can deliver at whatever cadence they like best with &lt;em&gt;no handoffs&lt;/em&gt; required to any other group. I call this partitioning the value streams. In waterfall workflows, we throw things over the wall and wait. We might get some feedback &lt;em&gt;days&lt;/em&gt; later and then start the cycle again. One of the great paradoxes of software development is that we can deliver faster and reduce risk &lt;em&gt;at the same time&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Imagine a company that has grown quickly, but is still not large. They have a monolithic application that is released once a week, and is feature-frozen a week before that. They're up to about two dozen full-time software engineers, so they're starting to feel the coordination overhead of working on a single large component. Here's a partial list of problems routinely faced by this group that are also solved by partitioning the value streams:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;someone writes an inefficient database query, and the whole business goes down&lt;/li&gt;
&lt;li&gt;the CI/CD build takes ten minutes to run, and if any change is made to &lt;code&gt;main&lt;/code&gt; while that happens, it has to start over&lt;/li&gt;
&lt;li&gt;people race to get their changes in before the feature freeze. If you make the cutoff, your changes will go live in a week. If you &lt;em&gt;miss&lt;/em&gt; the cutoff, it takes two weeks to ship&lt;/li&gt;
&lt;li&gt;a bug in a minor feature can take the whole business down, so a change to anything necessitates performing hours of regression testing on &lt;em&gt;everything&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;some engineers want to be consulted in case something they wrote is going to be changed&lt;/li&gt;
&lt;li&gt;the monolith can scale horizontally, but it is not possible to scale high-demand APIs independently of low-demand APIs&lt;/li&gt;
&lt;li&gt;innovating is so risky that experiments are discouraged and the default answer is "no"&lt;/li&gt;
&lt;li&gt;people responding to incidents are unlikely to have worked on the part of the application that is misbehaving&lt;/li&gt;
&lt;li&gt;availability of the system is only as good as the least available subsystem&lt;/li&gt;
&lt;li&gt;the dependency graph of the code is unreadable&lt;/li&gt;
&lt;li&gt;the codebase suffers from high coupling and low cohesion&lt;/li&gt;
&lt;li&gt;since deploys are not from trunk, it's necessary to forward port patches from the release candidate to main&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As long as there are organizations addressing these problems (and others) with microservices, the argument that it's a technology fad does not hold any water.  &lt;/p&gt;

</description>
      <category>gratitude</category>
    </item>
    <item>
      <title>AWS Provisioning Three Ways</title>
      <dc:creator>Zach A. Thomas</dc:creator>
      <pubDate>Sun, 10 Apr 2022 19:50:19 +0000</pubDate>
      <link>https://forem.com/aws-builders/aws-provisioning-three-ways-5c8b</link>
      <guid>https://forem.com/aws-builders/aws-provisioning-three-ways-5c8b</guid>
      <description>&lt;p&gt;I just read the news about &lt;a href="https://aws.amazon.com/blogs/aws/announcing-aws-lambda-function-urls-built-in-https-endpoints-for-single-function-microservices/"&gt;AWS Function URls&lt;/a&gt; and I wanted to try it out ASAP. In a nutshell, this allows you to instantly have a public HTTP endpoint for any lambda you've got. This was always possible by configuring an API Gateway, but that is quite complex and provides features that could be overkill if you just want a webhook or something simple.&lt;/p&gt;

&lt;p&gt;Most AWS tutorials I find show you how to do things step-by-step in the console. While that's helpful for starting out, what you want in the long run is infrastructure-as-code, which means all the resources in your application should be described in files that can be checked into version control.&lt;/p&gt;

&lt;p&gt;I stood up a quick demo in &lt;a href="https://github.com/dysmento/the-ways-of-aws-infra-as-code"&gt;this GitHub repo&lt;/a&gt; that shows setting up a simple serverless application (with Function URL!) in three different ways:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;pure AWS CLI commands. In this method, you create a shell script to create resources, using commands such as &lt;code&gt;aws iam create-role&lt;/code&gt; and &lt;code&gt;aws lambda create-function&lt;/code&gt;. This is the "no frills method."&lt;/li&gt;
&lt;li&gt;Serverless framework. There's a lot to like about &lt;a href="https://serverless.com"&gt;Serverless framework&lt;/a&gt;. It provides commands for deploying to different stages (think dev/test/prod) as well as invoking locally for test purposes. This method uses is a front-end for &lt;a href="https://aws.amazon.com/cloudformation/"&gt;AWS CloudFormation&lt;/a&gt;, which was a pioneer in infrastructure-as-code wave. One downside is that provisioning takes a bit longer, so the feedback loop when making code updates is not as quick as I like.&lt;/li&gt;
&lt;li&gt;Terraform. &lt;a href="https://terraform.io"&gt;Terraform from Hashicorp&lt;/a&gt; is a big deal. They have pulled off an amazing feat: to unify all the clouds and over 1,000 different providers into a single provisioning language. One of my favorite things about Terraform is that it helps you see differences between your application as configured and how it &lt;em&gt;really&lt;/em&gt; is in the real world. Configuration drift really exists, and it is important to stay on top of it.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;So far, I haven't even mentioned that the demo code I wrote is in &lt;a href="https://clojure.org"&gt;Clojure&lt;/a&gt;, my favorite language. Usually, it's quite a bit of effort to deploy clojure, because you need a compiler and build pipeline to get your artifacts ready to run. Thanks to a lightweight interpreter called &lt;a href="https://github.com/babashka/nbb"&gt;nbb&lt;/a&gt;, you can interpret Clojure in a lambda as easily as you can interpret JavaScript. If you're not into Clojure, it's not a problem. These three forms of provisioning lambdas work with any of the lambda runtimes (node, python, java, go, ruby, and .NET).&lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
    </item>
    <item>
      <title>Kafka Connect Plugins With Clojure</title>
      <dc:creator>Zach A. Thomas</dc:creator>
      <pubDate>Wed, 06 Apr 2022 17:02:29 +0000</pubDate>
      <link>https://forem.com/dysmento/kafka-connect-plugins-with-clojure-44ij</link>
      <guid>https://forem.com/dysmento/kafka-connect-plugins-with-clojure-44ij</guid>
      <description>&lt;p&gt;&lt;a href="https://docs.confluent.io/platform/current/connect/index.html"&gt;Kafka Connect&lt;/a&gt; has an extension mechanism based on plugins. By implementing certain interfaces, or extending certain classes, you can create one of eight types of extensions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;Connector&lt;/code&gt; for piping data into or out of Kafka via external systems&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Converter&lt;/code&gt; for translating between Kafka Connect's runtime data format and &lt;code&gt;byte[]&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;HeaderConverter&lt;/code&gt; same, but for headers&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Transformation&lt;/code&gt; for modifying messages as they move through a connector&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Predicate&lt;/code&gt; used to conditionally apply a &lt;code&gt;Transformation&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ConfigProvider&lt;/code&gt; for integrating a source of key/value properties use in configuration&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ConnectRestExtension&lt;/code&gt; for adding your own JAX-RS resources (filters, REST endpoints, etc.) to the Kafka Connect API&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ConnectorClientConfigOverridePolicy&lt;/code&gt; for enforcing a policy on overriding of client configs via the connector configs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Some of these, like the client config override, are pretty obscure, but connectors and transformations are the meat and potatoes of Kafka Connect, and a custom config provider is like magic for supplying secrets (e.g. passwords) to your configuration.&lt;/p&gt;

&lt;h2&gt;
  
  
  Adding Clojure to the Mix
&lt;/h2&gt;

&lt;p&gt;The Kafka Connect extension mechanism is pretty great, but if your language of choice is Clojure, can you still create plugins for Kafka Connect? You can! There are a couple of hoops to jump through, which I will describe.&lt;/p&gt;

&lt;h2&gt;
  
  
  Example Code: Shouting Transform
&lt;/h2&gt;

&lt;p&gt;This example is not something you would really use (unless you LOVE ALL CAPS), but it shows all the parts and configuration for a Kafka Connect transformer (also referred to as a single message transform, or SMT).&lt;/p&gt;

&lt;p&gt;This example will examine the value of string messages, and convert certain words to all caps, whichever words you specify in the configuration to the plugin.&lt;/p&gt;

&lt;p&gt;e.g., if you set the &lt;code&gt;shouting-words&lt;/code&gt; property to "love,nice,shouting" then this message:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Hello, nice world! Do you love shouting?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Will be transformed into:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Hello, NICE world! Do you LOVE SHOUTING?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h2&gt;
  
  
  Use &lt;code&gt;gen-class&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Clojure has other ways to implement a Java interface, like &lt;code&gt;reify&lt;/code&gt; and &lt;code&gt;proxy&lt;/code&gt;, but we need &lt;code&gt;gen-class&lt;/code&gt; because Kafka Connect configuration expects us to supply it with a class name for it to instantiate. &lt;code&gt;gen-class&lt;/code&gt; allows us to create a class with the name of our choice. Here's the full code of the shouting transformer:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight clojure"&gt;&lt;code&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;ns&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;fizzy.plugins.shouting&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:require&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;clojure.string&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;:refer&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;replace&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;upper-case&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;org.apache.kafka.common.config&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ConfigDef&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ConfigDef$Type&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ConfigDef$Importance&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
             &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;org.apache.kafka.connect.data&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Schema$Type&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:gen-class&lt;/span&gt;&lt;span class="w"&gt;
     &lt;/span&gt;&lt;span class="no"&gt;:name&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;fizzy.plugins.ShoutingTransform&lt;/span&gt;&lt;span class="w"&gt;
     &lt;/span&gt;&lt;span class="no"&gt;:extends&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;fizzy.plugins.ClassLoaderImposition&lt;/span&gt;&lt;span class="w"&gt;
     &lt;/span&gt;&lt;span class="no"&gt;:implements&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;org.apache.kafka.connect.transforms.Transformation&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;&lt;span class="w"&gt;

  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;atom&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{}))&lt;/span&gt;&lt;span class="w"&gt;

  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;-configure&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;this&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;swap!&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;assoc&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;this&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="w"&gt;

  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;-config&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;_this&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;field-name&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;"shouting-words"&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="n"&gt;value-type&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ConfigDef$Type/LIST&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="n"&gt;importance&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ConfigDef$Importance/HIGH&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="n"&gt;default-value&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="n"&gt;docstring&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;"Provides a list of words that SHOULD BE SHOUTED"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;.define&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;ConfigDef.&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;field-name&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;value-type&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;default-value&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;importance&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;docstring&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;&lt;span class="w"&gt;

  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;shout&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s"&gt;"Takes a string value and a string to match and makes the match string upper case"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;replace&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;re-pattern&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;"(?i)"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;upper-case&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;&lt;span class="w"&gt;

  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;-apply&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;this&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;.topic&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="n"&gt;partition&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;.kafkaPartition&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="n"&gt;key-schema&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;.keySchema&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nb"&gt;key&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;.key&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="n"&gt;value-schema&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;.valueSchema&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;.value&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="n"&gt;shout-value&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;string?&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="w"&gt;
                       &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;reduce&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;shout&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;get-in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;this&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;"shouting-words"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="w"&gt;
                        &lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="n"&gt;timestamp&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;.timestamp&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;.headers&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;.newRecord&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;partition&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;key-schema&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;key&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;value-schema&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;shout-value&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;timestamp&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;&lt;span class="w"&gt;

  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;-close&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;_this&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="n"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  A Tour of the Code
&lt;/h2&gt;

&lt;p&gt;A transformation plugin is any class that implements &lt;code&gt;org.apache.kafka.connect.transforms.Transformation&lt;/code&gt;. The methods we need are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;void configure(Map&amp;lt;String, ?&amp;gt; configs)&lt;/code&gt; accept a map of configuration keys/values&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ConfigDef config()&lt;/code&gt; returns an object which describes the expected configs&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ConnectRecord apply(ConnectRecord record)&lt;/code&gt; the actual transformation to perform&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;void close()&lt;/code&gt; called on shutdown, for any resources that need to be cleaned up&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Our Clojure code stores the configuration map in an atom, with a separate config stored for each instance of the class, since you could have multiple simultaneous instances of the plugin running.The only configuration option we use in this example is "shouting-words", a list of strings that we will convert to all caps in any message where we find them.&lt;/p&gt;

&lt;p&gt;The apply function passes all the fields of the &lt;code&gt;ConnectRecord&lt;/code&gt; along unmodified except the value, which is transformed by the &lt;code&gt;shout&lt;/code&gt; function, but only if the value is a string (a message coming through a connector could be virtually anything).&lt;/p&gt;

&lt;p&gt;Our &lt;code&gt;close&lt;/code&gt; function is a no-op, because we don't have anything to shut down.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Class Loading Issue
&lt;/h2&gt;

&lt;p&gt;Kafka Connect provides a dedicated class loader for each plugin, so that all the plugins can have isolated classes and not interfere with each other. This turns out to be a bit of a problem for Clojure, which loads itself using the context class loader rather than the plugin class loader. Furthermore, it uses static initializer blocks to load, so there's only a narrow window of opportunity to intervene. See &lt;a href="https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/RT.java#L2176-L2182"&gt;the Clojure implementation details here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If your eyes are very sharp, you may have noticed that our &lt;code&gt;gen-class&lt;/code&gt; extends another class, &lt;code&gt;fizzy.plugins.ClassLoaderImposition&lt;/code&gt;. The &lt;em&gt;sole&lt;/em&gt; purpose of this class is to set the context class loader when the class loads. Here is the implementation of that class. Putting the code in a static initializer block allows us to set the context class loader before Clojure needs to access it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;  &lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;fizzy.plugins&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

  &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ClassLoaderImposition&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
          &lt;span class="nc"&gt;Thread&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;currentThread&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;setContextClassLoader&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ClassLoaderImposition&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getClassLoader&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
      &lt;span class="o"&gt;}&lt;/span&gt;
  &lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To put it bluntly, this is a hack to get Clojure to load in the plugin context, but it works!&lt;/p&gt;

&lt;h2&gt;
  
  
  Packaging It Up
&lt;/h2&gt;

&lt;p&gt;Kafka Connect has a (configurable) directory where plugins are loaded from. If you're using a local installation of Confluent Platform, the default location is &lt;code&gt;$CONFLUENT_HOME/share/java&lt;/code&gt;. In the official Confluent Docker container, it's &lt;code&gt;/usr/share/confluent-hub-components&lt;/code&gt;. You can either make an uberjar with your code in it (plus all of its dependencies), or create a subdirectory of the plugins directory with all the jars in it. Note that you should not include jars that come with Kafka Connect, like &lt;code&gt;kafka-clients&lt;/code&gt;, &lt;code&gt;slf4j-api&lt;/code&gt;, &lt;code&gt;snappy&lt;/code&gt;. Creating jars from your project is beyond the scope of this article, but you can use &lt;a href="https://clojure.org/guides/tools_build"&gt;tools.build&lt;/a&gt; or &lt;a href="https://github.com/technomancy/leiningen/blob/master/doc/TUTORIAL.md#uberjar"&gt;leiningen&lt;/a&gt; to do it. Remember that once you have copied your plugin files to Connect, you must stop and restart the server. It's a good idea to tail the logs, so you can see any errors that pop up if something goes wrong with loading your plugin.&lt;/p&gt;

&lt;h2&gt;
  
  
  Adding Your Transformer to a Connector
&lt;/h2&gt;

&lt;p&gt;Once you have a plugin installed in a running Kafka Connect cluster, you use it by specifying its configuration in the connector properties. Here is how our example would be enabled within connector properties:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"my-connector"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"config"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
       &lt;/span&gt;&lt;span class="err"&gt;//&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;more&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;config&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;redacted&lt;/span&gt;&lt;span class="w"&gt;
       &lt;/span&gt;&lt;span class="nl"&gt;"transforms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"shouting"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
       &lt;/span&gt;&lt;span class="nl"&gt;"transforms.shouting.type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"fizzy.plugins.ShoutingTransform"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
       &lt;/span&gt;&lt;span class="nl"&gt;"transforms.shouting.shouting-words"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"hello,world,love,shouting"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;   
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Making a &lt;code&gt;ConfigProvider&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;We were very excited to build a custom config provider, because it meant we could get our secrets from Vault instead of passing them around in files or the environment. It is very similar to writing a transformer, with a couple of wrinkles. The interface is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;ConfigData get(String path)&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ConfigData get(String path, Set&amp;lt;String&amp;gt; keys)&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;void configure(Map&amp;lt;String, ?&amp;gt; configs)&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;void close()&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And there are optional methods for subscribing to changes to a given key.&lt;/p&gt;

&lt;p&gt;One of the differences with the &lt;code&gt;ConfigProvider&lt;/code&gt; (also with &lt;code&gt;ConnectRestExtension&lt;/code&gt; if you decide to create one) is that plugins of this type are found using the &lt;code&gt;java.util.ServiceLoader&lt;/code&gt; mechanism. This means that your jar file must have a &lt;code&gt;META-INF/service/org.apache.kafka.common.config.provider.ConfigProvider&lt;/code&gt; file. The content of this file is a single line with the fully-qualified name of your class which implements &lt;code&gt;ConfigProvider&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;When your config provider is installed, you will be able to add properties to Kafka Connect that reference your provider to supply a value. For example, your ENVIRONMENT could include the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;CONNECT_CONFIG_PROVIDERS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;fizzysecrets
&lt;span class="nv"&gt;CONNECT_CONFIG_PROVIDERS_FIZZYSECRETS_CLASS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;fizzy.plugins.SecretsConfigProvider
&lt;span class="nv"&gt;CONNECT_SASL_JAAS_CONFIG&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"org.apache.kafka.common.security.plain.PlainLoginModule required username='&lt;/span&gt;&lt;span class="se"&gt;\$&lt;/span&gt;&lt;span class="s2"&gt;{fizzysecrets:SASL_JAAS_CONFIG_USER}' password='&lt;/span&gt;&lt;span class="se"&gt;\$&lt;/span&gt;&lt;span class="s2"&gt;{fizzysecrets:SASL_JAAS_CONFIG_PASS}';"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  One More Hurdle
&lt;/h2&gt;

&lt;p&gt;Getting secrets from a config provider is great, but there is one final hoop to jump through if you're running Kafka Connect with the Confluent Docker image. Part of the startup sequence is running &lt;a href="https://github.com/confluentinc/confluent-docker-utils/blob/master/confluent/docker_utils/cub.py"&gt;a tool called cub&lt;/a&gt; to ensure Kafka is ready before starting Kafka Connect. Because the tool also needs access to your config providers, you must copy your config provider code (along with dependencies) to &lt;code&gt;/usr/share/java/cp-base-new&lt;/code&gt;. That way, any secrets you need to reach Kafka will be also be available to cub.&lt;/p&gt;

&lt;h2&gt;
  
  
  Clojure + Kafka Connect
&lt;/h2&gt;

&lt;p&gt;Clojure is a fun and productive way to extend Kafka Connect. One thing we haven't tried yet is writing our own connector, but that should be a breeze. Good luck!&lt;/p&gt;

</description>
      <category>clojure</category>
      <category>kafka</category>
    </item>
    <item>
      <title>The Single Biggest Beginner Mistake with DynamoDB</title>
      <dc:creator>Zach A. Thomas</dc:creator>
      <pubDate>Sun, 03 Apr 2022 19:18:51 +0000</pubDate>
      <link>https://forem.com/aws-builders/beginner-mistakes-with-dynamodb-2ofn</link>
      <guid>https://forem.com/aws-builders/beginner-mistakes-with-dynamodb-2ofn</guid>
      <description>&lt;p&gt;&lt;a href="https://aws.amazon.com/dynamodb/"&gt;DynamoDB&lt;/a&gt; is an amazingly powerful and performant database, best known for its low latency and elastic scaling characteristics. But there is one trap it is super easy to fall into, &lt;em&gt;especially&lt;/em&gt; if you have any background at all in more traditional relational database systems (think MySQL, Postgres, Oracle, and SQL Server).&lt;/p&gt;

&lt;h2&gt;
  
  
  That's Not Normal, Man!
&lt;/h2&gt;

&lt;p&gt;In the relational database world, the dependable best practice is to &lt;a href="https://www.guru99.com/database-normalization.html"&gt;normalize&lt;/a&gt; your data model. It's a fairly academic topic, but the short version is that every piece of data should have &lt;em&gt;one&lt;/em&gt; home (i.e., one table which is its canonical location), and any references to that data in another table will be in the form of a foreign key, which a pointer to its "true" location.&lt;/p&gt;

&lt;p&gt;In this way of storing data, an entity can be assembled from all the different rows in all the different tables by asking for a &lt;code&gt;JOIN&lt;/code&gt; operation.&lt;/p&gt;

&lt;p&gt;If you've ever done a data modeling exercise in this paradigm, you might have started with a table-per-entity. It might &lt;em&gt;feel&lt;/em&gt; natural to follow this same process with DynamoDB. Stop!&lt;/p&gt;

&lt;h2&gt;
  
  
  Don't Be a Joiner
&lt;/h2&gt;

&lt;p&gt;If you have taken DynamoDB for a spin, you may have noticed that there aren't any &lt;code&gt;JOIN&lt;/code&gt; operations. This is a feature, not a bug! Let's talk a bit about why &lt;code&gt;JOIN&lt;/code&gt; was invented in the first place. In the dawn of SQL databases (let's go back to 1979), storage was scarce and &lt;em&gt;expensive&lt;/em&gt;. A database join saves storage at the expense of computation, a tradeoff which made sense for a long time, but doesn't anymore. In the present day, the cost equation is completely flipped: storage is millions of times cheaper, and while computation has also improved (a lot), not by the same orders of magnitude as storage, which means computation is now the bottleneck for cost and performance. DynamoDB achieves remarkable performance by not incurring the computation cost of doing all those joins.&lt;/p&gt;

&lt;p&gt;If you make this common mistake (and I did, a bunch!) and continue modeling entities the old way, entity-per-table, you will end up doing all the joins yourself, in your application code. This is the worst of both worlds, because you're giving up the expressive flexibility of SQL while &lt;em&gt;still&lt;/em&gt; paying the cost of joins.&lt;/p&gt;

&lt;h2&gt;
  
  
  So Now What?
&lt;/h2&gt;

&lt;p&gt;What do you do instead? This problem is solvable with a little planning up-front. The best way I know to describe the new way of working is to imagine the query result you want, and store the rows in &lt;em&gt;that&lt;/em&gt; form, completely denormalized. There's an old wisdom about performance: the less it has to do, the faster it can be. With denormalized rows in your DynamoDB table, the only work is fetching by your chosen partition key (and optionally a &lt;em&gt;little bit&lt;/em&gt; extra to apply attribute constraints before returning).&lt;/p&gt;

&lt;h2&gt;
  
  
  The Next Level
&lt;/h2&gt;

&lt;p&gt;When you start modeling in this way, the surprising result is that most applications can be implemented &lt;em&gt;with a single database table&lt;/em&gt;. This is like waking up from living in the Matrix. When you're ready for more, start with &lt;a href="https://www.alexdebrie.com/posts/dynamodb-single-table/"&gt;this post&lt;/a&gt; by Alex Debrie, and then read everything else he has written! Absorb all of &lt;a href="https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/best-practices.html"&gt;this documentation&lt;/a&gt;. And when you're done with that and you're ready to level up again, go find everything by Rick Houlihan, like &lt;a href="https://youtu.be/MF9a1UNOAQo"&gt;this talk&lt;/a&gt; from re:Invent entitled "Amazon DynamoDB advanced design patterns."&lt;/p&gt;

&lt;h2&gt;
  
  
  Go With the Flow
&lt;/h2&gt;

&lt;p&gt;When you start using DynamoDB the way it was designed, it will blow your mind. Have fun on the journey!&lt;/p&gt;

</description>
      <category>aws</category>
      <category>database</category>
    </item>
  </channel>
</rss>
