<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Mike Shumkov</title>
    <description>The latest articles on Forem by Mike Shumkov (@shooma).</description>
    <link>https://forem.com/shooma</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F189329%2F35d96e3f-e5d9-42c0-ac81-d740e96490cb.jpg</url>
      <title>Forem: Mike Shumkov</title>
      <link>https://forem.com/shooma</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/shooma"/>
    <language>en</language>
    <item>
      <title>How to cancel Debezium Incremental Snapshot</title>
      <dc:creator>Mike Shumkov</dc:creator>
      <pubDate>Mon, 24 Jun 2024 08:55:44 +0000</pubDate>
      <link>https://forem.com/shooma/how-to-cancel-debezium-incremental-snapshot-3c5j</link>
      <guid>https://forem.com/shooma/how-to-cancel-debezium-incremental-snapshot-3c5j</guid>
      <description>&lt;h3&gt;
  
  
  TL;DR:
&lt;/h3&gt;

&lt;p&gt;To cancel Incremental Snapshot you could push manually combined message to Kafka Connect internal ...-offsets topic with &lt;strong&gt;value.incremental_snapshot_primary_key&lt;/strong&gt; equal to &lt;strong&gt;value.incremental_snapshot_maximum_key&lt;/strong&gt; from latest "offset" messages&lt;/p&gt;

&lt;h3&gt;
  
  
  Long story:
&lt;/h3&gt;

&lt;p&gt;Sometimes you might need to make a snapshot of some already tracked tables once again and Debezium has &lt;a href="https://debezium.io/blog/2021/10/07/incremental-snapshots/"&gt;Incremental Snapshots&lt;/a&gt; feature exactly for that purpose. You could send a "signal" (write a new row into signal DB table) which instructs Debezium to re-read some table. But what if you want to cancel already running Incremental Snapshot?&lt;/p&gt;

&lt;p&gt;We faced situation when Incremental Snapshot on some huge table was started but additional conditions were not applied! So instead of re-read 30k rows Debezium started to read all the 20 million records. We won't that much data to be produced because it will flood the data topic and latest changes (that we were need to be snapshotted) won't be pushed for hours. So we need to stop this snapshot.&lt;/p&gt;

&lt;p&gt;As I found - Debezium had no ability to stop already running snapshot by some sort of signal. Kafka Connect restarts also don't have any affect to snapshot process - it continues from last processed offset. So I dig into internal Kafka Connect topics, especially in "...-offsets" one, and here it is: Debezium stored it's own running snapshot offsets here. Example message for running snapshot:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"key"&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"dbz_prod"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
         &lt;/span&gt;&lt;span class="nl"&gt;"server"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"mssql"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"transaction_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"event_serial_no"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
     &lt;/span&gt;&lt;span class="nl"&gt;"incremental_snapshot_maximum_key"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"6e2166716a6c4b5310027575000decac616e672e4f626a6563743b90ce589f1073296c020000787000000001737200116a6176612e6c616e672e496e746567657212e2a0a4f781873802000149000576616c7565787200106a6176612e6c616e672e4e756d62657286ac951d0b94e08b02000078700142f017"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"commit_lsn"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"0006593e:000287c8:0003"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"change_lsn"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"0006593e:000287c8:0002"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"incremental_snapshot_collections"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"prod.dbo.InvoiceLines"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"incremental_snapshot_primary_key"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"6e2166716a6c4b5310027575000decac616e672e4f626a6563743b90ce589f1073296c020000787000000001737200116a6176612e6c616e672e496e746567657212e2a0a4f781873802000149000576616c7565787200106a6176612e6c616e672e4e756d62657286ac951d0b94e08b020000787000016862"&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"headers"&lt;/span&gt;&lt;span class="p"&gt;:[],&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"exceededFields"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here we see 2 valuable keys:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;incremental_snapshot_maximum_key&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;incremental_snapshot_primary_key&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Seems like snapshot will be stopped when current snapshot offset (&lt;em&gt;incremental_snapshot_primary_key&lt;/em&gt;) become equal to maximum primary key (&lt;em&gt;incremental_snapshot_maximum_key&lt;/em&gt;, which table contained when snapshot was started). You can see that these keys differ just in last 7 chars. And these last chars are hexadecimal values for offset and max primary key (&lt;strong&gt;142f017&lt;/strong&gt; to decimal eq &lt;strong&gt;21,164,055&lt;/strong&gt;).&lt;/p&gt;

&lt;p&gt;So I tried to push same message to ...-offsets topic with &lt;em&gt;incremental_snapshot_primary_key&lt;/em&gt; equal &lt;em&gt;incremental_snapshot_maximum_key&lt;/em&gt;. And it worked for me - snapshot were marked as "finished" and data flood was stopped.&lt;/p&gt;

&lt;p&gt;"Finished" message:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"key"&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"dbz_prod"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
         &lt;/span&gt;&lt;span class="nl"&gt;"server"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"mssql"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"transaction_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"event_serial_no"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
     &lt;/span&gt;&lt;span class="nl"&gt;"incremental_snapshot_maximum_key"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"6e2166716a6c4b5310027575000decac616e672e4f626a6563743b90ce589f1073296c020000787000000001737200116a6176612e6c616e672e496e746567657212e2a0a4f781873802000149000576616c7565787200106a6176612e6c616e672e4e756d62657286ac951d0b94e08b02000078700142f017"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"commit_lsn"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"0006593e:000287c8:0003"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"change_lsn"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"0006593e:000287c8:0002"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"incremental_snapshot_collections"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"prod.dbo.InvoiceLines"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"incremental_snapshot_primary_key"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"6e2166716a6c4b5310027575000decac616e672e4f626a6563743b90ce589f1073296c020000787000000001737200116a6176612e6c616e672e496e746567657212e2a0a4f781873802000149000576616c7565787200106a6176612e6c616e672e4e756d62657286ac951d0b94e08b02000078700142f017"&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"headers"&lt;/span&gt;&lt;span class="p"&gt;:[],&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"exceededFields"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Just in case, I had stopped Kafka Connect before pushing custom "finish" message to topic. I think it was not necessary.&lt;/p&gt;

</description>
      <category>debezium</category>
      <category>kafka</category>
      <category>webdev</category>
      <category>database</category>
    </item>
  </channel>
</rss>
