<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: 周明</title>
    <description>The latest articles on Forem by 周明 (@_4766ad6499d6063cc36ad7).</description>
    <link>https://forem.com/_4766ad6499d6063cc36ad7</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3937814%2Fd8cd6d90-144c-4f18-84b8-be1dab4989fc.png</url>
      <title>Forem: 周明</title>
      <link>https://forem.com/_4766ad6499d6063cc36ad7</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/_4766ad6499d6063cc36ad7"/>
    <language>en</language>
    <item>
      <title>Recovering Corrupted ROS 2 Bags Without a ROS 2 Runtime</title>
      <dc:creator>周明</dc:creator>
      <pubDate>Mon, 18 May 2026 15:36:59 +0000</pubDate>
      <link>https://forem.com/_4766ad6499d6063cc36ad7/recovering-corrupted-ros-2-bags-without-a-ros-2-runtime-4lc4</link>
      <guid>https://forem.com/_4766ad6499d6063cc36ad7/recovering-corrupted-ros-2-bags-without-a-ros-2-runtime-4lc4</guid>
      <description>&lt;p&gt;Robots do not always fail gracefully.&lt;/p&gt;

&lt;p&gt;A delivery robot can hit an obstacle. A lawn-mowing robot can lose power. An inspection robot can emergency-stop in the middle of a mission. In many of these cases, the most valuable data is the last few seconds before the incident: camera frames, IMU samples, odometry, localization, control commands, and diagnostics.&lt;/p&gt;

&lt;p&gt;For ROS 2 systems, that data is often stored in &lt;code&gt;rosbag2&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;But there is a painful failure mode: if the process exits abnormally or the device loses power, the underlying MCAP or SQLite3 file may not be closed cleanly. The data file can be missing a footer, summary, index, WAL checkpoint, or part of the final page. Official tools may refuse to open it, even though many complete messages are still inside.&lt;/p&gt;

&lt;p&gt;That is the problem I started working on with &lt;strong&gt;LibRobotBagFix&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/Adiao1973/LibRobotBagFix" rel="noopener noreferrer"&gt;https://github.com/Adiao1973/LibRobotBagFix&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What is LibRobotBagFix?
&lt;/h2&gt;

&lt;p&gt;LibRobotBagFix is an open-source C++ SDK and command-line tool for inspecting, repairing, and reading ROS 2 &lt;code&gt;rosbag2&lt;/code&gt; black-box data without requiring a ROS 2 runtime.&lt;/p&gt;

&lt;p&gt;The project focuses on commercial robot field diagnostics:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;autonomous delivery vehicles&lt;/li&gt;
&lt;li&gt;lawn-mowing robots&lt;/li&gt;
&lt;li&gt;inspection robots&lt;/li&gt;
&lt;li&gt;mobile robot fleets&lt;/li&gt;
&lt;li&gt;robotics support tools&lt;/li&gt;
&lt;li&gt;accident analysis workflows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The goal is not to replace ROS 2 or implement a full rosbag player. The goal is more focused:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Recover as many complete messages as possible from damaged MCAP or SQLite3 rosbag2 files, then make the result readable again.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Why this matters
&lt;/h2&gt;

&lt;p&gt;In real robot incidents, the most important data is usually near the end of the recording.&lt;/p&gt;

&lt;p&gt;Unfortunately, the end of the file is also the part most likely to be damaged during power loss or a crash.&lt;/p&gt;

&lt;p&gt;Common symptoms include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;ros2 bag info&lt;/code&gt; cannot open the bag&lt;/li&gt;
&lt;li&gt;an MCAP file is missing its summary or footer&lt;/li&gt;
&lt;li&gt;a SQLite3 &lt;code&gt;.db3&lt;/code&gt; file has uncheckpointed WAL data&lt;/li&gt;
&lt;li&gt;the last SQLite page is partially written&lt;/li&gt;
&lt;li&gt;the file contains complete earlier messages, but normal tools cannot reach them&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For field teams, there is another practical issue: they may not have a full ROS 2 environment available. They may be using macOS, Windows, a tablet, or a lightweight diagnostic tool.&lt;/p&gt;

&lt;p&gt;That is why LibRobotBagFix is designed as a small embeddable SDK with a CLI, not as a ROS 2 node.&lt;/p&gt;

&lt;h2&gt;
  
  
  Current features
&lt;/h2&gt;

&lt;p&gt;The CLI currently provides two main commands.&lt;/p&gt;

&lt;p&gt;Inspect a bag directory or single data file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./build/robotbagfix inspect path/to/bag_or_file &lt;span class="nt"&gt;--json&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Repair a damaged MCAP file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./build/robotbagfix repair damaged.mcap &lt;span class="nt"&gt;-o&lt;/span&gt; repaired.mcap &lt;span class="nt"&gt;--json&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Repair a damaged SQLite3 rosbag2 database:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./build/robotbagfix repair damaged.db3 &lt;span class="nt"&gt;-o&lt;/span&gt; repaired.db3 &lt;span class="nt"&gt;--json&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The project currently supports:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;rosbag2 directory detection&lt;/li&gt;
&lt;li&gt;MCAP detection and record scanning&lt;/li&gt;
&lt;li&gt;SQLite3 &lt;code&gt;.db3&lt;/code&gt; detection&lt;/li&gt;
&lt;li&gt;SQLite WAL sidecar discovery&lt;/li&gt;
&lt;li&gt;MCAP tail repair&lt;/li&gt;
&lt;li&gt;SQLite fast-path recovery&lt;/li&gt;
&lt;li&gt;SQLite page-level salvage fallback&lt;/li&gt;
&lt;li&gt;C++ SDK message iteration&lt;/li&gt;
&lt;li&gt;stable C ABI for FFI&lt;/li&gt;
&lt;li&gt;lightweight CDR helpers&lt;/li&gt;
&lt;li&gt;optional Qt desktop demo&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How MCAP repair works
&lt;/h2&gt;

&lt;p&gt;MCAP has a structured binary container format. A valid file includes magic bytes, a header, a data section, optional summary sections, a footer, and trailing magic bytes.&lt;/p&gt;

&lt;p&gt;A common corruption pattern is tail truncation. For example, the file may contain many complete records, but the final summary or footer was never written.&lt;/p&gt;

&lt;p&gt;LibRobotBagFix scans the MCAP file sequentially, keeps complete records, discards the partial tail record if one exists, and rebuilds the required trailing structure.&lt;/p&gt;

&lt;p&gt;The simplified idea looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;damaged.mcap
  -&amp;gt; scan complete records
  -&amp;gt; stop at the first incomplete record
  -&amp;gt; discard damaged tail bytes
  -&amp;gt; rebuild Data End / Summary / Footer
  -&amp;gt; write repaired.mcap
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The source file is never modified in place.&lt;/p&gt;

&lt;h2&gt;
  
  
  How SQLite3 repair works
&lt;/h2&gt;

&lt;p&gt;The SQLite3 backend is more complex because rosbag2 data is stored inside database tables.&lt;/p&gt;

&lt;p&gt;The first strategy is to let SQLite do what SQLite is good at:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;open the database safely&lt;/li&gt;
&lt;li&gt;apply WAL recovery when possible&lt;/li&gt;
&lt;li&gt;run integrity checks&lt;/li&gt;
&lt;li&gt;export complete rows&lt;/li&gt;
&lt;li&gt;rebuild a clean &lt;code&gt;.db3&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If that fails, LibRobotBagFix falls back to page-level salvage. It scans SQLite pages, tries to recover &lt;code&gt;sqlite_schema&lt;/code&gt;, locates rosbag2 tables such as &lt;code&gt;topics&lt;/code&gt; and &lt;code&gt;messages&lt;/code&gt;, and writes only rows that can be proven complete.&lt;/p&gt;

&lt;p&gt;Uncertain data is reported, not silently written into the output database.&lt;/p&gt;

&lt;h2&gt;
  
  
  Design principles
&lt;/h2&gt;

&lt;p&gt;A few principles guide the project:&lt;/p&gt;

&lt;h3&gt;
  
  
  No ROS 2 runtime dependency
&lt;/h3&gt;

&lt;p&gt;The SDK should be usable in diagnostic tools, desktop applications, mobile apps, and support workflows without installing a full ROS 2 environment.&lt;/p&gt;

&lt;h3&gt;
  
  
  Repair by container, not by sensor
&lt;/h3&gt;

&lt;p&gt;At the repair layer, an image, IMU sample, odometry message, and point cloud are all payload bytes inside MCAP or SQLite3.&lt;/p&gt;

&lt;p&gt;The repair engine focuses on the container format. Sensor-specific interpretation belongs in a higher-level parser.&lt;/p&gt;

&lt;h3&gt;
  
  
  Do not keep half-written messages
&lt;/h3&gt;

&lt;p&gt;If a message payload itself is incomplete, LibRobotBagFix does not pretend it is valid. It keeps complete messages and reports discarded bytes, records, pages, or uncertain rows.&lt;/p&gt;

&lt;h3&gt;
  
  
  Keep the SDK embeddable
&lt;/h3&gt;

&lt;p&gt;The project provides a C++ API and a stable C ABI. That makes it easier to integrate with tools written in Swift, Kotlin/JNI, C#, Python FFI, Qt, or other application layers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick start
&lt;/h2&gt;

&lt;p&gt;Build the default SDK and CLI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/Adiao1973/LibRobotBagFix.git
&lt;span class="nb"&gt;cd &lt;/span&gt;LibRobotBagFix
scripts/build.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check the version:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./build/robotbagfix &lt;span class="nt"&gt;--version&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run tests:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ctest &lt;span class="nt"&gt;--test-dir&lt;/span&gt; build &lt;span class="nt"&gt;--output-on-failure&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Inspect a file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./build/robotbagfix inspect path/to/bag_or_file &lt;span class="nt"&gt;--json&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Repair a file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./build/robotbagfix repair damaged.mcap &lt;span class="nt"&gt;-o&lt;/span&gt; repaired.mcap &lt;span class="nt"&gt;--json&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Read repaired output through the SDK examples:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./build/rbf_read_bag_cpp repaired.db3
./build/rbf_read_bag_c repaired.db3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Who this is for
&lt;/h2&gt;

&lt;p&gt;This project may be useful if you work on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;robotics incident analysis&lt;/li&gt;
&lt;li&gt;autonomous robot field support&lt;/li&gt;
&lt;li&gt;ROS 2 data tooling&lt;/li&gt;
&lt;li&gt;fleet diagnostics&lt;/li&gt;
&lt;li&gt;lightweight robot data viewers&lt;/li&gt;
&lt;li&gt;C++ SDKs for robotics products&lt;/li&gt;
&lt;li&gt;mobile or desktop tools that need to read robot black-box data&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It is especially relevant when you need to inspect or recover robot data outside a full ROS 2 workstation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Current boundaries
&lt;/h2&gt;

&lt;p&gt;LibRobotBagFix is still focused. It does not try to do everything.&lt;/p&gt;

&lt;p&gt;Current non-goals include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;modifying damaged bags in place&lt;/li&gt;
&lt;li&gt;recovering half-written image or point-cloud payloads&lt;/li&gt;
&lt;li&gt;implementing a full rosbag player&lt;/li&gt;
&lt;li&gt;publishing or subscribing to DDS topics&lt;/li&gt;
&lt;li&gt;making ROS 2 a runtime dependency&lt;/li&gt;
&lt;li&gt;making Qt or mobile UI frameworks part of the core SDK&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What I am looking for
&lt;/h2&gt;

&lt;p&gt;This project is open source, and I would appreciate feedback from people who work with ROS 2 bags in real systems.&lt;/p&gt;

&lt;p&gt;Useful feedback includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;damaged bag patterns you have seen in production&lt;/li&gt;
&lt;li&gt;MCAP or SQLite3 edge cases&lt;/li&gt;
&lt;li&gt;cross-platform build feedback&lt;/li&gt;
&lt;li&gt;SDK API suggestions&lt;/li&gt;
&lt;li&gt;ideas for minimal fixtures and regression tests&lt;/li&gt;
&lt;li&gt;field diagnostic workflows where this could help&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;GitHub repo:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/Adiao1973/LibRobotBagFix" rel="noopener noreferrer"&gt;https://github.com/Adiao1973/LibRobotBagFix&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you work with ROS 2 robot logs and have ever lost the last seconds of data after a crash or power failure, I would be interested to hear what failure modes you have seen.&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>cpp</category>
      <category>robotics</category>
      <category>ros2</category>
    </item>
  </channel>
</rss>
