<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Jared Ablon</title>
    <description>The latest articles on Forem by Jared Ablon (@jared_ablon_f27e6e2896913).</description>
    <link>https://forem.com/jared_ablon_f27e6e2896913</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3930408%2Fb2dc5b12-5588-4948-a330-59c20c1a8e02.png</url>
      <title>Forem: Jared Ablon</title>
      <link>https://forem.com/jared_ablon_f27e6e2896913</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/jared_ablon_f27e6e2896913"/>
    <language>en</language>
    <item>
      <title>I parsed 4,251 SEC 8-K filings — 7.3% have buried material events nobody surfaces</title>
      <dc:creator>Jared Ablon</dc:creator>
      <pubDate>Thu, 14 May 2026 04:40:13 +0000</pubDate>
      <link>https://forem.com/jared_ablon_f27e6e2896913/i-parsed-4251-sec-8-k-filings-73-have-buried-material-events-nobody-surfaces-5c8o</link>
      <guid>https://forem.com/jared_ablon_f27e6e2896913/i-parsed-4251-sec-8-k-filings-73-have-buried-material-events-nobody-surfaces-5c8o</guid>
      <description>&lt;h2&gt;
  
  
  The wedge
&lt;/h2&gt;

&lt;p&gt;US public companies file Form 8-K to disclose material events between quarterly reports. Each 8-K has one or more &lt;strong&gt;item codes&lt;/strong&gt; indicating &lt;em&gt;what&lt;/em&gt; the event is — Item 1.05 for a material cybersecurity incident, Item 5.02 for a director or officer departure, Item 1.01 for a definitive material agreement, and so on.&lt;/p&gt;

&lt;p&gt;There's also a catch-all: &lt;strong&gt;Item 8.01 ("Other Events")&lt;/strong&gt;. The SEC's instructions explicitly say that if a more specific item applies, you should use that instead. In practice, filers chronically use Item 8.01 to disclose events that fit a more specific item — sometimes innocently, sometimes to bury the disclosure.&lt;/p&gt;

&lt;p&gt;Most SEC data APIs trust the filer-reported item codes. &lt;strong&gt;We re-classify from the body text and surface the discrepancies.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I built &lt;a href="https://filingfirehose.com" rel="noopener noreferrer"&gt;FilingFirehose&lt;/a&gt; (a productized SEC EDGAR JSON API) and ran the body-text classifier over 4,251 8-K filings filed in a 21-business-day window. Findings:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;7.3%&lt;/strong&gt; of Item 8.01 filings contain body language strongly suggesting a more specific item code&lt;/li&gt;
&lt;li&gt;Top suspected misclassifications: Item 1.01 (material agreement, 27 cases), Item 3.01 (notice of delisting, 17), Item 1.05 (cyber, 5), Item 5.02 (officer departure, 3), Item 2.01 (acquisition, 3), Item 1.03 (bankruptcy, 3)&lt;/li&gt;
&lt;li&gt;Item 7.01 (Reg FD) shows a similar pattern: 4.0% of those have body language suggesting another item should have been used&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why this matters
&lt;/h2&gt;

&lt;p&gt;For three audiences specifically:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Compliance teams.&lt;/strong&gt; The SEC's 2023 cybersecurity disclosure rule explicitly flags Item 1.05 misclassification as a violation. Recent enforcement actions cite this exact pattern (a company filing a cyber breach under 8.01 when 1.05 was required). If you're monitoring competitor or watchlist filings, knowing which 8.01 filings actually contain cyber language gives you a real-time risk signal.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Quant analysts running event-driven strategies.&lt;/strong&gt; Classification noise on filer-reported items is a known problem. The body-text classification is itself a free signal — companies that bury events under 8.01 may be doing so to delay market reaction, which means the price-impact lag is longer than naive item-code monitoring would suggest. Strategies that key off the &lt;strong&gt;discrepancy&lt;/strong&gt; between filer-reported items and body-detected items have an alpha source most consumers miss entirely.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fintech journalists.&lt;/strong&gt; Item 8.01 filings are where the buried news lives. Manually reading every 8-K's body for cyber, officer, or material-agreement language is a Sunday-afternoon job. We do it automatically.&lt;/p&gt;

&lt;h2&gt;
  
  
  The pipeline
&lt;/h2&gt;

&lt;p&gt;The whole thing is ~2k LOC of Python. Stack:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;EDGAR poller&lt;/strong&gt;: standard SEC &lt;code&gt;getcurrent&lt;/code&gt; atom feed at 2-second cadence&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Filing fetch&lt;/strong&gt;: per-filing HTML body via &lt;code&gt;requests&lt;/code&gt; (cached aggressively — EDGAR rate-limits at 10 req/s)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Body extraction&lt;/strong&gt;: BeautifulSoup with custom rules for the SEC's idiosyncratic HTML&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Classification&lt;/strong&gt;: deterministic regex + keyword + co-occurrence rules. No transformer. Domain phrases are specific enough that rule-based beats LLM accuracy on this task, with no per-filing inference cost.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Storage&lt;/strong&gt;: Polars LazyFrames over a partitioned Parquet store (form_type / date). The classifier output stored as a JSON column alongside the parsed metadata.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Re-runs nightly&lt;/strong&gt;: full body-text re-parse with current rules; results overwrite the JSON column.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Sample classifier rule for Item 1.05 (cyber):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;detect_item_105&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Detect material cybersecurity incident language.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;body_lower&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;has_breach_lang&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;any&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;body_lower&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;material cybersecurity incident&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;unauthorized access&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data exfiltration&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ransomware&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cyber attack&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cybersecurity event&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;has_temporal&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;any&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;body_lower&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;on [date]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;we became aware&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;discovered on&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;identified on&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;has_materiality&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;any&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;body_lower&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;material&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;significant impact&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;operations were affected&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;has_breach_lang&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;has_temporal&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;has_materiality&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;(Real implementation has more rules + scoring. False positive rate on a 50-filing manual audit: ~12%. False negative rate on 100 unflagged filings: 1%.)&lt;/p&gt;

&lt;h2&gt;
  
  
  The output
&lt;/h2&gt;

&lt;p&gt;Every 8-K record in our API includes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"accession_number"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"0001234567-26-001234"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"company_name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Example Corp"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"filed_at"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-05-10T16:00:00-04:00"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"filer_reported_items"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"8.01"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"detected_items"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"8.01"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1.05"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"discrepancy_items"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"1.05"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"suspected_buried_events"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"reported as Item 8.01 but body suggests Item 1.05"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;discrepancy_items&lt;/code&gt; field is the difference between detected and reported. The &lt;code&gt;suspected_buried_events&lt;/code&gt; field is a human-readable string explaining the flag.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to query it
&lt;/h2&gt;

&lt;p&gt;The free public tier covers the past 72 hours, no API key required:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# All 8-Ks with suspected buried events from the last 72h&lt;/span&gt;
curl &lt;span class="s2"&gt;"https://filingfirehose.com/v1/public/8k?suspected_buried_only=true"&lt;/span&gt;

&lt;span class="c"&gt;# Filter by item code&lt;/span&gt;
curl &lt;span class="s2"&gt;"https://filingfirehose.com/v1/public/8k?items=1.05"&lt;/span&gt;

&lt;span class="c"&gt;# Recent activist 13D filings (separate endpoint)&lt;/span&gt;
curl &lt;span class="s2"&gt;"https://filingfirehose.com/v1/public/13d?activist=Saba"&lt;/span&gt;

&lt;span class="c"&gt;# Recent ATM offerings on S-3/424B5&lt;/span&gt;
curl &lt;span class="s2"&gt;"https://filingfirehose.com/v1/public/atm?min_shelf_million_usd=100"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For the full archive (10+ years) + webhook delivery + 10× rate limit, there's a &lt;a href="https://filingfirehose.com/trial" rel="noopener noreferrer"&gt;free 14-day Pro trial&lt;/a&gt;, no card required.&lt;/p&gt;

&lt;h2&gt;
  
  
  Live data
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Today's flagged filings&lt;/strong&gt;: &lt;a href="https://filingfirehose.com/today" rel="noopener noreferrer"&gt;filingfirehose.com/today&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Public leaderboard&lt;/strong&gt; of companies ranked by buried-event rate: &lt;a href="https://filingfirehose.com/leaderboard" rel="noopener noreferrer"&gt;filingfirehose.com/leaderboard&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Per-ticker pages&lt;/strong&gt;: filingfirehose.com/sec/{TICKER} for any listed US company&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Methodology piece&lt;/strong&gt;: &lt;a href="https://filingfirehose.com/research/buried-events" rel="noopener noreferrer"&gt;filingfirehose.com/research/buried-events&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MCP server&lt;/strong&gt; for Claude Desktop / Cursor / Windsurf: add &lt;code&gt;https://filingfirehose.com/mcp&lt;/code&gt; to your config&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ChatGPT custom GPT&lt;/strong&gt;: &lt;a href="https://chatgpt.com/g/g-6a00c51251c08191ad99bf16dc80f4b3-filingfirehose" rel="noopener noreferrer"&gt;chatgpt.com/g/g-6a00c51251c08191ad99bf16dc80f4b3-filingfirehose&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What I'd love feedback on
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Other body-text patterns worth testing.&lt;/strong&gt; Going-concern language buried under generic 8-K items is on my list. What else?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The classifier's false-positive rate.&lt;/strong&gt; 12% feels acceptable for a screening tool but may be too noisy for a trading signal. Curious if anyone here has experience tuning rule-based classifiers in this regime.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Other forms worth body-text-parsing.&lt;/strong&gt; S-1 amendments, proxy statements, going-concern 10-Q sections — what would compound the most?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If you're building anything in this space, happy to chat. The source on the parsing rules isn't fully open (it's the moat) but I'll share the methodology in detail with anyone serious.&lt;/p&gt;

&lt;p&gt;— Jared&lt;/p&gt;

</description>
      <category>finance</category>
      <category>python</category>
      <category>api</category>
      <category>fintech</category>
    </item>
  </channel>
</rss>
