<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Nuttee</title>
    <description>The latest articles on Forem by Nuttee (@nuttee_c5712c306d78369997).</description>
    <link>https://forem.com/nuttee_c5712c306d78369997</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1649298%2Ffa014097-d091-4d43-b7c5-738d48bece02.jpg</url>
      <title>Forem: Nuttee</title>
      <link>https://forem.com/nuttee_c5712c306d78369997</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/nuttee_c5712c306d78369997"/>
    <language>en</language>
    <item>
      <title>From Locked PDFs to Structured Data: Agent Skills to Extract Thailand's Election Policies using Gemini 3 Pro and Gemini CLI</title>
      <dc:creator>Nuttee</dc:creator>
      <pubDate>Sat, 07 Feb 2026 04:20:09 +0000</pubDate>
      <link>https://forem.com/nuttee_c5712c306d78369997/from-locked-pdfs-to-structured-data-agent-skills-to-extract-thailands-election-policies-using-2l0i</link>
      <guid>https://forem.com/nuttee_c5712c306d78369997/from-locked-pdfs-to-structured-data-agent-skills-to-extract-thailands-election-policies-using-2l0i</guid>
      <description>&lt;p&gt;&lt;em&gt;8 min read&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;Built an AI skill that extracts structured data from 51 scanned Thai election PDFs (587 policies) in under 2 hours with 100% valid JSON output. Uses Gemini 3 Pro's native OCR + structured output + Pydantic schemas for guaranteed data quality.&lt;/p&gt;

&lt;p&gt;Package workflows as AI "skills" that agents execute via natural language instead of manual scripts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Results:&lt;/strong&gt; Most of PDFs processed in 1-2 minutes, zero manual intervention, support agent with skills like gemini-cli, claude-code, cursor.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/nuttea/thailand-election-skills" rel="noopener noreferrer"&gt;GitHub Repo →&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem: A Facebook Post and 51 Locked PDFs
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fik1ietzzhrg6fdr9nyjm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fik1ietzzhrg6fdr9nyjm.png" alt="Facebook Post About Election Data" width="800" height="1092"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;The post that started it all: "Election Commission released 51 party policies as scanned PDFs that can't be used. When asked for CSV: 'If you want it, write it yourself'"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I was scrolling through Facebook when I saw this post from the Thai developer community. Thailand's Election Commission (กกต.) had released policy data for the 2026 election—all 587 policies from 51 political parties—but in the worst possible format: &lt;strong&gt;scanned image PDFs&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Not text PDFs you could copy from. &lt;strong&gt;Scanned photos&lt;/strong&gt; of printed documents. Just images of tables filled with Thai text, numbers, and complex formatting.&lt;/p&gt;

&lt;p&gt;When developers asked for a usable format like CSV or JSON, the response was essentially: &lt;em&gt;"Do it yourself."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The challenge was enormous:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;📄 51 scanned PDF files (no text layer)&lt;/li&gt;
&lt;li&gt;🔤 Thai language requiring OCR&lt;/li&gt;
&lt;li&gt;📊 Complex table structures with inconsistent formatting&lt;/li&gt;
&lt;li&gt;💰 Thai numerical units needing normalization (ล้าน, พันล้าน)&lt;/li&gt;
&lt;li&gt;🏷️ Unstructured categorization&lt;/li&gt;
&lt;li&gt;⏱️ Tight timeline before election discussion season&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I had three options:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Manual typing:&lt;/strong&gt; 3+ weeks of soul-crushing work with inevitable errors&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Traditional OCR:&lt;/strong&gt; Days of cleanup fixing Thai character errors, then manual structuring&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Build an AI solution:&lt;/strong&gt; Handle OCR + extraction + validation in one automated workflow&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I chose option 3—and built something reusable for anyone facing similar challenges.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The result?&lt;/strong&gt; ✅ 51 parties extracted, 587 policies structured, 100% valid JSON—in &lt;strong&gt;under 2 hours&lt;/strong&gt; of automated processing.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm1qviw01tep580exbwtw.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm1qviw01tep580exbwtw.jpeg" alt="Thailand Election Skills" width="800" height="597"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;The Problem&lt;/li&gt;
&lt;li&gt;Skills vs Scripts&lt;/li&gt;
&lt;li&gt;Technical Deep Dive&lt;/li&gt;
&lt;li&gt;Real Results&lt;/li&gt;
&lt;li&gt;Real-World Impact&lt;/li&gt;
&lt;li&gt;Installation&lt;/li&gt;
&lt;li&gt;Challenges &amp;amp; Improvements&lt;/li&gt;
&lt;li&gt;What I Learned&lt;/li&gt;
&lt;li&gt;Try It Yourself&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;
  
  
  What Makes This Different: Skills, Not Scripts
&lt;/h2&gt;

&lt;p&gt;Most people would write a Python script and call it done. But I took a different approach: I packaged this as a &lt;strong&gt;skill&lt;/strong&gt; for AI agents (Gemini CLI and Claude Code).&lt;/p&gt;
&lt;h3&gt;
  
  
  What's a Skill?
&lt;/h3&gt;

&lt;p&gt;Think of a skill as a &lt;strong&gt;recipe that teaches an AI agent a complete workflow&lt;/strong&gt;. Instead of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Running commands manually&lt;/li&gt;
&lt;li&gt;Remembering script parameters&lt;/li&gt;
&lt;li&gt;Setting up environments each time&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You simply tell the AI in natural language:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Extract policies from all Thai election PDFs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent handles everything—setup, execution, error recovery, validation.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Why this matters:&lt;/strong&gt; We're shifting from "writing scripts for humans to run" to "building tools for AI agents to use." This unlocks new levels of automation.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  The Two Parts of a Skill
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;1. Instructions (&lt;code&gt;SKILL.md&lt;/code&gt;)&lt;/strong&gt; - The "user manual" for the AI&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Agent Instructions&lt;/span&gt;
BEFORE running scripts, execute these commands:
&lt;span class="p"&gt;1.&lt;/span&gt; Navigate to skill directory
&lt;span class="p"&gt;2.&lt;/span&gt; Create virtual environment
&lt;span class="p"&gt;3.&lt;/span&gt; Activate environment
&lt;span class="p"&gt;4.&lt;/span&gt; Install dependencies
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2. Tools (&lt;code&gt;scripts/&lt;/code&gt;)&lt;/strong&gt; - The actual code that does the work&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;extract_policy.py&lt;/code&gt; - Single PDF extraction with OCR + validation&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;batch_extract_all.sh&lt;/code&gt; - Batch processing with retry logic&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;json_to_csv.py&lt;/code&gt; - Format conversion utilities&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;send_to_datadog.py&lt;/code&gt; - Monitoring and observability integration&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Secret Sauce: Native Thai OCR + Structured Output
&lt;/h2&gt;

&lt;p&gt;The real breakthrough came from combining two powerful capabilities:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Gemini's Native Vision for Thai OCR
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;The challenge:&lt;/strong&gt; These PDFs are scanned images—no selectable text. Traditional OCR tools struggle with Thai characters (๐-๙) and complex table layouts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The solution:&lt;/strong&gt; Gemini 3 Pro's native vision capabilities handle Thai OCR seamlessly. No preprocessing, no separate OCR pipeline, no error cleanup. It just works.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Structured Output with Pydantic
&lt;/h3&gt;

&lt;p&gt;Instead of hoping the LLM returns valid JSON, you define a Pydantic schema that &lt;em&gt;guarantees&lt;/em&gt; the output format.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Before: Raw Scanned Image&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;๑. ระบบรางความเร็วสูง | ๓.๕ แสนล้าน | งบประมาณ, PPP, พันธบัตร
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;(Thai numerals, Thai units, inconsistent formatting)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;After: Clean, Validated JSON&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"policy_seq"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"policy_category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"โครงสร้างพื้นฐาน"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"policy_name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ระบบรางความเร็วสูง"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"budget_baht"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;350000000000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"funding_source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"๑) งบประมาณแผ่นดิน&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;๒) PPP&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;๓) พันธบัตร"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"benefits"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"๑) เพิ่มการเชื่อมต่อ&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;๒) กระตุ้นเศรษฐกิจ"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;(Converted numerals, normalized budget, structured data)&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The Code: Pydantic + Gemini Structured Output
&lt;/h3&gt;

&lt;p&gt;Here's the core implementation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pydantic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Field&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;google.generativeai&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;genai&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.genai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;

&lt;span class="c1"&gt;# Define exact schema you want - this becomes your data contract
&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Policy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Convert Thai numerals (๑,๒,๓) to Arabic (1,2,3) for sequence
&lt;/span&gt;    &lt;span class="n"&gt;policy_seq&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Policy sequence (Thai numerals → Arabic)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# AI categorizes into 15 predefined buckets based on content
&lt;/span&gt;    &lt;span class="n"&gt;policy_category&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Category from predefined list&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Extract word-by-word for accuracy, preserving Thai formatting
&lt;/span&gt;    &lt;span class="n"&gt;policy_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Policy name extracted word-by-word&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Normalize: ๓.๕ แสนล้าน → 350000000000 (pure Baht integer)
&lt;/span&gt;    &lt;span class="n"&gt;budget_baht&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Budget in Baht (0 if none)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Preserve Thai numerals in lists (๑) ๒) ๓))
&lt;/span&gt;    &lt;span class="n"&gt;funding_source&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Funding details, preserve Thai formatting&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;cost_effectiveness&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;benefits&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;impacts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;risks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PoliticalPartyPolicies&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;policies&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Policy&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Configure Gemini with your schema - this is the magic
&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;GenerateContentConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Low temp for consistency
&lt;/span&gt;    &lt;span class="n"&gt;response_mime_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;response_schema&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;PoliticalPartyPolicies&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;model_json_schema&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;  &lt;span class="c1"&gt;# 🎯 Guaranteed structure
&lt;/span&gt;    &lt;span class="n"&gt;thinking_config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;ThinkingConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;thinking_level&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;low&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Process scanned PDF with native OCR + structured extraction
&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;genai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GEMINI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate_content_stream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-3-pro-preview&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;contents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="c1"&gt;# Your extraction instructions
&lt;/span&gt;        &lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Part&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;extraction_instructions&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="c1"&gt;# The scanned PDF - Gemini handles OCR natively
&lt;/span&gt;        &lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Part&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_bytes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pdf_bytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;mime_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/pdf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;  &lt;span class="c1"&gt;# Pass the structured output config
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Validate output - will raise error if schema doesn't match
&lt;/span&gt;&lt;span class="n"&gt;policies&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;PoliticalPartyPolicies&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;model_validate_json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result_json&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;✅ Validated: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;policies&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;policies&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; policies extracted&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key Benefits:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ &lt;strong&gt;Native Thai OCR&lt;/strong&gt; - Handles scanned images without preprocessing&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;No manual validation&lt;/strong&gt; - Pydantic handles schema validation automatically&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Type safety&lt;/strong&gt; - Budget is always an integer, never a string&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Thai numeral conversion&lt;/strong&gt; - LLM automatically converts Thai numerical characters (๐-๙) to computer-readable integers/floats&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Context-aware&lt;/strong&gt; - Understands Thai units and preserves formatting where needed&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Clear errors&lt;/strong&gt; - Know exactly what failed and where&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Documentation&lt;/strong&gt; - Schema serves as specification&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Usage: As Simple As Asking
&lt;/h2&gt;

&lt;p&gt;Once installed, using the skill is incredibly intuitive:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Extract a single PDF:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Extract policies from "เบอร์ 9 พรรคเพื่อไทย.pdf" using the Thailand election skill
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Batch process all PDFs:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Batch extract all Thai election PDFs in the assets folder
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Convert to CSV:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Convert party_9_policies.json to CSV format with pipe delimiter
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Analyze results:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Show me all policies with budgets over 100 billion Baht
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The AI agent handles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ Environment setup (Python venv, dependencies)&lt;/li&gt;
&lt;li&gt;✅ Script execution with correct parameters&lt;/li&gt;
&lt;li&gt;✅ Error handling and automatic retries&lt;/li&gt;
&lt;li&gt;✅ Progress monitoring&lt;/li&gt;
&lt;li&gt;✅ Output validation&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Real Results: Battle-Tested at Scale
&lt;/h2&gt;

&lt;p&gt;This isn't a toy project. Here are the real metrics from extracting Thailand's 2026 election data:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Result&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total parties&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;51 (100% success)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total policies&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;587 extracted&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Data quality&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;100% valid JSON&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Processing time&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Under 2 hours total&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Average per PDF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1-2 minutes (59% of files)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Automatic retries&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Handled seamlessly&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Manual intervention&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Zero&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Processing Time Distribution
&lt;/h3&gt;

&lt;p&gt;All 51 PDFs processed in under 5 minutes each:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Time Range&lt;/th&gt;
&lt;th&gt;Files&lt;/th&gt;
&lt;th&gt;Avg Time&lt;/th&gt;
&lt;th&gt;Percentage&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;0-1 min&lt;/td&gt;
&lt;td&gt;4 files&lt;/td&gt;
&lt;td&gt;47 sec&lt;/td&gt;
&lt;td&gt;11%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1-2 min&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;22 files&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;86 sec&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;59%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2-3 min&lt;/td&gt;
&lt;td&gt;7 files&lt;/td&gt;
&lt;td&gt;138 sec&lt;/td&gt;
&lt;td&gt;19%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3-4 min&lt;/td&gt;
&lt;td&gt;2 files&lt;/td&gt;
&lt;td&gt;192 sec&lt;/td&gt;
&lt;td&gt;5%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4-5 min&lt;/td&gt;
&lt;td&gt;2 files&lt;/td&gt;
&lt;td&gt;282 sec&lt;/td&gt;
&lt;td&gt;5%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Most common processing time:&lt;/strong&gt; 1-2 minutes (59% of files)&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Most PDFs processed in 90 seconds. That's faster than I could even open the file and copy-paste a single table manually."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This efficiency makes the solution practical for real-world use—you can extract data from dozens of documents in the time it takes to get a coffee.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Special Challenges Solved:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🔍 &lt;strong&gt;Thai OCR from scanned images&lt;/strong&gt; - 100% success rate, no preprocessing needed&lt;/li&gt;
&lt;li&gt;📄 &lt;strong&gt;Thai numerals&lt;/strong&gt; - Auto-convert ๐-๙ to 0-9 in sequence fields, preserve in content&lt;/li&gt;
&lt;li&gt;💰 &lt;strong&gt;Budget normalization&lt;/strong&gt; - ๓.๕ แสนล้าน (3.5 hundred billion) → 350,000,000,000 Baht&lt;/li&gt;
&lt;li&gt;🏷️ &lt;strong&gt;Smart categorization&lt;/strong&gt; - 587 policies across 15 categories with context awareness&lt;/li&gt;
&lt;li&gt;🔄 &lt;strong&gt;Retry logic&lt;/strong&gt; - Automatically handles incomplete responses (up to 3 attempts)&lt;/li&gt;
&lt;li&gt;⏱️ &lt;strong&gt;Stream timeout detection&lt;/strong&gt; - Monitors chunk timing, auto-retries on stalls&lt;/li&gt;
&lt;li&gt;📊 &lt;strong&gt;CSV consolidation&lt;/strong&gt; - Automatic merging across all parties with pipe delimiter&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Extracted Data in Action: Wevis Policy Comparison
&lt;/h2&gt;

&lt;p&gt;The extracted data isn't just sitting in JSON files—it's already powering real civic tech applications. &lt;a href="https://election69.wevis.info/promisedeconstructed/" rel="noopener noreferrer"&gt;Wevis&lt;/a&gt;, a Thai civic tech organization, used this structured data to build an interactive policy comparison tool for the 2026 election.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqo7tu1klsos9ndyz1ezf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqo7tu1klsos9ndyz1ezf.png" alt="Wevis Policy Comparison" width="800" height="395"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Wevis's interactive policy comparison website using the extracted election data&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What Wevis Built:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;📊 Interactive comparison across all 51 political parties&lt;/li&gt;
&lt;li&gt;🔍 Select and filter by policy categories&lt;/li&gt;
&lt;li&gt;💰 Budget analysis and visualization&lt;/li&gt;
&lt;li&gt;📱 Mobile-friendly interface for voters&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Why This Matters:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This demonstrates the real-world impact of making locked government data accessible. What started as 51 scanned PDFs "you have to write yourself" is now an interactive tool that helps millions of Thai voters make informed decisions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Check it out:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🌐 &lt;a href="https://election69.wevis.info/promisedeconstructed/" rel="noopener noreferrer"&gt;Wevis Policy Comparison Tool&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;📱 &lt;a href="https://www.facebook.com/share/p/18KN47rJ7A/" rel="noopener noreferrer"&gt;Facebook Announcement&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;"From locked PDFs to civic engagement tools—this is why open data automation matters."&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  How to Install and Use
&lt;/h2&gt;

&lt;p&gt;The skill is open source and ready to use:&lt;/p&gt;

&lt;h3&gt;
  
  
  Option 1: For Existing Agent Users (Recommended)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx skills add nuttea/thailand-election-skills &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--skill&lt;/span&gt; extract-thailand-election-policies &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--agent&lt;/span&gt; gemini-cli &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--agent&lt;/span&gt; claude-code &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--yes&lt;/span&gt;

&lt;span class="c"&gt;# Edit .env and add your GEMINI_API_KEY&lt;/span&gt;
&lt;span class="c"&gt;# start your agent like 'gemini', 'claude'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Option 2: Clone and Use Locally
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/nuttea/thailand-election-skills.git
&lt;span class="nb"&gt;cd &lt;/span&gt;thailand-election-skills

&lt;span class="c"&gt;# Set up Gemini API key&lt;/span&gt;
&lt;span class="nb"&gt;cp&lt;/span&gt; .env.example .env
&lt;span class="c"&gt;# Edit .env and add your GEMINI_API_KEY&lt;/span&gt;

&lt;span class="c"&gt;# Use with Claude or Gemini CLI&lt;/span&gt;
&lt;span class="c"&gt;# The agent handles the rest!&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;That's it!&lt;/strong&gt; The AI agent will handle Python environment setup, dependency installation, and script execution.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. AI Skills &amp;gt; Scripts&lt;/strong&gt;&lt;br&gt;
Package your workflows as reusable skills that agents can execute with natural language.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Structured Output is Essential&lt;/strong&gt;&lt;br&gt;
Use Pydantic schemas with Gemini's structured output for guaranteed valid data.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"The difference between hoping for valid JSON and guaranteeing it is the difference between a prototype and production."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;3. Agent-First Development&lt;/strong&gt;&lt;br&gt;
Design tools for AI agents to use, not just for manual execution.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Automation at Scale&lt;/strong&gt;&lt;br&gt;
What would take weeks manually can be done in hours with proper AI tooling.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Make It Reusable&lt;/strong&gt;&lt;br&gt;
This skill pattern works for invoices, research papers, financial reports, contracts—any structured document extraction.&lt;/p&gt;


&lt;h2&gt;
  
  
  Adapting This for Your Use Case
&lt;/h2&gt;

&lt;p&gt;This approach isn't limited to election policies. You can adapt it for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;📄 &lt;strong&gt;Invoice Processing&lt;/strong&gt; - Extract line items, totals, dates, vendor info&lt;/li&gt;
&lt;li&gt;📚 &lt;strong&gt;Research Papers&lt;/strong&gt; - Extract abstracts, citations, methodology, results&lt;/li&gt;
&lt;li&gt;💼 &lt;strong&gt;Contracts&lt;/strong&gt; - Extract clauses, dates, parties, obligations, terms&lt;/li&gt;
&lt;li&gt;📊 &lt;strong&gt;Financial Reports&lt;/strong&gt; - Extract metrics, tables, summaries, KPIs&lt;/li&gt;
&lt;li&gt;🏥 &lt;strong&gt;Medical Records&lt;/strong&gt; - Extract diagnoses, prescriptions, vitals (with proper HIPAA compliance)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The pattern is the same:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Define your Pydantic schema (your data contract)&lt;/li&gt;
&lt;li&gt;Configure Gemini with structured output&lt;/li&gt;
&lt;li&gt;Package as a skill with clear agent instructions&lt;/li&gt;
&lt;li&gt;Let agents do the work via natural language&lt;/li&gt;
&lt;/ol&gt;


&lt;h2&gt;
  
  
  Challenges and Future Improvements
&lt;/h2&gt;

&lt;p&gt;While this solution achieved 100% success across 51 parties, it's important to acknowledge the real-world challenges and areas for improvement:&lt;/p&gt;
&lt;h3&gt;
  
  
  Current Limitations
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;1. Non-Deterministic Extraction&lt;/strong&gt;&lt;br&gt;
LLM extraction is not 100% deterministic. Running the same PDF twice might produce slightly different results:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Wording variations in descriptions&lt;/li&gt;
&lt;li&gt;Occasional budget calculation differences&lt;/li&gt;
&lt;li&gt;Inconsistent categorization on edge cases&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;2. Quality Depends on Source Documents&lt;/strong&gt;&lt;br&gt;
Low-quality or inconsistent-resolution scanned documents present challenges:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Blurry text can lead to OCR errors&lt;/li&gt;
&lt;li&gt;Inconsistent table formatting requires manual verification&lt;/li&gt;
&lt;li&gt;Handwritten annotations may be misinterpreted&lt;/li&gt;
&lt;li&gt;Some PDFs required careful spot-checking&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Recommended Next Steps
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;1. Build a Test Dataset for Automated Evaluation&lt;/strong&gt;&lt;br&gt;
Create a ground-truth dataset with known values:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;test_cases&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;party_9&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;expected_policies&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;expected_total_budget&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;450000000000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sample_policy_names&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ระบบรางความเร็วสูง&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This enables automated validation: does extraction match expected policy count? Total budget?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Run Systematic Experiments&lt;/strong&gt;&lt;br&gt;
Compare different approaches with measurable metrics:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Models&lt;/strong&gt;: Gemini 3 Pro vs. Gemini 4 vs. GPT-4 Vision&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parameters&lt;/strong&gt;: Temperature (0.3 vs 0.5 vs 0.7), thinking levels, max tokens&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Prompts&lt;/strong&gt;: Different instruction styles, few-shot examples, chain-of-thought&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;3. Track Key Metrics&lt;/strong&gt;&lt;br&gt;
Build dashboards to monitor:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cost&lt;/strong&gt;: Token usage per PDF, per policy&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Latency&lt;/strong&gt;: Processing time per page, per table row&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Accuracy&lt;/strong&gt;: Match rate against ground truth test set&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Consistency&lt;/strong&gt;: Variance across multiple runs of same PDF&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;4. Implement Confidence Scores&lt;/strong&gt;&lt;br&gt;
Add validation checks to flag suspicious extractions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;validate_extraction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;policies&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pdf_metadata&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Return confidence checks for extracted data&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;checks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;policy_count_reasonable&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;policies&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;budgets_in_range&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;budget_baht&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mf"&gt;1e13&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;policies&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;has_required_fields&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;policy_name&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;policy_category&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;policies&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;no_duplicate_sequences&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;policy_seq&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;policies&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;policies&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;checks&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;5. Create a Human-in-the-Loop Review Process&lt;/strong&gt;&lt;br&gt;
For production use:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Flag extractions with low confidence scores for review&lt;/li&gt;
&lt;li&gt;Sample random extractions (e.g., 10%) for spot-checking&lt;/li&gt;
&lt;li&gt;Track and learn from manual corrections&lt;/li&gt;
&lt;li&gt;Build feedback loop to improve prompts over time&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;6. Implement LLM Observability&lt;/strong&gt;&lt;br&gt;
Monitor and optimize your extraction pipeline in production:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key questions to answer:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Quality&lt;/strong&gt;: How accurate are extractions across different document types?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost&lt;/strong&gt;: What's the token usage per PDF? Per policy extracted?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Latency&lt;/strong&gt;: Why do some PDFs take 5 minutes while others take 1 minute?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Failures&lt;/strong&gt;: What patterns lead to extraction errors?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Observability Tools:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Datadog LLMObs&lt;/strong&gt;: Track latency, costs, and quality metrics per extraction&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Custom Dashboards&lt;/strong&gt;: Visualize processing time distribution and error rates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A/B Testing&lt;/strong&gt;: Compare different models (Gemini 3 vs 4, GPT-4 Vision)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost Analysis&lt;/strong&gt;: Monitor token usage trends and optimize prompts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frwngkr9ewi9yz6neogfp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frwngkr9ewi9yz6neogfp.png" alt="LLMObs LLM Traces" width="800" height="488"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F59a3rravm3syickfw6ua.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F59a3rravm3syickfw6ua.png" alt="LLMObs Experiments" width="800" height="451"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real Example:&lt;/strong&gt;&lt;br&gt;
In my next post, I'll show how Datadog LLMObs revealed that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;PDFs with complex tables took 3-5 minutes vs. simple ones at 1-2 minutes&lt;/li&gt;
&lt;li&gt;Certain table layouts caused 30% more retries&lt;/li&gt;
&lt;li&gt;Optimizing prompts reduced average processing time by 40%&lt;/li&gt;
&lt;li&gt;Token costs varied 10x between smallest and largest documents&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This data-driven approach helps you make informed decisions about model selection, prompt engineering, and infrastructure costs.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Reality Check
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;This solution is production-ready for:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ Rapid prototyping and initial data extraction&lt;/li&gt;
&lt;li&gt;✅ Cases where 95-98% accuracy is acceptable&lt;/li&gt;
&lt;li&gt;✅ Projects with some budget for spot-checking&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;It needs more work for:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;⚠️ Mission-critical financial calculations requiring 100% accuracy&lt;/li&gt;
&lt;li&gt;⚠️ Legal document extraction with no tolerance for errors&lt;/li&gt;
&lt;li&gt;⚠️ High-volume production without human review&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The key is knowing your accuracy requirements and building appropriate validation layers.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Learned
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Gemini's Native Vision is Underrated&lt;/strong&gt;&lt;br&gt;
No need for separate OCR pipeline—Gemini handles scanned Thai documents natively. This eliminated an entire preprocessing step and potential error source.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Structured Output Changes Everything&lt;/strong&gt;&lt;br&gt;
Going from "hope the JSON is valid" to "guaranteed valid JSON" transforms a prototype into production-ready code. Pydantic + Gemini structured output is a game-changer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Agent-First Design is the Future&lt;/strong&gt;&lt;br&gt;
Building for AI agents to use (not just humans) unlocks new automation possibilities. The same skill works across Gemini CLI, Claude Code, and any agent that understands the pattern.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Observability is Non-Negotiable&lt;/strong&gt;&lt;br&gt;
You can't optimize what you don't measure. Tracking metrics revealed 40% efficiency gains and identified which PDFs needed manual review.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Start Small, Validate Early&lt;/strong&gt;&lt;br&gt;
I processed the first 5 PDFs manually to spot-check before automating all 51. This caught prompt issues early and saved hours of rework.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try It Yourself
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;GitHub Repository:&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://github.com/nuttea/thailand-election-skills" rel="noopener noreferrer"&gt;https://github.com/nuttea/thailand-election-skills&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The repo includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ Complete skill implementation&lt;/li&gt;
&lt;li&gt;✅ Pydantic schemas and extraction logic&lt;/li&gt;
&lt;li&gt;✅ Batch processing scripts with retry logic&lt;/li&gt;
&lt;li&gt;✅ CSV conversion utilities&lt;/li&gt;
&lt;li&gt;✅ Datadog integration (for monitoring)&lt;/li&gt;
&lt;li&gt;✅ Real example outputs from 51 parties&lt;/li&gt;
&lt;li&gt;✅ Comprehensive documentation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Questions or feedback?&lt;/strong&gt; Open an issue on GitHub.&lt;/p&gt;

&lt;p&gt;Happy automating! 🚀&lt;/p&gt;




</description>
      <category>gemini</category>
      <category>agentskills</category>
    </item>
  </channel>
</rss>
