<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Christian</title>
    <description>The latest articles on Forem by Christian (@zeekrey).</description>
    <link>https://forem.com/zeekrey</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3664525%2F7fbeb83d-c45d-4566-9b55-23553cd4aab3.png</url>
      <title>Forem: Christian</title>
      <link>https://forem.com/zeekrey</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/zeekrey"/>
    <language>en</language>
    <item>
      <title>Building a "Text-to-GIS" Engine with SvelteKit, PostGIS and Open-Source LLMs</title>
      <dc:creator>Christian</dc:creator>
      <pubDate>Thu, 18 Dec 2025 09:12:13 +0000</pubDate>
      <link>https://forem.com/zeekrey/building-a-text-to-gis-engine-with-sveltekit-postgis-and-open-source-llms-57bo</link>
      <guid>https://forem.com/zeekrey/building-a-text-to-gis-engine-with-sveltekit-postgis-and-open-source-llms-57bo</guid>
      <description>&lt;p&gt;A few days ago, I was grabbing coffee with a data scientist friend who works at a local energy provider here in Germany. He looked exhausted.&lt;/p&gt;

&lt;p&gt;When I asked what was going on, he said:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“I’m basically a human SQL converter. Marketing, Sales, Strategy… they all send vague questions like ‘Where are our high-consumption customers?’ and I spend all day writing complex PostGIS queries to get them CSVs.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I asked the obvious follow-up:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Why don’t you just build a chatbot so they can ask the database directly?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;He laughed.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Legal would shut it down immediately. We can’t send our internal schema or customer data to some hyperscaler. And IT won’t give us budget for a massive on-prem GPU cluster just to run a prototype.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That stuck with me.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsezcc2956jlzb15sgirc.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsezcc2956jlzb15sgirc.gif" alt="Thinking Math GIF" width="480" height="206"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The key idea
&lt;/h2&gt;

&lt;p&gt;You don’t need a huge commercial LLM to write SQL.&lt;/p&gt;

&lt;p&gt;And you definitely don’t need to ship your schema (or your data) to the US to get value from “AI in the business”.&lt;/p&gt;

&lt;p&gt;So I built a small &lt;strong&gt;Text-to-SQL + PostGIS&lt;/strong&gt; demo using &lt;strong&gt;open data&lt;/strong&gt; from my hometown (Leipzig) to prove the concept end-to-end.&lt;/p&gt;

&lt;p&gt;The goal: a user types something like:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Show me all parks in Leipzig larger than 5 hectares.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;…and the system returns a &lt;strong&gt;GeoJSON layer&lt;/strong&gt; that can be rendered on a map.&lt;/p&gt;

&lt;p&gt;What the system needs to do:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Identify the right dataset (“parks”) → &lt;em&gt;table/layer relevance&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Understand location context (“Leipzig”) → &lt;em&gt;spatial context&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Translate constraints (“&amp;gt; 5 hectares”) → &lt;em&gt;math + filtering&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Output something we can draw → &lt;em&gt;GeoJSON&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Tech stack
&lt;/h2&gt;

&lt;p&gt;Here’s what I used:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;SvelteKit&lt;/strong&gt; for the web app
&lt;a href="https://svelte.dev/docs/kit/introduction" rel="noopener noreferrer"&gt;https://svelte.dev/docs/kit/introduction&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PostgreSQL + PostGIS&lt;/strong&gt; for spatial queries
&lt;a href="https://postgis.net/" rel="noopener noreferrer"&gt;https://postgis.net/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;A hosted &lt;strong&gt;open-weights code model&lt;/strong&gt;: &lt;code&gt;qwen3-coder:30b&lt;/code&gt;
&lt;a href="https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct" rel="noopener noreferrer"&gt;https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vercel AI SDK&lt;/strong&gt; to wire LLM calls in SvelteKit
&lt;a href="https://ai-sdk.dev/docs/getting-started/svelte" rel="noopener noreferrer"&gt;https://ai-sdk.dev/docs/getting-started/svelte&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MapLibre GL&lt;/strong&gt; to render the result on a map
&lt;a href="https://maplibre.org/maplibre-gl-js/docs/" rel="noopener noreferrer"&gt;https://maplibre.org/maplibre-gl-js/docs/&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Why &lt;code&gt;qwen3-coder:30b&lt;/code&gt;? For Text-to-SQL you want a model that’s extremely good at syntax and structured output. This one performs very well in coding tasks &lt;em&gt;and&lt;/em&gt; it’s open weights—so you can run/host it in Germany without relying on a US API.&lt;/p&gt;

&lt;h2&gt;
  
  
  The real “secret”: schema context (and rules)
&lt;/h2&gt;

&lt;p&gt;The hardest part wasn’t the code. It was the prompt.&lt;/p&gt;

&lt;p&gt;If you just ask an LLM “write SQL”, it will confidently hallucinate tables and columns that don’t exist. The fix is to inject &lt;strong&gt;real schema context&lt;/strong&gt; and enforce strict rules.&lt;/p&gt;

&lt;p&gt;Here’s the system prompt I ended up with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You are a PostGIS expert. Your goal is to convert natural language queries into SQL queries for a PostGIS database.

The database has a single table named "city_objects" with the following schema:
-   id: serial primary key
-   layer_name: text (The name of the dataset)
-   properties: jsonb (Attributes of the object. Keys vary by layer.)
-   geometry: geometry(Geometry, 4326)

${SCHEMA_CONTEXT}

Rules:
1. Always return a valid SQL query.
2. The query should select the GeoJSON representation of the geometry using ST_AsGeoJSON(geometry) as "geojson," and the properties column.
3. Example: SELECT properties, ST_AsGeoJSON(geometry) as geojson FROM city_objects WHERE layer_name = 'SK_Vegetationsmerkmal_f' LIMIT 10;
4. Use the "Dataset Definitions" above to identify the correct 'layer_name' and property keys (e.g., 'd_FKT', 'd_BWS') for the user's request.
5. CRITICAL: Do not guess column names. If the definition says 'd_BWF', use 'd_BWF'. Do not use 'd_FKT' unless listed for that layer.
6. CRITICAL: Use ILIKE with wildcards for text filtering to match partial strings. Example: properties-&amp;gt;&amp;gt;'d_FKT' ILIKE '%Schwimmbad%'. Do not use = for text descriptions.
7. If the user asks for a specific location or spatial relationship, use PostGIS functions like ST_Contains, ST_DWithin, etc.
8. Do not include markdown formatting (sql) in the output, just the raw SQL.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Why &lt;code&gt;${SCHEMA_CONTEXT}&lt;/code&gt; matters
&lt;/h3&gt;

&lt;p&gt;Instead of hardcoding dataset definitions in the prompt, I generate them automatically.&lt;/p&gt;

&lt;p&gt;At build time, a script crawls the database, discovers available layers + JSON property keys, and injects them into &lt;code&gt;${SCHEMA_CONTEXT}&lt;/code&gt;. That single change massively improved correctness because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The model stops inventing columns&lt;/li&gt;
&lt;li&gt;It reliably picks the correct &lt;code&gt;layer_name&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Filtering uses real JSON keys that exist for that layer&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Also: explicit rules help. I keep updating them as I see failure modes in real prompts.&lt;/p&gt;

&lt;h2&gt;
  
  
  Model choice: small vs. “big enough”
&lt;/h2&gt;

&lt;p&gt;I started with smaller models like &lt;code&gt;qwen3:4b&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Sometimes it worked—but often the SQL was wrong, incomplete, or the model just didn’t return anything usable. In the end, &lt;code&gt;qwen3-coder:30b&lt;/code&gt; was the sweet spot for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;consistent SQL syntax&lt;/li&gt;
&lt;li&gt;fewer hallucinations (with schema context)&lt;/li&gt;
&lt;li&gt;better handling of multi-step spatial logic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I’m currently building &lt;a href="https://supa.works/?utm_source=devto&amp;amp;utm_medium=tutorial" rel="noopener noreferrer"&gt;SUPA&lt;/a&gt;, so I used a hosted setup for inference. (There are plenty of affordable hosted options out there if you don’t want to operate GPUs yourself.)&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementation notes (SvelteKit + AI SDK)
&lt;/h2&gt;

&lt;p&gt;To bootstrap the app:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx sv create my-ai-app
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then follow the SvelteKit guide in the AI SDK docs to wire up an API route that calls your model provider:&lt;br&gt;&lt;br&gt;
&lt;a href="https://ai-sdk.dev/docs/getting-started/svelte" rel="noopener noreferrer"&gt;https://ai-sdk.dev/docs/getting-started/svelte&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Provider options
&lt;/h3&gt;

&lt;p&gt;If you want to run locally with a small model, you can use Ollama via the community provider:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://ai-sdk.dev/providers/community-providers/ollama" rel="noopener noreferrer"&gt;https://ai-sdk.dev/providers/community-providers/ollama&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you want to call a hosted endpoint (SUPA in my case) using the OpenAI-compatible provider, you can set a custom base URL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;createOpenAI&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;SUPA_API_TOKEN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;baseURL&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;BASE_HREF&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Flow-wise it’s simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;User prompt → LLM returns SQL (raw SQL only)&lt;/li&gt;
&lt;li&gt;Backend executes SQL against PostGIS&lt;/li&gt;
&lt;li&gt;Response returns &lt;code&gt;properties&lt;/code&gt; + &lt;code&gt;geojson&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Frontend renders it with MapLibre&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I won’t go deep into MapLibre rendering here, but the GeoJSON output makes it straightforward.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I learned
&lt;/h2&gt;

&lt;p&gt;For a lot of business problems, you don’t need a frontier “foundation model” to ship something useful.&lt;/p&gt;

&lt;p&gt;Foundation models are amazing, and they keep getting better every week—but for workflows like “turn vague questions into database queries”, you can get very far with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a capable coding model&lt;/li&gt;
&lt;li&gt;strict output rules&lt;/li&gt;
&lt;li&gt;real schema injection&lt;/li&gt;
&lt;li&gt;and a setup that keeps data where it must stay (e.g., Germany/EU)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvgoa7qf5xk4s22sq7bil.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvgoa7qf5xk4s22sq7bil.gif" alt="So Close Atp GIF by Tennis TV" width="480" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Live demo: &lt;a href="https://gis.examples.supa.works" rel="noopener noreferrer"&gt;https://gis.examples.supa.works&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Repo: &lt;a href="https://github.com/supaworks/field-guide/tree/main/examples/natural-language-gis" rel="noopener noreferrer"&gt;https://github.com/supaworks/field-guide/tree/main/examples/natural-language-gis&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>webdev</category>
      <category>ai</category>
      <category>gis</category>
      <category>svelte</category>
    </item>
  </channel>
</rss>
