DEV Community

Shrijith Venkatramana
Shrijith Venkatramana

Posted on

3 1 1 1 1

Unleashing AI to Hunt Down Database Code Leaks in Go

Hi there! I'm Shrijith Venkatrama, founder of Hexmos. Right now, I’m building LiveAPI, a first of its kind tool for helping you automatically index API endpoints across all your repositories. LiveAPI helps you discover, understand and use APIs in large tech infrastructures with ease.

Database connection leaks in Go can silently kill your app’s performance. Open connections pile up, resources get hogged, and suddenly your app is choking under the weight of its own database calls. I built db_leaks.py, a Python script that uses AI to scan Go codebases for PostgreSQL connection issues. It’s not perfect, but it’s a solid tool to catch problems like unclosed connections, idle connection overuse, or redundant sql.Open() calls. Let’s dive into how it works, why it’s useful, and how you can use it yourself.

Why Database Leaks Matter in Go

Go’s database/sql package is powerful but unforgiving. If you forget to close a connection or mishandle a connection pool, you’re asking for trouble. Leaks can lead to memory bloat, connection timeouts, or even crashes in production. Manually reviewing code for these issues is tedious, especially in large codebases. That’s where db_leaks.py comes in—it automates the process by leveraging AI (specifically Google’s Gemini model) to analyze your Go code for common database connection pitfalls.

Key problems it targets:

  • Connections left open without defer db.Close()
  • Rows not closed with rows.Close()
  • Overuse of idle connections in pools
  • Repeated sql.Open() calls that could be consolidated

This script is designed for developers who want to catch these issues early without combing through thousands of lines of code.

How db_leaks.py Works

The script recursively scans a Go codebase, identifies .go files, and sends their contents to the Gemini API for analysis. It uses a SQLite database to track which files have been analyzed, so you don’t waste time reprocessing the same files. The results are written to a Markdown report (report.md) with detailed findings and suggestions for each file.

Here’s the high-level flow:

  1. Initialize a SQLite database to track analyzed files.
  2. Scan the codebase for .go files.
  3. For each unanalyzed file, send its code to Gemini with a tailored prompt.
  4. Append the AI’s analysis to a Markdown report.
  5. Mark the file as analyzed in the database.

The script is built to be resilient with retry logic for API calls and error handling for file reading or network issues.

Setting Up the Script

To use db_leaks.py, you need a Google Gemini API key. You can get one from Google’s AI Studio. Once you have the key, replace <GEMINI_KEY> in the script with your actual key.

Here’s what you need to run it:

  • Python 3.8+
  • Dependencies: requests, sqlite3 (built-in), pathlib
  • A Go codebase to analyze
  • Gemini API key

Install the required package:

pip install requests
Enter fullscreen mode Exit fullscreen mode

Run the script by passing the path to your codebase:

python db_leaks.py /path/to/your/codebase
Enter fullscreen mode Exit fullscreen mode

Output: A report.md file with analysis results and an analysis_state.db file to track progress.

The Prompt That Powers the Analysis

The script sends a carefully crafted prompt to Gemini to ensure it focuses on database connection issues. The prompt is specific to PostgreSQL and Go’s database/sql package, asking for:

  • A file-level summary
  • Problematic code snippets with line numbers (if possible)
  • Concrete improvement suggestions

Here’s the prompt used in the script:

# You are a Go code reviewer focused on identifying PostgreSQL connection mismanagement issues.
#
# Review the code below and look for:
# - Connections left open or idle
# - Missing defer db.Close() or rows.Close()
# - Excessive idle connections in connection pools
# - Unnecessary repeated sql.Open() calls
#
# Respond in Markdown format with:
# - File-level summary
# - Line numbers and problematic code (if possible)
# - Concrete improvement suggestions
#
# The code:
#
# {code}
Enter fullscreen mode Exit fullscreen mode

This prompt ensures Gemini provides actionable feedback in a consistent format. You can tweak it to focus on other database libraries or issues if needed.

Example: Catching a Connection Leak

Let’s look at a sample Go file and how db_leaks.py analyzes it.

Sample Go code (db_example.go):

package main

import (
    "database/sql"
    _ "github.com/lib/pq"
)

func main() {
    db, err := sql.Open("postgres", "user=admin dbname=test sslmode=disable")
    if err != nil {
        panic(err)
    }
    rows, err := db.Query("SELECT * FROM users")
    if err != nil {
        panic(err)
    }
    // No rows.Close() or db.Close()
}
Enter fullscreen mode Exit fullscreen mode

Script execution:

python db_leaks.py ./codebase
Enter fullscreen mode Exit fullscreen mode

Output in report.md:

## ./codebase/db_example.go

### Summary
The code opens a PostgreSQL connection and executes a query but does not close the database connection or query rows, leading to potential resource leaks.

### Issues
- **Line 10**: `db, err := sql.Open("postgres", "user=admin dbname=test sslmode=disable")`
  - The database connection is not closed.
  - **Suggestion**: Add `defer db.Close()` after the connection is opened.
- **Line 13**: `rows, err := db.Query("SELECT * FROM users")`
  - The query rows are not closed.
  - **Suggestion**: Add `defer rows.Close()` after the query.

### Recommendations
- Add `defer db.Close()` immediately after `sql.Open()` to ensure the connection is closed when the function exits.
- Add `defer rows.Close()` after `db.Query()` to release resources associated with the query result.
Enter fullscreen mode Exit fullscreen mode

This example shows how the script catches missing Close() calls, which are common culprits of leaks.

Tracking Progress with SQLite

The script uses a SQLite database (analysis_state.db) to avoid re-analyzing files. The database has a simple schema:

Column Type Description
path TEXT File path (primary key)
analyzed INTEGER 1 if analyzed, 0 if not
last_checked TIMESTAMP When the file was last analyzed

Code for schema initialization:

conn = sqlite3.connect(DB_FILE)
conn.execute("""
    CREATE TABLE IF NOT EXISTS files (
        path TEXT PRIMARY KEY,
        analyzed INTEGER NOT NULL DEFAULT 0,
        last_checked TIMESTAMP DEFAULT CURRENT_TIMESTAMP
    );
""")
conn.commit()
conn.close()
Enter fullscreen mode Exit fullscreen mode

This ensures you can pause and resume analysis without duplicating work. If you modify a file, you’ll need to delete its entry from the database to re-analyze it.

Handling API Failures Gracefully

The script includes retry logic for Gemini API calls to handle network issues or rate limits. It uses exponential backoff (starting at 1 second, doubling each retry) and attempts up to 5 retries.

Retry logic code:

def send_prompt(prompt: str, retries: int = 5, backoff: float = 1.0) -> str:
    for attempt in range(retries):
        try:
            resp = requests.post(GEMINI_URL, json={
                "contents": [
                    {"parts": [{"text": prompt}]}
                ]
            }, timeout=30)
            if resp.ok:
                return resp.json()["candidates"][0]["content"]["parts"][0]["text"]
            else:
                print(f"Gemini error: {resp.status_code}{resp.text}")
        except Exception as e:
            print(f"Exception during Gemini call: {e}")
        time.sleep(backoff * (2 ** attempt))
    raise RuntimeError("Max retries exceeded for Gemini prompt.")
Enter fullscreen mode Exit fullscreen mode

Output on failure:

Gemini error: 429 — Too Many Requests
Exception during Gemini call: Request timed out
Enter fullscreen mode Exit fullscreen mode

This makes the script robust for production use, even with flaky network conditions.

Limitations and What’s Next

The script is powerful but has limitations:

  • It relies on Gemini’s accuracy, which isn’t perfect. False positives or missed issues can happen.
  • It only scans .go files and focuses on PostgreSQL with database/sql.
  • Large codebases may hit API rate limits, requiring pauses or a higher-tier API key.

Future improvements:

  • Support other databases (e.g., MySQL, SQLite) by tweaking the prompt.
  • Add parallel processing for faster scans.
  • Integrate with CI/CD pipelines to run automatically on code commits.

To extend the script, you could modify the prompt to include other libraries like GORM or add filters for specific file patterns.

Final Notes

db_leaks.py is a practical tool for catching database connection issues in Go codebases. It’s not a silver bullet, but it saves time and catches mistakes you might miss during manual reviews. By combining AI analysis with a lightweight SQLite tracker, it’s both efficient and developer-friendly. Try it on your codebase, tweak the prompt for your needs, and let it do the heavy lifting. If you hit issues or want to share improvements, drop a comment on Dev.to or ping me on X.

Full Code

You can find the full source code for db_leaks.py in this gist

Top comments (0)