Forem: Chandler

I built a baby tracker app from my wife’s hospital room

Chandler — Mon, 16 Feb 2026 22:40:20 +0000

“I think my water broke!” Said my wife the day before her due date. Hastily, I threw all the pre-packed bags, a pillow and blanket for myself, and the baby’s carseat into the backseat of my truck. I took her to the hospital, which to our lack of surprise, she was right! Her water did break.

It was a long day, especially for her. After 18 hours of labor then an emergency c-section at 3am, our son William was born. The first day after he was born, we slept, visited with family who made the drive to the hospital, and asked the nurses lots and lots of questions.

One of the nurses asked, “Do you know what time he ate last?”

We did not remember. It was quite late on his birthday at this point, she was in pain and we were both exhausted. We didn’t remember at all.

The nurse said that we should try to keep track of it. Most people keep it in their phones notes app, but I figured surely there was a better solution for this.

But my developer brain didn’t want to search the internet or ask an LLM for options. I wanted to build.

After we both slept (the nurses graciously took him for a few hours so we could catch up on sleep), the next day I got to work.

Except, that I really couldn’t. I didn’t bring my laptop, and while I have coded on my phone before, its a bit of a chore and I wanted something working that we could both use.

Then I remembered, I had set up OpenClaw the previous weekend.

The Claw

OpenClaw, for those of you who don’t know, is a self-hosted AI assistant that’s meant to be like ChatGPT or Claude but has full access to the Linux system (or you laptop) its running on, and includes a vast skill library and the ability to connect basically any tool that can run on a Linux box or via MCP server. I’ve been using it as a general assistant, replacing ChatGPT and Claude as it had access to more tools that I found useful.

So I have this AI agent that I can talk to via Telegram (and I mean literally talk - it can interpret Telegram voice messages) that has full access to a Linux server and I can give access to some of my cloud provider accounts. What can go wrong?

Surprisingly, not much

I started from nothing - no template, no repo, only a basic idea of what tooling I wanted to use.

I followed proper best agentic coding practice - by planning first. I have an iPhone, however my wife as a Samsung Galaxy, so I wanted this app to be web accessible, a PWA that acts like a native app is preferred. I had just discovered Retro UI and quite liked the brutalist aesthetic for an app that (for the time being) has just my wife and I as its only users. I’m quite fond of the Cloudflare Workers and Pages platform, and its ease of development and use should make it fast to develop and deploy for the agent. That’s the only guidance I gave it for stack - I wanted Retro UI (React implied) and I wanted the entire thing to be built on the Cloudflare Workers and Pages platform. I told it to come up with a plan.

The Plan

It thought that having a separate frontend and backend with a monorepo structure was best, tying the projects together using pnpm. The frontend uses Vite and React Router (which is one of the frameworks I quite like) and the aforementioned Retro UI. The backend uses Hono, which is quickly becoming the preferred backend framework for Typescript projects, as it has compatibility with a wide range of JS/TS runtimes (Deno, Node, Bun, Cloudflare, and more). For the database it chose to use Cloudflare D1 since its available, which is fine. Given the tools I told it to use, it makes perfect sense, but how it executed this integration is where we have some problems. I’ll get to that later.

Execution

Once OpenClaw gave me a definitive plan, I gave it an API key with the proper permissions it would need to deploy the application, and told it to execute. After some back and forth getting it to deploy, it deployed an MVP. A surprisingly functional MVP. I created an account, added a child, and decided to test out by adding his diaper we just changed. It worked first try. Feedings worked first try as well, thought I later had to ask it to allow the switching between measurement units (hospital used milliliters, most Americans use fluid oz) which I told it to store the amounts in the smaller unit integers (it stores in ml, converts on the fly based on the user’s preference and rounds the oz to the nearest tenth). There was some other back and forth for little bugs and feature requests I found here and there, but overall an incredibly good first attempt.

Try it!

I implore you to go try it out yourself, link here. Its super simple and my wife loves how easy it is to use.

The Problems

The first big issue came up when I wanted to add naptime tracking. While its not something I’m doing with my, at the time of writing, 12 day old child, but it could be super useful in the future when he’s a bit older. OpenClaw noted that a bunch of the migrations are just missing from the local files, and that we really should have a migration system. Since I didn’t have the desire to fix it, I told it to work around it (which it did a great job at) but the lack of a migration system will definitely be an issue for the future when we want to add more features and update existing tables without breaking anything. On top of that while the app is written Typescript, there really isn’t super strong typing for the database due to the lack of ORM or migration system. This isn’t an issue in all cases, but I generally prefer strong typing for any of my projects.

Getting the initial deployment was also quite annoying thanks to context poisoning. The OpenClaw kept hitting the wrong API endpoint to check if we had a valid API key and kept refusing to try and use it with the other API endpoints, until I finally gave it the API key with an example curl request then it started working. This process took me well over an hour, and was incredibly annoying since I got it consistently deploying and working in a different thread. This is what pushed me to set up proper CI/CD (technically I just told OpenClaw to do it) so that it didn’t have to deploy anything itself, just handle builds and pushing to the GitHub repo that I also had it create.

Important note: I did not use proper security practice when developing this, I went whole-hog no cares in the world just get it done ASAP. Under most circumstances, you should manually set up CI and deployment yourself and never, ever give an LLM your API keys.

What I Learned

While this absolute zero to one development worked surprisingly well, I still think I’m going to scaffold the template first along with CI/CD pipelines before I set my AI agent lose on my project. While the code was fine, the lack of a migration system and the annoyance of getting the app properly deployed, I’d rather get that stuff working first so that development time can be even lower, spending less time on getting stuff working and more time iterating on features. Here’s a quick step-by-step on the Chandler Agentic Development flow.

Chandler’s Agentic Prototype Flow

Think of a good problem to solve
What kind of app do you want?
1. Hint - its probably a web app
  1. You can make it a mobile app later, if you get more than yourself as a user.
2. If its not, maybe a command line tool
  1. This can evolve later into a local GUI based app
What frameworks?
1. If its a web app
  1. If cloud hosted: React Router + Backend (Convex, CF Workers + D1 with Drizzle, or Supabase)
    1. If using Convex or Workers, use Clerk for auth.
  2. If self hosted: NextJS App Router, Auth.js, SQLite via Drizzle
  3. For both: always use shadcn or a similar compatible component library.
2. If its a command line app
  1. Python for an MVP
    1. Google’s Fire is great for a simple MVP
  2. Go for a fast, easy to maintain CLI tool
    1. Always use Cobra for CLI tools
3. If its a desktop GUI application
  1. Tauri
    1. always use shadcn or a similar compatible component library
4. If its a mobile app
  1. React Native
    1. Use Native-looking components with dark mode
  2. Capacitor
    1. always use shadcn or a similar compatible component library
Scaffold the repo
1. Need CI/CD when deploying to main
  1. Main is your release branch. All type checks, linting, and builds should pass.
2. Need linters and type checkers via pre-commit
  1. Type checking for all languages, Python included.
3. README and CLAUDE.md should be well thought out and well written.
  1. Follow HumanLayer’s Claude best practices.
  2. For example, mention that you should use best practice, but you don’t need to give it examples.
4. A proper .gitignore with as much as you may need now or later added.
  1. My hot take: I’d rather have too much in my ignore file than not enough.
5. Add initial code
  1. Simple “hello world” page for web apps with a hello world backend
  2. Set up the ORM properly
  3. Add scaffolding needed for shadcn components
6. Initial test deployment
  1. Do an initial test deployment to make sure that CI/CD works
Now let the agent do its thing
1. Once you had the scaffolding complete, its much easier for the AI agent to quickly iterate on features
2. This saves time and token costs, especially when using high quality LLMs like Opus 4.6.
3. Since all the hard stuff is done, you can do parallel features via apps like Conductor, Cursor Cloud Agents, or Copilot Agents on GitHub.

Conclusion

If there’s a headline here, it’s that agentic coding is already good enough to be useful in the messiest, highest-stakes moments of real life, when you’re sleep deprived, context switching constantly, and you just need something that works.

OpenClaw got me from idea to a working baby tracker in a day, from a hospital room, without the usual ceremony. That’s a big deal. The MVP was solid, the iteration loop was fast, and the app is now something we actually use.

But it also made the tradeoffs impossible to ignore. When you skip fundamentals like migrations, typed data access, and a repeatable deploy pipeline, you can ship quickly, but you’re borrowing friction from your future self. The hour lost to context poisoned deployment and the awkward database setup were reminders that agents are powerful, but they still need guardrails.

So my current takeaway is simple: let the agent sprint, but give it a track to run on. Scaffold the repo. Lock down CI/CD. Put migrations and typing in place. Then hand off the “build the feature” work to the agent and spend your own time making product calls, testing, and prioritizing.

Babylog was a tiny project, but it changed how I think about prototyping. The bar for “I can just build this” has dropped dramatically, and with the right scaffolding, it’s only going to drop further.

The best way to do agentic development in 2026

Chandler — Fri, 23 Jan 2026 21:25:36 +0000

TLDR; I tried Claude Code, switched to OpenCode + Oh-My-Code, then finally switched to Conductor + Claude Code + Lots of Plugins and Skills.

Claude Code is good

Claude Code by itself is an amazing tool, hell my coworkers and colleagues use it constantly. I was always a little more hesitant, preferring Cursor in most cases because I wanted the freedom to edit some code myself while the AI agent was doing its thing, while also keeping close watch on the agent to make sure that the code it was producing worked well and was up to my standards (it usually was, if it passes my strict CI its probably fine).

The problem I always had with Claude Code wasn't the lack of editing: for a while (and still quite often now) I would run Claude Code in my VSCode/Cursor terminal while editing other files. It was price to performance.

It was just okay.

While it always seemed to work, it never always worked well, nor did it ever seem to ever seem to work ever on the first try.

Some of this was remedied when moving from Sonnet 3.x to Sonnet 4 series, and even more when moving to the 4.5 series of models, but especially Opus 4.5. However it was never perfect.

OpenCode + Oh-My-OpenCode

A friend of mine introduced me to Oh-My-OpenCode (and I've known about OpenCode but always preferred Claude Code) at the beginning of the year, and IT. IS. WONDERFUL!

Feels like an actual developer that lives on your machine. Its basically a plugin that significantly improves the performance of OpenCode with subagents, and an "ultrawork" mode that allows more efficient background tasks and subtasks to be delegated to other agents.

Plus, unlike Claude Code, OpenCode supports many AI providers (the main two they talk about are Codex and Claude Code.

Now that all seems amazing, what's the catch? Cost.

Anthropic is kinda mad.

When I first started using Oh-My-OpenCode, one of its selling points is that it allowed you to use your Claude Code Pro or Max subscriptions to use OpenCode, which made it significantly more cost effective. Since then, Anthropic has blocked this as they claim this was outside their ToS.

I knew this may become an issue, and I had tons of Anthropic grant credits that expired and were burning a hole in my pocket, so I opted to use API keys from the start.

So what's my problem then? While OpenCode + Oh-My-OpenCode has impressive performance, there's a lot of dead time where nothing is happening. I'm just sitting there watching it code. Now I could do what I normally do and complete other non-coding tasks that are related to my work, or just procrastinate by doing non-work tasks in the background, but I'm a productivity nerd. I want to get as much done as possible in the least amount of time possible.

Conductor - trains or music?

Conductor is a unique tool. Its currently a macOS-only (Windows and Linux coming soon) developer tool that uses Claude Code (or OpenAI Codex) to orchestrate many git worktrees of a repo to accomplish many tasks at the same time. It has tight integration with Linear and GitHub (Linear is especially fitting since they're one of the larger companies using it), allowing you to inject Linear (and GitHub) issues as context, provide feedback as it developers the app, run multiple Claude Code instances in a single or across multiple worktrees, handoff plans to other agents, review code, and open GitHub Pull Requests right from within the app. I don't leave the app unless I need to request a review on my PR (I should ask them for that feature...).

It also handles GitHub PRs amazing. One of your CI checks fail? No problem, click the "fix" button in the top right to automatically fix it. Merge conflicts? No problem, one click fix those too!

Now this alone wouldn't get me to move over from Oh-My-OpenCode since the code from Claude Code was worse, however there's one more piece to this agentic puzzle that made the switch worth it.

Superpowers (and friends)

Superpowers is a plugin for Claude Code that just seems to make it smarter, especially when running it in planning mode (which you absolutely should be doing). It asks you more intelligent questions as it goes, fleshing out implementation details to make its actual implementation better. It'll spawn subagents (just like Oh-My-OpenCode) for gathering context from all your files in your repo or other MCP servers. It can even use your testing suites to make sure its new code is passing linting, formatting, and other CI/CD. It accomplishes this by providing a number of skills (basically just fancy prompts injected into the system prompt with hyper specific instructions for tasks) which allow it to perform within spitting distance of Oh-My-OpenCode, while still being able to use your existing Claude Code Max subscription.

The planning part of Superpowers is definitely the reason it performs so much better

Other Plugins

I don't just use the superpowers plugin, I also have a few others that I consider essential to making Claude Code perform at its best.

An example of this is the Context7 plugin. Context7 is a documentation provider that is specifically designed to allow AI agents find and read documentation. This allows Claude Code to search for the latest documentation and use APIs properly the first time, minimizing required iteration.

Another plugin I use that isn't listed in the Anthropic plugins store is Tavily. I use this to give Claude Code improved web search capabilities and to integrate research abilities into the repository, which has been especially useful when trying to find information about documentation that isn't yet in Context7, or to do a research task before accomplishing a complex feature like trying to find out how this is usually implemented in other projects.

How do I get this set up?

Claude Code + Plugins

First step is to install Claude Code. If you're on macOS or Linux, this can be done the official script.

curl -fsSL https://claude.ai/install.sh | bash

Once that's done, you should log into Claude by running claude in your terminal then running /login.

You can leave Claude for now. Now we need to install all the good plugins. I prefer use level install for most things so that its used across all my projects, which can be done from the command line. Let's start with the easy ones.

You don't need to install all of these, but I am going to list all plugins I have installed.

claude plugin marketplace add obra/superpowers-marketplace
claude plugin install superpowers@superpowers-marketplace
claude plugin install context7@claude-plugins-official
claude plugin install playwright@claude-plugins-official
claude plugin install frontend-design@claude-plugins-official
claude plugin install feature-dev@claude-plugins-official
claude plugin install commit-commands@claude-plugins-official

Tavily MCP Server

Unlike many MCP servers, you don't need to run this one locally. You start by making an account on Tavily, then by following their official skill instructions. Here's a TLDR if you don't want to use their docs.

Open your Claude settings, which is at ~/.claude/settings.json, using your preferred text editor.

Once that's done, add a new section called env. You'll add your Tavily API key, (which you need an account for) Should look something like this.

{
  "env": {
    "TAVILY_API_KEY": "tvly-YOUR_API_KEY"
  }
}

Replace tvly-YOUR_API_KEY with your actual API key. Once that's done, go to your terminal and run the following command, following the instructions to install the skill.

npx skills add https://github.com/tavily-ai/skills

Once that's done, you can call those commands you added from anywhere, or you can tell Claude to use Tavily to perform searches to research anything you want!

Conductor

Now that we have Claude Code set up for maximal productivity, we need the program that ties it all together: Conductor.

Install it, create your account, then (ideally) link your GitHub and Linear accounts to your Conductor accounts. This will let you quickly add context via injecting issues.

Once that's all done you're ready to rip with Conductor! Utilize the superpower skills, research topics, link Linear and GitHub issues, always use Planning mode, and dump as much context as you can into it to make your issues work as best as possible! I highly recommend use of the /research command or telling it about its new skills when prompting to make it especially intelligent when accomplishing tasks!

The real power when using Conductor though is that its not just a pretty UI: its a force multiplier. Now you can open many projects, many repos, and work using git worktrees (parallel workspaces) to accomplish many issues quickly. I'll run parallel agents on many issues across multiple project by dumping as much context as possible. Since I've started at a new company I've needed to gather lots of context very quickly from existing documents, and the fastest way to accomplish this (due to their extensive use of Google Drive) was to use NotebookLM to gather context on projects and information on clients and what they (both my colleagues and clients) would want. Since we also switched to Linear for Issue and Project management, I can dump additional context from there into Conductor, and build in parallel extremely fast.

The only issue I've ran into is that when you run commands with Conductor, (/commit, any slash commands) it tends to ignore any attached issues. The solution is to attach them anyways and tell it to use the Linear/GitHub Issue to complete your task, and when it can't find it it'll ask for you to attach it again. I then attach it again to the new message, and it finds it and uses it.

Any questions? Leave a comment below!

Using LLMs in 3 lines of Python

Chandler — Mon, 30 Jun 2025 19:26:33 +0000

When working with LLMs, the first thing people generally install is the openai or anthropic packages, if you’re a little more adventurous with your LLM choice it may be litellm or ollama. The issue is that all of these require a bit of code to get your started. For example, assuming you have an API key in your environment like I do, you’ll need at least this code to make an LLM call with OpenAI (also assuming you’re using the older Chat Completions endpoint).

import os
from openai import OpenAI

# retrieve API key from environment
api_key = os.getenv("OPENAI_API_KEY")

# initialize client
client = OpenAI(api_key=api_key)

# send a chat request
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Say something concise."}
    ]
)

# print assistant's answer
print(response.choices[0].message.content.strip())

And if you want to wrap your API call with a function so you can call it repeatedly, that’s even more lines!

import os
from openai import OpenAI

def chat_with_openai(prompt: str) -> str:
    api_key = os.getenv("OPENAI_API_KEY")

    client = OpenAI(api_key=api_key)
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": prompt}
        ]
    )
    return response.choices[0].message.content.strip()

if __name__ == "__main__":
    print(chat_with_openai("Say something concise."))

And that is simply unacceptable!

Do you really care?

No, I’m being facetious. For most LLM projects, consistency of output trumps anything else, however sometimes its nice to have a super simple way to add LLMs to my one-off python scripts and tools without all the boilerplate.

Magentic

Magentic is a Python package that lets you create functions that call LLMs in 3 lines of code. No, really! Here’s an example ripped straight from their docs.

from magentic import prompt

@prompt('Add more "dude"ness to: {phrase}')
def dudeify(phrase: str) -> str: ...  # No function body as this is never executed

Thanks to some black box dark magic that I don’t feel like learning about, this is a completely valid Python function that’s callable anywhere in the script, assuming you have an OpenAI API Key in your environment variables.

print(dudeify("Hello, how are you?"))
# "Hey, dude! What's up? How's it going, my man?"

A Note On Package Management

I’m going to be using the PEP 723 standard at the top of all my scripts for the rest of this post. This allows you to use uv, the best package manager for Python, to run the scripts without you having to make a virtual environment, then install packages, then run the script. This automates all three of those tasks into a single command. Here’s an example.

Here’s the above script with the added metadata and some slight modifications. This assumes you have uv installed and the OPENAI_API_KEY env var set.

#!/usr/bin/env -S uv run --script
# /// script
# requires-python = ">=3.10"
# dependencies = [
#     "fire",
#     "magentic"
# ]
# ///

import fire
from magentic import prompt

@prompt('Add more "dude"ness to: {phrase}')
def dudeify(phrase: str) -> str: ...  # No function body as this is never executed

if __name__=="__main__":
    fire.Fire(dudeify)

This script can now be downloaded and ran like an executable. I’ve uploaded to a gist for easy download.

wget -O dudeify https://gist.githubusercontent.com/chand1012/218372f3e1101dfa7f915dc35c0e66d8/raw/363f720d21fa8ebe2e6a484f6b389496c3452064/dudeify.py
chmod +x dudeify
./dudeify "Hello how are you"
# Installed 23 packages in 45ms
# Yo dude, how's it hangin'?

The first time you run the script it’ll handle making a cached virtual environment for the next time you run it! For more information on how this works, you can check out the uv docs, and the blog post that inspired my constant use of this feature.

Structured Outputs

If you want to have structured outputs, like for example for an API response or just to make it easier to parse and use the data with your scripts, you can use a Pydantic Dataclass.

#!/usr/bin/env -S uv run --script
# /// script
# requires-python = ">=3.10"
# dependencies = [
#     "fire",
#     "magentic",
#     "pydantic",
# ]
# ///

from fire import Fire
from magentic import prompt
from pydantic import BaseModel

class Animal(BaseModel):
    species: str
    legs: int
    latin_species: str
    predators: list[str]
    prey: list[str]

@prompt("Give me information on the animal {animal_name}.")
def animal_info(animal_name: str) -> Animal: ...

if __name__=="__main__":
    Fire(animal_info)

Here’s an example of that method being ran.

Prompting and Function Calls

There’s two ways you can prompt the LLM with Magentic. You can either use the @prompt decorator, as I’ve been using, which is the simplest and fastest way to create LLM methods. There’s also @chatprompt, which allows you to pass a list of chat messages to the LLM. This is especially useful for few-shot prompting, where you give the LLM some examples of what output you want. After all, LLMs are just fancy pattern matching black boxes.

#!/usr/bin/env -S uv run --script
# /// script
# requires-python = ">=3.10"
# dependencies = [
#     "fire",
#     "magentic",
#     "pydantic",
# ]
# ///
from fire import Fire
from magentic import chatprompt, AssistantMessage, SystemMessage, UserMessage
from pydantic import BaseModel

# this is a modified version of magentic's example chatprompt code
# https://magentic.dev/#chatprompt
class Quote(BaseModel):
    quote: str
    character: str

@chatprompt(
    SystemMessage("You are a movie buff."),
    UserMessage("What is your favorite quote from Harry Potter?"),
    AssistantMessage(
        Quote(
            quote="It does not do to dwell on dreams and forget to live.",
            character="Albus Dumbledore",
        )
    ),
    UserMessage("What is your favorite quote from {movie}?"),
)
def get_movie_quote(movie: str) -> Quote: ...

if __name__=="__main__":
    Fire(get_movie_quote)

You can also pass function calls to LLMs to allow them to return a python callable that you can call later. Another use of this is the decorator @prompt_chain which allows you to have an LLM call a function and use the returned results to generate its response.

#!/usr/bin/env -S uv run --script
# /// script
# requires-python = ">=3.10"
# dependencies = [
#     "fire",
#     "magentic",
#     "duckduckgo_search",
# ]
# ///
from fire import Fire
from magentic import prompt_chain
from duckduckgo_search import DDGS

def web_search(query: str) -> dict:
    """Searches the web for a given query"""
    with DDGS() as ddgs:
        results = ddgs.text(query, max_results=5)
        print(results)
        return results

@prompt_chain(
    "You are a helpful assistant that can search the web for information. Use your tools to answer the user's question: {query}",
    functions=[web_search],
)
def search(query: str) -> str: ...

if __name__ == "__main__":
    Fire(search)

Using Other LLMs

If you’re a data conscious person, or just want your options to be open, Magentic can be configured to work with nearly all other LLMs as long as they are supported by LiteLLM or offer an OpenAI compatible API. Here’s an example of a script that runs entirely locally using Ollama and Google’s Gemma 3.

#!/usr/bin/env -S uv run --script
# /// script
# requires-python = ">=3.10"
# dependencies = [
#     "fire",
#     "magentic"
# ]
# ///

import fire
from magentic import prompt, OpenaiChatModel

model = OpenaiChatModel("gemma3:27b-it-qat", base_url="http://localhost:11434/v1/")

@prompt('Add more "dude"ness to: {phrase}', model=model)
def dudeify(phrase: str) -> str: ...  # No function body as this is never executed

if __name__=="__main__":
    fire.Fire(dudeify)

If your chosen LLM is one of the many supported by LiteLLM, you can use the LiteLLM extra feature of Magentic.

#!/usr/bin/env -S uv run --script
# /// script
# requires-python = ">=3.10"
# dependencies = [
#     "fire",
#     "magentic[litellm]"
# ]
# ///

import fire
from magentic import prompt
from magentic.chat_model.litellm_chat_model import LitellmChatModel

# this specific example requires GEMINI_API_KEY env var to be set
model = LitellmChatModel("gemini/gemini-2.0-flash")

@prompt('Add more "dude"ness to: {phrase}', model=model)
def dudeify(phrase: str) -> str: ...  # No function body as this is never executed

if __name__=="__main__":
    fire.Fire(dudeify)

You can use the LiteLLM method to use Anthropic’s Claude series of models, or you can use Magentic’s official Anthropic extension.

#!/usr/bin/env -S uv run --script
# /// script
# requires-python = ">=3.10"
# dependencies = [
#     "fire",
#     "magentic[anthropic]"
# ]
# ///

import fire
from magentic import prompt
from magentic.chat_model.anthropic_chat_model import AnthropicChatModel

# this specific example requires GEMINI_API_KEY env var to be set
model = AnthropicChatModel("claude-4-sonnet-latest")

@prompt('Add more "dude"ness to: {phrase}', model=model)
def dudeify(phrase: str) -> str: ...  # No function body as this is never executed

if __name__=="__main__":
    fire.Fire(dudeify)

No LLM left behind!

Advanced Usage

Need an async function? Just prefix with async def instead of def !

# incomplete snippet
from magentic import prompt

@prompt("Tell me more about {topic}")
async def tell_me_more_about(topic: str) -> str: ...

You can use Python’s AsyncIterable to make multiple simultaneous calls to the LLM.

# incomplete snippet
import asyncio
from typing import AsyncIterable

from magentic import prompt

@prompt("List ten presidents of the United States")
async def iter_presidents() -> AsyncIterable[str]: ...

tasks = []
async for president in await iter_presidents():
    # Use asyncio.create_task to schedule the coroutine for execution before awaiting it
    # This way descriptions will start being generated while the list of presidents is still being generated
    task = asyncio.create_task(tell_me_more_about(president))
    tasks.append(task)
descriptions = await asyncio.gather(*tasks)

Need to stream the response back to the user? Use Magentic’s StreamedStr to loop through the response chunks.

#!/usr/bin/env -S uv run --script
# /// script
# requires-python = ">=3.10"
# dependencies = [
#     "fire",
#     "magentic"
# ]
# ///

import fire
from magentic import prompt, StreamedStr

@prompt("Tell me about {country}")
def describe_country(country: str) -> StreamedStr: ...

def describe(country: str):
    for chunk in describe_country(country):
        print(chunk, end="")
    print()

if __name__=="__main__":
    fire.Fire(describe)

This also works for multiple objects, simply wrap your objects in the Iterable class.

#!/usr/bin/env -S uv run --script
# /// script
# requires-python = ">=3.10"
# dependencies = [
#     "fire",
#     "magentic",
#     "pydantic",
# ]
# ///
from collections.abc import Iterable

from fire import Fire
from magentic import prompt
from pydantic import BaseModel

class Animal(BaseModel):
    species: str
    legs: int
    latin_species: str
    predators: list[str]
    prey: list[str]

@prompt("Give me information on the animals in the family {family}.")
def animal_family_info(family: str) -> Iterable[Animal]: ...

def info(family: str):
    for animal in animal_family_info(family):
        print(animal)

if __name__=="__main__":
    Fire(info)

Conclusion

Working with LLMs is now easier than ever, and Magnetic makes it even easier than the standard methods to quick add LLMs to any Python script, regardless of the scale of complexity. Using this in tandem with something like uv and the new scripting metadata allows you to quickly make command line tools that can utilize AI quickly and effectively. I won’t always use Magentic for every project I need an LLM for, but I’ll definitely use it all the time with my small one-offs and utilities.

Agentic Coding (Vibe Coding) Best Practices

Chandler — Fri, 28 Mar 2025 21:49:53 +0000

TLDR

If you've been living under a rock, you may not be aware of the "vibe coding" phenomenon.
If you want a good explanation of what "vibe coding", or to use a more technical term, Agentic Coding, is, check out Fireship's video on the subject. He does a great job of explaining the concept in a way that's both easy to understand and objective about the pros and cons.

Tooling

If you're going to ignore all the cons and go ahead with agentic coding, there are some best practices I use to make sure my code doesn't turn into a complete mess of AI generated garbage. I personally use Cursor, so this guide is going to use their Rules feature to organize and apply rules to the LLM for code generation.

Rules

Cursor has a concept of a rule file. This file is a markdown file found in the .cursor/rules directory with some extra front matter that specifies when and how the rule is applied to the LLM, and ending in .mdc rather than .md. These can be broken down into 4 categories.

Language Rules
- Applies to specific languages.
Framework Rules
- Applies to specific frameworks.
- Can also apply to libraries that have special rules, like shadcn/ui.
Practice Rules
- For coding practice guidelines.
Project Rules
- Should be used to describe project specific guidelines like file structure, dependencies used, etc.

Rules can be applied with 4 different methods.

Always Apply
Auto Apply
- Uses a glob pattern to apply the rule to all files that match the pattern.
- Especially useful for language and framework rules where specific file extensions are used.
Agent Requested
- Uses a description of the rule to allow the agent to decide when to apply the rule.
Manual Apply
- Only applied when you directly ask the agent to apply the rule.

Rules can also have other files from within your project linked to them and will also be loaded into context. This is especially useful for the project rules where you can link the README as well as any other documentation that the LLM should know about, like an Architecture or Contributing Guide.

Writing Rules

The actual contents of the rules are written in markdown, and should be concise and clear guidelines for the LLM rules. They should be written to be both human and LLM readable, and should include very minimal code and command examples. Here's some examples of rules from my personal collection.

Example Rules

Here's an example I made for best practices when using Go.

---
description: "Go coding standards and best practices for modern development"
globs: **/*.go
---

# Go Best Practices

## Package and Import Statements
- Use meaningful package names that reflect their purpose (e.g., `auth`, `config`).
- Group imports in this order: standard library, third-party, then local packages, separated by blank lines.
- Avoid import cycles to maintain clean dependency graphs.

## Type System
- Use `struct` types to define complex data structures.
- Define `interface` types to specify behavior and enable polymorphism.
- Use type aliases sparingly for clarity (e.g., `type ID string`).
- Leverage Go’s built-in types (e.g., `map`, `slice`) and composite types effectively.
- Avoid unnecessary type conversions to maintain type safety.
- Use struct embedding for composition instead of inheritance.

## Naming Conventions
- Use `camelCase` for variable and function names (e.g., `getUser`).
- Use `PascalCase` for type names and exported identifiers (e.g., `UserService`).
- Use `ALL_CAPS` for constants (e.g., `MAX_RETRIES`).
- Be descriptive yet concise in naming (e.g., `userCount` over `cnt`).

## Code Organization
- Follow the standard Go project layout (e.g., `cmd/`, `pkg/`, `internal/`).
- Keep related code within the same package for cohesion.
- Use subdirectories for larger packages to organize functionality (e.g., `api/handlers`).

## Functions and Methods
- Keep functions short and focused on a single responsibility.
- Use named return values for clarity in complex functions (e.g., `func getData() (data string, err error)`).
- Avoid side effects in functions to improve predictability.

## Best Practices
- Follow the Go proverb: "A little copying is better than a little dependency."
- Use interfaces to define behavior and decouple components.
- Prefer composition over inheritance using embedding.
- Avoid unnecessary abstractions; prioritize simplicity.
- Use the `init` function sparingly for package initialization.
- Avoid global variables; if unavoidable, ensure they are thread-safe (e.g., with `sync.Mutex`).
- Be mindful of memory allocations; use profiling tools (e.g., `pprof`) to optimize performance.
- Use `gofmt` for consistent formatting and `go vet` for static analysis.

## Error Handling
- Always check errors explicitly (e.g., `if err != nil`).
- Use descriptive error messages for debugging (e.g., `errors.New("failed to open file")`).
- Consider error wrapping with `fmt.Errorf` and `%w` for context (e.g., `fmt.Errorf("query failed: %w", err)`).
- Use `defer` with `recover` to handle panics in critical sections (e.g., HTTP handlers).

## Concurrency
- Use goroutines for concurrent tasks (e.g., `go processData()`).
- Use channels for safe communication between goroutines (e.g., `ch := make(chan int)`).
- Avoid shared state when possible; prefer message passing via channels.

## Testing
- Write tests for all public functions using the `testing` package.
- Use table-driven tests for multiple test cases (e.g., `tests := []struct{...}`).
- Aim for high test coverage with `go test -cover`.

## Documentation
- Write doc comments for all exported identifiers (e.g., `// UserService handles user operations`).
- Follow the standard Go doc format, starting with the identifier name (e.g., `// Package auth provides...`).
- Include examples in doc comments when possible (e.g., `// Example: ...`).

## Patterns
- Use interfaces for dependency injection to improve testability.
- Implement the Repository pattern for data access (e.g., `UserRepository` interface).
- Use the Factory pattern for object creation (e.g., `NewUserService()`).

This allows the LLM to properly structure Go code, and its output is great! Here is some code generated using the rule.

Here's another example of a project-level rule that I use for a Supabase backend project.

---
description: 
globs: 
alwaysApply: true
---
# Lancer DB

This is our monorepo for our Supabase Database migrations as well as our Supabase Edge Functions written in Deno.

## Directory Structure

- docs/: Markdown documentation relevant to the repo and development.
- supabase/: Contains all the Supabase related configurations, migrations, and edge functions.
  - supabase/migrations/: Contains the migrations. All migrations names should be formatted like so: `20240821194157_subnets.sql`. That is a raw date with no spaces or formatting + `_` + followed by a snake case description of the migration.
  - supabase/functions/: Contains Deno edge functions.
    - supabase/functions/**/index.ts: Each of the main entrypoints for each edge function. Edge functions have folders which are their name, and any related files that the edge function uses that are not shared between functions should be included in the same directory as `index.ts`.
    - supabase/functions/_shared/: Directory of all shared code that gets reused between multiple functions.
  - supabase/seed.sql: Seed data for local development and testing only. If dummy data is needed for local testing, it should be added here.
  - supabase/config.toml: Configuration data for the local Supabase instance for local dev and testing.
- scripts/: Deno scripts for development and testing.
- Justfile: Command runner script. Holds commands and bash scripts we use frequently as we work on the project. Automatically loads a `.env` if present. Commands can be run with `just <command name>`

## Code Style

### General Guidelines

- Follow DRY (DO NOT REPEAT YOURSELF)
- Code should be well-named while following the case practices defined below for the language.
- Code should be readable by human devs as well as LLMs alike.
- Use meaningful variable and function names.

This basically just tells the LLM to follow the project's overall coding standards and file structure. In other repos, I've also linked other documentation that the LLM should know about via an @ symbol.

Creating Rules

Create a new file in the .cursor/rules directory followed by .mdc as the extension.

mkdir -p .cursor/rules
touch .cursor/rules/go.mdc

Cursor will by default open the rule file using a special editor that allows you to set the rule type and globs without having to manually edit the front matter.

For manual and always apply rules, you can simply write the rule contents and save the file.

For auto apply rules, you can use a glob pattern to apply the rule to all files that match the pattern.

For agent requested rules, you should write a good description of the rule and the conditions that should trigger the rule.

Once that's done your rules are finished and will be loaded into context when you open Cursor.

Documentation

Sometimes you'll need to link both internal and external documentation to the LLM. For internal documentation, such as a project's README, you can use the @ which will bring up a menu of files you can select from. You can start typing the name of the file you want to link to and it will filter down the list.

For external documentation, you should link it via a markdown link.

Conclusion

I've been using these rules for a while now and they've helped me write better code. I've also found that the LLM is able to follow the rules more often than not, and when it doesn't, it's usually because I need to update the rule to be more specific.

Happy coding!

Command Line Power-Ups: Boost Your Workflow with Mods and Freeze

Chandler — Thu, 27 Jun 2024 23:05:53 +0000

For developers and tech enthusiasts, the command line is a powerful tool. But did you know there are ways to make it even more efficient and visually appealing? Two of my favorite tools I’ve been using lately are Mods and Freeze. These tools, brought to you by the innovative team at Charm Bracelet (charm.sh), will revolutionize how you interact with your terminal, automating tasks and creating beautiful code snippets.

Mods

Imagine having the capabilities of ChatGPT directly in your command line. That's the power of Mods. This AI-driven tool excels at scripting and automation, allowing you to generate code snippets and streamline repetitive tasks with ease.

Let's say you need a basic Python script to print "Hello World" with user input. With Mods, it's as simple as typing:

mods -f "Generate a python hello world app with user input. Only output the code and no other text" -r > test.py

This command instructs Mods to generate the code, output it in raw format (without Markdown), and save it to a file named test.py.

Mods is still under development, so you might need to make minor adjustments to the output format. However, its ability to understand natural language commands and generate code is truly impressive. Plus, Mods remembers your conversation history, allowing you to reference previous commands and build upon your work seamlessly.

Freeze

Sharing code snippets for documentation, presentations, or blog posts can be cumbersome. Freeze comes to the rescue, enabling you to generate visually stunning code screenshots with just a single command.

To create a beautiful image of your Python script (test.py), simply type:

freeze test.py

This will generate a PNG image file showcasing your code with elegant syntax highlighting. Freeze also offers a range of customization options:

--window: Adds macOS-style window controls for a realistic look.
--theme: Allows you to apply various themes like the popular GitHub dark mode.
--execute: Captures the output of terminal commands within the screenshot.

With Freeze, you can effortlessly create professional-looking code visuals, enhancing your projects and communication.

Mods and Freeze offer developers and tech enthusiasts powerful tools to enhance their productivity and creativity. Whether you're automating tasks, generating scripts, or creating eye-catching code visuals, these tools will streamline your workflow and elevate your projects. Explore these and other innovative command line tools to unlock the full potential of your terminal!

Links:

Mods: https://github.com/charmbracelet/mods
Freeze: https://github.com/charmbracelet/freeze

What are your favorite command line tools? Share them in the comments below!

Video Tutorial - How To Run Llama 3 locally with Ollama and OpenWebUI!

Chandler — Thu, 16 May 2024 16:52:07 +0000

Learn how to run LLaMA 3 locally on your computer using Ollama and Open WebUI! In this tutorial, we'll take you through a step-by-step guide on how to install and set up Ollama, and demonstrate the power of LLaMA 3 in action. Whether you're a developer, AI enthusiast, or just curious about the possibilities of local AI, this video is for you. So sit back, relax, and let's dive into the world of LLaMA 3, Ollama, and OpenWeb UI!

Follow TimeSurge Labs!

AI Disclosure

Thumbnail Background by OpenAI Dalle 3.
Title and Description partially AI generated by Llama 3 on GroqCloud.
All music generated by Suno.
Video filmed and edited by Chandler (NOT AI).

Music Links

I Said Goodbye to ChatGPT and Hello to Llama 3 on Open WebUI - You Should Too

Chandler — Wed, 24 Apr 2024 18:29:03 +0000

I’m a huge fan of open source models, especially the newly release Llama 3. Because of the performance of both the large 70B Llama 3 model as well as the smaller and self-host-able 8B Llama 3, I’ve actually cancelled my ChatGPT subscription in favor of Open WebUI, a self-hostable ChatGPT-like UI that allows you to use Ollama and other AI providers while keeping your chat history, prompts, and other data locally on any computer you control.

My previous article went over how to get Open WebUI set up with Ollama and Llama 3, however this isn’t the only way I take advantage of Open WebUI. The other way I use it is with external API providers, of which I use three. I’ll go over each of them with you and given you the pros and cons of each, then I’ll show you how I set up all 3 of them in my Open WebUI instance!

External AIs

OpenAI

OpenAI can either be considered the classic or the monopoly. Their AI tech is the most mature, and trades blows with the likes of Anthropic and Google. Even though Llama 3 70B (and even the smaller 8B model) is good enough for 99% of people and tasks, sometimes you just need the best, so I like having the option either to just quickly answer my question or even use it along side other LLMs to quickly get options for an answer.

OpenAI is the example that is most often used throughout the Open WebUI docs, however they can support any number of OpenAI-compatible APIs. Here’s another favorite of mine that I now use even more than OpenAI!

Groq Cloud

Groq is an AI hardware and infrastructure company that’s developing their own hardware LLM chip (which they call an LPU). They offer an API to use their new LPUs with a number of open source LLMs (including Llama 3 8B and 70B) on their GroqCloud platform.

Their claim to fame is their insanely fast inference times - sequential token generation in the hundreds per second for 70B models and thousands for smaller models. Here’s Llama 3 70B running in real time on Open WebUI.

// Detect dark theme var iframe = document.getElementById('tweet-1782783466406322202-998'); if (document.body.className.includes('dark-theme')) { iframe.src = "https://platform.twitter.com/embed/Tweet.html?id=1782783466406322202&theme=dark" }

Here’s the best part - GroqCloud is free for most users. With no credit card input, they’ll grant you some pretty high rate limits, significantly higher than most AI API companies allow. Here’s the limits for my newly created account.

14k requests per day is a lot, and 12k tokens per minute is significantly higher than the average person can use on an interface like Open WebUI.

Using GroqCloud with Open WebUI is possible thanks to an OpenAI-compatible API that Groq provides. All you have to do is generate an API Key via the dashboard, change the URL in the dashboard to https://api.groq.com/openai/v1, and it’ll work just like OpenAI’s API!

This is how I was able to use and evaluate Llama 3 as my replacement for ChatGPT!

Cloudflare Workers AI

This is the part where I toot my own horn a little. Using Open WebUI via Cloudflare Workers is not natively possible, however I developed my own OpenAI-compatible API for Cloudflare Workers a few months ago. I recently added the /models endpoint to it to make it compable with Open WebUI, and its been working great ever since. The main advantage of using Cloudflare Workers over something like GroqCloud is their massive variety of models. This allows you to test out many models quickly and effectively for many use cases, such as DeepSeek Math (model card) for math-heavy tasks and Llama Guard (model card) for moderation tasks. They even support Llama 3 8B!

The main con of Workers AI is token limits and model size. Currently Llama 3 8B is the largest model supported, and they have token generation limits much smaller than some of the models available. I still think they’re worth having in this list due to the sheer variety of models they have available with no setup on your end other than of the API. If you want to set up OpenAI for Workers AI yourself, check out the guide in the README.

Adding External AIs to Open WebUI

Now, how do you add all these to your Open WebUI instance? Assuming you’ve installed Open WebUI (Installation Guide), the best way is via environment variables.

When running Open WebUI using Docker, you can set the OPENAI_API_BASE_URLS and OPENAI_API_KEYS environment variables to configure the API endpoints.

For example, to integrate OpenAI, GroqCloud, and Cloudflare Workers AI, you would set the environment variables as follows:

docker run -d -p 3000:8080 \
  -v open-webui:/app/backend/data \
  -e OPENAI_API_BASE_URLS="https://api.openai.com/v1;https://api.groq.com/openai/v1;https://openai-cf.yourusername.workers.dev/v1" \
  -e OPENAI_API_KEYS="sk-proj-ABCDEFGHIJK1234567890abcdef;gsk_1234567890abcdefabcdefghij;0123456789abcdef0123456789abcdef" \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:main

Replace sk-proj-ABCDEFGHIJK1234567890abcdef, gsk_1234567890abcdefabcdefghij, and 0123456789abcdef0123456789abcdef with your actual API keys. Make sure to put the keys for each API in the same order as their respective API. If you don’t, you’ll get errors saying that the APIs could not authenticate.

When using Docker Compose, you can define the environment variables in your docker-compose.yaml file:

services:
  open-webui:
    environment:
      - 'OPENAI_API_BASE_URLS=${OPENAI_API_BASE_URLS}'
      - 'OPENAI_API_KEYS=${OPENAI_API_KEYS}'

Alternatively, you can define the values of these variables in an .env file, placed in the same directory as the docker-compose.yaml file:

OPENAI_API_BASE_URLS="https://api.openai.com/v1;https://api.groq.com/openai/v1;https://openai-cf.yourusername.workers.dev/v1" \
OPENAI_API_KEYS="sk-proj-ABCDEFGHIJK1234567890abcdef;gsk_1234567890abcdefabcdefghij;0123456789abcdef0123456789abcdef" \

By following these steps, you can easily integrate multiple OpenAI-compatible APIs with your Open WebUI instance, unlocking the full potential of these powerful AI models.

Conclusion

Open WebUI has opened up a whole new world of possibilities for me, allowing me to take control of my AI experiences and explore the vast array of OpenAI-compatible APIs out there. With the ability to seamlessly integrate multiple APIs, including OpenAI, Groq Cloud, and Cloudflare Workers AI, I've been able to unlock the full potential of these powerful AI models. By leveraging the flexibility of Open WebUI, I've been able to break free from the shackles of proprietary chat platforms and take my AI experiences to the next level. If you're tired of being limited by traditional chat platforms, I highly recommend giving Open WebUI a try and discovering the vast possibilities that await you.

How to Run Llama 3 Locally with Ollama and Open WebUI

Chandler — Sun, 21 Apr 2024 14:26:46 +0000

I’m a big fan of Llama. Meta releasing their LLM open source is a net benefit for the tech community at large, and their permissive license allows most medium and small businesses to use their LLMs with little to no restrictions (within the bounds of the law, of course). Their latest release is Llama 3, which has been highly anticipated.

Llama 3 comes in two sizes: 8 billion and 70 billion parameters. This kind of model is trained on a massive amount of text data and can be used for a variety of tasks, including generating text, translating languages, writing different kinds of creative content, and answering your questions in an informative way. Meta touts Llama 3 as one of the best open models available, but it is still under development. Here’s the 8B model benchmarks when compared to Mistral and Gemma (according to Meta).

This begs the question: how can I, the regular individual, run these models locally on my computer?

Getting Started with Ollama

That’s where Ollama comes in! Ollama is a free and open-source application that allows you to run various large language models, including Llama 3, on your own computer, even with limited resources. Ollama takes advantage of the performance gains of llama.cpp, an open source library designed to allow you to run LLMs locally with relatively low hardware requirements. It also includes a sort of package manager, allowing you to download and use LLMs quickly and effectively with just a single command.

The first step is installing Ollama. It supports all 3 of the major OSes, with Windows being a “preview” (nicer word for beta).

Once this is installed, open up your terminal. On all platforms, the command is the same.

ollama run llama3

Wait a few minutes while it downloads and loads the model, and then start chatting! It should bring you to a chat prompt similar to this one.

ollama run llama3
>>> Who was the second president of the united states?
The second President of the United States was John Adams. He served from 1797 to 1801, succeeding
George Washington and being succeeded by Thomas Jefferson.

>>> Who was the 30th?
The 30th President of the United States was Calvin Coolidge! He served from August 2, 1923, to March 4,
1929.

>>> /bye

You can chat all day within this terminal chat, but what if you want something more ChatGPT-like?

Open WebUI

Open WebUI is an extensible, self-hosted UI that runs entirely inside of Docker. It can be used either with Ollama or other OpenAI compatible LLMs, like LiteLLM or my own OpenAI API for Cloudflare Workers.

Assuming you already have Docker and Ollama running on your computer, installation is super simple.

docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

The simply go to http://localhost:3000, make an account, and start chatting away!

If you didn’t run Llama 3 earlier, you’ll have to pull some models down before you can start chatting. Easiest way to do this is to click the settings icon after clicking your name in the bottom left.

Then clicking on “models” on the left side of the modal, then pasting in a name of a model from the Ollama registry. Here are some models that I’ve used that I recommend for general purposes.

llama3
mistral
llama2

Ollama API

If you want to integrate Ollama into your own projects, Ollama offers both its own API as well as an OpenAI Compatible API. The APIs automatically load a locally held LLM into memory, run the inference, then unload after a certain timeout. You do have to pull whatever models you want to use before you can run the model via the API, which can easily be done via the command line.

ollama pull mistral

Ollama API

Ollama has their own API available, which also has a couple of SDKs for Javascript and Python.

Here is how you can do a simple text generation inference with the API.

curl http://localhost:11434/api/generate -d '{
  "model": "mistral",
  "prompt":"Why is the sky blue?"
}'

And here’s how you can do a Chat generation inference with the API.

curl http://localhost:11434/api/chat -d '{
  "model": "mistral",
  "messages": [
    { "role": "user", "content": "why is the sky blue?" }
  ]
}'

Replace the model parameter with whatever model you want to use. See the official API docs for more information.

OpenAI Compatible API

You can also use Ollama as a drop in replacement (depending on use case) with the OpenAI libraries. Here’s an example from their documentation.

# Python
from openai import OpenAI

client = OpenAI(
    base_url='http://localhost:11434/v1/',

    # required but ignored
    api_key='ollama',
)

chat_completion = client.chat.completions.create(
    messages=[
        {
            'role': 'user',
            'content': 'Say this is a test',
        }
    ],
    model='mistral',
)

This also works for Javascript.

// Javascript
import OpenAI from 'openai'

const openai = new OpenAI({
  baseURL: 'http://localhost:11434/v1/',

  // required but ignored
  apiKey: 'ollama',
})

const chatCompletion = await openai.chat.completions.create({
  messages: [{ role: 'user', content: 'Say this is a test' }],
  model: 'llama2',
})

Conclusion

The release of Meta's Llama 3 and the open-sourcing of its Large Language Model (LLM) technology mark a major milestone for the tech community. With these advanced models now accessible through local tools like Ollama and Open WebUI, ordinary individuals can tap into their immense potential to generate text, translate languages, craft creative writing, and more. Furthermore, the availability of APIs enables developers to seamlessly integrate LLMs into new projects or enhance existing ones. Ultimately, the democratization of LLM technology through open-source initiatives like Llama 3 unlocks a vast realm of innovative possibilities and fuels creativity in the tech industry.

Building a Fast, Efficient Web App: The Technology Stack of PromptSmithy Explained

Chandler — Tue, 26 Mar 2024 17:08:07 +0000

I’ve written a lot of one-off project, internal scripts for my own use, B2C apps, B2B apps, and everything in between. Every time I start a new project I like to use a new stack to try and diversify my own skillset, and so that if I’m ever tasked with doing another similar project in the future, my knowledge can accelerate my workflow. PromptSmithy was slightly different though, as the stack hadn’t changed from other recent projects of mine, however the frontend tooling did greatly. In this article I’m going to break down the stack we used and talk about the new development flow we used for rapid development.

The Stack

React + Vite + React Router

We all know what React is at this point, but why use it with Vite and React Router DOM over something like NextJS?

The reasons are twofold: We didn’t need any of the backend functionality of NextJS, and I wanted something that wouldn’t get in the way of our development with SSR or any other special cases that are only found on NextJS.

On top of that, Vite’s compiler is super fast, supports Typescript (which we of course used), and built just fine on our host, which was Cloudflare Pages. Cloudflare Pages is a super fast static website hosting service by Cloudflare, which allows your site to take advantage of their global CDN to make sure your site is as close to your users as possible. It supports nearly any JS framework you could want to use for your site, and can even host plan old HTML if you’re of that persuasion.

React Router is also super minimal and doesn’t get in the way, and provides all the same functionality of NextJS’s static output router without making your compile sizes massive. Our entire built site (before we added all the fancy physics animations) was just a bit over 135KB.

Tailwind + shadcn/ui + v0.dev

For development of the UI components, we tried something new. Vercel has this new AI tool called v0.dev that allows developers to take advantage of shadcn/ui and Tailwind using nothing but words, which can then be easily downloaded to your local project using nothing but a simple npx command.

Here is the original example code for a 404 page that I ended up using in the final app! npx v0 add RB8eJs2Kd6R

While I have experience with Tailwind and frontend development, I don’t really have the patience to use it. I usually end up using something like Mantine, which is a complete component library UI kit, or Daisy UI, which is a component library built on top of Tailwind. Shadcn/ui is quite similar to Daisy in this sense, but being able to customize the individual components, since they get installed to your components folder, made development more streamlined and more customizable. On top of that being able to change my components style with natural language thanks to v0 made development super easy and fast. Shadcn may be too minimalist of a style for some, but thanks to all the components being local, you can customize them quickly and easily!

This is the structure of the project’s components directory.

Supabase

Here the thing that accelerated my development the most: Supabase. Thanks to its Database, Authentication, and Edge Functions, we were able to rapidly develop the app. Their JS library made development super seamless, and their local development stack made testing a breeze.

The development process is simple: install their CLI, run supabase init, run supabase start. That’s it. (Assuming you have Docker installed that is.)

The database service is pretty self explanatory. Rather than having to write SQL queries in a remote API, hosting that API as well as the database to go with it, you simply create tables with migrations (created using supabase migrations new name_here), then you can query the migrations using the frontend API. From there you can configure row level security, which restricts access to specific rows on a per user basis, using either the migrations themselves or using the local UI. I opted for the former so that I could easily apply the migrations. Here is one I wrote for the project.

create table prompts (
  id bigint primary key generated always as identity,
  user_id uuid not null,
  metaprompt_id bigint references metaprompts(id),
  prompt text,
  variables text,
  title text,
  task text,
  created_at timestamp with time zone default now(),
  updated_at timestamp with time zone default now(),
  public boolean default false
);

alter table prompts
  enable row level security;

create policy "Users can insert their own prompts" on prompts
  for insert with check (auth.uid() = user_id);

create policy "Users can read their own prompts that are private" on prompts
  for select using (auth.uid() = user_id and public = false);

create policy "Users can read all prompts that are public" on prompts
  for select using (public = true);

This then got applied locally by running supabase db reset, and deployed remotely with supabase db push. We could then query on the frontend using the following code:

const limit = 100;
const { data, error } = await supabase
      .from("prompts")
      .select("*")
      .eq("public", true)
      .order("created_at", {
        ascending: false,
      })
      .limit(limit);

You can see Supabase’s excellent guides on how to do this for more information.

We also took advantage of Supabase’s Authentication service that allows us to quickly and effectively log in a user so we can handle authenticated requests (which once Row Level Security is set up is automatic) quickly. Since this was a weekend project sort of app, we went with Supabase Magic Links, which allows our users to log in by simply entering their email and clicking a link that gets sent. Here’s all the code to do that:

const email = 'me@example.com';
const { error } = await supabase.auth.signInWithOtp({ email });

That’s it! Then all we had to do is have the user click the link (assuming your website URL is configured properly in the Supabase settings) and they were logged in!

Finally we used Supabase Edge Functions to handle payments with Stripe, as well as hold the business logic of PromptSmithy, which primarily just calling Anthropic AI. Edge Functions are written in Deno, which is a NodeJS alternative that I like very much. You create a new edge function by running supabase functions new name_here and then deploying with supabase functions deploy . You can also run these functions locally for testing (which is what we did along with the Stripe CLI’s webhook feature) with supabase functions serve .

Calling your functions on the frontend is super simple too. Whenever we wanted to call the AI, this is the code we run on the frontend.

const input = "Write me an email response to my boss asking for a raise";
const resp = await supabase.functions.invoke<string>(
        "create-task-prompt",
        {
          body: {
            task: input,
            variables: "",
            public: true,
            metapromptID: 1,
          },
        }
      );

The value of resp would be whatever we responded with, which is always JSON for our application.

Functions can also be invoked by remote applications, for example Stripe webhooks. If you want this, you’ll need to make sure that JWT verification is disabled for that function, which can be done simply in the config.toml in the supabase directory of your project. Here’s an example.

[functions.stripe-webhook]
verify_jwt = false

Now whenever this function is deployed you can check your Edge Functions page for a URL to give to Stripe!

Conclusion

In conclusion, our choice of stack for PromptSmithy’s project was primarily based on the speed of development and the performance of the end product. Using tools like Vite, React, Supabase, and the innovative v0.dev, we were able to develop rapidly and effectively, resulting in a highly functional and efficient application.

Want to give PromptSmithy a try? All new users get $5 in free credits!

🦉 AthenaDB: Distributed Vector Database Powered by Cloudflare 🌩️

Chandler — Mon, 19 Feb 2024 18:47:15 +0000

What is AthenaDB?

AthenaDB is a serverless vector database designed to be highly distributed and easily accessible as an API. It leverages Cloudflare’s Workers AI platform to create the vectors, Cloudflare Vectorize for handling vector querying, and Cloudflare D1 as its database for storing text. This combination allows AthenaDB to offer a simple yet powerful set of API endpoints for inserting, querying, retrieving, and deleting vector text data.

Key Features of AthenaDB

Simple API Endpoints: AthenaDB provides straightforward endpoints for various database operations, making it accessible for developers of all skill levels.
Distributed Nature: With data replication across multiple data centers, AthenaDB ensures high availability and resilience.
Built-In Data Replication: Due to Cloudflare Workers’ underlying architecture, data is replicated across data centers automatically.
Scalability: AthenaDB is designed to handle large amounts of vector text data, making it suitable for projects with high data volumes.
Serverless Architecture: With AthenaDB being serverless, you don't have to worry about managing infrastructure, allowing for more focus on development.

What Are Vector Databases?

Vector databases are a special kind of computer storage that helps artificial intelligence (AI) programs quickly understand and use information. They work by turning data into numbers (called vectors) that the AI can easily compare to find similarities. This is really useful for things like online searches, suggesting products you might like, or creating smart chatbots.

For example, if a vector database has the following three items: “Python is cool”, “Java is cool”, and “C is statically typed”, and the user uses a search query “coffee”, it would return “Java is cool”. Why? Because while the user may not have been talking about the programming language Java, the words “java” and “coffee” have similar root meaning, which the neural network that created the vectors relates using complex math.

You can learn more about them in this article.

Why Cloudflare?

Cloudflare has a serverless compute platform called Workers. Workers are automatically replicated across all Cloudflare data centers, meaning that the developer can make an API or other application that automatically scales with zero infrastructure! Workers also automatically routes user requests to their nearest data center, meaning that latency is reduced significantly!

By using this, it means AthenaDB gets many of the features of Cloudflare’s Platform - Data replication, distribution across data centers, and an infinitely scalable serverless architecture - with no complicated code base or management.

Getting Started with AthenaDB

Deploying and using AthenaDB involves a few steps, starting from setting up your environment to deploying your instance of AthenaDB.

Prerequisites

Before you begin, make sure you have the following:

Cloudflare account
Node.js and npm installed
Wrangler CLI installed (npm install -g @cloudflare/wrangler)

Deployment Steps

Clone the Repository: Start by cloning the AthenaDB repository to your local machine, installing dependencies, and logging in to Wrangler.
```
git clone https://github.com/TimeSurgeLabs/athenadb.git
cd athenadb
npm i
npx wrangler login
```
Create a Vector and Database: Use the provided npm scripts to create a vector and database for your AthenaDB instance.
```
npm run create-vector
npm run create-db
```
After running these commands, copy the output Database ID and update the wrangler.toml file under database_id.
Initialize the Database: Run the initialization script to set up the database schema.
```
npm run init-db
```
Deploy AthenaDB: Finally, deploy your instance of AthenaDB using Wrangler.
```
npm run deploy
```
Upon successful deployment, you will receive an output with your API URL, which indicates that AthenaDB is now ready for use.

Using AthenaDB

With AthenaDB deployed, you can start interacting with the database through its API endpoints. Here are some examples of how you can use AthenaDB:

Inserting Text Data: Use the /insert endpoint to add text data into the database.

fetch('https://athenadb.yourusername.workers.dev/your-namespace/insert', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({ input: 'Your text here' })
})

Querying the Database: To find similar text embeddings, use the /query endpoint.

fetch('https://athenadb.yourusername.workers.dev/your-namespace/query', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({ input: 'Query text' })
})

Retrieving an Entry: Retrieve specific entries using their UUID with the GET endpoint.

fetch('https://athenadb.yourusername.workers.dev/your-namespace/your-uuid', {
  method: 'GET'
})

Deleting Data: Use the /delete endpoint to remove data from the database.

fetch('https://athenadb.yourusername.workers.dev/your-namespace/your-uuid', {
  method: 'DELETE'
})

Conclusion

AthenaDB stands out as a powerful tool for developers needing a scalable, serverless database solution for managing vector text data. By following the steps outlined in this blog post, you can deploy your own instance of AthenaDB and begin leveraging its capabilities for your projects. Whether you're building search engines, recommendation systems, or any application that requires efficient handling of vector data, AthenaDB provides a robust, easy-to-use solution.

If you’re looking to integrate AI into your existing workflow or products, TimeSurge Labs is here to help. Specializing in AI consulting, development, internal tooling, and LLM hosting, our team of passionate AI experts is dedicated to building the future of AI and helping your business thrive in this rapidly changing industry. Contact us today!

Credit Card Skimmers & Open Source | Causal Coders Podcast

Chandler — Mon, 12 Feb 2024 13:46:47 +0000

Season 2 of the Casual Coders podcast is live! I was not in the first episode, but it’s still a good one! In this first episode they talk about credit card skimmers and how credit cards work, the best open source tools, and a long running web server project by one of our hosts! Available on Spotify, Apple Podcasts, YouTube, or wherever you get your podcasts!

Casual Coders Podcast • A podcast on Spotify for Creators

Welcome to the Casual Coders Podcast! We're a group of makers and engineers who talk nerd and make cool stuff! We strive to release two podcasts a month! Follow our socials! YouTube | https://www.youtube.com/c/CasualCodersOfficial Instagram | https://www.instagram.com/casualcodersprojects/ Twitter | https://twitter.com/CasualCoders Facebook | https://www.facebook.com/CasualCodersProjects TikTok | https://www.tiktok.com/@casualcoders

creators.spotify.com

How I Use Google's Gemini Pro with LangChain

Chandler — Thu, 04 Jan 2024 17:00:26 +0000

Google's Gemini Pro is one of the newest LLMs publicly available, and to the surprise of some its relatively price competitive.

Its effectively free while you're developing, and after you development is complete its relatively cheap, costing around $0.00025 per 1K characters (characters, not tokens like OpenAI), which is slightly more expensive than GPT-3.5-Turbo, and $0.0025 per image, which is effectively the same as OpenAI's GPT-4).

Okay, I get it, how do I use it?

Let's start fresh with a new project. Assuming you're using Python >= 3.10, let's initialize a new virtual environment.

python -m venv env
source env/bin/activate

And let's install LangChain and our dotenv file loader first.

pip install langchain python-dotenv

Once that's done, we can install Gemini Pro's libraries and its LangChain adapter.

pip install google-generativeai langchain-google-genai

Next you need to acquire an API key, which can be done on the Google MakerSuite. In the top left of the page you should see a "Get API Key" button.

Click that button, then click "Create API Key in new project".

Copy the new API Key and save it to a .env file in your project directory.

GOOGLE_API_KEY=new_api_key_here

Now we can create a script that calls Gemini Pro via LangChain.

import os

from dotenv import load_dotenv
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

load_dotenv()

GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")

llm = ChatGoogleGenerativeAI(model="gemini-pro", google_api_key=GOOGLE_API_KEY)

tweet_prompt = PromptTemplate.from_template("You are a content creator. Write me a tweet about {topic}.")

tweet_chain = LLMChain(llm=llm, prompt=tweet_prompt, verbose=True)

if __name__=="__main__":
    topic = "how ai is really cool"
    resp = tweet_chain.run(topic=topic)
    print(resp)

And that's it! You've now integrated Gemini Pro with LangChain! If you're interested in learning more about LangChain and AI, follow us here on Dev.to as well as on X! I also post a lot of AI and developer stuff on my personal X account! We also have more articles on LangChain an AI on our Dev Page!

Happy coding!