<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Victor</title>
    <description>The latest articles on Forem by Victor (@demosjarco).</description>
    <link>https://forem.com/demosjarco</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1410166%2F6b161cb7-6a9d-4149-94d6-898ec96a6822.jpeg</url>
      <title>Forem: Victor</title>
      <link>https://forem.com/demosjarco</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/demosjarco"/>
    <language>en</language>
    <item>
      <title>MATT AI</title>
      <dc:creator>Victor</dc:creator>
      <pubDate>Sun, 14 Apr 2024 00:08:23 +0000</pubDate>
      <link>https://forem.com/demosjarco/matt-ai-99f</link>
      <guid>https://forem.com/demosjarco/matt-ai-99f</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/devteam/join-us-for-the-cloudflare-ai-challenge-3000-in-prizes-5f99"&gt;Cloudflare AI Challenge&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;This open-source project demonstrates the possibilities with Cloudflare Workers AI in a single, seamless conversation. Additionally, for privacy reasons, everything is stored locally in the &lt;a href="https://developer.mozilla.org/en-US/docs/Web/API/IndexedDB_API" rel="noopener noreferrer"&gt;browser&lt;/a&gt; with no server logging or storage.&lt;/p&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://matt-ai.pages.dev" rel="noopener noreferrer"&gt;https://matt-ai.pages.dev&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  My Code
&lt;/h2&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev.to%2Fassets%2Fgithub-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/demosjarco" rel="noopener noreferrer"&gt;
        demosjarco
      &lt;/a&gt; / &lt;a href="https://github.com/demosjarco/matt-ai" rel="noopener noreferrer"&gt;
        matt-ai
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      Magically All The Things AI
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;matt-ai&lt;/h1&gt;
&lt;/div&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Get Started&lt;/h2&gt;
&lt;/div&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;Visit live&lt;/h3&gt;
&lt;/div&gt;
&lt;ol&gt;
&lt;li&gt;Go to &lt;a href="https://matt-ai.pages.dev" rel="nofollow noopener noreferrer"&gt;matt-ai.pages.dev&lt;/a&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;Run local&lt;/h3&gt;

&lt;/div&gt;
&lt;div class="markdown-alert markdown-alert-tip"&gt;
&lt;p class="markdown-alert-title"&gt;Tip&lt;/p&gt;

&lt;p&gt;This repo supports &lt;a href="https://github.com/features/codespaces" rel="noopener noreferrer"&gt;GitHub Codespaces&lt;/a&gt; and is &lt;a href="https://github.com/demosjarco/matt-ai.devcontainer" rel="noopener noreferrer"&gt;preconfigured&lt;/a&gt; for it.&lt;/p&gt;


&lt;/div&gt;

&lt;div class="markdown-alert markdown-alert-important"&gt;
&lt;p class="markdown-alert-title"&gt;Important&lt;/p&gt;
&lt;p&gt;Make sure you are running this project with the latest &lt;code&gt;lts&lt;/code&gt; version of NodeJS (GitHub Codespaces is already setup with &lt;code&gt;lts/*&lt;/code&gt;). Other versions may work but are not guaranteed.&lt;/p&gt;
&lt;/div&gt;

&lt;ol&gt;
&lt;li&gt;Duplicate &lt;a href="https://github.com/demosjarco/matt-aipages/.dev.vars.example" rel="noopener noreferrer"&gt;&lt;code&gt;pages/.dev.vars.example&lt;/code&gt;&lt;/a&gt;, but without the &lt;code&gt;.example&lt;/code&gt; extention and fill in the values appropriately&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="markdown-alert markdown-alert-note"&gt;
&lt;p class="markdown-alert-title"&gt;Note&lt;/p&gt;
&lt;p&gt;On &lt;code&gt;locahost&lt;/code&gt; turnstile is configured for the &lt;a href="https://developers.cloudflare.com/turnstile/reference/testing/#dummy-sitekeys-and-secret-keys" rel="nofollow noopener noreferrer"&gt;dummy keys&lt;/a&gt; (&lt;code&gt;Always passes&lt;/code&gt;/&lt;code&gt;invisible&lt;/code&gt;). Use the &lt;code&gt;Always passes&lt;/code&gt; secret key to allow usage.&lt;/p&gt;
&lt;/div&gt;


&lt;ol start="2"&gt;

&lt;li&gt;Install packages (If you are running in GitHub Codespaces, you can skip this step)

&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;npm ci --include-workspace-root --workspaces&lt;/pre&gt;

&lt;/div&gt;


&lt;/li&gt;

&lt;li&gt;Build everything (If you are running in GitHub Codespaces/vscode, you can simply do &lt;code&gt;ctrl&lt;/code&gt;/&lt;code&gt;cmd&lt;/code&gt; + &lt;code&gt;shift&lt;/code&gt; + &lt;code&gt;b&lt;/code&gt;):

&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;npm run build:local&lt;/pre&gt;

&lt;/div&gt;

&lt;ul&gt;
&lt;li&gt;If there's ever a build error or corruption:
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;npm run clean&lt;/pre&gt;

&lt;/div&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;To run the site, there are…&lt;/li&gt;

&lt;/ol&gt;
&lt;/div&gt;
&lt;br&gt;
  &lt;/div&gt;
&lt;br&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/demosjarco/matt-ai" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;br&gt;
&lt;/div&gt;
&lt;br&gt;
&lt;br&gt;&lt;br&gt;
I will stop pushes to production branch at submission deadline, however work (outside of the competition) will continue in other branches.

&lt;h2&gt;
  
  
  Journey
&lt;/h2&gt;

&lt;p&gt;From the start, I wanted a private (as much as possible without running the inference yourself) solution for chats. That means no server side storage or even accounts to identify people. In order to combat spam, bots, and abuse I implemented &lt;a href="https://developers.cloudflare.com/turnstile" rel="noopener noreferrer"&gt;Turnstile&lt;/a&gt; in invisible mode (on every message send) and &lt;a href="https://developers.cloudflare.com/workers-ai/models/llamaguard-7b-awq/" rel="noopener noreferrer"&gt;Llamaguard&lt;/a&gt; for message content. &lt;/p&gt;

&lt;p&gt;The cornerstone of this project is &lt;a href="https://github.com/microsoft/TypeChat" rel="noopener noreferrer"&gt;TypeChat&lt;/a&gt;, originally developed by Microsoft's TypeScript team. I patched it to eliminate the &lt;code&gt;node:fs&lt;/code&gt; requirement and decoupled it from OpenAI/Azure. My version on &lt;a href="https://www.npmjs.com/package/@chainfuse/typechat" rel="noopener noreferrer"&gt;npm&lt;/a&gt; uses LangChain, supporting virtually any AI provider. However, for this submission, I used a further modified version that utilizes Worker AI over bindings, as LangChain runs only over HTTP REST (as of writing this), and bindings provide even better performance.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://qwik.builder.io/" rel="noopener noreferrer"&gt;Qwik&lt;/a&gt; is exceptionally fast (resisting the obvious pun here). Honestly, try loading &lt;a href="https://matt-ai.pages.dev" rel="noopener noreferrer"&gt;this project&lt;/a&gt; on cellular data with 4G/5G turned off. Despite this, due to Vite's bundling quirks, several issues arose (such as &lt;code&gt;node:buffer&lt;/code&gt; not being externalized despite explicit configuration). As a workaround, I paired it with a worker for those specific tasks. Initially, the worker used service bindings and the &lt;a href="https://hono.dev" rel="noopener noreferrer"&gt;hono&lt;/a&gt;/&lt;a href="https://the-guild.dev/graphql/yoga-server" rel="noopener noreferrer"&gt;yoga&lt;/a&gt;/&lt;a href="https://graphql.org" rel="noopener noreferrer"&gt;gql&lt;/a&gt; HTTP stack. It was fast, albeit cluttered. I later switched to &lt;a href="https://developers.cloudflare.com/workers/runtime-apis/rpc/" rel="noopener noreferrer"&gt;RPC&lt;/a&gt; reducing latency and the bundle size by almost 90%.&lt;/p&gt;

&lt;p&gt;I am also developing a &lt;a href="https://developers.cloudflare.com/queues/" rel="noopener noreferrer"&gt;Queues&lt;/a&gt; callback system using web sockets and durable objects for handling extremely rate-limited services like &lt;a href="https://developers.cloudflare.com/browser-rendering/" rel="noopener noreferrer"&gt;Browser Rendering&lt;/a&gt;. For more details, see the &lt;a href="https://github.com/demosjarco/matt-ai/wiki/Overcoming-Browser-Rendering-limitations" rel="noopener noreferrer"&gt;wiki&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;A major future goal is to allow users to select the AI model preference before dispatch and to regenerate parts of previous messages with the same context and instructions.&lt;/p&gt;

&lt;p&gt;There's a secret mode under development that will revolutionize AI interaction... but more on that later. However, I did leave a fun easter egg in the source code...&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multiple Models and/or Triple Task Types&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When working with models, the priority is to deliver data with minimal latency, even if some decision-making processes need to occur first. To achieve this, &lt;a href="https://developers.cloudflare.com/workers-ai/models/llamaguard-7b-awq/" rel="noopener noreferrer"&gt;LlamaGuard&lt;/a&gt;, &lt;del&gt;initial text generation, and TypeChat fire off immediately. The last two are buffered and not displayed until LlamaGuard approves them. Once approved, all loaded chunks display immediately, followed by any remaining content.&lt;/del&gt; currently shelved due to buffering and loss of context issues. Will return at a later date.&lt;/p&gt;

&lt;p&gt;TypeChat orchestrates the entire experience, managing everything from previous content lookup to image generation to fully autonomous internet browsing. This provides not just AI-driven responses but a complete AI-controlled experience.&lt;/p&gt;

&lt;p&gt;Current capabilities:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  TypeChat (&lt;code&gt;@hf/mistralai/mistral-7b-instruct-v0.2&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;  Text gen (&lt;del&gt;&lt;code&gt;@cf/meta/llama-2-7b-chat-fp16&lt;/code&gt;&lt;/del&gt; &lt;code&gt;@hf/thebloke/llama-2-13b-chat-awq&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;  Previous message searching (not using Vector DBs, but keyword generation AND searching)&lt;/li&gt;
&lt;li&gt;  Web searching (thx duckduckgo - even if it's a limited version)&lt;/li&gt;
&lt;li&gt;  Image generation (&lt;code&gt;@cf/lykon/dreamshaper-8-lcm&lt;/code&gt;, &lt;code&gt;@cf/stabilityai/stable-diffusion-xl-base-1.0&lt;/code&gt;, &lt;code&gt;@cf/bytedance/stable-diffusion-xl-lightning&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Eventually:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Web browsing (Browser Rendering API - but get in &lt;del&gt;line&lt;/del&gt;/queue)&lt;/li&gt;
&lt;li&gt;  Translation (&lt;code&gt;@cf/meta/m2m100-1.2b&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;  Image detection (&lt;code&gt;@cf/microsoft/resnet-50&lt;/code&gt;, &lt;code&gt;@cf/unum/uform-gen2-qwen-500m&lt;/code&gt;, &lt;code&gt;@cf/facebook/detr-resnet-50&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;  Audio (Uploading recorded audio or live mic recording)&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>cloudflarechallenge</category>
      <category>devchallenge</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
