<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Michel Sabchuk</title>
    <description>The latest articles on Forem by Michel Sabchuk (@michelts).</description>
    <link>https://forem.com/michelts</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F521939%2Fb6e32a81-fa2c-4206-bced-ba443eca72c4.png</url>
      <title>Forem: Michel Sabchuk</title>
      <link>https://forem.com/michelts</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/michelts"/>
    <language>en</language>
    <item>
      <title>Learning spell: using cloudflare's AI to improve speaking skills</title>
      <dc:creator>Michel Sabchuk</dc:creator>
      <pubDate>Sun, 14 Apr 2024 18:45:19 +0000</pubDate>
      <link>https://forem.com/michelts/learning-spell-using-cloudflares-ai-to-improve-speaking-skills-1558</link>
      <guid>https://forem.com/michelts/learning-spell-using-cloudflares-ai-to-improve-speaking-skills-1558</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/devteam/join-us-for-the-cloudflare-ai-challenge-3000-in-prizes-5f99"&gt;Cloudflare AI Challenge&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://learning-spell.turbosys.workers.dev/"&gt;Learning Spell&lt;/a&gt; is a web application for practicing language skills. Users will be challenged with a sentence obtained using &lt;a href="https://developers.cloudflare.com/workers-ai/models/#text-generation"&gt;Cloudflare's Text Generation models&lt;/a&gt; and should read it aloud to be recorded. The recording will be transcripted using &lt;a href="https://developers.cloudflare.com/workers-ai/models/#automatic-speech-recognition"&gt;Cloudflare's Automatic Speech Recognition&lt;/a&gt;. The transcription will be then compared to the original sentence.&lt;/p&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;You can test the application in the link below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://learning-spell.turbosys.workers.dev/"&gt;https://learning-spell.turbosys.workers.dev/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Watch the video below for a quick demonstration:&lt;/p&gt;

&lt;p&gt;&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/cIA5XYrBCpw"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  My Code
&lt;/h2&gt;

&lt;p&gt;The code is available on Github: &lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--A9-wwsHG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/michelts"&gt;
        michelts
      &lt;/a&gt; / &lt;a href="https://github.com/michelts/learning-spell"&gt;
        learning-spell
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      Web application for practicing language skills
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;Learning Spell&lt;/h1&gt;
&lt;/div&gt;
&lt;p&gt;Web application for practicing language skills.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Technical overview&lt;/h2&gt;
&lt;/div&gt;
&lt;p&gt;The project uses Remix running on Cloudflare Workers.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Development&lt;/h2&gt;
&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;Install dependencies using &lt;code&gt;pnpm install&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Create the database by running the command &lt;code&gt;npx wrangler d1 migrations apply --local DB&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Start the development server using &lt;code&gt;pnpm dev&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Open up &lt;a href="http://127.0.0.1:8787" rel="nofollow"&gt;http://127.0.0.1:8787&lt;/a&gt; and you should be ready to go!&lt;/p&gt;
&lt;p&gt;The project uses Cloudflare Workers AI and thus, you might need to set your
account id even to run the project locally. Do it by settings an &lt;code&gt;.env&lt;/code&gt; file
with the variable CLOUDFLARE_ACCOUNT_ID pointing to your account id.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Deployment&lt;/h2&gt;

&lt;/div&gt;
&lt;p&gt;If you don't already have an account, then &lt;a href="https://dash.cloudflare.com/sign-up" rel="nofollow"&gt;create a cloudflare account
here&lt;/a&gt; and after verifying your email
address with Cloudflare, go to your dashboard and set up your free custom
Cloudflare Workers subdomain.&lt;/p&gt;
&lt;p&gt;Once that's done, you should be able to deploy your app:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;npm run deploy&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;You might need change the database name and…&lt;/p&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/michelts/learning-spell"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;h2&gt;
  
  
  Journey
&lt;/h2&gt;

&lt;p&gt;When I knew about the hackathon, I was afraid it was too late. It was the last Monday before the due date and all the work and life stuff happening.&lt;/p&gt;

&lt;p&gt;But I have been wanting to test Cloudflare Workers for a time already (plus a couple of other tools like Remix, Drizzle, and Tailwind). It was a good excuse!&lt;/p&gt;

&lt;p&gt;In my current position, we have been working on speech-to-text and text-to-speech features (we use AWS there), I wanted to see what I can do with Cloudflare's tooling.&lt;/p&gt;

&lt;p&gt;I also just finished reading all Harry Potter books for my two daughters (we read books together before sleep since they were babies).&lt;/p&gt;

&lt;p&gt;With that, the theme I picked came to me naturally: an application to allow kids (and adults, of course) to practice speaking by reading Harry Potter movie quotes. All set in a magician ambiance.&lt;/p&gt;

&lt;h3&gt;
  
  
  Proof of concept
&lt;/h3&gt;

&lt;p&gt;The POC consisted of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Recording audio using &lt;a href="https://github.com/samhirtarif/react-audio-recorder"&gt;react-audio-voice-recorder
&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Submitting it to a worker using Remix's &lt;a href="https://remix.run/docs/en/main/hooks/use-submit"&gt;useSubmit&lt;/a&gt; and &lt;code&gt;multipart/form-data&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Transcribing it using &lt;a href="https://developers.cloudflare.com/workers-ai/models/#automatic-speech-recognition"&gt;automatic speech recognition&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://remix.run/"&gt;Remix&lt;/a&gt; is compatible with Cloudflare Workers. I use React and Django in my current position and we use react-router-dom, so the mental model and the project structuring feel familiar to me. The experience with Remix was very positive!&lt;/p&gt;

&lt;p&gt;I began with a couple of hardcoded sentences, but my goal was to use &lt;a href="https://developers.cloudflare.com/workers-ai/models/#text-generation"&gt;text generation&lt;/a&gt; to get them.&lt;/p&gt;

&lt;h3&gt;
  
  
  Generating quotes
&lt;/h3&gt;

&lt;p&gt;The lack of time didn't allow me to test deeply the best text generation model for my use case. I hand-picked a couple of them and ended up using &lt;a href="https://developers.cloudflare.com/workers-ai/models/mistral-7b-instruct-v0.1-awq/"&gt;mistral-7b-instruct-v0.1-awq&lt;/a&gt; to return sentences in JSON format.&lt;/p&gt;

&lt;p&gt;I can probably automate the models comparison: by generating a couple of sentences using the same app workflow and comparing results. That's for the next steps!&lt;/p&gt;

&lt;p&gt;The model will generate the same sentence for the same instructions so, to be able to generate additional quotes, I'm keeping the generated messages in a database table. Even with that, I can only generate a couple of individual sentences, after some point, the model will return repeated content.&lt;/p&gt;

&lt;p&gt;I didn't study how to improve that for one reason: generating the same quotes for a single theme using AI seems overkill! This approach would shine when generating sentences for different themes - and at this point, caching the sentence on the database could be appropriate.&lt;/p&gt;

&lt;p&gt;Imagine the student picking his own theme: Harry Potter, Dragon Ball, The Godfather, or whatever you like! This was out of the boundaries of my POC though: with what I already did, I know it is possible and that's enough for now!&lt;/p&gt;

&lt;h3&gt;
  
  
  Comparing sentences
&lt;/h3&gt;

&lt;p&gt;I'm using &lt;a href="https://github.com/kpdecker/jsdiff"&gt;jsdiff&lt;/a&gt; to compare sentences and render the correct, incorrect, and missing terms:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8cztub9vogq4tz6m5fx9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8cztub9vogq4tz6m5fx9.png" alt="Sentence with correctness indication" width="472" height="99"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Layout
&lt;/h3&gt;

&lt;p&gt;Time to make the app beautiful! I have been playing with Tailwind on side projects and it can make you really productive! I love the utility-first approach!&lt;/p&gt;

&lt;p&gt;It is powerful! Even the wand from the image below was built using Tailwind:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxd8wpsf67iygrxtupw1g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxd8wpsf67iygrxtupw1g.png" alt="Screenshot of a wand taken from the app screen" width="200" height="306"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;One interesting thought: I have more than a decade of experience using CSS, and so, translating Tailwind to CSS and vice-versa is easy for me. I wonder how it feels for some unfamiliar with CSS at all 👀.&lt;/p&gt;

&lt;h4&gt;
  
  
  Translation
&lt;/h4&gt;

&lt;p&gt;I'm Brazilian and so are my daughters. I like to give them some opportunity to understand what they are reading and thus I included also a translation of the sentence, but only after submitting, to avoid taking the attention out of the main task:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F32o2gfd0692g4adnlp04.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F32o2gfd0692g4adnlp04.png" alt="Image of the translated text taken from the app" width="481" height="170"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For now, I'm hard-coding the translation to Portuguese. On a production app though, I would be reading the user's language instead to conditionally translate (e.g. English speakers don't need a translation).&lt;/p&gt;

&lt;h1&gt;
  
  
  Takeaways
&lt;/h1&gt;

&lt;p&gt;Cloudflare Workers use its own runtime. I faced only one compatibility issue: generating ulid ids. The ulid package is not compatible with workers.&lt;/p&gt;

&lt;p&gt;I overcame it by using &lt;a href="https://github.com/ryan-mars/ulid-workers"&gt;ulid-workers&lt;/a&gt;, which &lt;a href="https://github.com/ryan-mars/ulid-workers?tab=readme-ov-file#monotonicity-and-ulid-time-in-cloudflare-workers"&gt;has its own limitations&lt;/a&gt;, but would work for my use-case.&lt;/p&gt;

&lt;p&gt;Also, Remix sourcemaps didn't work with Cloudflare Workers, but it seems that &lt;a href="https://github.com/remix-run/remix/issues/6702"&gt;it should be working already&lt;/a&gt;. I should double-check it!&lt;/p&gt;

&lt;p&gt;Other than that, the experience was great!&lt;/p&gt;

&lt;h3&gt;
  
  
  Next steps
&lt;/h3&gt;

&lt;p&gt;The project is already fun, but there's a lot that can be done further:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Multiple themes for better variance&lt;/li&gt;
&lt;li&gt;Better fit text translation in the app&lt;/li&gt;
&lt;li&gt;Add monitoring tools (e.g. Sentry)&lt;/li&gt;
&lt;li&gt;Better handle server-error&lt;/li&gt;
&lt;li&gt;Parse transcription with streaming&lt;/li&gt;
&lt;li&gt;Authentication&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Multiple Models and/or Triple Task Types&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The project is using 3 different task types:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Text generation&lt;/li&gt;
&lt;li&gt;Automatic Speech Recognition&lt;/li&gt;
&lt;li&gt;Translation&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>cloudflarechallenge</category>
      <category>devchallenge</category>
      <category>ai</category>
      <category>remix</category>
    </item>
  </channel>
</rss>
