<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: David Haley</title>
    <description>The latest articles on Forem by David Haley (@dchaley).</description>
    <link>https://forem.com/dchaley</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1186907%2F99d60363-9354-49c3-a55f-75d14e77e6a5.jpg</url>
      <title>Forem: David Haley</title>
      <link>https://forem.com/dchaley</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/dchaley"/>
    <language>en</language>
    <item>
      <title>Auto-loading .nvmrc in JetBrains Junie terminal</title>
      <dc:creator>David Haley</dc:creator>
      <pubDate>Tue, 23 Dec 2025 06:06:59 +0000</pubDate>
      <link>https://forem.com/dchaley/auto-loading-nvmrc-in-junie-terminal-48pc</link>
      <guid>https://forem.com/dchaley/auto-loading-nvmrc-in-junie-terminal-48pc</guid>
      <description>&lt;p&gt;I've been using JetBrains's &lt;a href="https://www.jetbrains.com/junie/" rel="noopener noreferrer"&gt;Junie&lt;/a&gt; product for agentic AI coding since it went public in April 2025. It's been an overall great experience.&lt;/p&gt;

&lt;p&gt;I noted it was struggling with a React/TypeScript project. When running &lt;code&gt;yarn&lt;/code&gt; commands it wasn't honoring the &lt;code&gt;.nvmrc&lt;/code&gt; file, so it didn't load the correct &lt;code&gt;node&lt;/code&gt; version. This caused issues for files generated/built against different NodeJS versions.&lt;/p&gt;

&lt;p&gt;My setup is slightly strange: because NVM takes a while to load, and I'm impatient, I don't load it automatically in my shell. Whereas the usual setup would be,&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="s2"&gt;"/opt/homebrew/opt/nvm/nvm.sh"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\.&lt;/span&gt; &lt;span class="s2"&gt;"/opt/homebrew/opt/nvm/nvm.sh"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;my &lt;code&gt;.zsh-local-mac&lt;/code&gt; (included from &lt;code&gt;.zshrc&lt;/code&gt;) sets up an alias:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;alias &lt;/span&gt;load-nvm&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'[ -s "/opt/homebrew/opt/nvm/nvm.sh" ] &amp;amp;&amp;amp; \. "/opt/homebrew/opt/nvm/nvm.sh"'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without that line, Junie doesn't have NVM enabled in its terminal. But just loading NVM isn't enough, we need to also load the &lt;code&gt;.nvmrc&lt;/code&gt; file.&lt;/p&gt;

&lt;p&gt;Junie helpfully sets the &lt;code&gt;$TERMINAL_EMULATOR&lt;/code&gt; to &lt;code&gt;JetBrains-JediTerm&lt;/code&gt;. So I added this to my shell config:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;if [[ "$TERMINAL_EMULATOR" == "JetBrains-JediTerm" ]]; then
  load-nvm

  autoload -U add-zsh-hook
  load-nvmrc() {
    local node_version="$(nvm version)"
    local nvmrc_path="$(nvm_find_nvmrc)"

    if [ -n "$nvmrc_path" ]; then
      local nvmrc_node_version=$(nvm version "$(cat "${nvmrc_path}")")

      if [ "$nvmrc_node_version" = "N/A" ]; then
        nvm install
      elif [ "$nvmrc_node_version" != "$node_version" ]; then
        nvm use
      fi
    elif [ "$node_version" != "$(nvm version default)" ]; then
      echo "Reverting to nvm default version"
      nvm use default
    fi
  }
  add-zsh-hook chpwd load-nvmrc
  load-nvmrc
fi
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After making this change, Junie terminals honor &lt;code&gt;.nvmrc&lt;/code&gt; files. This is vital to run tests, e.g., &lt;code&gt;yarn jest src/file.test.tsx&lt;/code&gt;, allowing the agent to iterate more intelligently on its progress.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>jetbrains</category>
      <category>programming</category>
      <category>shell</category>
    </item>
    <item>
      <title>Deploying to Firebase Hosting + Firestore from GitHub actions</title>
      <dc:creator>David Haley</dc:creator>
      <pubDate>Mon, 06 Oct 2025 03:26:15 +0000</pubDate>
      <link>https://forem.com/dchaley/deploying-to-firebase-hosting-firestore-from-github-actions-5g52</link>
      <guid>https://forem.com/dchaley/deploying-to-firebase-hosting-firestore-from-github-actions-5g52</guid>
      <description>&lt;p&gt;I recently set up a GitHub action to deploy Firebase after pull request merge. It's a tremendous time-saver. Previously, I was deploying from my dev machine, doing some toil to switch between a development environment (emulators) and the production environment.&lt;/p&gt;

&lt;h2&gt;
  
  
  Project setup
&lt;/h2&gt;

&lt;p&gt;I use environment variables to control the various Firebase variables (project/app ID, API key, etc). Note that the Firebase API key is not a secret key. I put these in &lt;code&gt;.envrc&lt;/code&gt; on my local machine but the action needs a bit more help setting up the environment.&lt;/p&gt;

&lt;p&gt;I have a script that uses &lt;code&gt;jq&lt;/code&gt; to create a JSON file from a template. For example, &lt;code&gt;write-config.sh&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;

&lt;span class="c"&gt;# Write the overall firebase config:&lt;/span&gt;

jq &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--arg&lt;/span&gt; FIREBASE_PROJECT_ID &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$FIREBASE_PROJECT_ID&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-f&lt;/span&gt; .firebaserc.jq &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; .firebaserc

&lt;span class="c"&gt;# Write the json file loaded by the kotlin-angular build:&lt;/span&gt;

jq &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--arg&lt;/span&gt; FIREBASE_PROJECT_ID &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$FIREBASE_PROJECT_ID&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--arg&lt;/span&gt; FIREBASE_APP_ID &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$FIREBASE_APP_ID&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--arg&lt;/span&gt; FIREBASE_STORAGE_BUCKET &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$FIREBASE_STORAGE_BUCKET&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--arg&lt;/span&gt; FIREBASE_API_KEY &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$FIREBASE_API_KEY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--arg&lt;/span&gt; FIREBASE_AUTH_DOMAIN &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$FIREBASE_AUTH_DOMAIN&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--arg&lt;/span&gt; FIREBASE_MESSAGING_SENDER_ID &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$FIREBASE_MESSAGING_SENDER_ID&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--arg&lt;/span&gt; FIREBASE_USE_EMULATORS &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$FIREBASE_USE_EMULATORS&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-f&lt;/span&gt; webApp/src/jsMain/resources/firebase-config.json.jq &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; webApp/src/jsMain/resources/firebase-config.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The template files are quite simple, here's one for &lt;code&gt;firebase.json&lt;/code&gt; (used by the CLI):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "projects": {
    "default": "\($FIREBASE_PROJECT_ID)"
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You also need your Firebase environment configured in the client. This particular project builds Angular via gradle (it's a long story, see also &lt;a href="https://dev.to/dchaley/series/25958"&gt;Kotlin in the Browser&lt;/a&gt;). But I used the same json format that Angular Fire recommends. Here's the &lt;code&gt;firebase-config.json.jq&lt;/code&gt; template:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "projectId": "\($FIREBASE_PROJECT_ID)",
  "appId": "\($FIREBASE_APP_ID)",
  "apiKey": "\($FIREBASE_API_KEY)",
  "authDomain": "\($FIREBASE_AUTH_DOMAIN)",
  "storageBucket": "\($FIREBASE_STORAGE_BUCKET)",
  "messagingSenderId": "\($FIREBASE_MESSAGING_SENDER_ID)",
  "useEmulators": "\($FIREBASE_USE_EMULATORS)"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Repository setup
&lt;/h2&gt;

&lt;p&gt;Set up a target environment (eg "Production") and populate it with values from the Firebase console:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;FIREBASE_PROJECT_ID&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;FIREBASE_APP_ID&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;FIREBASE_STORAGE_BUCKET&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;FIREBASE_API_KEY&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;FIREBASE_AUTH_DOMAIN&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;FIREBASE_MESSAGING_SENDER_ID&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Also, create a secret named &lt;code&gt;FIREBASE_SERVICE_ACCOUNT_BASE64&lt;/code&gt; containing a newly exported &lt;code&gt;json&lt;/code&gt; service account key (see below).&lt;/p&gt;

&lt;h2&gt;
  
  
  GitHub action
&lt;/h2&gt;

&lt;p&gt;The action is a straightforward series of commands,&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;check out the repo&lt;/li&gt;
&lt;li&gt;Generate config (as above)&lt;/li&gt;
&lt;li&gt;Install Firebase CLI&lt;/li&gt;
&lt;li&gt;Build the app with gradle&lt;/li&gt;
&lt;li&gt;Deploy web app to Hosting&lt;/li&gt;
&lt;li&gt;Deploy Firestore rules&lt;/li&gt;
&lt;li&gt;Clean up&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Services deployed
&lt;/h2&gt;

&lt;p&gt;The following Firebase services are deployed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hosting

&lt;ul&gt;
&lt;li&gt;Provides the main web app&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Firestore (rules)

&lt;ul&gt;
&lt;li&gt;Defines the database security rules&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Service account &amp;amp; permissions required
&lt;/h2&gt;

&lt;p&gt;Create a new service account in the GCP IAM console panel. Call it something like "GitHub deploy", and only use it for GitHub action deploys.&lt;/p&gt;

&lt;p&gt;The permissions are a bit trickier. The easy way out is to make the service account an overall admin, but consider following the principle of least privilege. Limit the impact of a malicious or mistaken actor.&lt;/p&gt;

&lt;p&gt;Through trial and error I think this is it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Firebase Hosting Admin

&lt;ul&gt;
&lt;li&gt;Needed to deploy to Hosting&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Firebase Rules Admin

&lt;ul&gt;
&lt;li&gt;Needed to deploy Rules (for Firestore)&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Service Account User

&lt;ul&gt;
&lt;li&gt;Needed to act as the service account&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Service Usage Consumer

&lt;ul&gt;
&lt;li&gt;Needed to test if APIs are active&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Full YAML source
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deploy to Firebase on merge&lt;/span&gt;
&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;push&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;branches&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;main&lt;/span&gt;
&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;build_and_deploy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;production&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Use Node.js&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/setup-node@v4&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;node-version-file&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;./.nvmrc'&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Generate .firebaserc&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;./write-firebase-config.sh&lt;/span&gt;
          &lt;span class="s"&gt;cat .firebaserc&lt;/span&gt;
        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;FIREBASE_PROJECT_ID&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ vars.FIREBASE_PROJECT_ID }}&lt;/span&gt;
          &lt;span class="na"&gt;FIREBASE_APP_ID&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ vars.FIREBASE_APP_ID }}&lt;/span&gt;
          &lt;span class="na"&gt;FIREBASE_STORAGE_BUCKET&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ vars.FIREBASE_STORAGE_BUCKET }}&lt;/span&gt;
          &lt;span class="na"&gt;FIREBASE_API_KEY&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ vars.FIREBASE_API_KEY }}&lt;/span&gt;
          &lt;span class="na"&gt;FIREBASE_AUTH_DOMAIN&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ vars.FIREBASE_AUTH_DOMAIN }}&lt;/span&gt;
          &lt;span class="na"&gt;FIREBASE_MESSAGING_SENDER_ID&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ vars.FIREBASE_MESSAGING_SENDER_ID }}&lt;/span&gt;
          &lt;span class="na"&gt;FIREBASE_USE_EMULATORS&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Install Firebase CLI&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;npm install -g firebase-tools&lt;/span&gt;
          &lt;span class="s"&gt;firebase --version&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Build Angular app&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;./gradlew webApp:buildProductionWebApp&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deploy hosting&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;echo "${{ secrets.FIREBASE_SERVICE_ACCOUNT_BASE64 }}" | base64 --decode &amp;gt; "google-application-credentials.json"&lt;/span&gt;
          &lt;span class="s"&gt;firebase deploy --only hosting --non-interactive&lt;/span&gt;
          &lt;span class="s"&gt;rm -rf "google-application-credentials.json"&lt;/span&gt;
        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;GOOGLE_APPLICATION_CREDENTIALS&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;google-application-credentials.json"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deploy Firestore rules&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;echo "${{ secrets.FIREBASE_SERVICE_ACCOUNT_BASE64 }}" | base64 --decode &amp;gt; "google-application-credentials.json"&lt;/span&gt;
          &lt;span class="s"&gt;firebase deploy --only firestore:rules --non-interactive&lt;/span&gt;
          &lt;span class="s"&gt;rm -rf "google-application-credentials.json"&lt;/span&gt;
        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;GOOGLE_APPLICATION_CREDENTIALS&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;google-application-credentials.json"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Cleanup credentials&lt;/span&gt;
        &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;always()&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;rm -rf "google-application-credentials.json"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Happy Firebaseing!!&lt;/p&gt;

</description>
      <category>githubactions</category>
      <category>firebase</category>
      <category>webdev</category>
      <category>cicd</category>
    </item>
    <item>
      <title>Quiz: Ruby &amp; Rspec scoping</title>
      <dc:creator>David Haley</dc:creator>
      <pubDate>Fri, 25 Jul 2025 19:51:32 +0000</pubDate>
      <link>https://forem.com/dchaley/quiz-ruby-rspec-scoping-3gnh</link>
      <guid>https://forem.com/dchaley/quiz-ruby-rspec-scoping-3gnh</guid>
      <description>&lt;p&gt;Consider this test:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;      context "scoping" do
        let(:var) { 1 }

        it "is tricky" do
          var = var + 1
          expect(var).to eq(2)
        end
      end
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Will this pass or fail?&lt;/p&gt;

</description>
      <category>ruby</category>
      <category>programming</category>
      <category>beginners</category>
      <category>learning</category>
    </item>
    <item>
      <title>Kotlin in the browser: attempting Firebase + Multiplatform</title>
      <dc:creator>David Haley</dc:creator>
      <pubDate>Fri, 04 Jul 2025 08:43:40 +0000</pubDate>
      <link>https://forem.com/dchaley/kotlin-in-the-browser-attempting-firebase-multiplatform-f6k</link>
      <guid>https://forem.com/dchaley/kotlin-in-the-browser-attempting-firebase-multiplatform-f6k</guid>
      <description>&lt;p&gt;I &lt;a href="https://dev.to/dchaley/kotlin-in-the-browser-g3f"&gt;started using Kotlin&lt;/a&gt; to write browser apps about 1.5 years ago. It's definitely an early adopter experience. It worked well for a non-trivial but basic app. For the next frontier, multi-platform development with Firebase, I got stuck.&lt;/p&gt;

&lt;p&gt;Since then, I've largely completed my basic needs for &lt;a href="https://github.com/dchaley/you-need-a-splitter" rel="noopener noreferrer"&gt;You Need a Splitter&lt;/a&gt;: an app that integrates with &lt;a href="https://www.ynab.com/" rel="noopener noreferrer"&gt;You Need a Budget&lt;/a&gt; to speed up my budget splitting workflow.&lt;/p&gt;

&lt;p&gt;In parallel I also wrote &lt;a href="https://github.com/redwoodconsulting-io/reservations-app/" rel="noopener noreferrer"&gt;ReservationsApp&lt;/a&gt; in Typescript with Angular, using Firebase. This gave me a baseline for comparison.&lt;/p&gt;

&lt;p&gt;YNAS was written with &lt;a href="https://kvision.gitbook.io/" rel="noopener noreferrer"&gt;KVision&lt;/a&gt;, an object oriented web framework for Kotlin/JS. I was able to develop a non-trivial app: multiple components, dialog boxes, asynchronous interaction… All in all, it was a good experience.&lt;/p&gt;

&lt;p&gt;At first I struggled with state management (understanding which changes did/didn't trigger reactive renderings). I found Angular's state management clearer– although, Angular has changed so much &amp;amp; so fast, some community content is obsolete. (On that note, the KVision community is much smaller.)&lt;/p&gt;

&lt;p&gt;One example is using arrays: is it changing the array, or its contents, that triggers a re-render? And how do you avoid tearing down &amp;amp; rebuilding elements that didn't change, if you react to the list reference changing?&lt;/p&gt;

&lt;p&gt;There were also rough edges around exact DOM interaction. Here's &lt;a href="https://github.com/dchaley/you-need-a-splitter/pull/14/commits/8b61021d5d901768955debd4acafa47da43720dd" rel="noopener noreferrer"&gt;an example&lt;/a&gt; where a component re-render wasn't updating FontAwesome icons. I needed to use a special unique DOM key so the buttons' icon changes would take effect.&lt;/p&gt;

&lt;p&gt;An interesting side quest: to use the YNAB javascript SDK in Kotlin, I needed to generate Kotlin type declarations. I wrote a &lt;a href="https://github.com/dchaley/you-need-a-splitter/tree/main/src/jsMain/kotlin/ynab" rel="noopener noreferrer"&gt;quick README&lt;/a&gt; with some notes. TLDR, I used &lt;a href="https://github.com/Kotlin/dukat" rel="noopener noreferrer"&gt;dukat&lt;/a&gt; to generate the declarations but had some manual work to do.&lt;/p&gt;

&lt;p&gt;All in all: I thought it was a success 😤 My main regret though was not accomplishing the multiplatform vision. Running YNAS on a mobile app would require a UX rewrite. 😩&lt;/p&gt;

&lt;p&gt;Emboldened by this experience I wanted to tackle that more ambitious project: a Kotlin-Multiplatform app backed by Firebase.&lt;/p&gt;

&lt;h2&gt;
  
  
  Attempt: a multiplatform Firebase app
&lt;/h2&gt;

&lt;p&gt;I'm building a financial planning application. YouNeedABudget is great for understanding where your money went, and allocating it to future goals. But it doesn't help me plan for the future in bigger ways: asking what-if questions like buying with cash vs financing a purchase.&lt;/p&gt;

&lt;p&gt;The UX would center on charts: projecting cash flow scenarios, that sort of thing.&lt;/p&gt;

&lt;p&gt;Why Firebase? Two reasons: serverless, and realtime data access. I really don't want to maintain (or pay for) servers but I do need data accessible beyond my laptop. The realtime data is a delightful cherry on top: the reactive rendering loop is built on shared data, not just local state.&lt;/p&gt;

&lt;p&gt;I was encouraged by GitLive's &lt;a href="https://github.com/GitLiveApp/firebase-kotlin-sdk" rel="noopener noreferrer"&gt;Firebase Kotlin SDK&lt;/a&gt; … at first glance, it was exactly what I need: a multiplatform Kotlin library.&lt;/p&gt;

&lt;p&gt;Now for the UX framework. The Compose-Multiplatform framework's big promise is to support "all" targets: web, desktop, iOS, and Android.&lt;/p&gt;

&lt;p&gt;Compose-Multiplatform supports web through WASM, not regular Javascript (&lt;a href="https://en.wikipedia.org/wiki/Source-to-source_compiler" rel="noopener noreferrer"&gt;transpiled&lt;/a&gt; from Kotlin to be indistinguishable from normal javascript apps).&lt;/p&gt;

&lt;p&gt;The WASM build has some caveats (I discussed in my &lt;a href="https://dev.to/dchaley/kotlin-in-the-browser-g3f"&gt;previous post&lt;/a&gt;). But if it meant one codebase for all platforms, I was willing to bet on WASM's future development.&lt;/p&gt;

&lt;p&gt;Then I needed a charting component. I settled on &lt;a href="https://github.com/KoalaPlot/koalaplot-core/tree/main" rel="noopener noreferrer"&gt;KoalaPlot&lt;/a&gt; which has a variety of plots, various customization options, and ongoing development.&lt;/p&gt;

&lt;h2&gt;
  
  
  Result: defeat
&lt;/h2&gt;

&lt;p&gt;Alas– I got "this" close but not close enough.&lt;/p&gt;

&lt;p&gt;I was able to build a HelloWorld app running on all platforms (well, I didn't test iOS, and only ran Android on the emulator). Here it is running on web (after &lt;a href="https://github.com/redwoodconsulting-io/streamwise/pull/3" rel="noopener noreferrer"&gt;PR#3&lt;/a&gt;):&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1f0tfexjb02lg8zj1up2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1f0tfexjb02lg8zj1up2.png" alt="Chart showing a line incrementing upwards" width="800" height="514"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;So far so good…! I even &lt;a href="https://github.com/redwoodconsulting-io/streamwise/pull/4" rel="noopener noreferrer"&gt;got hot-reload working&lt;/a&gt; for desktop, a major development cycle speedup.&lt;/p&gt;

&lt;p&gt;Then I tried incorporating the GitLive Firebase SDK … I hit a hard wall and the magic ended. 😮‍💨&lt;br&gt;
It doesn't support WASM (&lt;a href="https://github.com/GitLiveApp/firebase-kotlin-sdk/issues/440" rel="noopener noreferrer"&gt;issue #440&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;But it does support Kotlin/JS, meaning a Kotlin project transpiling to Javascript could use GitLive's Kotlin Firebase SDK.&lt;/p&gt;

&lt;p&gt;However– that meant getting Compose itself running for Kotlin/JS. I gather it's possible…? Or at least, at some point of the experimental lifecycle of Compose-for-Web, Compose-HTML, and other library evolutions, it was maybe possible to do some things.&lt;/p&gt;

&lt;p&gt;KoalaPlot itself also referenced Kotlin/JS so I was hopeful that I could convince the pieces to work together.&lt;/p&gt;

&lt;p&gt;… No.&lt;/p&gt;

&lt;p&gt;I got a Compose app compiling, but it was unable to find various font rendering functions at runtime. I went down a rabbit-hole of preloading custom emoji fonts, an apparently related but actually different set of problems. I had to downgrade from material3 to material to get it compiling. The problems piled up but solutions did not.&lt;/p&gt;

&lt;p&gt;My conclusion was, if I want Firebase, I can't have WASM. And if I can't have WASM, then I can't have Compose. And if I can't have Compose, then I can't build a UI toward multiple targets. Sad…&lt;/p&gt;

&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;At this point, I don't have great options with Kotlin:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Multiplatform&lt;/th&gt;
&lt;th&gt;Kotlin/JS&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Firebase&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UX framework&lt;/td&gt;
&lt;td&gt;Compose&lt;/td&gt;
&lt;td&gt;Questionable&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;How feasible is building Firebase on Multiplatform? The discussion on &lt;a href="https://github.com/GitLiveApp/firebase-kotlin-sdk/issues/440" rel="noopener noreferrer"&gt;issue #440&lt;/a&gt; shows several attempts to get GitLive's Firebase SDK working on wasm, with only limited success (eg just the Auth library). That's great– but I need more, Firestore in particular.&lt;/p&gt;

&lt;p&gt;In principle, I could write a multiplatform interface layer, with a WASM native implementation that calls out to the JS SDK… but no, I'm not gonna do that.&lt;/p&gt;

&lt;p&gt;I could use the Firebase SDK on mobile &amp;amp; desktop– but if I have to choose those or web, I choose web. Mobile apps pose a number of distribution challenges that simply aren't worth it at this stage. Besides, a responsive web app would be ok on a phone.&lt;/p&gt;

&lt;p&gt;Given that I'm stuck with Kotlin/JS for now– it's worth highlighting how straightforward it was to build an Angular + Firebase app with Typescript. There's no question that working within a major web app framework provides a far more integrated &amp;amp; seamless experience. I still prefer Kotlin 😝😭 but is it worth it?&lt;/p&gt;

&lt;p&gt;Generally, I like my code's core framework (eg UX layer) to be battle-tested. I have limited tolerance for rough edges. 🫩 &lt;/p&gt;

&lt;p&gt;To proceed with Kotlin, I'd need to use Kotlin/JS which means sticking with KVision (I don't want to learn yet another non-Compose Kotlin+HTML framework). While KVision is fine (and I paid the learning price) the KVision developer has put recent focus into &lt;a href="https://kilua.gitbook.io/kilua-guide/introduction" rel="noopener noreferrer"&gt;Kilua&lt;/a&gt;, which does target WASM working directly with the DOM no less. But it's even more cutting edge than KVision.&lt;/p&gt;

&lt;p&gt;Could I combine Kotlin with Angular? Well, in principle Kotlin/JS means it's "just" a question of getting the build system set up correctly. Brian White wrote a &lt;a href="https://stackoverflow.com/a/75478365/211771" rel="noopener noreferrer"&gt;StackOverflow post&lt;/a&gt; explaining how to do this– it's worth a try, although 2 years is "forever" in early adoption timelines. Besides, this lets Angular see the Kotlin code: I don't think it lets Kotlin code drive the Angular framework.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final thoughts
&lt;/h2&gt;

&lt;p&gt;I had high hopes for this, high enough to spend about 10 hours exploring all these rough edges. Unfortunately, at this juncture multiplatform building (mobile plus web) remains elusive. It's not enough to compile code to run on a platform: you also need frameworks to turn code into pixels, etc…&lt;/p&gt;

&lt;p&gt;Given all this friction: is Kotlin in the browser worth it? The main advantage (besides language preference) is a monorepo that powers frontend and backend. After all: background processing needs to wrangle the same data types. What if I learned Dart/Flutter…&lt;/p&gt;

&lt;p&gt;But today, I need to move on with my problem: visualizing my cash flow scenarios. Here's my backup option: the simplest solution, duplication. Build Kotlin + Typescript code around Firebase's data model. Make sure the data models don't get out of sync.&lt;/p&gt;

&lt;p&gt;One day, as these frameworks mature, I'll revisit how much more I can do with Kotlin in the browser. In the meantime, my quip from Part I feels even more real:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqi4iv299sordpvjlj5xp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqi4iv299sordpvjlj5xp.png" alt="Boromir saying: one does not simply render HTML" width="651" height="383"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;My goals were ambitious– perhaps one does not simply do &lt;em&gt;anything&lt;/em&gt;. 😎&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>kotlin</category>
      <category>browser</category>
      <category>firebase</category>
    </item>
    <item>
      <title>Performance trap: general libraries &amp; helper objects</title>
      <dc:creator>David Haley</dc:creator>
      <pubDate>Wed, 09 Oct 2024 23:26:30 +0000</pubDate>
      <link>https://forem.com/dchaley/performance-trap-general-libraries-helper-objects-h2k</link>
      <guid>https://forem.com/dchaley/performance-trap-general-libraries-helper-objects-h2k</guid>
      <description>&lt;p&gt;Convenience and performance are typically inversely correlated. If the code is easy to use, it's less optimized. If it's optimized, it's less convenient. Efficient code needs to get closer to the nitty gritty details of what is actually running, how.&lt;/p&gt;

&lt;p&gt;I came across an example in our ongoing work to run &amp;amp; optimize DeepCell cellular segmentation for cancer research. The DeepCell AI model predicts which pixels are most likely to be in a cell. From there, we "flood fill" from the most likely pixels, until reaching the cell border (below some threshold).&lt;/p&gt;

&lt;p&gt;Part of this process involves smoothing over small gaps inside predicted cells, which can happen for various reasons but isn't biologically possible. (Think donut holes, not a cell's porous membrane.)&lt;/p&gt;

&lt;p&gt;The hole-filling algorithm goes like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Identify objects (contiguous pixels with a given cell label with the same numeric id).&lt;/li&gt;
&lt;li&gt;Compute the "&lt;a href="https://en.wikipedia.org/wiki/Euler_characteristic" rel="noopener noreferrer"&gt;Euler number&lt;/a&gt;" of these cells, a measure of the shape's surface.&lt;/li&gt;
&lt;li&gt;If the Euler Number is less than 1 (aka the surface has gaps), smooth out the holes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here is an example of Euler numbers from the Wikipedia article; a circle (just the line part) has an Euler characteristic of zero whereas a disk (the "filled-in" circle) has value 1.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn3q91ix1bf9derb7ovpc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn3q91ix1bf9derb7ovpc.png" alt="Picture of a circular line with Euler number 0, and a circular disk with Euler number 1" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We're not here to talk about defining or computing Euler numbers though. We'll talk about how the library's easy path to computing Euler numbers is quite inefficient.&lt;/p&gt;

&lt;p&gt;First things first. We noticed the problem by looking at this profile using &lt;a href="https://www.speedscope.app/" rel="noopener noreferrer"&gt;Speedscope&lt;/a&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fex6jodu2lohcvm8in0nk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fex6jodu2lohcvm8in0nk.png" alt="Speedscope profile of the postprocessing runtime" width="800" height="623"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It shows ~32ms (~15%) spent in &lt;code&gt;regionprops&lt;/code&gt;. This view is left-heavy, if we go to timeline view and zoom in, we get this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuhm4nwzx49n5pm7ame6b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuhm4nwzx49n5pm7ame6b.png" alt="Speedscope profile zoomed in to regionprops, showing a very large number of very small function calls" width="800" height="485"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;(Note that we do this twice, hence ~16ms here and ~16ms elsewhere, not shown.)&lt;/p&gt;

&lt;p&gt;This is immediately suspect: the "interesting" part of finding the objects with &lt;code&gt;find_objects&lt;/code&gt; is that first sliver, 0.5ms. It returns a list of tuples, not a generator, so when it's done it's done. So what's up with all the other stuff? We're &lt;a href="https://github.com/scikit-image/scikit-image/blob/e3e7c48b76d38a5f92db6ff9f355f933f9f977ff/skimage/measure/_regionprops.py#L1383-L1390" rel="noopener noreferrer"&gt;constructing &lt;code&gt;RegionProperties&lt;/code&gt; objects&lt;/a&gt;. Let's zoom in on one of them.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F28273vtmliqsih8rfldb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F28273vtmliqsih8rfldb.png" alt="Profile creating a single RegionProperty object" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The tiny slivers (which we won't zoom into) are custom &lt;code&gt;__setattr__&lt;/code&gt; calls: the RegionProperties objects support aliasing, for instance if you set the attribute &lt;code&gt;ConvexArea&lt;/code&gt; it redirects to a standard attribute &lt;code&gt;area_convex&lt;/code&gt;. Even though we're not making use of that we still go through the attribute converter.&lt;/p&gt;

&lt;p&gt;Furthermore: we aren't even using most of the properties calculated in the region properties. We only care about the Euler number:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;props&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;regionprops&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;squeeze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;label_img&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;astype&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;int&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;prop&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;props&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;prop&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;euler_number&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;in turn, that only uses the most basic aspect of the region properties: the image regions detected by &lt;code&gt;find_objects&lt;/code&gt; (slices of the original image).&lt;/p&gt;

&lt;p&gt;So, we changed the code to &lt;code&gt;fill_holes&lt;/code&gt; code to simply bypass the &lt;code&gt;regionprops&lt;/code&gt; general-purpose function. Instead, we call &lt;code&gt;find_objects&lt;/code&gt; and pass the resulting image sub-regions to the &lt;code&gt;euler_number&lt;/code&gt; function (not the method on a &lt;code&gt;RegionProperties&lt;/code&gt; object).&lt;/p&gt;

&lt;p&gt;Here's the pull request: &lt;a href="https://github.com/dchaley/deepcell-imaging/pull/358" rel="noopener noreferrer"&gt;deepcell-imaging#358 Skip regionprops construction&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;By skipping the intermediate object, we got a decent performance improvement for the &lt;code&gt;fill_holes&lt;/code&gt; operation:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Image size&lt;/th&gt;
&lt;th&gt;Before&lt;/th&gt;
&lt;th&gt;After&lt;/th&gt;
&lt;th&gt;Speedup&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;260k pixels&lt;/td&gt;
&lt;td&gt;48ms&lt;/td&gt;
&lt;td&gt;40ms&lt;/td&gt;
&lt;td&gt;8ms (17%)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;140M pixels&lt;/td&gt;
&lt;td&gt;15.6s&lt;/td&gt;
&lt;td&gt;11.7s&lt;/td&gt;
&lt;td&gt;3.9s (25%)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For the larger image, 4s is ~3% of the overall runtime– not the bulk of it, but not too shabby either.&lt;/p&gt;

</description>
      <category>performance</category>
      <category>software</category>
      <category>python</category>
      <category>bioinformatics</category>
    </item>
    <item>
      <title>Improve container build time by 70% w/ better caching</title>
      <dc:creator>David Haley</dc:creator>
      <pubDate>Thu, 26 Sep 2024 00:34:37 +0000</pubDate>
      <link>https://forem.com/dchaley/improve-container-build-time-by-70-w-better-caching-2o2l</link>
      <guid>https://forem.com/dchaley/improve-container-build-time-by-70-w-better-caching-2o2l</guid>
      <description>&lt;p&gt;Our ongoing work to run DeepCell on GCP Batch produces a very large container: 5 GB compressed. Most of it is the Python &amp;amp; binaries required to run TensorFlow and all associated GPU code. It took ~13 minutes to build on GCP Cloud Build.&lt;/p&gt;

&lt;p&gt;By &lt;a href="https://github.com/dchaley/deepcell-imaging/pull/353" rel="noopener noreferrer"&gt;leveraging Docker's cache better&lt;/a&gt;, we brought that down to ~4 minutes, a roughly 70% improvement.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Before&lt;/th&gt;
&lt;th&gt;After&lt;/th&gt;
&lt;th&gt;Delta&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;13min&lt;/td&gt;
&lt;td&gt;4min&lt;/td&gt;
&lt;td&gt;-9min (-70%)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Docker builds containers by creating a layer for each build command. The layers "stack" onto each other, adding or changing what's in the container so far. Loosely speaking, the layers are like snapshots of the container contents.&lt;/p&gt;

&lt;p&gt;Docker can &lt;a href="https://docs.docker.com/build/cache/" rel="noopener noreferrer"&gt;cache&lt;/a&gt; layers in the build process. Unless the build instruction changes, like updating the command or copying a different source file, the layer doesn't need to be rebuilt.&lt;/p&gt;

&lt;p&gt;Our Dockerfile looked like this: (&lt;a href="https://github.com/dchaley/deepcell-imaging/blob/9393e407d023a31366bb5fc054ff640d281ad802/container/Dockerfile" rel="noopener noreferrer"&gt;unabridged version here&lt;/a&gt;)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;FROM &amp;lt;base_container&amp;gt;

RUN apt-get update -y &amp;amp;&amp;amp; apt-get install -y &amp;lt;packages&amp;gt;

# Add the repo sha to the container as the version.
ADD https://api.github.com/repos/dchaley/deepcell-imaging/git/refs/heads/main version.json

# Clone the deepcell-imaging repo
RUN git clone https://github.com/dchaley/deepcell-imaging.git

# Switch into the repo directory
WORKDIR "/deepcell-imaging"

# Install python requirements
RUN pip install --user --upgrade -r requirements.txt

# Install our own module
RUN pip install .
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When we added caching, we saw a smaller speedup, about 30%. We avoided reinstalling the &lt;code&gt;apt-get&lt;/code&gt; packages but we were still reinstalling Python dependencies … some of which (like TensorFlow) are very hefty, and many of which require compilation.&lt;/p&gt;

&lt;p&gt;The full &lt;a href="https://docs.docker.com/build/cache/invalidation/" rel="noopener noreferrer"&gt;cache invalidation rules&lt;/a&gt; are a bit tricky.  But the basic idea is simple. Layers are invalidated if the command changes or copied files change. If any layer is invalidated, all subsequent layers must be rebuilt.&lt;/p&gt;

&lt;p&gt;In our case, by adding &lt;code&gt;version.json&lt;/code&gt;, we were invalidating everything below, in particular installing the Python dependencies from &lt;code&gt;requirements.txt&lt;/code&gt;. But these change quite rarely, compared to our application code!&lt;/p&gt;

&lt;p&gt;Normally it's a GoodThing™️ to force a rebuild if code changes. But we don't want to lose the Python dependencies cache. To stop invalidating the cache for dependencies, we explicitly pulled in just &lt;code&gt;requirements.txt&lt;/code&gt;, installed those, and then pulled in the overall source code. This means we still rebuild dependencies if they change, but if they don't … we don't!&lt;/p&gt;

&lt;p&gt;Our new Dockerfile looks like this: (&lt;a href="https://github.com/dchaley/deepcell-imaging/blob/90c67248548c929ca056bf1447353c15376bbf12/container/Dockerfile" rel="noopener noreferrer"&gt;unabridged version here&lt;/a&gt;)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;FROM &amp;lt;base_container&amp;gt;

RUN apt-get update -y &amp;amp;&amp;amp; apt-get install -y &amp;lt;packages&amp;gt;

# Fetch the Python dependencies
ADD https://raw.githubusercontent.com/dchaley/deepcell-imaging/refs/heads/main/requirements.txt requirements.txt

# Install python requirements
RUN pip install --user --upgrade -r requirements.txt

# Add the repo sha to the container as the version.
ADD https://api.github.com/repos/dchaley/deepcell-imaging/git/refs/heads/main version.json

# Clone the deepcell-imaging repo
RUN git clone https://github.com/dchaley/deepcell-imaging.git

# Switch into the repo directory
WORKDIR "/deepcell-imaging"

# Install our own module
RUN pip install .
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, we rebuilt the container after a small code change, and observed the fuller benefits of the cache- avoiding the needless rebuilding of Python dependencies.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwmlw8ripsw4g0kknr34q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwmlw8ripsw4g0kknr34q.png" alt="GitHub actions showing before/after build times of 13m38s to 4m37s" width="800" height="230"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;It's been really interesting learning the various ways of building containers &amp;amp; their pros/cons. A lot of containers are built by copying files from the local directories into the container, rather than checking out from source. This has many advantages like you can build a test/dev container from whatever you currently have. I wanted to simplify and make sure the container was &lt;em&gt;always&lt;/em&gt; built from &lt;code&gt;main&lt;/code&gt;. The double-edged sword of simplicity.&lt;/p&gt;

</description>
      <category>docker</category>
      <category>tensorflow</category>
      <category>googlecloud</category>
      <category>cicd</category>
    </item>
    <item>
      <title>Optimizing QuPath intensity measurements: 12.5 hr to 2min</title>
      <dc:creator>David Haley</dc:creator>
      <pubDate>Sat, 31 Aug 2024 06:50:54 +0000</pubDate>
      <link>https://forem.com/dchaley/optimizing-qupath-intensity-measurements-125-hr-2min-416c</link>
      <guid>https://forem.com/dchaley/optimizing-qupath-intensity-measurements-125-hr-2min-416c</guid>
      <description>&lt;p&gt;Spatial biology analyzes tissue sample images to derive patterns and data. A key first step is identifying cells on the image and gathering quantitative measurements about those cells.&lt;/p&gt;

&lt;p&gt;In our ongoing work &lt;a href="https://github.com/dchaley/deepcell-imaging" rel="noopener noreferrer"&gt;scaling DeepCell on GCP Batch&lt;/a&gt;, we'd previously gotten pretty efficient at the first part: segmenting the image into cells. But we hit a major performance roadblock for the next step: generating quantitative measurements.&lt;/p&gt;

&lt;p&gt;The measurements are fairly straightforward:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;size of each cell (convert pixels in each detected cell to physical dimensions, assuming some number of microns per pixel)&lt;/li&gt;
&lt;li&gt;pixel intensity of each cell&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Of note, for a ~140M pixel image, it took about 12.5 hours (‼️) to measure the detected cells. That's … not great 😩 What the heck?? We're just counting number of pixels, and pixel values. An HD image is ~2 M pixels, and computers (and TVs &amp;amp; phones) render &amp;gt;30 of those &lt;em&gt;per second&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Profiling to the rescue. The great thing about JVM code is that it's extremely easy to profile. Just click "profile" instead of "run".&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft3dat807zdd2b20c9asu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft3dat807zdd2b20c9asu.png" alt="Screenshot of profiler button" width="660" height="292"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here's the resulting flamegraph.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fifqwr6286rhnyvnxlxvk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fifqwr6286rhnyvnxlxvk.png" alt="IntelliJ flamegraph adding cell measurements" width="800" height="368"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Of note, 99.9% of adding intensity measurements–84% of the total time–is spent simply reading the image.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F734ph3ia6hcrzpwei73b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F734ph3ia6hcrzpwei73b.png" alt="Screenshot of time spent in readRegion: 84.25% of all, 99.88% of parent, amounting to 79.5 seconds" width="800" height="108"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;OK: so we need to not read the image repeatedly. In our case, the entire image can (for now) fit into RAM. If only we could simply prefetch the image, then read regions out of that in-memory image.&lt;/p&gt;

&lt;p&gt;Sounds like a great use case for the &lt;a href="https://en.wikipedia.org/wiki/Proxy_pattern" rel="noopener noreferrer"&gt;Proxy pattern&lt;/a&gt;. We need an &lt;code&gt;ImageServer&lt;/code&gt; that behaves just like the original image server, except, it reads from an in-memory image not from disk (or wherever the wrapped server reads).&lt;/p&gt;

&lt;p&gt;The resulting code is quite simple. Here's the &lt;a href="https://github.com/dchaley/qupath-project-initializer/pull/41" rel="noopener noreferrer"&gt;pull request&lt;/a&gt;. We override the abstract &lt;code&gt;ImageServer&lt;/code&gt;, wrapping another &lt;code&gt;ImageServer&lt;/code&gt; and forwarding all methods to the original.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;UPDATE 2024-09-10&lt;/em&gt;: Thanks to Adrián Szegedi (GitHub &lt;a href="https://github.com/HawkSK" rel="noopener noreferrer"&gt;HawkSK&lt;/a&gt;) the code is even simpler (&lt;a href="https://github.com/dchaley/qupath-project-initializer/pull/42" rel="noopener noreferrer"&gt;PR#42&lt;/a&gt;): no need to explicitly forward methods. Instead we use Kotlin's &lt;a href="https://kotlinlang.org/docs/delegation.html" rel="noopener noreferrer"&gt;delegation syntax&lt;/a&gt; which implicitly forwards non-overridden methods. This removes 100 lines of boilerplate 💪🏻&lt;/p&gt;

&lt;p&gt;The one non-forwarded method is the core operation: reading a region.&lt;/p&gt;

&lt;p&gt;That one turns into extracting the region from the entire (prefetched) image:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  private fun readFullImage() {
    if (prefetchedImage != null)
      return

    logger.info("Prefetching full image at path: ${wrappedImageServer.path}")

    val wholeImageRequest = RegionRequest.createInstance(
      wrappedImageServer.path,
      1.0,
      0,
      0,
      wrappedImageServer.width,
      wrappedImageServer.height
    )
    prefetchedImage = wrappedImageServer.readRegion(wholeImageRequest)
  }

  override fun readRegion(request: RegionRequest?): BufferedImage {
    if (request?.z != 0 || request.t != 0)
      throw IllegalArgumentException("PrefetchedImageServer only supports z=0 and t=0")

    readFullImage()
    return prefetchedImage!!.getSubimage(request!!.x, request.y, request.width, request.height)
  }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This way, we only read the image once, and fetch all subregions from the in-memory image.&lt;/p&gt;

&lt;p&gt;Here's the speed-up in the real-world (Google Batch)&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh7qc100w58ghcpfuwx39.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh7qc100w58ghcpfuwx39.png" alt="Google Batch jobs showing new runtime 2min 14s, and old runtime 12hr 25min" width="800" height="161"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Before (min)&lt;/th&gt;
&lt;th&gt;After (min)&lt;/th&gt;
&lt;th&gt;Delta&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;745&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;-743 min (-99.7%)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;In the words of the great Tina Turner: Boom, Shaka Laka.&lt;/p&gt;

</description>
      <category>performance</category>
      <category>bioinformatics</category>
      <category>kotlin</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Re-rebuilding TF2.8 image: 369 patches</title>
      <dc:creator>David Haley</dc:creator>
      <pubDate>Sun, 18 Aug 2024 07:56:00 +0000</pubDate>
      <link>https://forem.com/dchaley/re-rebuilding-tf28-image-369-patches-kif</link>
      <guid>https://forem.com/dchaley/re-rebuilding-tf28-image-369-patches-kif</guid>
      <description>&lt;p&gt;I wrote previously about &lt;a href="https://dev.to/dchaley/rebuilding-tensorflow-284-on-ubuntu-2204-to-patch-vulnerabilities-3j3m"&gt;rebuilding the TF2.8 image to patch vulnerabilities&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I noticed the issue scanner had crept up again. So I rebuilt the container again, using the same dockerfiles (etc).&lt;/p&gt;

&lt;p&gt;The upstream changes (e.g. in Ubuntu 22.04) pulled in 369 security fixes. 🎉&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F13zdfgbdp376uj5u5vcs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F13zdfgbdp376uj5u5vcs.png" alt="A screenshot of the Google artifact repository showing vulnerability counts going from 839 to 470" width="800" height="166"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Along the way, I regretted that the &lt;a href="https://github.com/dchaley/tensorflow-2.8.4-redux" rel="noopener noreferrer"&gt;tensorflow-2.8.4-redux repo&lt;/a&gt; doesn't have automatic container building. I can't build it locally any longer as it needs x86_64 but I have arm64. 😓 I was able to build it in Cloud Shell easily enough for now.&lt;/p&gt;

</description>
      <category>docker</category>
      <category>tensorflow</category>
      <category>security</category>
      <category>ai</category>
    </item>
    <item>
      <title>gs-fastcopy: get CPU count for upload workers</title>
      <dc:creator>David Haley</dc:creator>
      <pubDate>Tue, 23 Jul 2024 02:48:43 +0000</pubDate>
      <link>https://forem.com/dchaley/gs-fastcopy-get-cpu-count-for-upload-workers-3ke7</link>
      <guid>https://forem.com/dchaley/gs-fastcopy-get-cpu-count-for-upload-workers-3ke7</guid>
      <description>&lt;p&gt;See previous post: &lt;a href="https://dev.to/dchaley/introducing-gs-fastcopy-9pi"&gt;Introducing gs-fastcopy&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I shipped the enhancement &lt;a href="https://github.com/redwoodconsulting-io/gs-fastcopy-python/issues/10" rel="noopener noreferrer"&gt;gs-fastcopy-python#10: Inspect processor count for better upload defaults&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Previously, we were defaulting to 8 workers (Google's default). On a system with more than 8 cores, that's leaving a lot idle!&lt;/p&gt;

&lt;p&gt;Now, we inspect the available CPU count. We honor &lt;code&gt;os.get_schedaffinity&lt;/code&gt; on systems that support it (processors available to &lt;em&gt;this process&lt;/em&gt; not just in general) otherwise, we use &lt;code&gt;os.cpu_count()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Benchmarking results: [&lt;a href="https://docs.google.com/spreadsheets/d/13LRSnfddt4nSBfc-MAx8jTL3xCp7riqGR8Vcc_skUxw/edit?gid=1577232550#gid=1577232550" rel="noopener noreferrer"&gt;source sheet&lt;/a&gt;]&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgj23zxl4a2grurlsr7ut.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgj23zxl4a2grurlsr7ut.png" alt="Bar chart showing time taken to complete upload operation, with and without compressing first" width="800" height="482"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Note how adding workers speeds up the process, but yields diminishing returns. I think that's when the network transfer itself becomes more the bottleneck, but it's likely that tweaking chunk sizes would help too.&lt;/p&gt;

&lt;p&gt;Also note the more dramatic effects when using compression (using &lt;code&gt;pigz&lt;/code&gt;, parallel gzip). pigz would've picked up on the max workers before, what's new here is including them in the upload as well.&lt;/p&gt;

</description>
      <category>performance</category>
      <category>cloudstorage</category>
      <category>googlecloud</category>
      <category>python</category>
    </item>
    <item>
      <title>Introducing gs-fastcopy</title>
      <dc:creator>David Haley</dc:creator>
      <pubDate>Sun, 21 Jul 2024 23:23:35 +0000</pubDate>
      <link>https://forem.com/dchaley/introducing-gs-fastcopy-9pi</link>
      <guid>https://forem.com/dchaley/introducing-gs-fastcopy-9pi</guid>
      <description>&lt;p&gt;These days, a single laptop can chomp through gigabytes of data in seconds. So why was it taking ~1.5min to compress &amp;amp; upload 2 GB? Why was it taking ~10s to download just 100 MB?&lt;/p&gt;

&lt;p&gt;I get bothered by code that "should" be fast but isn't, when I have to wait around for it. Maybe it's 30+ yrs experience with software, 25+ years in web dev: I have a pretty good sense when something is slower than it "should" be.&lt;/p&gt;

&lt;p&gt;And o', but am I never satisfied with needlessly slow code.&lt;/p&gt;

&lt;p&gt;Time is both time and money. The more cancer researchers can process data, the faster we get to innovative treatments and save lives. And going 2x as fast with the same hardware typically means spending 1/2 as much. In an eventual clinical setting, every cent matters when it comes to tests being given freely… which can mean life &amp;amp; death.&lt;/p&gt;

&lt;p&gt;I checked with my co-conspirator Lynn Langit: "these speeds, but really though?" She pointed me at the gcloud CLI tool's much superior performance in file transfer. &lt;/p&gt;

&lt;p&gt;That began an investigation into optimizing transfer: basically, the standard Python (&amp;amp; other) Blob implementation is single-threaded. So much computing power just … sitting there sad &amp;amp; idle.&lt;/p&gt;

&lt;p&gt;It's nice when default settings "just work" – correctly, but also fast. The numpy library is absolutely brilliant because it brings all kinds of low-level hardware optimization into Python, you don't have to think about it.&lt;/p&gt;

&lt;p&gt;In that spirit, I hope to make cloud storage file transfer just that much easier, so that you don't have to think about it to get fast performance.&lt;/p&gt;

&lt;p&gt;Without further ado: introducing gs-fastcopy:&lt;br&gt;
&lt;a href="https://medium.com/@dchaley/introducing-gs-fastcopy-36bb3bb71818" rel="noopener noreferrer"&gt;https://medium.com/@dchaley/introducing-gs-fastcopy-36bb3bb71818&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It's my first open-source public Python package 🐍 📦 🎉&lt;/p&gt;

&lt;p&gt;Package: &lt;a href="https://pypi.org/project/gs-fastcopy/" rel="noopener noreferrer"&gt;https://pypi.org/project/gs-fastcopy/&lt;/a&gt;&lt;br&gt;
Source code: &lt;a href="https://github.com/redwoodconsulting-io/gs-fastcopy-python" rel="noopener noreferrer"&gt;https://github.com/redwoodconsulting-io/gs-fastcopy-python&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now I download &amp;amp; uncompress those 100 MB in just a couple seconds, not 10. I'll take a 5x speedup. And the impact is only bigger as the files get larger.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fra7kvarot9abdauuhrtw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fra7kvarot9abdauuhrtw.png" alt="Bar chart of benchmark results for local &amp;amp; cloud environments, using Blob-based smart_open and gs-fastcopy" width="800" height="318"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>performance</category>
      <category>googlecloud</category>
      <category>cloudstorage</category>
    </item>
    <item>
      <title>Ensuring GCE instances have full access to GCP APIs</title>
      <dc:creator>David Haley</dc:creator>
      <pubDate>Sat, 20 Jul 2024 02:58:50 +0000</pubDate>
      <link>https://forem.com/dchaley/ensuring-gce-instances-have-full-access-to-gcp-apis-3cdg</link>
      <guid>https://forem.com/dchaley/ensuring-gce-instances-have-full-access-to-gcp-apis-3cdg</guid>
      <description>&lt;p&gt;The default settings for GCE instances are fairly locked down from accessing Google APIs, but it's not obvious that's happening!&lt;/p&gt;

&lt;p&gt;Check out the instance creation settings:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fddfx9q1qifce0isgpwpz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fddfx9q1qifce0isgpwpz.png" alt="Screenshot of the Identity and API access settings" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You might think that "allow default access" means "use normal permissions as already configured". But … no 😅 Hover over the "?" icon and see:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Default: read-only access to Storage and Service Management, write access to Stackdriver Logging and Monitoring, read/write access to Service Control.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In other words, creating a GCE instance with default settings means you can't write to storage &lt;em&gt;even if&lt;/em&gt; the default service account has write permissions.&lt;/p&gt;

&lt;p&gt;You have two options:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Go with full access according to permissions: &lt;em&gt;Allow full access to all Cloud APIs&lt;/em&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Customize each service: &lt;em&gt;Set access for each API&lt;/em&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I went with the former, as I'm ok relying on the service account permissions. It's nice to know a more secure environment could lock down the account to just what's needed for that particular case (vs everything the account can do).&lt;/p&gt;

&lt;p&gt;🔐&lt;/p&gt;

&lt;p&gt;After this change, I can create VMs that can read/write storage. Ahh 😌&lt;/p&gt;

</description>
      <category>cloud</category>
      <category>googlecloud</category>
      <category>security</category>
    </item>
    <item>
      <title>Improve TensorFlow model load time by ~70% using HDF5 instead of SavedModel</title>
      <dc:creator>David Haley</dc:creator>
      <pubDate>Thu, 11 Jul 2024 03:55:16 +0000</pubDate>
      <link>https://forem.com/dchaley/improve-tensorflow-model-load-time-by-70-using-hdf5-instead-of-savedmodel-5c8e</link>
      <guid>https://forem.com/dchaley/improve-tensorflow-model-load-time-by-70-using-hdf5-instead-of-savedmodel-5c8e</guid>
      <description>&lt;p&gt;In our &lt;a href="https://dev.to/dchaley/series/27298"&gt;ongoing work&lt;/a&gt; running DeepCell on Google Batch, we noted that it takes ~9s to load the model into memory, whereas prediction (the interesting part of loading the model) takes ~3s for a 512x512 image.&lt;/p&gt;

&lt;p&gt;The ideal runtime environment is serverless, so we don't have long-lived processes which would load the model once, to predict multiple samples across multiple jobs. Instead, each task instance needs to load the model before doing any work. So, it hurts when the model takes 3x the load time of the actual work… it certainly makes it inefficient to scale horizontally with one short-lived compute node per prediction.&lt;/p&gt;

&lt;p&gt;My local machine (a macbook m3 max pro) took ~12 s to load the model, the slowest part of the entire preprocess → predict → postprocess pipeline.&lt;/p&gt;

&lt;p&gt;I was curious why it took so long to load the model into memory. It's "only" ~100 MB on disk.&lt;/p&gt;

&lt;p&gt;I came across &lt;a href="https://towardsdatascience.com/tensorflow-performance-loading-models-fb2d0dc340a3" rel="noopener noreferrer"&gt;TensorFlow Performance: Loading Models&lt;/a&gt; by Libor Vanek. It compares the load times for different formats. Here's the punchline:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhf5nbhaqkbfxmr6avsgz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhf5nbhaqkbfxmr6avsgz.png" alt="Chart of load times for SavedModel vs HDF5 showing a drop from ~10s to ~2s" width="800" height="421"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I was intrigued 🤞🏻  could we get similar speed-ups just by changing the format?&lt;/p&gt;

&lt;p&gt;Yes:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Environment&lt;/th&gt;
&lt;th&gt;SavedModel&lt;/th&gt;
&lt;th&gt;HDF5&lt;/th&gt;
&lt;th&gt;Diff&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Macbook M3 Max Pro&lt;/td&gt;
&lt;td&gt;12.3 s&lt;/td&gt;
&lt;td&gt;0.84 s&lt;/td&gt;
&lt;td&gt;-11.46 s (-93%)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;n1-standard-8 w/ 1 T4 GPU&lt;/td&gt;
&lt;td&gt;8.99 s&lt;/td&gt;
&lt;td&gt;2.68 s&lt;/td&gt;
&lt;td&gt;-6.31 s (-70%)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;n1-standard-32 w/ 1 T4 GPU&lt;/td&gt;
&lt;td&gt;8.21s&lt;/td&gt;
&lt;td&gt;2.72 s&lt;/td&gt;
&lt;td&gt;-5.49 s (-67%)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Of note, loading the model into memory used to take ~3x the time of prediction. Now, it's roughly the same.&lt;/p&gt;

&lt;p&gt;Converting the model was easy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Load the SavedModel version
&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;keras&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/Users/davidhaley/.keras/models/MultiplexSegmentation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Save as HDF5
&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MultiplexSegmentation-resaved-20240710.h5&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We needed to adjust one factor: the &lt;code&gt;load_model&lt;/code&gt; call needs an additional parameter to locate custom training objects:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;deepcell.layers.location&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Location2D&lt;/span&gt;

&lt;span class="c1"&gt;# [...]
&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;keras&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;custom_objects&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Location2D&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Location2D&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We learned this by importing the HDF5 file without the &lt;code&gt;custom_objects&lt;/code&gt; and getting the error that &lt;code&gt;Location2D&lt;/code&gt; wasn't found.&lt;/p&gt;

&lt;p&gt;This is the only caveat we've found with the HDF5 format: needing to tell it where to find the custom objects. The prediction results appear to be the same.&lt;/p&gt;

&lt;p&gt;70% just by using a different file format!&lt;/p&gt;

</description>
      <category>tensorflow</category>
      <category>performance</category>
      <category>cloud</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
