<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Paul</title>
    <description>The latest articles on Forem by Paul (@platinn).</description>
    <link>https://forem.com/platinn</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1330234%2F5d83c7fb-a970-46ce-ad01-099d7d38538e.jpg</url>
      <title>Forem: Paul</title>
      <link>https://forem.com/platinn</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/platinn"/>
    <language>en</language>
    <item>
      <title>When was the last time you learn something from your LLM logs ? Here is the solution : phospho</title>
      <dc:creator>Paul</dc:creator>
      <pubDate>Wed, 06 Mar 2024 03:27:37 +0000</pubDate>
      <link>https://forem.com/platinn/when-was-the-last-time-you-learn-something-from-your-llm-logs-here-is-the-solution-phospho-1opn</link>
      <guid>https://forem.com/platinn/when-was-the-last-time-you-learn-something-from-your-llm-logs-here-is-the-solution-phospho-1opn</guid>
      <description>&lt;p&gt;&lt;em&gt;TL;DR: &lt;a href="https://github.com/phospho-app/phospho"&gt;phospho&lt;/a&gt; is an open-source text analytics for LLM Apps. It helps companies turn their LLM prototype into a product with testing, evaluation, monitoring and guardrails at the semantic level.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Building LLM apps has never been easier. There are TONS of tools. Yet, companies that ship to production are &lt;strong&gt;scarce&lt;/strong&gt;. And lots of AI tools that have made it to production have either &lt;strong&gt;HIGH churn rate&lt;/strong&gt; or, &lt;strong&gt;LOW usage rate&lt;/strong&gt;. Why ?&lt;/p&gt;

&lt;p&gt;Unfortunately, many AI builders (I was one of them) are trapped at ground zero:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;They don’t know what to improve in their products, because there is so many ways to improve (and many yet to come!)&lt;/li&gt;
&lt;li&gt;To make decisions, they either have &lt;strong&gt;KPIs&lt;/strong&gt; irrelevant to their use case or just &lt;strong&gt;gut feeling&lt;/strong&gt; from everyone, but their users&lt;/li&gt;
&lt;li&gt;Who are the users, what they do, or what they say, is usually a &lt;strong&gt;big unknown&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No wonder they feel stuck. Their best chance at improving is either guesswork or reading through thousands of logs &amp;amp; messages. It is like looking for a needle in a haystack. &lt;/p&gt;

&lt;p&gt;There is no secret. Here is what the best companies shipping LLM products do that others don’t:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;They release often, and fast… because they have a &lt;strong&gt;clear set of custom metrics&lt;/strong&gt; that act as a simple “green light/red light”&lt;/li&gt;
&lt;li&gt;They improve on precise product issues… because they &lt;strong&gt;understand in great detail&lt;/strong&gt; who use their products and why&lt;/li&gt;
&lt;li&gt;They act on feedback quickly… because they &lt;strong&gt;listen to it every day&lt;/strong&gt; in their Slack channels or via mail and &lt;strong&gt;get alerted&lt;/strong&gt; when something is going wrong&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;🧪 This is the purpose of phospho. phospho is an &lt;strong&gt;&lt;a href="https://github.com/phospho-app/phospho"&gt;open-source text analytics tool for LLM apps&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;It gathers all the tools that enable your team to go from prototype to product at record pace: testing, evaluation, monitoring at the semantic level. Let’s deep dive:&lt;/p&gt;

&lt;h3&gt;
  
  
  Build and Test
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxk0ig4hru988fzofk811.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxk0ig4hru988fzofk811.png" alt="phospho test" width="800" height="473"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Define your own textual event detection pipeline&lt;/li&gt;
&lt;li&gt;Set up webhooks and enforce guardrails&lt;/li&gt;
&lt;li&gt;Assess the quality before releasing with personalized evals, continuously A/B test&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Understand and Analyze
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmqcqwp2cpsvl3ytnla5k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmqcqwp2cpsvl3ytnla5k.png" alt="phospho dashboard" width="800" height="473"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Detect usage patterns, categorize interactions by type, topics, intents, and more&lt;/li&gt;
&lt;li&gt;Evaluate app response quality&lt;/li&gt;
&lt;li&gt;Run tests at scale and in real-time&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Improve and Take Action
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhgkyd59kga6rsi9j8ik2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhgkyd59kga6rsi9j8ik2.png" alt="phospho events webhook" width="800" height="473"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Trigger workflows, escalations, and alerts based on detected events or evaluations&lt;/li&gt;
&lt;li&gt;Dive deeper into the data; get consolidated reports through the platform or via the API&lt;/li&gt;
&lt;li&gt;Break down the analysis by users or sessions&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Integrations
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://docs.phospho.ai/integrations/python/logging"&gt;Python&lt;/a&gt;&lt;/strong&gt; and &lt;strong&gt;&lt;a href="https://docs.phospho.ai/integrations/javascript/logging"&gt;Javascript&lt;/a&gt;&lt;/strong&gt; SDKs to easily integrate in your LLM stack&lt;/p&gt;

&lt;p&gt;Phospho can be &lt;strong&gt;&lt;a href="https://github.com/phospho-app/phospho"&gt;self hosted&lt;/a&gt;&lt;/strong&gt; or used in a &lt;strong&gt;&lt;a href="https://phospho.ai/"&gt;managed cloud version&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  How to use this text analytics tool ?
&lt;/h3&gt;

&lt;p&gt;✨ The repo is open-source on &lt;strong&gt;&lt;a href="https://github.com/phospho-app/phospho"&gt;Github&lt;/a&gt;&lt;/strong&gt;. Join the &lt;strong&gt;&lt;a href="https://discord.com/invite/MXqBJ9pBsx"&gt;Discord&lt;/a&gt; here&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;⚙️ If you run an LLM app, try the platform for &lt;strong&gt;&lt;a href="https://phospho.ai/"&gt;free&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;

</description>
      <category>llm</category>
      <category>analytics</category>
      <category>opensource</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
