<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: jianzs@pluto-lang</title>
    <description>The latest articles on Forem by jianzs@pluto-lang (@jianzs).</description>
    <link>https://forem.com/jianzs</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1216267%2F400ed481-7d6e-4701-ab0b-07f4ad7a4ab5.jpeg</url>
      <title>Forem: jianzs@pluto-lang</title>
      <link>https://forem.com/jianzs</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/jianzs"/>
    <language>en</language>
    <item>
      <title>Bridging the Last Mile in LangChain Application Development</title>
      <dc:creator>jianzs@pluto-lang</dc:creator>
      <pubDate>Thu, 16 May 2024 14:06:21 +0000</pubDate>
      <link>https://forem.com/jianzs/bridging-the-last-mile-in-langchain-application-development-2i2m</link>
      <guid>https://forem.com/jianzs/bridging-the-last-mile-in-langchain-application-development-2i2m</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Deploying LangChain applications can be complex due to the need for various cloud services. This article explores the challenges developers face and introduces Pluto, a tool that enables developers to focus on writing business logic rather than getting bogged down in tedious configuration tasks.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Undoubtedly, &lt;a href="https://github.com/langchain-ai/langchain" rel="noopener noreferrer"&gt;LangChain&lt;/a&gt; is the most popular framework for AI application development at the moment. The advent of LangChain has greatly simplified the construction of AI applications based on Large Language Models (LLM). If we compare an AI application to a person, the LLM would be the "brain," while LangChain acts as the "limbs" by providing various tools and abstractions. Combined, they enable the creation of AI applications capable of "thinking." However, this article does not delve into the specific usage of LangChain but aims to discuss with readers the last-mile issue in LangChain application development—&lt;strong&gt;how to deploy LangChain applications&lt;/strong&gt;, using AWS as an example. &lt;em&gt;Why deploy on AWS? The free tier is simply too appealing for daily use.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;First, let's define the scope of this discussion: we are not talking about only deploying a LangChain application's code to the cloud. If that were the case, we would only need to consider using services like EC2 virtual machines, Fargate container services, or Lambda functions. However, a complete AI application often requires a series of backend services for support, such as using databases to save session histories or vector databases to store knowledge base embeddings. To achieve more comprehensive AI application functionalities, we might also need messaging queues, API Gateways, and so on. Therefore, we will discuss: &lt;strong&gt;How to deploy a LangChain application and its dependent backend services to the cloud together.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  🔗 LangServe
&lt;/h2&gt;

&lt;p&gt;Those familiar with the LangChain ecosystem might think of LangChain's sub-project &lt;a href="https://github.com/langchain-ai/langserve" rel="noopener noreferrer"&gt;LangServe&lt;/a&gt; upon reading this. LangServe's goal is to simplify the deployment of LangChain applications. It can package LangChain apps into API servers and provides default endpoints such as stream, async, docs, and playground. But LangServe alone does not resolve the deployment issues of LangChain applications. It ultimately provides an API server based on FastAPI, akin to frameworks like Flask and Django. How to deploy LangServe applications to the cloud and how to create and manage the dependent backend services remain unanswered by LangServe.&lt;/p&gt;

&lt;p&gt;Nonetheless, LangChain is actively offering hosted LangServe capabilities on the LangSmith platform, indicating that the LangChain community is aware of the deployment issues and is working towards a solution. Even so, what about the backend services that LangChain applications depend on? Should LangSmith also provide these services? Aren't application hosting and backend services core competencies of cloud service providers? Why not directly use services from AWS, Azure, and others?&lt;/p&gt;

&lt;h2&gt;
  
  
  🪄 Three Ways to Deploy LangChain Applications
&lt;/h2&gt;

&lt;p&gt;Let's examine how to deploy LangChain applications on AWS. Here, we introduce three different methods for deployment. If you have better approaches, feel free to join the discussion.&lt;/p&gt;

&lt;h3&gt;
  
  
  ⚙️ AWS CDK
&lt;/h3&gt;

&lt;p&gt;In an AWS GenAI Day event, AWS invited Harrison Chase, the CEO of LangChain. The theme was "Building and Deploying Cutting-edge Generative AI Applications with LangChain and Amazon Bedrock." You can watch the recorded event &lt;a href="https://aws.amazon.com/startups/learn/build-and-deploy-a-cutting-edge-generative-ai-application-with-langchain-and-amazon-bedrock" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Interestingly, a significant portion of the event was dedicated to introducing how AWS's services like OpenSearch, Bedrock, Kendra, etc., can integrate with LangChain. However, the final demonstration did not show how to create instances of these services or how to deploy LangChain applications on AWS. Instead, it showcased a locally executed LangChain application that utilized pre-deployed instances of AWS Bedrock and Kendra services.&lt;/p&gt;

&lt;p&gt;However, I found a &lt;a href="https://github.com/langchain-ai/langchain-aws-template" rel="noopener noreferrer"&gt;&lt;code&gt;langchain-aws-template&lt;/code&gt; GitHub repository&lt;/a&gt; in the Resources list at the end of the video. It contains two example applications integrating AWS with LangChain, complete with deployment guides. The deployment process includes four steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a specific Python environment using Conda;&lt;/li&gt;
&lt;li&gt;Configure keys and other necessary application data;&lt;/li&gt;
&lt;li&gt;Execute a Bash script to package the application;&lt;/li&gt;
&lt;li&gt;Deploy the application using AWS CDK.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;It seems quite straightforward, right? But if you need to implement more complex features and rely on more backend services, you will have to modify the packaging process and the CDK deployment scripts. This can be challenging for developers unfamiliar with AWS CDK or AWS cloud services.&lt;/p&gt;

&lt;p&gt;Additionally, we previously conducted a &lt;a href="https://pluto-lang.vercel.app/cookbook/langchain-llama2-chatbot-sagemaker" rel="noopener noreferrer"&gt;comparison&lt;/a&gt; that found using IaC-based deployment methods (like Terraform) results in &lt;strong&gt;IaC code that is 2-3 times the volume of the business code&lt;/strong&gt;. This means using IaC tools requires spending a significant amount of time maintaining IaC code, while developers clearly prefer to focus more on business code development, as the goal is to implement application functionalities.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;A quick note here: AWS CDK is one type of Infrastructure as Code (IaC) tool, alongside others like Terraform and Pulumi. They have similar usage patterns and share the issues mentioned above.&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  ⌨️ AWS Console
&lt;/h3&gt;

&lt;p&gt;If we don't use AWS CDK, we can manually create the backend services that applications depend on by logging into the AWS console. However, this method is quite cumbersome, involving repetitive navigation across different console pages to create various service instances and configure permissions between them. Moreover, these processes cannot be automated, making team collaboration, continuous integration, and continuous deployment impractical for complex applications.&lt;/p&gt;

&lt;p&gt;As we can see from the above, both the AWS CDK deployment method and manual creation have their difficulties:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Prone to errors: Both methods essentially involve manually creating granular service instances, which can lead to configuration omissions and errors that are hard to detect during deployment and only surface when the application runs.&lt;/li&gt;
&lt;li&gt;Requires AWS background knowledge: Whether defining service instances through CDK code or manually creating them via the console, developers need an in-depth understanding of AWS services, including direct dependencies like DynamoDB, S3, and indirect ones like IAM.&lt;/li&gt;
&lt;li&gt;Tedious permission configuration: For security reasons, we usually adhere to the principle of least privilege when configuring permissions for resource service instances. If developers manually manage these permissions through CDK or the console, it will undoubtedly be a very cumbersome process. Moreover, it's easy to forget to update permission configurations after modifying business code.&lt;/li&gt;
&lt;li&gt;Dependency management: When publishing a LangChain application as an AWS Lambda function instance, we need to package the application's SDK dependencies during the packaging process. This requires manual management by developers, which can lead to missed dependencies. If the local device's operating system or CPU architecture doesn't match AWS's platform, the packaging process becomes even more troublesome.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  🤖️ Pluto
&lt;/h3&gt;

&lt;p&gt;The analysis above shows that despite the powerful services offered by large cloud service providers like AWS, there's still a significant learning curve for developers to effectively utilize these services. This led us to an idea: what if we could deduce the infrastructure resource requirements of an application directly from the LangChain application code, and then automatically create corresponding resource instances on cloud platforms like AWS? This approach could simplify the process of resource creation and application deployment. Based on this idea, we developed a research and development tool named &lt;a href="https://github.com/pluto-lang/pluto" rel="noopener noreferrer"&gt;Pluto&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Pluto is a tool designed for individual developers, aimed at &lt;strong&gt;making the construction of cloud and AI applications more convenient&lt;/strong&gt;, addressing the aforementioned usability issues related to cloud services. Developers can &lt;strong&gt;directly define and use the cloud services their application requires within the application code&lt;/strong&gt;, including AWS DynamoDB, SageMaker, and more. Pluto employs &lt;strong&gt;static program analysis&lt;/strong&gt; to automatically extract the infrastructure requirements from the code and create the necessary service instances on the designated cloud platform.&lt;/p&gt;

&lt;p&gt;So, what's it like to deploy a LangChain application using Pluto? Let's look at a simple example:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pluto_client&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Router&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;HttpRequest&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;HttpResponse&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_core.pydantic_v1&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SecretStr&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_core.prompts&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatPromptTemplate&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_core.output_parsers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;StrOutputParser&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatOpenAI&lt;/span&gt;

&lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ChatPromptTemplate&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_template&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tell me a short joke about {topic}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ChatOpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-3.5-turbo&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;SecretStr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;output_parser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;StrOutputParser&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;HttpRequest&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;HttpResponse&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;chain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;output_parser&lt;/span&gt;
    &lt;span class="n"&gt;topic&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;topic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ice cream&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;joke&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;topic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;HttpResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;joke&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;router&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Router&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pluto&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;router&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The code snippet above is a LangChain application implemented with Pluto, and it looks just like a regular Python application, doesn't it? However, by simply executing &lt;code&gt;pluto deploy&lt;/code&gt;, Pluto can construct the application architecture shown in the figure below on the AWS platform. During the process, it will automatically create instances for the API Gateway and Lambda, and configure the routing, triggers, permissions, etc., from API Gateway to Lambda.&lt;/p&gt;

&lt;p&gt;
  &lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcmr37qioooa13i8zr8ak.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcmr37qioooa13i8zr8ak.png"&gt;&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;Due to space limitations, the example above only demonstrates the integration of a LangChain application with the API Gateway resource. Similarly, using variable definitions, you can &lt;strong&gt;integrate more resources such as DynamoDB, S3, SageMaker&lt;/strong&gt;, and more. You can find additional examples &lt;a href="https://pluto-lang.vercel.app/cookbook" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Since the infrastructure configuration is defined alongside the application code, developers can freely alter the code according to their needs. In the next &lt;code&gt;pluto deploy&lt;/code&gt;, Pluto will automatically update the infrastructure configuration of the application without any extra steps from the developer, thus solving the previously mentioned issues of error-proneness, code packaging, and cumbersome permission configurations.&lt;/p&gt;

&lt;h2&gt;
  
  
  💬 Conclusion
&lt;/h2&gt;

&lt;p&gt;The analysis above shows that despite the powerful services offered by large cloud service providers like AWS, there's still a significant learning curve for developers to effectively utilize these services. This might explain the emergence of AI Infra products like &lt;a href="https://smith.langchain.com" rel="noopener noreferrer"&gt;LangSmith&lt;/a&gt;, &lt;a href="https://modal.com" rel="noopener noreferrer"&gt;Modal&lt;/a&gt;, and &lt;a href="https://www.lepton.ai" rel="noopener noreferrer"&gt;LaptonAI&lt;/a&gt;, which aim to be one-stop service providers for AI applications. We, on the other hand, take a different approach by directly deducing the infrastructure requirements from the application code and automatically creating corresponding service instances on cloud platforms, thus assisting developers with deployment issues. Our hope is to enable developers to focus on writing business logic, allowing even those unfamiliar with AWS to deploy applications to the cloud without getting bogged down in the tedious configuration of infrastructure.&lt;/p&gt;

&lt;p&gt;Lastly, if you like the Pluto project and want to give it a try, you can visit our &lt;a href="https://pluto-lang.vercel.app/documentation/getting-started" rel="noopener noreferrer"&gt;Getting Started guide&lt;/a&gt;, which offers various usage options including containers and online. If you have any questions or suggestions, or would like to contribute (very welcome!), feel free to &lt;a href="https://github.com/pluto-lang/community" rel="noopener noreferrer"&gt;join our community&lt;/a&gt; to participate in discussions and co-creation.&lt;/p&gt;

&lt;p&gt;And finally, if you've come this far, why not &lt;strong&gt;give us a star🌟&lt;/strong&gt;? GitHub link 👉 &lt;a href="https://github.com/pluto-lang/pluto" rel="noopener noreferrer"&gt;https://github.com/pluto-lang/pluto&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Reference
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://pluto-lang.vercel.app/cookbook" rel="noopener noreferrer"&gt;Recipe Collection&lt;/a&gt;:

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://pluto-lang.vercel.app/cookbook/langchain-llama2-chatbot-sagemaker-python" rel="noopener noreferrer"&gt;Llama2 Conversational Chatbot&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://pluto-lang.vercel.app/cookbook/rag-qa-bot-with-web" rel="noopener noreferrer"&gt;RAG Document Bot&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://pluto-lang.vercel.app/cookbook/deploy-langserve-to-aws" rel="noopener noreferrer"&gt;Deploying LangServe Applications to AWS&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;a href="https://github.com/langchain-ai/langchain" rel="noopener noreferrer"&gt;LangChain GitHub&lt;/a&gt;.&lt;/li&gt;

&lt;li&gt;

&lt;a href="https://github.com/langchain-ai/langserve" rel="noopener noreferrer"&gt;LangServe GitHub&lt;/a&gt;.&lt;/li&gt;

&lt;li&gt;

&lt;a href="https://aws.amazon.com/cn/free" rel="noopener noreferrer"&gt;AWS Free Resources&lt;/a&gt;.&lt;/li&gt;

&lt;li&gt;

&lt;a href="https://aws.amazon.com/startups/learn/build-and-deploy-a-cutting-edge-generative-ai-application-with-langchain-and-amazon-bedrock" rel="noopener noreferrer"&gt;Build and deploy a cutting-edge generative AI application with LangChain and Amazon Bedrock | AWS GenAI Day&lt;/a&gt;.&lt;/li&gt;

&lt;li&gt;

&lt;a href="https://github.com/langchain-ai/langchain-aws-template" rel="noopener noreferrer"&gt;LangChain AWS Template&lt;/a&gt;.&lt;/li&gt;

&lt;li&gt;

&lt;a href="https://github.com/pluto-lang/pluto" rel="noopener noreferrer"&gt;Pluto GitHub&lt;/a&gt;.&lt;/li&gt;

&lt;li&gt;

&lt;a href="https://github.com/pluto-lang/community" rel="noopener noreferrer"&gt;Pluto Community&lt;/a&gt;.&lt;/li&gt;

&lt;li&gt;

&lt;a href="https://pluto-lang.vercel.app/zh-CN" rel="noopener noreferrer"&gt;Pluto Website&lt;/a&gt;.&lt;/li&gt;

&lt;/ul&gt;

</description>
      <category>langchian</category>
      <category>aws</category>
      <category>infrastructureascode</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Craft a Document QA Assistant for Your Project in Just 5 Minutes!</title>
      <dc:creator>jianzs@pluto-lang</dc:creator>
      <pubDate>Mon, 13 May 2024 11:55:36 +0000</pubDate>
      <link>https://forem.com/jianzs/craft-a-document-qa-assistant-for-your-project-in-just-5-minutes-40pc</link>
      <guid>https://forem.com/jianzs/craft-a-document-qa-assistant-for-your-project-in-just-5-minutes-40pc</guid>
      <description>&lt;p&gt;Let me introduce you to an incredibly simple way to build a document Q&amp;amp;A assistant, which allows you to create a personalized Web Q&amp;amp;A assistant based on your GitHub documentation repository in &lt;strong&gt;just 5 minutes&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;First, let's take a look at the result. You can also experience it by opening &lt;a href="https://xw3vdvjmyp7jig7tmrvrqbisiu0peosf.lambda-url.us-east-1.on.aws/" rel="noopener noreferrer"&gt;this link&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fchmgn97xnrcjkjl472ry.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fchmgn97xnrcjkjl472ry.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Brief Introduction
&lt;/h2&gt;

&lt;p&gt;First, let's quickly go over the working principle and implementation method of this document Q&amp;amp;A assistant.&lt;/p&gt;

&lt;p&gt;The main logic of this document assistant is: upon initialization, it downloads documents from the GitHub repository where the documents are stored. Then, through LangChain, it calls OpenAI's Embeddings model to generate vectors for the documents and saves them to AWS S3 to avoid wasting Tokens by recreating document vectors. When a user inputs a question, LangChain is used again to generate vectors for the question. Then, the FAISS vector database retrieves related documents, and finally, GPT-3.5 is used to synthesize an answer. Additionally, this document Q&amp;amp;A assistant automatically updates the document vectors at 0:00 UTC every day.&lt;/p&gt;

&lt;p&gt;To construct this document Q&amp;amp;A assistant, we utilized a series of frameworks and tools including LangChain, FastUI, and Pluto, and eventually deployed it on AWS. LangChain is used for the main Q&amp;amp;A functionality, invoking models such as OpenAI's Embedding and GPT-3.5; FastUI is used for the Web interface shown above; Pluto is in charge of creating and configuring cloud resources, as well as application deployment. With Pluto, you can create and configure resources such as AWS Lambda and S3 by just creating a few variables in the code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;p&gt;To make it easy for everyone to get started, I've made this document Q&amp;amp;A assistant into a CodeSandbox template. Simply click the link below to open CodeSandbox's online IDE, then click Fork in the top right corner to copy the project to your account. After that, you are free to modify the code and deploy it to AWS with one click.&lt;br&gt;
&lt;a href="https://codesandbox.io/p/devbox/pluto-doc-qa-bot-forked-8njh42" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcodesandbox.io%2Fstatic%2Fimg%2Fplay-codesandbox.svg" alt="Open in CodeSandbox"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Next, I will introduce how to step by step build your own document Q&amp;amp;A assistant.&lt;/p&gt;

&lt;h3&gt;
  
  
  Prepare Tokens
&lt;/h3&gt;

&lt;p&gt;Before you start, you need to prepare several key Tokens, including a GitHub Token for downloading document data from GitHub, an OpenAI API Key for invoking OpenAI models, and AWS credentials for application deployment, among others.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GitHub Token: You can create a GitHub Token on this page &lt;a href="https://github.com/settings/tokens/new" rel="noopener noreferrer"&gt;New personal access token (classic)&lt;/a&gt;, with just the &lt;code&gt;public_repo&lt;/code&gt; permission required.&lt;/li&gt;
&lt;li&gt;OpenAI API Key: You can obtain an API Key from the &lt;a href="https://platform.openai.com/account/api-keys" rel="noopener noreferrer"&gt;OpenAI platform&lt;/a&gt;. Of course, you can also use other APIs that are compatible with the OpenAI API, where there's a place to configure &lt;code&gt;base_url&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;AWS Credentials: You need to get your AWS Access Key and Secret Key from &lt;a href="https://us-east-1.console.aws.amazon.com/iam/home?region=us-east-1#security_credential" rel="noopener noreferrer"&gt;AWS's console&lt;/a&gt; for deploying the application to AWS later.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Modify Basic Configuration
&lt;/h3&gt;

&lt;p&gt;Once you enter the development environment, the console will automatically display the Configure AWS Certificate tab. Enter your AWS certificate information here to ensure the application can be successfully deployed to AWS. You can leave the &lt;code&gt;output format&lt;/code&gt; field blank. After filling in the other necessary information, if everything is correct, you will see a green check mark ✔️ next to the tab name.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjspugatnzb170h55vzjc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjspugatnzb170h55vzjc.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Next, we need to modify the configuration of the assistant. Open the &lt;code&gt;app/main.py&lt;/code&gt; file, starting from line 25, the following lines contain the assistant's basic configuration, including the GitHub repository where the documents are stored, the repository branch, the relative path of the documents in the repository, OpenAI's API Key, GitHub Token, etc. You will need to modify these configurations according to your actual situation.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="n"&gt;PROJECT_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Pluto&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="c1"&gt;# Project name, related to the title of the Web page
&lt;/span&gt;&lt;span class="n"&gt;REPO&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pluto-lang/website&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="c1"&gt;# GitHub repository storing the documents
&lt;/span&gt;&lt;span class="n"&gt;BRANCH&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;main&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="c1"&gt;# Branch of the repository
&lt;/span&gt;&lt;span class="n"&gt;DOC_RELATIVE_PATH&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="c1"&gt;# Relative path of the documents in the repository
&lt;/span&gt;
&lt;span class="n"&gt;OPENAI_BASE_URL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.openai.com/v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="c1"&gt;# Base URL of the OpenAI API
&lt;/span&gt;&lt;span class="n"&gt;OPENAI_API_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;replace_with_your_openai_api_key&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="c1"&gt;# OpenAI API Key
&lt;/span&gt;&lt;span class="n"&gt;GITHUB_ACCESS_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;replace_with_your_github_access_key&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="c1"&gt;# GitHub Token
&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;To customize your robot, such as changing the style of the robot's responses, you can achieve this by altering the &lt;code&gt;prompt&lt;/code&gt; variable in the code.&lt;/p&gt;

&lt;h3&gt;
  
  
  One-Click Deployment
&lt;/h3&gt;

&lt;p&gt;After the configuration is complete, simply click the terminal icon and select &lt;code&gt;Deploy&lt;/code&gt; from the menu to deploy the document Q&amp;amp;A assistant to AWS, without any additional steps on your part. The deployment may take about a minute. Once it's done, the deployment URL will be displayed in the console, and you can access the document Q&amp;amp;A assistant by clicking this URL!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0rp5xkocy1byd4i2cbbo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0rp5xkocy1byd4i2cbbo.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Note that due to AWS Lambda cold starts and the delay in building the vector database, the first visit may require a few seconds to a few tens of seconds of patience.&lt;/p&gt;

&lt;h3&gt;
  
  
  Destruction
&lt;/h3&gt;

&lt;p&gt;If you want to take the application offline, simply click &lt;code&gt;Destroy&lt;/code&gt; from the menu, and the resources created in AWS will be completely deleted.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjorvs37n7rcobok1y3hs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjorvs37n7rcobok1y3hs.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;The implementation of this document Q&amp;amp;A assistant is very simple; you only need to modify a few configurations to deploy it on AWS.&lt;/p&gt;

&lt;p&gt;The reason it can be implemented so effortlessly is that you don't need to worry about the creation and configuration of cloud resources or the deployment of applications; this is mainly thanks to Pluto's capabilities. Pluto can automatically deduce the resources an application depends on from the code and automatically create and configure these resources, allowing developers to focus on the implementation of business logic.&lt;/p&gt;

&lt;p&gt;If you want to learn more about Pluto's features, feel free to visit Pluto's &lt;a href="https://pluto-lang.vercel.app" rel="noopener noreferrer"&gt;official documentation&lt;/a&gt; or the &lt;a href="https://github.com/pluto-lang/pluto" rel="noopener noreferrer"&gt;GitHub repository&lt;/a&gt;. If you could give a Star🌟, that would be even better! We also welcome everyone to submit issues and PRs.&lt;/p&gt;

&lt;h2&gt;
  
  
  More Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;If you want to create based on Llama3, SageMaker, you can refer to this case: &lt;a href="https://pluto-lang.vercel.app/cookbook/rag-qa-bot-llama3" rel="noopener noreferrer"&gt;Document-Q&amp;amp;A assistant based on Llama3&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;If you want to support session functionality, you can refer to this case: &lt;a href="https://pluto-lang.vercel.app/zh-CN/cookbook/langchain-llama2-chatbot-sagemaker-python" rel="noopener noreferrer"&gt;Building a Llama2 Conversational Chatbot with AWS and LangChain&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Pluto Official Documentation: &lt;a href="https://pluto-lang.vercel.app" rel="noopener noreferrer"&gt;https://pluto-lang.vercel.app&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Pluto GitHub Repository: &lt;a href="https://github.com/pluto-lang/pluto" rel="noopener noreferrer"&gt;https://github.com/pluto-lang/pluto&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>rag</category>
      <category>langchain</category>
      <category>openai</category>
      <category>aws</category>
    </item>
    <item>
      <title>Deploy LangServe Application to AWS</title>
      <dc:creator>jianzs@pluto-lang</dc:creator>
      <pubDate>Tue, 07 May 2024 09:46:55 +0000</pubDate>
      <link>https://forem.com/jianzs/deploy-langserve-application-to-aws-3f65</link>
      <guid>https://forem.com/jianzs/deploy-langserve-application-to-aws-3f65</guid>
      <description>&lt;p&gt;&lt;strong&gt;This guide will introduce how to deploy the LangServe application to AWS with one click through Pluto, requiring only AWS access credentials, no need to learn AWS operations or log in to the AWS console.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/langchain-ai/langserve" rel="noopener noreferrer"&gt;LangServe&lt;/a&gt; is a subproject of LangChain, which can help developers deploy LangChain's Runnable and Chain through REST API. At the same time, it also provides a client for calling Runnable deployed on the server, including multiple versions such as Python and TypeScript, and provides Playground by default for online trial after deployment.&lt;/p&gt;

&lt;p&gt;You can get all the code for this example from &lt;a href="https://github.com/pluto-lang/pluto/tree/main/examples/deploy-langserve-to-aws/" rel="noopener noreferrer"&gt;here&lt;/a&gt;. &lt;a href="https://codesandbox.io/p/devbox/deploy-langserve-application-to-aws-csj8wj" rel="noopener noreferrer"&gt;This link&lt;/a&gt; provides an online IDE for this sample application. Click the Fork button in the upper right corner to create your own development environment, and then you can directly modify the code and deploy it to AWS in the browser.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;⚠️Note:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Since Pluto currently only supports single files, the code of the LangServe application needs to be placed in one file.&lt;/li&gt;
&lt;li&gt;Limited by the current packaging method of Pluto, it does not yet support LangChain's &lt;a href="https://github.com/langchain-ai/langchain/tree/master/templates" rel="noopener noreferrer"&gt;Template Ecosystem&lt;/a&gt;. &lt;em&gt;Coming soon&lt;/em&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Environment Preparation
&lt;/h2&gt;

&lt;p&gt;If you have not configured the Pluto development environment, please refer to the local development in &lt;a href="https://pluto-lang.vercel.app/documentation/getting-started" rel="noopener noreferrer"&gt;Getting Started&lt;/a&gt; for configuration, or you can use the online sandbox or container provided by Pluto for experience.&lt;/p&gt;

&lt;h2&gt;
  
  
  Developing LangServe Application
&lt;/h2&gt;

&lt;p&gt;Here we introduce two different ways to develop LangServe applications: one is the development method mentioned in the &lt;a href="https://github.com/langchain-ai/langserve" rel="noopener noreferrer"&gt;langserve tutorial&lt;/a&gt;, using the &lt;code&gt;langchain app new&lt;/code&gt; command to create a new LangChain application; the other is using the &lt;code&gt;pluto new&lt;/code&gt; command to create a new Pluto application.&lt;/p&gt;

&lt;h3&gt;
  
  
  Method 1: langchain app new
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Install LangChain CLI
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;langchain-cli
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Create LangServe Application
&lt;/h4&gt;

&lt;p&gt;Use the &lt;code&gt;langchain app new&lt;/code&gt; command to create a new LangChain application. This command will create a new directory in the current directory, and the directory name is the application name you specified:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;langchain app new &lt;span class="nt"&gt;--non-interactive&lt;/span&gt; my-app
&lt;span class="nb"&gt;cd &lt;/span&gt;my-app
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note: The &lt;code&gt;langchain app new&lt;/code&gt; command depends on &lt;code&gt;git&lt;/code&gt;, please make sure &lt;code&gt;git&lt;/code&gt; is installed in your environment. If you are using the container environment provided by Pluto, please execute this command &lt;code&gt;apt-get update &amp;amp;&amp;amp; apt-get install -y git&lt;/code&gt; to install &lt;code&gt;git&lt;/code&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  Write LangServe Application
&lt;/h4&gt;

&lt;p&gt;You can develop AI applications based on LangChain in the &lt;code&gt;app/server.py&lt;/code&gt; file according to your needs. In the end, you should develop one or more Runnable instances such as LangChain's Agent, Chain, etc. These instances can be added to FastAPI through the &lt;code&gt;add_routes&lt;/code&gt; method provided by LangServe, and then provided to users in the form of HTTP services.&lt;/p&gt;

&lt;p&gt;We take the &lt;a href="https://github.com/langchain-ai/langserve?tab=readme-ov-file#sample-application" rel="noopener noreferrer"&gt;sample application on the LangServe homepage&lt;/a&gt; as an example. This example uses the &lt;code&gt;add_routes&lt;/code&gt; method to add multiple Runnable instances of LangChain to FastAPI.&lt;/p&gt;

&lt;h4&gt;
  
  
  Modify Code to Adapt to Pluto
&lt;/h4&gt;

&lt;p&gt;Next, we need to adapt the LangServe application to the Pluto application so that Pluto can deploy it to AWS. The adaptation process is also very simple, just two steps&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;First, you need to put the code related to the FastAPI app into a function and make this function return the FastAPI app instance. Here we assume that the function name is &lt;code&gt;return_fastapi_app&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Then, replace the entire &lt;code&gt;if __name__ == "__main__":&lt;/code&gt; code block with the following 4 statements. You can modify &lt;code&gt;router_name&lt;/code&gt; to your favorite name. This name is related to the name of the Api Gateway instance created on AWS.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mangum&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Mangum&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pluto_client&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Router&lt;/span&gt;

&lt;span class="n"&gt;router&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Router&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;router_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;router&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/*&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;Mangum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;return_fastapi_app&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;api_gateway_base_path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/dev&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;raw&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The final code is as follows:&lt;/p&gt;

&lt;p&gt;&lt;span id="modified-code"&gt;&lt;/span&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;fastapi&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FastAPI&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.prompts&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatPromptTemplate&lt;/span&gt;
&lt;span class="c1"&gt;# from langchain.chat_models import ChatAnthropic, ChatOpenAI
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_anthropic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatAnthropic&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatOpenAI&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langserve&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;add_routes&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.pydantic_v1&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SecretStr&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mangum&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Mangum&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pluto_client&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Router&lt;/span&gt;

&lt;span class="n"&gt;OPENAI_API_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SecretStr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sk-EUk0Tal8cIkmG4vJF904F57a9eE241A8Ae72666fAxxxxxxx&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;ANTHROPIC_API_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SecretStr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sk-EUk0Tal8cIkmG4vJF904F57a9eE241A8Ae72666fAxxxxxxx&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ChatAnthropic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ChatPromptTemplate&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_template&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tell me a joke about {topic}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;return_fastapi_app&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="c1"&gt;# The langserve depends on this, but it may not come pre-installed.
&lt;/span&gt;    &lt;span class="c1"&gt;# So, we write it here to ensure it is installed.
&lt;/span&gt;    &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;sse_starlette&lt;/span&gt;

    &lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FastAPI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;LangChain Server&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1.0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A simple api server using Langchain&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s Runnable interfaces&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nf"&gt;add_routes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nc"&gt;ChatOpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/openai&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nf"&gt;add_routes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nc"&gt;ChatAnthropic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/anthropic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nf"&gt;add_routes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/joke&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;app&lt;/span&gt;

&lt;span class="n"&gt;router&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Router&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;router_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;router&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/*&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;Mangum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;return_fastapi_app&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;api_gateway_base_path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/dev&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;raw&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Deploy to AWS
&lt;/h4&gt;

&lt;p&gt;Before the official deployment, we need to initialize this project as a Pluto project, so that Pluto can recognize and deploy the project. Run the following command in the project root directory, Pluto will guide you to initialize the project interactively, please choose Python for the programming language:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pluto init
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After the initialization is completed, we need to install some necessary dependencies, execute the following two commands:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt;

&lt;span class="c"&gt;# When the Python version does not match, please modify the python version number in pyproject.toml&lt;/span&gt;
poetry add pluto-client mangum langchain-openai langchain_anthropic sse_starlette
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, we can deploy the LangServe application to AWS by executing the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;poetry shell
pluto deploy app/server.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Note: If your development environment is Arm64 architecture&lt;/strong&gt;, please install and start &lt;code&gt;docker&lt;/code&gt; in the environment. If you are using the container environment provided by Pluto, &lt;code&gt;docker&lt;/code&gt; has been installed in the environment, but you need to configure the &lt;code&gt;--privileged&lt;/code&gt; parameter when starting, and then manually start the &lt;code&gt;docker&lt;/code&gt; service in the container, the startup command is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dockerd &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /dev/null 2&amp;gt;&amp;amp;1 &amp;amp;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command will deploy your LangServe application as a serverless application to AWS, creating an Api Gateway instance and a Lambda function instance to handle requests. At the same time, the URL of AWS's Api Gateway will be printed in the terminal. You can access the deployed application by visiting this URL.&lt;/p&gt;

&lt;h3&gt;
  
  
  Method 2: pluto new
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Create Pluto Application
&lt;/h4&gt;

&lt;p&gt;Use the &lt;code&gt;pluto new&lt;/code&gt; command to create a new Pluto application. This command will interactively create a new Pluto application and create a new directory in the current directory. The directory name is the application name you specified. Please choose Python for the programming language:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pluto new
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After creation, enter the created application directory and install the necessary dependencies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; &amp;lt;project name&amp;gt;
npm &lt;span class="nb"&gt;install
&lt;/span&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Write LangServe Application
&lt;/h4&gt;

&lt;p&gt;You can develop AI applications based on LangChain in the &lt;code&gt;app/main.py&lt;/code&gt; file according to your needs. In the end, you should develop one or more Runnable instances such as LangChain's Agent, Chain, etc. These instances can be added to FastAPI through the &lt;code&gt;add_routes&lt;/code&gt; method provided by LangServe, and then provided to users in the form of HTTP services.&lt;/p&gt;

&lt;p&gt;However, here, we need to put the code related to the FastAPI app into a function and make this function return the FastAPI app instance, and finally encapsulate this function in the &lt;code&gt;all&lt;/code&gt; method of &lt;code&gt;Router&lt;/code&gt;, so that Pluto can deploy it to AWS.&lt;/p&gt;

&lt;p&gt;Take the sample application on the LangServe homepage as an example, the final code is the same as the adapted code in the previous method.&lt;/p&gt;

&lt;h4&gt;
  
  
  Deploy to AWS
&lt;/h4&gt;

&lt;p&gt;Ensure that all dependencies are installed, and you can deploy the LangServe application to AWS by executing the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pluto deploy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Note: If your development environment is Arm architecture&lt;/strong&gt;, please install and start &lt;code&gt;docker&lt;/code&gt; in the environment. If you are using the container environment provided by Pluto, &lt;code&gt;docker&lt;/code&gt; has been installed in the environment, but you need to configure the &lt;code&gt;--privileged&lt;/code&gt; parameter when starting, and then manually start the &lt;code&gt;docker&lt;/code&gt; service in the container, the startup command is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dockerd &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /dev/null 2&amp;gt;&amp;amp;1 &amp;amp;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;pluto deploy&lt;/code&gt; will deploy your LangServe application as a serverless application to AWS, creating an Api Gateway instance and a Lambda function instance to handle requests. At the same time, the URL of AWS's Api Gateway will be printed in the terminal. You can access the deployed application by visiting this URL.&lt;/p&gt;

&lt;h2&gt;
  
  
  Access
&lt;/h2&gt;

&lt;p&gt;After the deployment is complete, you can see the URL output by Pluto from the terminal. You can access your LangServe application through this URL.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;⚠️Note:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pluto does not yet support Stream access, and the result is still returned at once when using the &lt;code&gt;astream&lt;/code&gt; method of LangServe.&lt;/li&gt;
&lt;li&gt;Because the first load of LangChain dependencies may be slow, the first call to the LangServe service or access to the Playground may be slow, and it will automatically time out after 30 seconds. Therefore, if you encounter a timeout issue when accessing, please try again.&lt;/li&gt;
&lt;li&gt;Each instance of an AWS Lambda function can only handle one request at a time, and the initialization time of each LangChain Lambda instance is close to 2 minutes, so there may be request timeout issues in high concurrency situations.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Call via RemoteRunnable
&lt;/h3&gt;

&lt;p&gt;Still taking the &lt;a href="https://github.com/langchain-ai/langserve?tab=readme-ov-file#client" rel="noopener noreferrer"&gt;Client provided by the sample application on the LangServe homepage&lt;/a&gt; as an example, you only need to replace the local URL in the LangServe example with the URL output by Pluto.&lt;/p&gt;

&lt;p&gt;We did not use the Anthropic model, so we only retained the call of the OpenAI and Joke models. The modified Python client code is as follows, please replace the &lt;code&gt;https://fcz1u130w3.execute-api.us-east-1.amazonaws.com/dev&lt;/code&gt; in the code with the URL output by Pluto:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.schema&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SystemMessage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;HumanMessage&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.prompts&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatPromptTemplate&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.schema.runnable&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RunnableMap&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langserve&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RemoteRunnable&lt;/span&gt;


&lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;RemoteRunnable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://fcz1u130w3.execute-api.us-east-1.amazonaws.com/dev/openai/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;joke_chain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;RemoteRunnable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://fcz1u130w3.execute-api.us-east-1.amazonaws.com/dev/joke/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;sync_inoke&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;joke_chain&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;topic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;parrots&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;gt;&amp;gt; The result of `joke_chain.invoke({&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;topic&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;parrots&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;})` is:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;async_inoke&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;joke_chain&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ainvoke&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;topic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;parrots&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;gt;&amp;gt; The result of `await joke_chain.ainvoke({&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;topic&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;parrots&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;})` is:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="nc"&gt;SystemMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Act like either a cat or a parrot.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nc"&gt;HumanMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hello!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="c1"&gt;# Supports astream
&lt;/span&gt;    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;gt;&amp;gt; The result of `openai.astream(prompt)` is:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;astream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; | &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;flush&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;custom_chain&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ChatPromptTemplate&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_messages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Tell me a long story about {topic}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Can define custom chains
&lt;/span&gt;    &lt;span class="n"&gt;chain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="nc"&gt;RunnableMap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;openai&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;anthropic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;batch&lt;/span&gt;&lt;span class="p"&gt;([{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;topic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;parrots&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;topic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cats&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}])&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;gt;&amp;gt; The result of `chain.batch([{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;topic&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;parrots&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;}, {&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;topic&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;cats&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;}])` is:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="nf"&gt;sync_inoke&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;async_inoke&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="nf"&gt;custom_chain&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;


&lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The following figure shows the result of executing the Python client code:&lt;/p&gt;

&lt;p&gt;
   &lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpluto-lang.vercel.app%2Fassets%2Fdeploy-langserve-to-aws%2Fpyclient.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpluto-lang.vercel.app%2Fassets%2Fdeploy-langserve-to-aws%2Fpyclient.png" alt="Python Client Result"&gt;&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;The modified TypeScript client code is as follows, please replace the &lt;code&gt;&amp;lt;your-api-gateway-url&amp;gt;&lt;/code&gt; in the code with the URL output by Pluto:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;RemoteRunnable&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@langchain/core/runnables/remote&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;RemoteRunnable&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`&amp;lt;your-api-gateway-url&amp;gt;/joke/`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;cats&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Access via curl
&lt;/h3&gt;

&lt;p&gt;Similarly, you only need to replace the &lt;code&gt;&amp;lt;your-api-gateway-url&amp;gt;&lt;/code&gt; in the example with the URL output by Pluto:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;--location&lt;/span&gt; &lt;span class="nt"&gt;--request&lt;/span&gt; POST &lt;span class="s1"&gt;'&amp;lt;your-api-gateway-url&amp;gt;/joke/invoke'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--header&lt;/span&gt; &lt;span class="s1"&gt;'Content-Type: application/json'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--data-raw&lt;/span&gt; &lt;span class="s1"&gt;'{
        "input": {
            "topic": "cats"
        }
    }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The following figure shows the result of executing the curl command:&lt;/p&gt;

&lt;p&gt;
   &lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpluto-lang.vercel.app%2Fassets%2Fdeploy-langserve-to-aws%2Fcurl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpluto-lang.vercel.app%2Fassets%2Fdeploy-langserve-to-aws%2Fcurl.png" alt="Curl Result"&gt;&lt;/a&gt;
&lt;/p&gt;

&lt;h3&gt;
  
  
  Access Playground via Browser
&lt;/h3&gt;

&lt;p&gt;Due to the current routing policy of LangServe, we cannot directly access LangServe's Playground through the browser without modifying the code. After &lt;a href="https://github.com/langchain-ai/langserve/pull/580" rel="noopener noreferrer"&gt;this PR&lt;/a&gt; is merged, LangServe's Playground can be directly accessed through the browser.&lt;/p&gt;

&lt;p&gt;Now, we need to add an additional &lt;code&gt;add_routes&lt;/code&gt; method for each &lt;code&gt;add_routes&lt;/code&gt; method, and add the &lt;code&gt;/dev&lt;/code&gt; prefix to the &lt;code&gt;path&lt;/code&gt; parameter, so that LangServe's Playground can be accessed in the browser. Below is a sample code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nf"&gt;add_routes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nc"&gt;ChatOpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/openai&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;add_routes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nc"&gt;ChatOpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/dev/openai&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After modifying and deploying, you can access the Playground of the sample application through the following URL. Note that you need to add &lt;code&gt;/dev&lt;/code&gt; in the access path, that is, there are two &lt;code&gt;/dev&lt;/code&gt; in the path. Note that the URL may be redirected. If it is modified, please adjust the path and try again.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;OpenAI: &lt;code&gt;&amp;lt;your-api-gateway-url&amp;gt;/dev/openai/playground&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Anthropic: &lt;code&gt;&amp;lt;your-api-gateway-url&amp;gt;/dev/anthropic/playground&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Joke: &lt;code&gt;&amp;lt;your-api-gateway-url&amp;gt;/dev/joke/playground&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The two figures below show the results of accessing the Playground of OpenAI and Joke through the browser, respectively:&lt;/p&gt;

&lt;p&gt;
   &lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpluto-lang.vercel.app%2Fassets%2Fdeploy-langserve-to-aws%2Fopenai-playground.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpluto-lang.vercel.app%2Fassets%2Fdeploy-langserve-to-aws%2Fopenai-playground.png" alt="OpenAI Playground"&gt;&lt;/a&gt;
   &lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpluto-lang.vercel.app%2Fassets%2Fdeploy-langserve-to-aws%2Fjoke-playground.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpluto-lang.vercel.app%2Fassets%2Fdeploy-langserve-to-aws%2Fjoke-playground.png" alt="Joke Playground"&gt;&lt;/a&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  Cleanup
&lt;/h2&gt;

&lt;p&gt;If you want to take the deployed LangServe application offline from AWS, you only need to execute the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pluto destroy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;In this article, we have explored in detail how to use Pluto to deploy the LangServe application to the AWS cloud platform with one click. This method allows you to easily deploy the LangServe application to the cloud and implement remote calls and Playground access, even if you are not familiar with AWS operations.&lt;/p&gt;

&lt;p&gt;Pluto also provides the ability to automatically create resources such as DynamoDB, SNS, SageMaker, etc. You only need to write code, and &lt;code&gt;pluto deploy&lt;/code&gt; will automatically create and configure these resources on AWS, providing you with the computing, storage and other capabilities of the cloud more conveniently, helping you to develop powerful AI applications more easily, and realize your ideas💡. You can get more information from More Resources.&lt;/p&gt;

&lt;p&gt;We try to make the steps in this article as simple and easy to understand as possible, so even if you are not very familiar with Pluto or AWS, you can easily get started. If you encounter problems during reading and practice, or have new ideas, please feel free to seek help by &lt;a href="https://github.com/pluto-lang/pluto/issues/new/choose" rel="noopener noreferrer"&gt;submitting an issue&lt;/a&gt; or &lt;a href="https://join.slack.com/t/plutolang/shared_invite/zt-25gztklfn-xOJ~Xvl4EjKJp1Zn1NNpiw" rel="noopener noreferrer"&gt;joining the Pluto Slack community&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  More Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/langchain-ai/langserve" rel="noopener noreferrer"&gt;LangServe&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.langchain.com/" rel="noopener noreferrer"&gt;LangChain&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://pluto-lang.vercel.app/zh-CN" rel="noopener noreferrer"&gt;Pluto Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://pluto-lang.vercel.app/zh-CN/cookbook/" rel="noopener noreferrer"&gt;Pluto Sample Applications&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/pluto-lang/pluto" rel="noopener noreferrer"&gt;Pluto GitHub Repository&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://join.slack.com/t/plutolang/shared_invite/zt-25gztklfn-xOJ~Xvl4EjKJp1Zn1NNpiw" rel="noopener noreferrer"&gt;Pluto Slack Community&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Quick Experience
&lt;/h2&gt;

&lt;p&gt;Replace the &lt;code&gt;OPENAI_API_KEY&lt;/code&gt;, &lt;code&gt;ANTHROPIC_API_KEY&lt;/code&gt;, &lt;code&gt;AWS_ACCESS_KEY_ID&lt;/code&gt;, &lt;code&gt;AWS_SECRET_ACCESS_KEY&lt;/code&gt;, &lt;code&gt;AWS_REGION&lt;/code&gt; and other parameters in this script with your actual values, and then save this script to your local machine.&lt;/p&gt;

&lt;p&gt;Executing this script will automatically create a LangServe sample application and deploy it to AWS, finally outputting the deployed URL. You can refer to the Access section in the text above to access the deployed application.&lt;/p&gt;

&lt;p&gt;After execution, it will enter an interactive command line, making it easy for you to take the deployed application offline with &lt;code&gt;pluto destroy&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;your-openai-api-key&amp;gt;"&lt;/span&gt;
&lt;span class="nv"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;your-aws-access-key-id&amp;gt;"&lt;/span&gt;
&lt;span class="nv"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;your-aws-secret-access-key&amp;gt;"&lt;/span&gt;
&lt;span class="nv"&gt;AWS_REGION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"us-east-1"&lt;/span&gt;

&lt;span class="c"&gt;# Prepare the modified code of LangServe application&lt;/span&gt;
&lt;span class="nv"&gt;MODIFIED_CODE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;
from fastapi import FastAPI
from langchain.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
from langserve import add_routes
from langchain.pydantic_v1 import SecretStr

from mangum import Mangum
from pluto_client import Router

OPENAI_API_KEY = SecretStr("&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;")

model = ChatOpenAI(api_key=OPENAI_API_KEY)
prompt = ChatPromptTemplate.from_template("tell me a joke about {topic}")

def return_fastapi_app():
    # The langserve depends on this, but it may not come pre-installed.
    # So, we write it here to ensure it is installed.
    import sse_starlette

    app = FastAPI(
      title="LangChain Server",
      version="1.0",
      description="A simple api server using Langchain's Runnable interfaces",
    )

    add_routes(
      app,
      ChatOpenAI(api_key=OPENAI_API_KEY),
      path="/openai",
    )

    add_routes(
      app,
      ChatOpenAI(api_key=OPENAI_API_KEY),
      path="/dev/openai",
    )

    add_routes(
      app,
      prompt | model,
      path="/joke",
    )

    add_routes(
      app,
      prompt | model,
      path="/dev/joke",
    )

    return app

router = Router("router_name")
router.all(
    "/*",
    lambda *args, **kwargs: Mangum(return_fastapi_app(), api_gateway_base_path="/dev")(*args, **kwargs),
    raw=True,
)
&lt;/span&gt;&lt;span class="no"&gt;EOF
&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# Prepare the package.json file, used by the Pluto&lt;/span&gt;
&lt;span class="nv"&gt;PACKAGE_JSON&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;
{
  "name": "my-app",
  "private": true,
  "version": "0.0.0",
  "scripts": {
    "test:dev": "pluto test --sim",
    "test:prod": "pluto test",
    "deploy": "pluto deploy",
    "destroy": "pluto destroy"
  },
  "dependencies": {},
  "devDependencies": {
    "@types/node": "^20",
    "typescript": "^5.2.2",
    "@plutolang/base": "latest",
    "@plutolang/pluto-infra": "latest",
    "@pulumi/pulumi": "^3.88.0"
  },
  "main": "dist/index.js"
}
&lt;/span&gt;&lt;span class="no"&gt;EOF
&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# Prepare the Pluto configuration file&lt;/span&gt;
&lt;span class="nv"&gt;PLUTO_YML&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;
current: aws
language: python
stacks:
  - configs: {}
    name: aws
    platformType: AWS
    provisionType: Pulumi
&lt;/span&gt;&lt;span class="no"&gt;EOF
&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# Prepare the AWS credentials&lt;/span&gt;
&lt;span class="nv"&gt;AWS_CREDENTIALS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;
[default]
aws_access_key_id = &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;
aws_secret_access_key = &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;
&lt;/span&gt;&lt;span class="no"&gt;EOF
&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# Prepare the AWS configuration&lt;/span&gt;
&lt;span class="nv"&gt;AWS_CONFIG&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;
[default]
region = &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AWS_REGION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;
&lt;/span&gt;&lt;span class="no"&gt;EOF
&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# Prepare the script to run inside the Docker container&lt;/span&gt;
&lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="no"&gt;EOF1&lt;/span&gt;&lt;span class="sh"&gt; &amp;gt; script.sh
#!/bin/bash

apt update
apt install -y git

pip install langchain-cli poetry

langchain app new --non-interactive my-app
cd my-app

cat &amp;lt;&amp;lt; EOF2 &amp;gt; app/server.py
&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;MODIFIED_CODE&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;
EOF2

cat &amp;lt;&amp;lt; EOF3 &amp;gt; package.json
&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;PACKAGE_JSON&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;
EOF3

mkdir -p .pluto
cat &amp;lt;&amp;lt; EOF4 &amp;gt; .pluto/pluto.yml
&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;PLUTO_YML&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;
EOF4

npm install
sed -i 's/&lt;/span&gt;&lt;span class="se"&gt;\^&lt;/span&gt;&lt;span class="sh"&gt;3.11/&lt;/span&gt;&lt;span class="se"&gt;\^&lt;/span&gt;&lt;span class="sh"&gt;3.10/' pyproject.toml
poetry add pluto-client mangum langchain-openai sse_starlette

mkdir -p ~/.aws
cat &amp;lt;&amp;lt; EOF5 &amp;gt; ~/.aws/credentials
&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AWS_CREDENTIALS&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;
EOF5
cat &amp;lt;&amp;lt; EOF6 &amp;gt; ~/.aws/config
&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AWS_CONFIG&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;
EOF6

source &lt;/span&gt;&lt;span class="se"&gt;\$&lt;/span&gt;&lt;span class="sh"&gt;(poetry env info --path)/bin/activate
pluto deploy -y --force app/server.py

bash
&lt;/span&gt;&lt;span class="no"&gt;EOF1

&lt;/span&gt;&lt;span class="c"&gt;# Run the script inside the Docker container&lt;/span&gt;
docker run &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--platform&lt;/span&gt; linux/amd64 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;pwd&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;/script.sh:/script.sh &lt;span class="se"&gt;\&lt;/span&gt;
  plutolang/pluto:latest bash &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"bash /script.sh"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>webdev</category>
      <category>langchian</category>
      <category>aws</category>
      <category>pluto</category>
    </item>
    <item>
      <title>Integrate Llama3 Into Your Application with Just One Command!</title>
      <dc:creator>jianzs@pluto-lang</dc:creator>
      <pubDate>Thu, 25 Apr 2024 12:46:59 +0000</pubDate>
      <link>https://forem.com/jianzs/integrate-llama3-into-your-application-with-just-one-command-13mj</link>
      <guid>https://forem.com/jianzs/integrate-llama3-into-your-application-with-just-one-command-13mj</guid>
      <description>&lt;p&gt;Llama3, heralded as the world's first "open-source GPT4," has finally arrived!&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Llama3, the latest open-source Large Language Model (LLM) launched by Meta, includes the Llama3 8B with 8 billion parameters and the Llama3 70B with 70 billion parameters. Llama3 has made significant progress in performance, with the 8B model outperforming Gemma 7B and Mistral 7B Instruct across various benchmarks such as MMLU, GPQA, HumanEval, while the 70B model has surpassed the proprietary Claude 3 Sonnet and is on par with Google's Gemini Pro 1.5. Additionally, Meta is developing a version with more than 400 billion (400B) parameters, expected to have enhanced multilingual processing capabilities and the ability to understand non-textual patterns such as images.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;With Llama3 at our disposal, we can craft an array of innovative applications, from engaging chatbots to intelligent Retrieval-Augmented Generation (RAG) QA bots, and beyond. However, deploying Llama3, integrating the deployed Llama3 with one's application, and deploying the application itself can be challenging for many developers.&lt;/p&gt;

&lt;p&gt;This article introduces a development approach based on &lt;a href="https://github.com/pluto-lang/pluto"&gt;Pluto&lt;/a&gt;, which requires only writing application code and &lt;strong&gt;executing a single command to deploy Llama3 and release the application&lt;/strong&gt;. This article will use a RAG-based document QA bot as an example to demonstrate this development method. The main function of this QA bot is to retrieve project documentation from a specified GitHub repository and then use the Llama3 model to answer questions based on the document content.&lt;/p&gt;

&lt;p&gt;The following image shows the interaction with this QA bot, with the specified repository being Pluto's documentation repository. Thus, from the content of the image, one can get a basic understanding of what Pluto is:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy7omkbpg2hi0th8qh2ca.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy7omkbpg2hi0th8qh2ca.png" alt="Result" width="800" height="481"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Application Architecture
&lt;/h2&gt;

&lt;p&gt;The example application to be implemented is based on the LangChain framework and uses OpenAI Embeddings as the document vectorization tool. The entire application will be deployed on AWS, and the deployed architecture is shown in the figure below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feeysewaezapmaxjpb45d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feeysewaezapmaxjpb45d.png" alt="Arch" width="800" height="704"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Specifically, the deployed application will include the following AWS resource instances:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The Llama3 model will be deployed on SageMaker.&lt;/li&gt;
&lt;li&gt;An S3 bucket will be created to store the document vector database, thus avoiding the need to rebuild the vector database each time a Lambda function is started.&lt;/li&gt;
&lt;li&gt;A CloudWatch rule will be created to update the document vector database daily.&lt;/li&gt;
&lt;li&gt;Two Lambda instances will be created, one for receiving user query requests and the other for updating the document vector database.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In addition to creating these resource instances, it is also necessary to configure the dependencies between resources, including triggers, IAM roles, and permission policies. However, you don't need to worry about these complex creation and configuration processes, as &lt;strong&gt;Pluto can deduce this information from the code and then automatically complete the creation and configuration of resources&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setting Up Your Development Environment and Tokens
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Prepare the Development Environment
&lt;/h3&gt;

&lt;p&gt;First, you need to prepare the development environment for the Pluto application. Pluto provides three different development methods: container development, online development, and local development:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Container development: Create a container based on the &lt;code&gt;plutolang/pluto:latest&lt;/code&gt; container image to serve as the development environment for the Pluto application.&lt;/li&gt;
&lt;li&gt;Online development: Open and fork the &lt;a href="https://codesandbox.io/p/devbox/github/pluto-lang/codesandbox/tree/main/python?file=/README_zh.md"&gt;template application&lt;/a&gt; created on CodeSandbox to develop the application directly in the browser.&lt;/li&gt;
&lt;li&gt;Local development: Refer to the &lt;a href="https://pluto-lang.vercel.app/documentation/getting-started/local-development"&gt;Pluto Local Development Guide&lt;/a&gt; to configure the Pluto development environment locally.
You can learn about the detailed usage methods of various environments from the &lt;a href="https://pluto-lang.vercel.app/documentation/getting-started"&gt;Pluto Getting Started Guide&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Prepare AWS Resource Quotas
&lt;/h3&gt;

&lt;p&gt;To deploy the Llama3 8B model, at least the &lt;code&gt;ml.g5.2xlarge&lt;/code&gt; instance type is required, and to deploy the Llama3 70B model, at least the &lt;code&gt;ml.p4d.24xlarge&lt;/code&gt; instance type is required. The initial quota for these two types of instances is zero, so if you haven't applied for an increase, you may need to do so through the &lt;a href="https://console.aws.amazon.com/servicequotas/home"&gt;AWS Management Console&lt;/a&gt;. For a trial experience, you can also use the &lt;code&gt;TinyLlama-1.1B-Chat-v1.0&lt;/code&gt; model, which is compatible with the &lt;code&gt;ml.m5.xlarge&lt;/code&gt; instance.&lt;/p&gt;

&lt;h3&gt;
  
  
  Prepare Tokens
&lt;/h3&gt;

&lt;p&gt;This example application requires preparing API Keys from several different websites to implement the application functions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;OpenAI API Key: Used to access the OpenAI Embeddings API. You can obtain the API Key from &lt;a href="https://platform.openai.com/account/api-keys"&gt;OpenAI&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;GitHub Access Token: Used to fetch documents from the GitHub repository. You can create a personal access token from &lt;a href="https://github.com/settings/tokens"&gt;GitHub&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Hugging Face Hub Token: Used to download the model from the Hugging Face Hub when deploying the model on AWS SageMaker. You can obtain the Token from &lt;a href="https://huggingface.co/settings/tokens"&gt;Hugging Face&lt;/a&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Create a Pluto Application
&lt;/h2&gt;

&lt;p&gt;If you are in a local or container environment, execute the following command interactively to create a new Pluto application. After completion, a new project directory will be created in the current directory with the project name.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pluto new
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The cloud development environment already includes the basic project directory, so there is no need to create a new application.&lt;/p&gt;

&lt;h2&gt;
  
  
  Install Application Dependencies
&lt;/h2&gt;

&lt;p&gt;After entering the project root directory, add the following dependency libraries to the &lt;code&gt;requirements.txt&lt;/code&gt; file. These libraries are the dependencies required for this example application:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pluto_client
faiss-cpu
langchain-core
langchain-community
langchain-openai
langchain_text_splitters
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then execute the following two commands to install the dependency libraries:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install
&lt;/span&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Write Application Code
&lt;/h2&gt;

&lt;p&gt;After installing the dependencies, you can start writing the application program in the &lt;code&gt;app/main.py&lt;/code&gt; file. The example code for the document QA bot is attached at the end of this article. You can directly copy it into the &lt;code&gt;app/main.py&lt;/code&gt; file and then modify the configuration parameters as needed.&lt;/p&gt;

&lt;p&gt;Writing Pluto application code is similar to writing pure business code. Developers do not need to worry about how cloud resources are created and configured. They only need to define the resources required for the application and implement the application's business logic by creating objects in the code. When deploying, Pluto will automatically deduce the dependencies between cloud resources from the application logic and create and configure these resources on the cloud platform.&lt;br&gt;
For example, we can define a SageMaker endpoint with the Llama3 model deployed using the following code snippet:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;sagemaker&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SageMaker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llama3-model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-tgi-inference:2.1.1-tgi2.0.0-gpu-py310-cu121-ubuntu22.04-v2.0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nc"&gt;SageMakerOptions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;instanceType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ml.g5.2xlarge&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;envs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;HF_MODEL_ID&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;meta-llama/Meta-Llama-3-8B-Instruct&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;HF_TASK&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text-generation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;HUGGING_FACE_HUB_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;HUGGING_FACE_HUB_TOKEN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In subsequent code, we can directly call the &lt;code&gt;sagemaker.invoke()&lt;/code&gt; method to trigger the SageMaker endpoint or use &lt;code&gt;sagemaker.endpointName&lt;/code&gt; to get the name of the SageMaker endpoint. Pluto will automatically set the correct permissions for the functions calling the SageMaker instance, allowing the SageMaker endpoint to be called.&lt;/p&gt;

&lt;p&gt;Similarly, creating resources like S3 and Lambda is the same. However, Pluto provides an abstraction layer for these common resource types provided by most cloud platforms to reduce developers' learning costs and facilitate the feature of deploying directly to multiple cloud platforms without modifying the code.&lt;br&gt;
In the code snippet below, the Function resource type corresponds to Lambda functions on AWS and to Knative Service on Kubernetes, while the Bucket resource type corresponds to S3 storage buckets on AWS, and the corresponding type on Kubernetes is expected to be PV, which has not been implemented yet. Everyone is welcome to contribute.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;vector_store_bucket&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Bucket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vector-store&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nc"&gt;Function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;FunctionOptions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;memory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Deploy the Application
&lt;/h2&gt;

&lt;p&gt;After completing the application code, you only need to execute the following command to directly deploy the application on AWS:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pluto deploy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The process of creating SageMaker may take a long time, even exceeding 20 minutes. Please be patient. After deployment, Pluto will return a URL address, which you can use to access your application.&lt;/p&gt;

&lt;h2&gt;
  
  
  Test the Application
&lt;/h2&gt;

&lt;p&gt;This URL address accepts POST requests. The request body must be a JSON array containing only one element, corresponding to the parameter of the &lt;code&gt;query&lt;/code&gt; function in the code. You can use the &lt;code&gt;curl&lt;/code&gt; command or other HTTP client tools to interact with your application. For example, you can replace the &lt;code&gt;URL&lt;/code&gt; in the code snippet below with the URL address you received, and also replace the specific question with the one you want to ask. Then, execute the code to interact with your application:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST &amp;lt;URL&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'["What is Pluto?"]'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here is a simple interactive script. You can save this script to a file and then execute the file to interact with your application, achieving the effect shown at the beginning of this article.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="c"&gt;# set -o xtrace&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"NOTE: This QA bot cannot keep a conversation. It can only answer the question you just asked."&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;""&lt;/span&gt;

&lt;span class="nb"&gt;read&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"Input the URL that Pluto has outputted: "&lt;/span&gt; URL
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-z&lt;/span&gt; &lt;span class="nv"&gt;$URL&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Please set the BOT_URL env var first"&lt;/span&gt;
    &lt;span class="nb"&gt;exit &lt;/span&gt;1
&lt;span class="k"&gt;fi

&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;Now you can ask your question!"&lt;/span&gt;
&lt;span class="nv"&gt;user_message&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;
&lt;span class="k"&gt;while&lt;/span&gt; :&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Press 'q' to quit."&lt;/span&gt;
    &lt;span class="nb"&gt;read&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"User &amp;gt; "&lt;/span&gt; user_message
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nv"&gt;$user_message&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"q"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
        &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Bye. 👋"&lt;/span&gt;
        &lt;span class="nb"&gt;break
    &lt;/span&gt;&lt;span class="k"&gt;fi

    &lt;/span&gt;&lt;span class="nv"&gt;payload&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;jq &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="nt"&gt;--arg&lt;/span&gt; msg &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$user_message&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s1"&gt;'[$msg]'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
    &lt;span class="nv"&gt;response&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="nt"&gt;-w&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;%{http_code}"&lt;/span&gt; &lt;span class="nt"&gt;-X&lt;/span&gt; POST &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$URL&lt;/span&gt;&lt;span class="s2"&gt;?n=1"&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$payload&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s1"&gt;'Content-type: application/json'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

    &lt;span class="nv"&gt;http_code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;tail&lt;/span&gt; &lt;span class="nt"&gt;-n1&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
    &lt;span class="nv"&gt;response_content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;head&lt;/span&gt; &lt;span class="nt"&gt;-n-1&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
    &lt;span class="nv"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$response_content&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | jq &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;'.body'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
    &lt;span class="nv"&gt;code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$response_content&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | jq &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;'.code'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$http_code&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;-ne&lt;/span&gt; 200 &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$code&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;-ne&lt;/span&gt; 200 &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
        &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Server responded with error: &lt;/span&gt;&lt;span class="nv"&gt;$http_code&lt;/span&gt;&lt;span class="s2"&gt;, &lt;/span&gt;&lt;span class="nv"&gt;$response_content&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
        &lt;span class="nb"&gt;exit &lt;/span&gt;1
    &lt;span class="k"&gt;fi

    &lt;/span&gt;&lt;span class="nv"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$response_content&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | jq &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;'.body'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Bot  &amp;gt; &lt;/span&gt;&lt;span class="nv"&gt;$body&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Take Down the Application
&lt;/h2&gt;

&lt;p&gt;If you want to take down the application, you only need to execute the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pluto destroy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Extend the Application
&lt;/h2&gt;

&lt;p&gt;If you want to implement a &lt;strong&gt;session-based QA bot&lt;/strong&gt;, you can use the KVStore resource type to save the session. The example application "&lt;a href="https://pluto-lang.vercel.app/cookbook/langchain-llama2-chatbot-sagemaker-python"&gt;Session Chatbot&lt;/a&gt;" can be used as a reference.&lt;/p&gt;

&lt;p&gt;If you want to &lt;strong&gt;rewrite the application as a LangServe application&lt;/strong&gt; to use LangServe's RemoteRunable and Playground components, you can refer to the "&lt;a href="https://pluto-lang.vercel.app/cookbook/deploy-langserve-to-aws"&gt;Deploy LangServe to AWS&lt;/a&gt;" document.&lt;/p&gt;

&lt;h2&gt;
  
  
  More Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://pluto-lang.vercel.app/"&gt;Pluto Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/pluto-lang/pluto"&gt;Pluto GitHub Repository&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://pluto-lang.vercel.app/cookbook"&gt;Pluto Cookbook&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Example Code
&lt;/h2&gt;

&lt;p&gt;Below is the example code for the document QA bot. You can copy it into the &lt;code&gt;app/main.py&lt;/code&gt; file and modify the configuration parameters as needed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;sys&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pluto_client&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FunctionOptions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Function&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Bucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Schedule&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pluto_client.sagemaker&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SageMaker&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SageMakerOptions&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAIEmbeddings&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_core.pydantic_v1&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SecretStr&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_core.prompts&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PromptTemplate&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_core.runnables&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RunnablePassthrough&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_core.output_parsers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;StrOutputParser&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.vectorstores.faiss&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FAISS&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_text_splitters&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MarkdownTextSplitter&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.document_loaders.github&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;GithubFileLoader&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.llms.sagemaker_endpoint&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;SagemakerEndpoint&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;LLMContentHandler&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="c1"&gt;# ====== Configuration ======
# 1. The OpenAI API key is used to access the OpenAI Embeddings API. You can get the API key from
# https://platform.openai.com/account/api-keys
# 2. The GitHub Access Key is used to fetch the documents from the GitHub repository. You can create
# a personal access token from https://github.com/settings/tokens
# 3. The Hugging Face Hub token is used to download the model from the Hugging Face Hub when
# deploying the model on AWS SageMaker. You can get the token from
# https://huggingface.co/settings/tokens
&lt;/span&gt;
&lt;span class="n"&gt;REPO&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pluto-lang/website&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;BRANCH&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;main&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;DOC_RELATIVE_PATH&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;OPENAI_BASE_URL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.openai.com/v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;OPENAI_API_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;replace_with_your_key&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;GITHUB_ACCESS_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;replace_with_your_key&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;HUGGING_FACE_HUB_TOKEN&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;replace_with_your_key&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="c1"&gt;# ===========================
&lt;/span&gt;

&lt;span class="n"&gt;FAISS_INDEX&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;index&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;PKL_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;FAISS_INDEX&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.pkl&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;FAISS_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;FAISS_INDEX&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.faiss&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="n"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAIEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;OPENAI_BASE_URL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;SecretStr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;vector_store_bucket&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Bucket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vector-store&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
Deploy the Llama3 model on AWS SageMaker using the Hugging Face Text Generation Inference (TGI)
container. If you&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;re unable to deploy the model because of the instance type, consider using the
TinyLlama-1.1B-Chat-v1.0 model, which is compatible with the ml.m5.xlarge instance.

Below is a set up minimum requirements for each model size of Llama3 model:
Model      Instance Type      # of GPUs per replica
Llama 8B   ml.g5.2xlarge      1
Llama 70B  ml.p4d.24xlarge    8

The initial limit set for these instances is zero. If you need more, you can request an increase
in quota via the [AWS Management Console](https://console.aws.amazon.com/servicequotas/home).
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="n"&gt;sagemaker&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SageMaker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llama3-model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-tgi-inference:2.1.1-tgi2.0.0-gpu-py310-cu121-ubuntu22.04-v2.0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nc"&gt;SageMakerOptions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;instanceType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ml.g5.2xlarge&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;envs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;HF_MODEL_ID&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;meta-llama/Meta-Llama-3-8B-Instruct&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;HF_TASK&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text-generation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="c1"&gt;# If you want to deploy the Meta Llama3 model, you need to request a permission and
&lt;/span&gt;            &lt;span class="c1"&gt;# prepare the token. You can get the token from https://huggingface.co/settings/tokens
&lt;/span&gt;            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;HUGGING_FACE_HUB_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;HUGGING_FACE_HUB_TOKEN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ContentHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;LLMContentHandler&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;content_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;accepts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;transform_input&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model_kwargs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stop&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;model_kwargs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;model_kwargs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stop&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;|eot_id|&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;|eot_id|&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;model_kwargs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stop&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
            &lt;span class="n"&gt;model_kwargs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stop&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;|eot_id|&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;input_str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;inputs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;parameters&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;model_kwargs&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;input_str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;transform_output&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# type: ignore
&lt;/span&gt;        &lt;span class="n"&gt;response_json&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response_json&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;generated_text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

        &lt;span class="n"&gt;assistant_beg_flag&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&amp;lt;|end_header_id|&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="n"&gt;answerStartPos&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;index&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;assistant_beg_flag&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;assistant_beg_flag&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;answerStartPos&lt;/span&gt;&lt;span class="p"&gt;:].&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;answer&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;build_logger&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;logger&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getLogger&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setLevel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;INFO&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Create a console handler
&lt;/span&gt;    &lt;span class="n"&gt;handler&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;StreamHandler&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;flush&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stdout&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;flush&lt;/span&gt;
    &lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setLevel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;INFO&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Create a formatter and add it to the handler
&lt;/span&gt;    &lt;span class="n"&gt;formatter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Formatter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;%(asctime)s - %(name)s - %(levelname)s - %(message)s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setFormatter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;formatter&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Add the handler to the logger
&lt;/span&gt;    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;logger&lt;/span&gt;


&lt;span class="n"&gt;logger&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;build_logger&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_vector_store&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;FAISS&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Explicitly import Faiss to alert Pluto that this function relies on it, ensuring the inclusion
&lt;/span&gt;    &lt;span class="c1"&gt;# of the Faiss package in the deployment bundle.
&lt;/span&gt;    &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;faiss&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;file_filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file_path&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;match&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;DOC_RELATIVE_PATH&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/.*&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s"&gt;.mdx?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;file_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

    &lt;span class="n"&gt;loader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;GithubFileLoader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;REPO&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;branch&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;BRANCH&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;access_token&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;GITHUB_ACCESS_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;github_api_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.github.com&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;file_filter&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;file_filter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;loader&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No documents updated&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;
    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Loaded &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; documents&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Starting to split documents&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;text_splitter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MarkdownTextSplitter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk_overlap&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;splits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;text_splitter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Starting to create vector store&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;FAISS&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;splits&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Finished creating vector store&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;download_vector_store&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector_store_dir&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;ensure_dir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector_store_dir&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;vector_store_bucket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;PKL_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector_store_dir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PKL_KEY&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;vector_store_bucket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;FAISS_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector_store_dir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;FAISS_KEY&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;flush_vector_store&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector_store_dir&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/tmp/vector_store&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;vector_store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_vector_store&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;vector_store&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;

    &lt;span class="nf"&gt;ensure_dir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector_store_dir&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;vector_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save_local&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector_store_dir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;index_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;FAISS_INDEX&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;vector_store_bucket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;put&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;PKL_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector_store_dir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PKL_KEY&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;vector_store_bucket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;put&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;FAISS_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector_store_dir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;FAISS_KEY&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;build_retriever&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;vector_store_dir&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/tmp/vector_store&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exists&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector_store_dir&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Vector store not found, downloading...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="nf"&gt;download_vector_store&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector_store_dir&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Failed to download vector store: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="nf"&gt;flush_vector_store&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector_store_dir&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Loading vector store&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;vectorstore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;FAISS&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load_local&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;vector_store_dir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;allow_dangerous_deserialization&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Vector store loaded&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;vectorstore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;as_retriever&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;ensure_dir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dir&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exists&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dir&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;makedirs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dir&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_aws_region&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;aws_region&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWS_REGION&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;aws_region&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWS_REGION environment variable must be set&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;aws_region&lt;/span&gt;


&lt;span class="c1"&gt;# Leaving the following variable outside the handler function will allow them to be reused across
# multiple invocations of the function.
&lt;/span&gt;&lt;span class="n"&gt;retriever&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;build_retriever&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Create the prompt template in accordance with the structure provided in the Llama3 documentation,
# which can be found at https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/
&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;PromptTemplate&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_template&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;|begin_of_text|&amp;gt;&amp;lt;|start_header_id|&amp;gt;system&amp;lt;|end_header_id|&amp;gt;

You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;t know the answer, just say that you don&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;t know. Use three sentences maximum and keep the answer concise. In case the query requests a link, respond that you don&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;t support links.
Context: {context}&amp;lt;|eot_id|&amp;gt;&amp;lt;|start_header_id|&amp;gt;user&amp;lt;|end_header_id|&amp;gt;

{question}&amp;lt;|eot_id|&amp;gt;&amp;lt;|start_header_id|&amp;gt;assistant&amp;lt;|end_header_id|&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;llm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SagemakerEndpoint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;endpoint_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sagemaker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;endpoint_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# SageMaker endpoint name
&lt;/span&gt;    &lt;span class="n"&gt;region_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;get_aws_region&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;content_handler&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;ContentHandler&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;model_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_new_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;do_sample&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;temperature&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;top_p&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;format_docs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;page_content&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;rag_chain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;context&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;retriever&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;format_docs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;RunnablePassthrough&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;
        &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;
        &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;
        &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="nc"&gt;StrOutputParser&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;rag_chain&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# The line below serves as a notification to Pluto that the function will trigger SageMaker
&lt;/span&gt;    &lt;span class="c1"&gt;# endpoint. So, Pluto will set the appropriate permissions for the function.
&lt;/span&gt;    &lt;span class="n"&gt;sagemaker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="n"&gt;schd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Schedule&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;schedule&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;schd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cron&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;0 0 * * *&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;flush_vector_store&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# This application requires a minimum of 256MB memory to run.
&lt;/span&gt;&lt;span class="nc"&gt;Function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;FunctionOptions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;memory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Q&amp;amp;A
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Why Not Use Api Gateway?
&lt;/h3&gt;

&lt;p&gt;There are two reasons:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;ApiGateway has an inalterable 30-second timeout limit. This means if the generation process exceeds this time window, we would receive a &lt;code&gt;503 Service Unavailable&lt;/code&gt; error. Therefore, we use Lambda functions directly to handle requests. We will attempt to improve the experience by supporting WebSocket in the future.&lt;/li&gt;
&lt;li&gt;The application example requires more than the default 128MB memory. As for the routing functions for ApiGateway, Pluto currently does not support setting the memory size. For the Function resource type corresponding to AWS Lambda, the memory size can be set.&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>llama3</category>
      <category>aws</category>
      <category>ai</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Rethinking a Cloud-Native Application Development Paradigm</title>
      <dc:creator>jianzs@pluto-lang</dc:creator>
      <pubDate>Tue, 02 Jan 2024 11:45:00 +0000</pubDate>
      <link>https://forem.com/jianzs/rethinking-a-cloud-native-application-development-paradigm-4jik</link>
      <guid>https://forem.com/jianzs/rethinking-a-cloud-native-application-development-paradigm-4jik</guid>
      <description>&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt;: The existing development model for cloud-native applications faces complexity and orchestration challenges in the FaaS environment. This article introduces a novel development paradigm that combines Monolithic Programming with Compile-Time Splitting to streamline cloud-based execution and enhance development efficiency. The concept involves leveraging a compiler to automatically partition monolithic application code, enabling distributed execution on cloud infrastructure.&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Cloud-native applications are conventionally identified as those designed and nurtured on cloud infrastructure. Such applications, rooted in cloud technologies, skillfully benefit from the cloud's features of autoscaling and high reliability. This offers robust support for individual developers and SMEs, liberating them from wrestling with the complexities of infrastructure management—an undoubtedly attractive proposition.&lt;/p&gt;

&lt;h3&gt;
  
  
  Present Scenario
&lt;/h3&gt;

&lt;p&gt;So, how are cloud applications currently being developed? References to cloud-native applications usually conjure images of containers and microservices, both of which are intrinsic to the CNCF's definition of the term. A typical development workflow might encompass writing application code at a microservices level, packaging it into containers, and deploying it onto a PaaS platform.&lt;/p&gt;

&lt;p&gt;However, as cloud technologies evolve, Function as a Service (FaaS) has emerged as a pivotal cloud component. Compared to PaaS, FaaS integrates more intimately with cloud capabilities, offering superior performance in scaling and cold starts. For instance, AWS Lambda has achieved cold start latencies on the order of hundreds of milliseconds&lt;/p&gt;

&lt;h3&gt;
  
  
  Hurdles
&lt;/h3&gt;

&lt;p&gt;But does the traditional development methodology of microservices and containers still prove effective for FaaS-based cloud-native applications? Consider the following:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Services encapsulated by functions, often called NanoServices, possess a finer granularity than microservices. The process of packaging and deploying each function individually can be &lt;strong&gt;tedious&lt;/strong&gt;, despite the command-line tools provided by cloud vendors.&lt;/li&gt;
&lt;li&gt;Dividing an application into numerous functions, each functionally isolated, necessitates inter-function communication via SDKs. This results in a development experience that lacks &lt;strong&gt;consistency&lt;/strong&gt;, with frequent context-switching among functions.&lt;/li&gt;
&lt;li&gt;Orchestrating functions entails a &lt;strong&gt;steep learning curve&lt;/strong&gt;, demanding familiarity with the cloud's event mechanisms and orchestration tools.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Essentially, the development model directly based on FaaS falls short of ideal, posing substantial challenges in effectively managing and coordinating functions. It's only natural to aspire for an improved application development experience. To this end, a fresh concept is proposed: &lt;strong&gt;Monolithic Programming, Compile-Time Splitting, and Distributed Execution&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--cKJ8c09W--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/6js9vbvfpgrlv2hn9wxo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--cKJ8c09W--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/6js9vbvfpgrlv2hn9wxo.png" alt="principle" width="800" height="271"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conceptual Analogies
&lt;/h2&gt;

&lt;p&gt;This concept might strike a chord with the parallel programming framework OpenMP. OpenMP leverages the compiler to inject multithreading code for parallel execution within code segments earmarked for concurrency. Similarly, this novel concept uses the compiler to identify and extract code regions capable of independent computation, while the cloud infrastructure assumes the responsibility of managing the distributed execution.&lt;/p&gt;

&lt;p&gt;Frameworks like MapReduce, Spark, and the increasingly popular Ray, all operating in the realm of cloud computing, target large-scale distributed computing. The notable difference here lies in the underlying runtime implementation. While these frameworks construct their runtime environments to support the distributed execution of specific task types, the proposed concept harnesses FaaS provided by cloud infrastructure as a uniform underlying environment to enable general computation for a range of cloud-native applications. This approach presents the potential for tighter integration with cloud capabilities and accommodates diverse workloads.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Concept
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Why "Monolithic Programming"?
&lt;/h3&gt;

&lt;p&gt;Developing a monolithic application is a remarkably seamless experience. With all context residing within a single project, tools like linters, formatters, and IDE plugins can verify variable dependencies and function calls prior to execution.&lt;/p&gt;

&lt;h3&gt;
  
  
  How is "Compile-time Splitting" Achievable?
&lt;/h3&gt;

&lt;p&gt;Files containing code devoid of programming constraints are challenging to split. Nevertheless, by defining keywords, special classes, or functions, the compiler can be directed to partition the code.&lt;/p&gt;

&lt;p&gt;Consider the code snippet below. The &lt;code&gt;Function&lt;/code&gt; class can be viewed as a special construct, where the function definition passed to its constructor is analyzed and extracted into a standalone computational module. Of course, real-world implementations would necessitate addressing many more nuances.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Function&lt;/span&gt;  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;fn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(...&lt;/span&gt;&lt;span class="nx"&gt;args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt;&lt;span class="p"&gt;[])&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="cm"&gt;/* implementation details */&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;fn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Function&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;fn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;The sum of 1 + 1 is &lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Crucially, once computational modules are demarcated, the original code file excluding these modules should also be treated as a computational module. This module serves as the application's entrypoint, akin to the main function in a monolithic app, orchestrating the entire application logic.&lt;/p&gt;

&lt;p&gt;Thus, we achieve the development experience of monolithic programming, paired with the capability for cloud-based distributed execution through compile-time splitting. The outcome of this development approach is an application that thrives directly on the cloud infrastructure, embodying a true cloud-native app.&lt;/p&gt;

&lt;h2&gt;
  
  
  Example
&lt;/h2&gt;

&lt;p&gt;Take an example program based on this concept. The program applies the Monte Carlo method to compute Pi. The logic is straightforward: spawn 10 Workers, each conducting a million samples, then accumulate the results.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;calculatePi&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Function&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;iterations&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;insideCircle&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;iterations&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;random&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;random&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;x&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;y&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;y&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;insideCircle&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;piEstimate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;insideCircle&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nx"&gt;iterations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;piEstimate&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;workerCount&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;iterationsPerWorker&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;piPromises&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;workerCount&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;piPromises&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;calculatePi&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;iterationsPerWorker&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;piResults&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;piPromises&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;piSum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;piResults&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;reduce&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;pi&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;piSum&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nx"&gt;workerCount&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Estimated value of π: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;pi&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--c7BXSNs_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/8ngr64z0don35jyfeiju.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--c7BXSNs_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/8ngr64z0don35jyfeiju.png" alt="example" width="800" height="219"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The execution of this code is expected to pan out as depicted above:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;During the compilation stage, two computational modules are extracted: one for &lt;code&gt;calculatePi&lt;/code&gt;, and another for the main code, excluding &lt;code&gt;calculatePi&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;These modules are subsequently deployed as separate FaaS resource instances.&lt;/li&gt;
&lt;li&gt;Upon deployment, invoking the instance corresponding to the main code yields the output results from the logs.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For more elaborate examples, consider the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/pluto-lang/pluto/issues/108"&gt;Big Data Scenario: Word Count&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/pluto-lang/pluto/issues/109"&gt;Web Scenario&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://github.com/pluto-lang/pluto"&gt;Pluto&lt;/a&gt; is committed to further exploring this concept throughout 2024, with a focus on leveraging static analysis and Infrastructure as Code (IaC) to facilitate implementation. If you find this concept intriguing or have relevant use cases, we encourage you to connect with us.&lt;/p&gt;

&lt;p&gt;If you support this idea, please star the &lt;a href="https://github.com/pluto-lang/pluto"&gt;project&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/cncf/toc/blob/main/DEFINITION.md"&gt;CNCF Cloud Native Definition v1.0&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/lambda/latest/operatorguide/execution-environments.html"&gt;Lambda execution environments - AWS Lambda&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/pluto-lang/pluto"&gt;Pluto | GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://pluto-lang.vercel.app"&gt;Pluto Website&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>webdev</category>
      <category>serverless</category>
      <category>cloudnative</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Building Cloud-Native Applications Made Easy with Pluto: A Guide for Developers</title>
      <dc:creator>jianzs@pluto-lang</dc:creator>
      <pubDate>Wed, 22 Nov 2023 15:13:13 +0000</pubDate>
      <link>https://forem.com/jianzs/building-cloud-native-applications-made-easy-with-pluto-a-guide-for-developers-3l9</link>
      <guid>https://forem.com/jianzs/building-cloud-native-applications-made-easy-with-pluto-a-guide-for-developers-3l9</guid>
      <description>&lt;p&gt;Developers define variables in their code, and &lt;a href="https://github.com/pluto-lang/pluto" rel="noopener noreferrer"&gt;Pluto&lt;/a&gt; takes care of automatically creating and managing the required cloud resource components based on those variables. This simplifies the process of deploying and managing cloud infrastructure, enabling developers to make better use of the cloud.&lt;/p&gt;

&lt;p&gt;In the context, cloud resources do not refer to Infrastructure as a Service (IaaS), but rather to managed resource components such as Backend as a Service (BaaS) and Function as a Service (FaaS). These managed components generally provide enhanced reliability and cost-effectiveness compared to building and managing your own instances.&lt;/p&gt;

&lt;p&gt;In this article, we will guide you through the steps of getting started with Pluto and help you become familiar with its features.&lt;/p&gt;

&lt;h2&gt;
  
  
  Installation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Prerequisites
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://nodejs.org/en/" rel="noopener noreferrer"&gt;Node.js&lt;/a&gt;: Pluto supports writing cloud applications using TypeScript.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.pulumi.com/docs/install/" rel="noopener noreferrer"&gt;Pulumi&lt;/a&gt;: Pluto uses Pulumi to interact with cloud platforms (such as AWS or Kubernetes) and deploy cloud resources.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pluto CLI
&lt;/h3&gt;

&lt;p&gt;The Pluto command-line tool is distributed via &lt;a href="https://www.npmjs.com/" rel="noopener noreferrer"&gt;npm&lt;/a&gt;. Install it by running the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @plutolang/cli
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Verify your installation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pluto &lt;span class="nt"&gt;--version&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Hello, Pluto
&lt;/h2&gt;

&lt;p&gt;Now, let's get started with your first Pluto program.&lt;/p&gt;

&lt;h3&gt;
  
  
  Create your project
&lt;/h3&gt;

&lt;p&gt;Create a Pluto project using the Pluto CLI by running:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pluto new
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command will interactively create a project and create a directory with the provided project name. Here's an example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ pluto new
? Project name hello-pluto
? Stack name dev
? Select a platform AWS
? Select an IaC engine Pulumi
Info:  Created a project, hello-pluto
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After you have created the new project, go to the project root directory and install the dependencies.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; &amp;lt;project_root&amp;gt;  
npm &lt;span class="nb"&gt;install&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Write your business logic
&lt;/h3&gt;

&lt;p&gt;Use your preferred code editor to write the following code in &lt;code&gt;&amp;lt;project_root&amp;gt;/src/index.ts&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Router&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;Queue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;KVStore&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;CloudEvent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;HttpRequest&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;HttpResponse&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@plutolang/pluto&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;router&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Router&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;router&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;queue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Queue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;queue&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;kvstore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;KVStore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;kvstore&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Publish the access time to the queue, and respond with the last access time.&lt;/span&gt;
&lt;span class="nx"&gt;router&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;/access&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;HttpRequest&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;HttpResponse&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;name&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Anonym&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;accessAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt; &lt;span class="p"&gt;}));&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;lastAccess&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;kvstore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="k"&gt;catch&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;respMsg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;lastAccess&lt;/span&gt;
    &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="s2"&gt;`Hello, &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;! The last access was at &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;lastAccess&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;
    &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Hello, &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;!`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;statusCode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;respMsg&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Subscribe to messages in the queue and store them in the KV database.&lt;/span&gt;
&lt;span class="nx"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;subscribe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;evt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;CloudEvent&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;void&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;evt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;kvstore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;name&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;accessAt&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/http%3A%2F%2Fcdn.zhengsj.cn%2Fob-1700630175532.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/http%3A%2F%2Fcdn.zhengsj.cn%2Fob-1700630175532.png" alt="arch"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This code includes 3 resource variables and 2 processes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;An HTTP service called "router" that accepts &lt;code&gt;/access&lt;/code&gt; HTTP requests. It publishes the access time to the message queue "queue" and retrieves the last access time from the KV database "kvstore" and returns it in the response.&lt;/li&gt;
&lt;li&gt;A message queue named "queue" with a subscriber that saves messages from the queue to the KV database "kvstore".&lt;/li&gt;
&lt;li&gt;A KV database named "kvstore" used to store users' last access time.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Deploy your application
&lt;/h3&gt;

&lt;p&gt;To deploy your application to the cloud platform you configured initially, run the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pluto deploy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you specified AWS as the cloud platform, make sure the &lt;code&gt;AWS_REGION&lt;/code&gt; environment variable is correctly configured, for example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;AWS_REGION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;us-east-1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/http%3A%2F%2Fcdn.zhengsj.cn%2Fob-1700630203893.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/http%3A%2F%2Fcdn.zhengsj.cn%2Fob-1700630203893.png" alt="aws-resource"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Pluto will create 3 resource components and 2 function objects on the specified cloud platform. For example, if you chose AWS, it will create:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;An ApiGateway named "router"&lt;/li&gt;
&lt;li&gt;An SNS named "queue"&lt;/li&gt;
&lt;li&gt;A DynamoDB named "kvstore"&lt;/li&gt;
&lt;li&gt;Two Lambda functions starting with "function"&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Multi-platform deployment
&lt;/h4&gt;

&lt;p&gt;If you want to deploy to another cloud platform, you can create a new stack and specify the stack during deployment. Here are the steps:&lt;/p&gt;

&lt;p&gt;Create a new stack:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pluto stack new 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Specify the stack during deployment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pluto deploy &lt;span class="nt"&gt;--stack&lt;/span&gt; &amp;lt;new_stack&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  More Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Example: &lt;a href="https://github.com/pluto-lang/pluto/tree/main/examples/chat-bot" rel="noopener noreferrer"&gt;Command-line chatbot based on OpenAI&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Example: &lt;a href="https://github.com/pluto-lang/pluto/tree/main/examples/daily-joke-slack" rel="noopener noreferrer"&gt;Share a daily computer joke on Slack&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Repository: &lt;a href="https://github.com/pluto-lang/pluto" rel="noopener noreferrer"&gt;Pluto | GitHub&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Pluto's main approach is to leverage techniques such as static program analysis and infrastructure as code to &lt;strong&gt;automatically create cloud resource components on the cloud platform by defining a variable&lt;/strong&gt;. The goal of Pluto is to assist individual developers in building cloud-native applications with ease and reduce the learning curve associated with cloud capabilities.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Please note that Pluto is still in its early stages, and we welcome contributions from interested developers. If you are using AWS or Kubernetes, you can provide us with your requirements. We also appreciate any ideas or suggestions you may have, as they can be implemented in future versions. Feel free to join our &lt;a href="https://join.slack.com/t/plutolang/shared_invite/zt-25gztklfn-xOJ~Xvl4EjKJp1Zn1NNpiw" rel="noopener noreferrer"&gt;Slack&lt;/a&gt; community&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>typescript</category>
      <category>tutorial</category>
      <category>serverless</category>
      <category>cloudnative</category>
    </item>
  </channel>
</rss>
