Forem: Hexmos

LiveAPI Devlogs Part 3: Transforming User Onboarding with 3 Industry-Inspired Methods

Rijul Rajesh — Sun, 22 Dec 2024 14:14:43 +0000

Here, we will explore the new updates and improvements in LiveAPI from the past few weeks.

Developers rarely have time to document their code properly. Projects are rushed to meet deadlines, which makes documenting the codebase challenging.

While Swagger helps, it's not a complete solution; it has considerable setup and maintenance costs. Additionally, an exposed Swagger endpoint can introduce security threats. LiveAPI addresses all these issues.

LiveAPI provides a Super-Convenient solution by providing the following.

Using AI to automate documentation.
Staying up-to-date with revisions and interacting with APIs through widgets.
Ensuring end-to-end code security.

Even though all these features have been developed and used, some new users may find difficulty to use everything without proper guidance.

So we were faced with a new problem: How to onboard users easily without friction?

In this article, we will demonstrate how we addressed these issues using three methods inspired by other products.

The Difficulty We Faced with Onboarding New Users

When speaking of the difficulty we have been facing, there were 3 issues.

1. Difficulty in Getting Users to Sign up and Use the Product

Merely using signup can be a barrier for new users to get familiarized with the product, as they will seek simpler ways to use the product.

So we had to think of a solution that would get people to use the product in the easiest way possible, without barriers.

2. Difficulty of Keeping the Users Engaged

Even if the user signs up, if there is no proper guidance for beginners, they won't know what to do and may eventually drop off.

So we need to make sure that there are some beginner-level activities for the user to do, to understand the product more and engage.

3. Difficulty of Explaining the Features

The user signed up and has some guidance on what things to do, but some users won't properly understand what is written and will find it difficult to use the product.

In such cases we need a proper tour around the UI, highlighting the areas to be interacted with and how to properly use it.

To address these problems, let's explore how existing products in the market handle them.

Case Study 1: How Picwordify Gets Users to Signup

Continue reading the article here

From Lama2 to LiveAPI: Building Super-Convenient API Documentation (Part II)

Athreya aka Maneshwar — Sat, 14 Dec 2024 19:40:48 +0000

Hello, I'm Maneshwar. I'm building git-lrc, an AI code reviewer that runs on every commit. It is free, unlimited, and source-available on Github. Star Us to help devs discover the project. Do give it a try and share your feedback for improving the product.

Hello, I'm Maneshwar. I'm working on FreeDevTools online currently building **one place for all dev tools, cheat codes, and TLDRs* — a free, open-source hub where developers can quickly find and use tools without any hassle of searching all over the internet.

In my previous post, I shared how a small team of students working part-time built Lama2—a tool that simplified API collection and execution.

It quickly became an essential part of our workflow, but as our API repositories grew, Lama2's manual process started showing its limits.

The Ch

allenges of Scaling Lama2

When we started, our team consisted of five students juggling part-time work and studies.

We worked 3-4 hours daily, often pushing the boundaries of our limited capabilities. Lama2 was just one of three projects we were building at the time.

Despite our constraints, Lama2 received a good reception on Hacker News. We even gained some early advocates for the product. For a CLI tool and niche language, it was a solid response.

However, shipping features still took us longer than we hoped. By the time we were ready to compete, the market for API clients was already crowded.

Established teams working full-time on similar products gained traction through their hard work and outreach. While Lama2 solved real problems, it didn’t generate the widespread buzz we had envisioned.

We realized that for Lama2 to make a real impact, it needed more than just execution tools.

The Challenges of Scaling Lama2

When we started, our team consisted of five students juggling part-time work and studies.

We worked 3-4 hours daily, often pushing the boundaries of our limited capabilities. Lama2 was just one of three projects we were building at the time.

Despite our constraints, Lama2 received a good reception on Hacker News. We even gained some early advocates for the product. For a CLI tool and niche language, it was a solid response.

However, shipping features still took us longer than we hoped. By the time we were ready to compete, the market for API clients was already crowded.

We realized that for Lama2 to make a real impact, it needed more than just execution tools.

The Problem with Manual API Documentation

Even with Lama2, maintaining large API collections was daunting. Initially, collecting APIs in a single repository for all services felt manageable. But as we scaled to four backends and hundreds of APIs, the process became overwhelming.

We knew firsthand how frustrating it was to manually document and sync API changes. And we weren’t alone—every developer faces this challenge when dealing with large API collections.

A Vision for Automation

We knew we needed to automate the workflow, making API documentation effortless and execution seamless. Our goal was to eliminate manual steps and create a tool that could:

Automatically document APIs as code was merged, without need of setting up any kind of meta tags etc.
Keep documentation updated with every change
Allow anyone in the organization to execute APIs with ease

Our goal was simple: "Super-Convenient API Documentation."

Imagine a system where:

Input: A repository link
Output: Fully documented APIs that stay updated with every commit.

Building LiveAPI

To bring this vision to life, we started developing LiveAPI, a platform designed with the following key features:

One-Click Repository Connection: Developers could connect their GitHub, GitLab, or Bitbucket repository effortlessly.
Automated Documentation Generation: Documentation would be generated automatically for every commit, with auto-syncing to keep it up-to-date.
Automated Code Snippets: Generate code snippets for any language, enabling frontend developers to move faster.
Developer-Friendly Experience: Minimal setup, maximum convenience.
LiveAPI Runner with Privacy First:
- We never store your repository’s code.
- Using our logic, we extract only routes and API validators.
- This entire process runs on your private server, ensuring your data never leaves your infrastructure.

Spreading the Word

After months of work, LiveAPI is ready. We built a tool that could take the pain out of managing and documenting APIs, enabling teams to focus on building features rather than wrangling documentation.

A collection of UI/UX-focused tools crafted to simplify workflows, save time, and reduce friction in searching tools/materials.

Any feedback or contributors are welcome!

It’s online, open-source, and ready for anyone to use.

👉 Check it out:

⭐ Star it on GitHub:

Let’s make it even better together.

*AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.

git-lrc fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.*

Any feedback or contributors are welcome! It's online, source-available, and ready for anyone to use.

⭐ Star it on GitHub:

HexmosTech / git-lrc

AI Micro Code Reviews That Run on Commit

git-lrc

AI Micro Code Reviews That Run on Commit

AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.

git-lrc fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.

See It In Action

See git-lrc catch serious security issues such as leaked credentials, expensive cloud operations, and sensitive material in log statements

git-lrc-intro-60s.mp4

Why

🤖 AI agents silently break things. Code removed. Logic changed. Edge cases gone. You won't notice until production.
🔍 Catch it before it ships. AI-powered inline comments show you exactly what changed and what looks wrong.
🔁 Build a habit,…

View on GitHub

How a Hobby API Collection and Execution Tool is Evolving into a Product

Athreya aka Maneshwar — Thu, 12 Dec 2024 18:38:38 +0000

In any startup, managing APIs across multiple services is a common challenge.

We faced three main issues:

Documenting APIs
Publishing the documentation
Updating it whenever APIs change

Each of these had its own set of questions: how to do it, where to do it, what tools to use, and who would take ownership.

To tackle this, our team decided to consolidate all APIs into a single repository called APIHub. Each service’s APIs were stored in a simple and consistent format:

GET | POST | PUT | DELETE | PATCH  
${baseurl}/endpoint  
{  
  "body": "if present"  
}

We named the files according to their function. Below is an example of a .l2 file for a "Leave Apply" API, along with a sidebar showing other APIs in the repository:

Improving Documentation Practices

We made it mandatory to include the corresponding .l2 file in every pull/merge request. If it wasn’t there, the request wouldn’t be approved. This simple rule increased API documentation consistency across the team.

From Documentation to Execution

We soon realized that manually testing APIs by copying URLs and payloads to tools like Postman was time-consuming. So, we built a CLI tool called Lama2.

Lama2 is a plain-text API manager optimized for Git-based collaboration.

With Lama2, you could pass a .l2 file as input, and the CLI would execute the API and show the response in the terminal:

This saved us from constant copy-pasting, but switching directories to find .l2 files was still tedious:

lovestaco@i3nux:~/apihub/feedback/fb_v3/leave$ l2 apply_leave.l2

Taking it to VSCode

To streamline things further, we developed a VSCode extension. It came with features that made our workflow even smoother:

Execute .l2 files directly in the editor
Copy the file’s Git URL for easy sharing
Prettify JSON payloads
Generate code snippets for any language from .l2 syntax
Create templates for new APIs in seconds
Auto-completion of variables using LSP

This extension quickly became a favorite among the team, and we decided to release it on GitHub so others could benefit too.

The Next Problem: Scaling Documentation

As our APIs grew, we asked ourselves:

Why manually document APIs for each service?
Isn’t it time-consuming to update documentation for every change?

And that’s where the next chapter of our journey begins...
Follow me to learn what happens next in my next post.

From Vulnerabilities to Vault: How We Stopped Hardcoding Secrets and Started Using Hashicorp Vault

Athreya aka Maneshwar — Sun, 17 Nov 2024 14:26:53 +0000

We recently migrated our infrastructure from Kubernetes to HashiCorp Nomad.

Soon after, we encountered service discovery issues and integrated Consul to address them.

At this point, we were feeling a bit more relaxed, knowing that we could dedicate less time to infrastructure and focus more on product development, as we don’t have a separate resource dedicated to DevOps.

However, we encountered another challenge—a Nomad port was open on the Node, and a friendly person helped us identify this vulnerability.

They were able to access it directly using the server's IP address, without needing a domain.

As a result, we had to quickly rotate all our payment keys, AWS credentials, and other secrets.

The Problem: Hardcoded Secrets and Security Risks

In some areas, our team had hardcoded keys directly into the codebase.

This meant there wasn’t a single place to manage these secrets, and updating them across our systems was a manual, error-prone process.

It was clear we needed a solution that would allow us to change a secret in one place, and have it propagate automatically throughout the system.

Initially, we considered using AWS Secrets Manager alongside our GitLab CI/CD pipeline to inject environment variables during container deployments.

However, this approach required passing a large number of variables as arguments to nomad job run from the project's .gitlab-ci.yml, which cluttered our configuration and reduced system readability.

That’s when we thought, Why not go full HashiStack? We decided to explore Vault.

Exploring Secret Management Options

AWS Secrets Manager

AWS Secrets Manager offers capabilities for managing and rotating secrets such as database credentials, API keys, and other sensitive information.

Access Management: AWS Secrets Manager relies on AWS IAM (Identity and Access Management) for managing permissions. For granular control, additional tools like Chamber maybe required. Chamber leverages AWS IAM roles and policies to restrict access by namespace, simplifying the process of partitioning secrets across different environments.
Cross-Cloud Limitations: While effective within AWS (our existing infra), use in multi-cloud environments.
Cost of service: $0.40 per secret per month.

HashiCorp Vault

Hashicorp Vault provides a robust, identity-based security solution designed to authenticate and authorize access to secrets automatically, supporting a wide range of infrastructures.

Access Control: Vault offers RBAC (Role-Based Access Control) for user logins, enabling administrators to manage human access to secrets. For machine access, Vault provides policy-driven access with leases, allowing for precise control over which applications can access secrets and for how long. This includes options like automatic key rotation and expiration.
Dynamic Secrets Generation: Vault can dynamically generate AWS IAM credentials, providing temporary access that’s tightly controlled by policies and easily revoked if needed.
Cross-Environment Compatibility: Vault supports cross-region, cross-cloud, and cross-datacenter replication, making it a seamless choice for organizations with multi-cloud setups.
Cost of service: Free.

Why We Chose HashiCorp Vault

Vault offered the ideal solution: it allowed Nomad to request sensitive information, such as payment secrets or AWS keys, directly from Vault.

Vault would verify if Nomad was authorized to access those secrets and promptly deliver the data.

This way, we could centralize key management, automate secret updates, and improve security by avoiding hardcoded keys.

We developed a proof of concept, where we demonstrated Vault's ability to manage key rotation and securely store secrets in one place, accessible across all our projects.

This not only simplified secret management but also ensured consistency and security for our entire infrastructure stack.

Continue reading full article...

Why We Chose NGINX + HashiStack Over Kubernetes for Our Service Discovery Needs

Athreya aka Maneshwar — Sun, 06 Oct 2024 17:20:20 +0000

We recently switched from Kubernetes to Nomad to manage our infrastructure. At first, with two nodes and multiple services,we had a hard time getting the request routing to work reliably.
In this post, I’ll walk through how we built an efficient and low-cost service discovery solution for our infrastructure—and why it could benefit others facing similar routing issues.

Spoiler: You can achieve smooth results without needing NGINX Plus, thanks to NGINX’s robust features and the power of open-source modules.

The Routing Problem: A Snapshot of Our Setup

At the core of our infrastructure lies a typical setup: a browser making requests to our server, an NGINX reverse proxy forwarding those requests, and Nomad managing the services on multiple nodes.

Initially, we had hardcoded the node IPs into our Nginx configuration,
which caused a major problem when services were redeployed to different nodes.

Every redeployment required manual NGINX configuration updates.
This quickly became unsustainable as the number of services grew.

That’s when we decided to integrate service discovery into our stack.

Service Discovery: Why It Matters

Service discovery is the process of automatically detecting services on a network.

In a dynamic, multi-node setup like ours, services constantly shift between nodes and deployments. Without proper service discovery, it’s impossible to route requests to the right service on the right node.

Without this, a request arriving at our server might hit the wrong service, or worse, fail completely because NGINX had no idea where the service was running.

Nomad alone doesn’t solve this issue. That’s where Consul comes in.

Consul tracks the location and port of each service deployed via Nomad, and NGINX uses this data to ensure requests reach their intended destinations. This is key if you want scalable, robust routing without hardcoding IP addresses or relying on static configurations.

Why Should You Care?

This solution isn’t just a neat technical solution, but a necessary move if you want to avoid downtime by making request go to healthy services.

Whether you’re a startup or running a large-scale app, the right service discovery mechanism helps reduce complexity, improve reliability, and keep your infrastructure flexible.

Think of it like moving from a fragile system to something far more robust without the need for heavyweight orchestration like Kubernetes.

NGINX and Service Discovery: Exploring the Options

While NGINX offers built-in service discovery features, they are part of NGINX Plus, the enterprise version.

We opted to explore open-source alternatives and found a pre-built open-source NGINX module that allows NGINX to retrieve service location data from Consul.

Here’s why we made this choice:

Cost Considerations: NGINX Plus requires paid subscriptions, which means ongoing management of licenses.
Feature Set: NGINX Plus comes with a broad set of features, but many of them were unnecessary for our specific use case.
Full Control: By using a purely FOSS solution, we maintain 100% control over our infrastructure, without relying on external enterprise solutions.

Why This Solution Worked

We’re now serving all internal and customer-facing Hexmos apps using this custom NGINX and HashiStack setup. The custom NGINX is necessary because of our legacy configurations as we transitioned from Kubernetes. This makes our case particularly interesting for others facing similar transitions.

A lot of smaller teams are using NGINX with PM2 to manage their processes. While that works, it doesn’t scale easily if you’re trying to handle multiple nodes or containers.

For teams using NGINX+PM2, moving to NGINX + HashiStack is a more robust and flexible solution—a great fit for startups looking for scalability without the complexity of Kubernetes.

In fact, many startups are likely using PM2, and very few truly need Kubernetes.

Moving to NGINX+HashiStack

Larger organizations like Zerodha and Cloudflare are using Nomad to manage their infrastructure. Both companies have substantial setups but avoid Kubernetes, showing that Nomad + Consul can scale effectively without the overhead of Kubernetes.

For startups, HashiStack is like PM2 on steroids—providing multi-node and Docker control. It allows you to easily manage different workloads—whether binaries or Docker container—across multiple nodes, while being lightweight enough for smaller operations.

Kubernetes is often overkill unless you’re running at a very large scale. HashiStack with custom NGINX offers a much simpler, cost-effective, and scalable alternative.

Our transition from Kubernetes to Nomad was eye-opening. Here’s why this solution could be a good fit for teams considering an upgrade:

Simplicity
- HashiStack + NGINX: Lightweight and easy to manage with just two binaries—Nomad (orchestration) and Consul (service discovery). The custom NGINX module integrates seamlessly without complex setups.
- Kubernetes: A full-featured but complex platform, requiring numerous services and configurations. Often needs a dedicated team for ongoing management.
Flexibility and Scale
- HashiStack + NGINX: Supports a range of workloads (containers, binaries, VMs) and scales smoothly across nodes and regions. Ideal for startups or teams seeking flexible deployment management.
- Kubernetes: Excels in container-heavy environments but can be overkill for smaller setups. Its complexity makes scaling harder to manage.
Cost Efficiency and Operational Effort
- Both HashiStack + NGINX and Kubernetes are open-source, offering flexibility without upfront licensing fees. However, their cost efficiency varies when factoring in operational complexity and labor hours.
- HashiStack + NGINX: Free and open-source, avoiding enterprise license costs (like NGINX Plus). Easier to set up and maintain, making it a cost-effective solution for smaller teams with limited DevOps resources.
- Kubernetes: Also open-source may require additional tools (e.g., Ingress controllers), increasing operational complexity. Its steep learning curve and management demands can lead to higher labor costs.

Options for Service Discovery Integration with NGINX

When comparing Nomad's template stanza with Consul Template, the choice largely depends on your use case, but both have their strengths and challenges. Let’s break down the pros and cons of each approach:

1. Nomad Template Stanza

Usage: The template stanza in Nomad is often used for injecting dynamic content (like load balancer configs) directly into tasks. It relies heavily on the integration with Consul to fetch service details and generate configurations dynamically.
Pros
- Tight integration: Works seamlessly with Nomad jobs and Consul service discovery. It automatically reconfigures services when Nomad or Consul detects changes.
- No extra processes: Since it's native to Nomad, you don’t need to run a separate daemon for templates to update.
- Signal-based reload: Can signal the containerized service (e.g., NGINX) to reload configurations on updates (SIGHUP signal).
- All-in-one Job Spec: Everything is packed into the same Nomad job file (code, template logic, service configuration), which could simplify management for some.
Cons
- Complexity: The inline template can get quite complex and difficult to maintain, especially as your configuration grows. Writing Nomad templates with complex range statements to handle service discovery (like the upstream block for NGINX) can become cumbersome. Example: {{ range service "echo-server" }} ... {{ else }}server 127.0.0.1:65535;{{ end }} could be tricky for large applications.
- Limited portability: The template configuration is tied to Nomad’s job files, which can make it harder to migrate or adapt to environments where Nomad is not in use.
- Steeper learning curve: The embedded logic in the template stanza can feel overwhelming. For newcomers, this can make understanding and debugging more difficult.

2. Consul Template Daemon

Usage: Consul Template is a standalone daemon that fetches data from Consul, Nomad, or Vault and renders it into templates, offering more flexibility for updating service configurations. It can be used independently or alongside Nomad.
Pros
- Separation of concerns: The configuration and template management are decoupled from Nomad, so you can manage templates independently. This is useful when you have multiple services and configurations that need to be updated based on Consul data.
- Powerful templating features: Consul Template can handle more complex scenarios and logic than the Nomad template stanza due to its broader templating syntax.
- Run custom commands: It can run any arbitrary command after rendering a template (like restarting a service), offering more flexibility in how you manage updates.
- Cross-system: Consul Template can be used for other systems as well (e.g., Vault or just plain Consul), making it more versatile and portable.
Cons
- Extra daemon: You need to run an additional process (consul-template) which adds operational overhead.
- Manual setup and management: It requires setting up configuration and managing the lifecycle of the daemon. You’ll also need to configure reload logic manually, which could be overkill for smaller systems.
- Reloading complexity: You have to configure signals or restart logic to handle service restarts correctly, and incorrect configurations could lead to service downtime or stale configurations.

3. DNS Service Discovery with NGINX Plus

We wish we could tell you more about NGINX Plus, but it's a paid tool and we haven't had a chance to try it out. From what I've heard, it's a really smooth experience. It automatically keeps track of where your services are and sends traffic to the right places. If you're looking for a hassle-free solution and don't mind spending a bit extra, NGINX Plus might be a great fit.

4. NGINX’s ngx_http_consul_backend_module

ngx_http_consul_backend_module is a NGINX add-on that I've found incredibly useful for establishing a direct connection between NGINX and Consul. This module uses the Consul Go backend to efficiently discover and route to healthy services.

Pros
- No need for NGINX reloads: Since NGINX queries the Consul Go API client for healthy services on each request, there’s no need to reload NGINX whenever a service moves between nodes or when new instances are added.
- Simplified service discovery: Module directly route each request through Consul, ensuring that traffic is always directed to healthy services. NGINX fetches the healthy services without needing custom health checks, external scripts, or manual intervention.
- Improved reliability: Since the Consul backend only provides information about healthy hosts, there is no risk of requests being sent to dead or unhealthy services.
- Efficient connection pooling: By using the official Consul Go API client, the module benefits from efficient connection management, contributing to faster and more reliable service discovery.
- Familiar configuration interface: The setup with Consul and NGINX is relatively straightforward, and familiar configuration directives (like proxy_pass and $backend) make it easy to integrate into existing NGINX configurations.
Cons
- Need to rebuild NGINX from source: The biggest downside is that you need to rebuild NGINX from source with this module. This adds an extra step to your deployment process and makes updates or migrations slightly more cumbersome. If you’re using packaged NGINX versions from repositories, this could be a hassle.
- Maintenance overhead: Rebuilding from source means you’ll need to maintain your own version of NGINX, handle upgrades, and ensure compatibility with other NGINX modules you may want to use.

Workflow

A request arrives at NGINX that fits a specific location block which includes a Consul directive.

    location / {
        consul $backend echo-server-lovestaco-com;
        add_header X-Debug-Backend $backend;
        proxy_pass http://$backend;
    }

NGINX then calls the ngx_http_consul_backend function, providing it with two pieces of information.

The first piece of information is a variable where the result will be stored (for example, $backend).
The second piece of information is the name of the Consul service to which the request should be routed (like echo-server-lovestaco-com).

The ngx_http_consul_backend function uses dlopen to load the shared C library (the .so file) and calls the Go function defined within that library.
This Go function interacts with Consul using the official API client library. It gathers a list of available IP addresses and selects one to return.
The chosen IP address is sent back to the ngx_http_consul_backend function, and assigned to $backend.
The next step involves using NGINX's built-in proxy_pass directive to forward the traffic to the selected host.

Below image shows the flow of a request using consul

Step-by-Step Guide on How We Made It Work by Rebuilding NGINX from Source

1. Install the Essential Build Tools

apt-get -yqq install build-essential curl git libpcre3 libpcre3-dev libssl-dev zlib1g-dev

2. Download and Extract NGINX from Source

cd /tmp
curl -sLo nginx.tgz https://nginx.org/download/nginx-1.24.0.tar.gz

Extract the downloaded tarball to access the NGINX source code

tar -xzvf nginx.tgz

4. Download and Extract the NGINX Development Kit (NDK)

Download the ngx_devel_kit module, which is required for building the backend.

curl -sLo ngx_devel_kit-0.3.0.tgz https://github.com/simpl/ngx_devel_kit/archive/v0.3.0.tar.gz

tar -xzvf ngx_devel_kit-0.3.0.tgz

6. Clone the `ngx_http_consul_backend_module` Repository

git clone https://github.com/hashicorp/ngx_http_consul_backend_module.git /go/src/github.com/hashicorp/ngx_http_consul_backend_module

7. Change Ownership of the NGINX Extensions Directory

sudo chown -R $(whoami):$(whoami) /go/src/github.com/hashicorp/ngx_http_consul_backend_module
sudo chown -R $(whoami):$(whoami) /usr/local/nginx/ext/

8. Tidy Go Modules

go mod tidy

9. Compile the Go Code as a Shared C Library That NGINX Will Dynamically Load

Set the CGO flags to include the ngx_devel_kit directory

CGO_CFLAGS="-I /tmp/ngx_devel_kit-0.3.0/src" \
go build \
  -buildmode=c-shared \
  -o /usr/local/nginx/ext/ngx_http_consul_backend_module.so \
  ./ngx_http_consul_backend_module.go

This will compile the object file with symbols to /usr/local/nginx/ext/nginx_http_consul_backend_module.so

10. Configure NGINX with Required Paths and Modules

To add a module during the NGINX build process, use the following configuration command

cd /tmp/nginx-1.24.0

CFLAGS="-g -O0" \
./configure \
  --with-debug \
  --prefix=/etc/nginx \
  --sbin-path=/usr/sbin/nginx \
  --conf-path=/etc/nginx/nginx.conf \
  --pid-path=/var/run/nginx.pid \
  --error-log-path=/var/log/nginx/error.log \
  --http-log-path=/var/log/nginx/access.log \
  --add-module=/tmp/ngx_devel_kit-0.3.0 \
  --add-module=/go/src/github.com/hashicorp/ngx_http_consul_backend_module

Common Configuration Options

--prefix=/etc/nginx: Installation directory for Nginx binaries and configuration files.
--sbin-path=/usr/sbin/nginx: Path to the Nginx binary executable.
--conf-path=/etc/nginx/nginx.conf: Path to the main Nginx configuration file.
--pid-path=/var/run/nginx.pid: Path to the Nginx process ID file.
--error-log-path=/var/log/nginx/error.log: Path to the Nginx error log file.
--http-log-path=/var/log/nginx/access.log: Path to the Nginx access log file.
(Add other desired modules with --with-modulename_module)
Make sure to include the --add-module option for each static module you want to build with NGINX.

11. Build and Install NGINX

make

sudo make install

12. Verify NGINX Installation and Configuration

/usr/sbin/nginx -V

Hardcoded Backend vs. Consul-driven Backend

Let's compare two scenarios

1. Hardcoded Backend

This is sort of a traditional approach where you manually specify the IP address and port of the backend server in your NGINX configuration. Here's an example

server {
  listen        80;
  server_name   one.example.com www.one.example.com;

  location / {
    proxy_pass     http://127.0.0.1:8080/;  # Hardcoded IP and port

    proxy_set_header  Host      $host;
    proxy_set_header  X-Real-IP  $remote_addr;
  }
}

This approach has limitations

Static Configuration: If the backend server IP or port changes, you need to manually update the NGINX configuration and reload NGINX.
Scalability Issues: Manually managing configurations becomes cumbersome as your infrastructure grows.

2. Consul-driven Backend with `ngx_http_consul_backend_module`

The ngx_http_consul_backend_module simplifies backend management by leveraging Consul's service discovery capabilities. Here's how it works:

Consul Service Listing: First, list the available services registered in Consul using the consul catalog services -tags command. This will display service names and tags for easier identification.

ubuntu@master:~$ consul catalog services -tags
consul           consul
one-example-com   one-example-com,primary
dns              primary
echo-server-1
nomad            http,rpc,serf
nomad-client     http
python-http-server  http,python-http-server

NGINX Configuration: Update your NGINX configuration to utilize the consul directive within the location block. This directive retrieves the healthy backend server information for the specified service name and stores it in a variable (e.g., $backend).

server {
  listen        80;
  server_name   one.example.com www.one.example.com;

  location / {
    consul        $backend one-example-com;  # Retrieve backend from Consul
    proxy_pass     http://$backend/;          # Use retrieved backend address

    proxy_set_header  Host      $host;
    proxy_set_header  X-Real-IP  $remote_addr;
  }
}

Benefits of Consul-driven Backend

Dynamic Configuration: NGINX automatically discovers healthy backend servers registered in Consul, eliminating the need for manual configuration updates.
Scalability: As your infrastructure grows with more backend servers, NGINX seamlessly adjusts to route traffic to healthy instances.

Additional Notes

Remember to install and configure ngx_http_consul_backend_module for this approach to work.
Refer to the module's documentation for advanced configuration options.

By employing ngx_http_consul_backend_module, you can achieve a dynamic and scalable backend management system for your NGINX server, simplifying configuration and enhancing overall application reliability.

Conclusion: A Lightweight, Flexible Solution

Switching from Kubernetes to Nomad allowed me to streamline our deployments, but it also required better service discovery to ensure smooth routing between services.

By using Consul and an open-source NGINX module, we avoided the complexity and cost of NGINX Plus while still getting an efficient, scalable solution.

For anyone currently running NGINX with PM2 or those looking for a simpler alternative to Kubernetes, NGINX with the HashiStack (Nomad + Consul) is a flexible, powerful, and cost-effective solution.
It’s lightweight, robust, and much easier to manage at scale.

If you're exploring service discovery for a similar setup, give it a try—it might be the neat solution you need.

Stay ahead of the curve! Subscribe for a weekly dose of insights on
development, IT, operations, design, leadership and more.

Why your API Doc Structure Matters: How these 5 Principles make it easy for your readers

Rijul Rajesh — Sun, 15 Sep 2024 14:30:00 +0000

Common Issues with API Docs: And How I Managed to Solve Them

Good API Documentation is crucial for developers to understand and utilize the APIs in the best way possible, And it can contribute to the success of the project.

A Project without good documentation is like a tool without proper instructions. No matter how good the tool is, if people don't understand, it's useless. So we need a proper idea of what common mistakes people commit when making API documentation for their project.

1. Discovering the right API that I need

When a developer enters your documentation, they will usually have an specific goal in mind. They will be searching for an API which they need from your documentation.

Example: Suppose your API docs includes an endpoint for sending emails
If the developer searches with "email" then the relevant endpoint should come up right away.

If the API Documentation has poor search functionality and unorganized documentation structure, the developer can get exausted trying to find the API they need.

The solution towards this problem is to implement a proper search functionality as well as a documentation structure.

Divio has a good structure that helps to easily grasp the contents.

This system divides documentation into 4 main types

1. Tutorials (Learning Oriented): Designed to teach, help a developer to acheive something from start to finish

2. How-To Guides(Problem oriented): Helps to solve a particular problem step by step

3. Explanations (Understanding oriented): Provides in-depth explanations about how things work

4. Reference(Information oriented): Serve as a technical resource if you need specific details

2. Figuring out how to use it

When API documentations are made, We can often forget that this documentation is meant to be read by a developer who has no prior experience working with the system. So the they can often be left wondering what the endpoint does and what is the bigger picture in which this API belongs.

To solve such issues, we need a detailed description, which should be beginner friendly, and should demonstrate real-world usecases

Example:
Suppose the following API is shown in the documentation:

https://dogapi.dog/api/v2/breeds/{id}

The user might not immediately know what the API does or what value to insert for {id}.

To resolve this, the documentation needs a clear description and a real-world example, like:

The API (https://dogapi.dog/api/v2/breeds) provides detailed information about various dog breeds. This API allows users to retrieve specific data related to dog breeds, including attributes like name, physical characteristics, life expectancy, weight, temperament, and other related details.

https://dogapi.dog/api/v2/breeds/68f47c5a-5115-47cd-9849-e45d3c378f12

3. Procrastination in creating the code and getting it done

Even if the documentation is well-defined, the developer may still feel some friction in integrating the API into his code. As a solution to this problem, we can have code snippets in various languages that can be generated from the API.

The developer can directly copy and paste these snippets into his code and get things running quick.

For example
A python developer will prefer a ready made request like

import requests

url = "https://dogapi.dog/api/v2/breeds/68f47c5a-5115-47cd-9849-e45d3c378f12"

response = requests.get(url)

print(response.json())

Rather than a plain url

https://dogapi.dog/api/v2/breeds/68f47c5a-5115-47cd-9849-e45d3c378f12

4. Handling Errors and Authentication

Imagine a new developer who is using your API encounters an error or faces an issue related to authentication.The first place they would look is your Documentation, to see what are the possible errors and the solutions which can be applied.

But if the documentation doesn't have such information, then the whole debugging process can become painful. This can even lead to the user ultimately dropping off from your API and searching for easier alternative solutions.

We can solve this problem by listing the common errors, showing detailed troubleshooting solutions and link towards how the authentication tokens are to be obtained.

For example:

Imagine a developer trying to access a protected API endpoint, but faces errors.

The API docs should demo how to use the authentication token like

curl -X GET "https://httpbin.org/bearer" \
-H "Authorization: Bearer YOUR_ACCESS_TOKEN"

And show the possible errors like so.

Error Code	Error Message	Troubleshooting Tip
401	Missing Authorization Header	Ensure you include the `Authorization` header with the Bearer Token.
403	Invalid or Expired Token	Verify that the token is valid and has not expired.
500	Internal Server Error	Try again later, or contact support if the issue persists.

How Mainstream API Docs Structure Information for the Readers: A Case Study

Mintlify

For our Mintlify example, let's visit the infisical documentation.

Suppose, I want to look for an API for Creating Client secret. There are 2 ways in which I can access it. One is to go through the Auth-related categories and find them. Or I can use the search bar and search for "Client secret". It's pretty straightforward to get the APIs we need.

Since we got the API we need, We need to understand it next. Details are provided related to Authorizations, Path parameters, Body, and Response.

Now we have understood the API. The next thing to do is get the implementation done. We have the code snippets in various languages (cURL, Python, Javascript etc) as well as the responses to see what it looks like.

Readme

Let's explore the Readme documentation.
We will be visiting persona documentation for this.

The UI consists of a sidebar and a topbar. We can see the Topbar having OpenAPI Spec, which is a standard format for defining RESTful APIs, allowing for clear documentation and interaction with the API. This also means the documentation is synced with the OpenAPI Definition, ensuring the details in the documentation are always upto-date.

There is an API Changelog as well, so each changes are tracked.

Suppose we want to use the Create Authorization API. I can either go to the OAuth section or the search bar to get the API.

To understand how the API works, I can utilize the "Try it" button to run the API myself and observe what the output looks like.

For deeper understanding, we have sections like Form-Data, Headers, and Responses.

We can also observe an Alert, which informs the user about the prerequisites.

Fern

At last, we can check the Fern Documentation.
We will be visiting Vellum documentation.

We will try searching for Execute Prompt API. This API can be similarly accessed via the search bar or the sidebar.

For understanding the API, we have a play button to try the APIs ourselves as well as some description about the Request and Response.

Based on our learnings, we now know the features that these API Documentation platforms use. Let's go through it one by one.

The Key features that improve Developer Engagement

Read the remaining article

Faster, Easier Deployments: How We Simplified Our Infrastructure with Nomad in 15 Hours (Goodbye, Kubernetes!)

Athreya aka Maneshwar — Sun, 11 Aug 2024 17:36:23 +0000

We're a lean team of 10 managing a relatively simple infrastructure.

Our setup includes a master Kube v1.24.4, two v1.25.1 nodes, and a self-hosted GitLab for version control.

Our CI/CD pipeline relies on GitLab runners to build images and deploy Docker containers using Kubernetes.

Kubernetes: The Goliath We Couldn't Tame

For a while, our infrastructure was running smoothly with eight services deployed across our Kubernetes cluster.

However, a few months ago, we hit a snag.

Nodes became unreachable, deployments stalled, and pods started crashing.

This instability was diverting our energy into fixing things up instead of producing.

As a temporary fix, we decided to host four of our deployments on a separate node using Docker.

This provided some stability, but the underlying issue persisted – the nodes connection continued to go offline.

Drowning in YAML: Our Kubernetes Nightmare

Maintaining Kubernetes for our small team felt manageable at first.

Helm templates streamlined our deployments, but downtime issues persisted.

Maintaining Kubernetes for a small team like ours was becoming increasingly burdensome.

The constant troubleshooting and firefighting were taking a toll on our productivity.

It felt like we were spending more time managing the infrastructure than building our product.

Nomad: The Oasis in Our Infrastructure Desert

(Source: HashiCorp)

After careful consideration, we decided to explore Nomad.

Large organizations like Cloudflare and Trivago have been running Nomad for some time now.

Their endorsement of Nomad's simplicity and reduced operational overhead increased our interest.

We were eager to explore how this platform could benefit our smaller team.

What we wanted was something like a supercharged pm2, but multi-node with good support for both docker and binaries.

Nomad seemed like the perfect fit, so we decided to give it a shot.

Our next steps involve diving into Nomad's capabilities and potentially conducting a proof of concept to assess its suitability for our environment, before that let's understand the key difference between Kubernetes and Nomad.

Kubernetes vs. Nomad: A Clash of Titans

Kubernetes is like a big, complex city. It’s got everything, but it can be hard to navigate and manage. It's is a collection of many, many services & tools working together.

Nomad is more like a cozy town. It’s simpler, easier to get around, and still gets the job done. It's is a single binary, which you have to run & you're done

Here's the breakdown:

Simplicity: Nomad wins. It's like comparing a bike to a car.
- Nomad is simple to learn and use, making it accessible even for teams with limited experience in managing complex container orchestration systems.
Flexibility: Nomad can handle different types of workloads, while Kubernetes is mostly focused on containers.
- It has the ability to schedule, deploy, and manage various workloads (virtualized, containerized, and standalone applications) types, including both Windows- and Linux-based containers.
Consistency: Nomad is the same everywhere, while Kubernetes can vary depending on how you set it up.
- Setting up Kubernetes in production can be complex and resource-intensive. Kubernetes, while versatile, can introduce complexities due to its various deployment options (like minikube, kubeadm, and k3s). This can lead to differences in setup and management.
- Nomad stands out with its consistent deployment process. As a single binary, it works seamlessly across different environments, from development to production. This unified approach simplifies management and reduces potential issues.
Scaling: Both can handle big jobs, but Nomad seems to do it smoother.

According to Kubernetes documentation, it supports clusters with up to 5,000 nodes and 300,000 total containers. However, as the environment grows, managing the system at scale can become increasingly challenging.
Nomad is built for scale. It effortlessly handles tens of thousands of nodes and can be spread across multiple regions without added complexity. Rigorous testing, including the 2 million container challenge, has proven Nomad's ability to manage massive workloads efficiently.

So, if you want something easy to manage and versatile, Nomad might be your jam.

But if you need a full-featured platform and don't mind the complexity, Kubernetes could be the way to go.

Image source: This image is extracted from a presentation included in the Internet Archive article "Migrating from Kubernetes to Nomad and Consul" which details the migration process.

Nomad has a pretty solid UI for real-time insights.

From YAML to HCL: A Developer-Friendly Shift

Thankfully, getting started with Nomad was a breeze.

In Kubernetes, we define deployments via YAML files. Whereas Nomad leverages HCL (HashiCorp Configuration Language).

This language, built using Go, proved to be surprisingly user-friendly for developers like ourselves.

We were able to quickly grasp the syntax and start building deployments with minimal learning curve.

Nomad 101: Your Quick Start Guide

1. Installing Nomad for Production

I've set up Nomad with a single server and multiple clients.

If you wish to deploy services on the server machine as well, you can register it as a client.
This way, one node can act as a coordinator/leader while also deploying services, making efficient use of compute resources.

The official HashiCorp installation instructions for Linux are clear and easy to follow.

We found them to be much simpler than the sometimes cumbersome setup process experienced with Kubernetes years earlier.

Note: To run your services that depend on Docker using Nomad, Docker must be installed on the clients.

2. Exposing Nomad Ports

To ensure your Nomad servers and clients can communicate (join), we need to open specific ports on your network.

Create Security Group

A security group acts as a firewall, controlling incoming and outgoing traffic for your instances.

In our AWS setup, we'll create a security group to allow access to the following ports:

Port 4648: Used for Serf communication between Nomad servers. Both a TCP and UDP listener will be exposed to this address. This protocol is responsible for cluster membership and failure detection.
Port 4647: Dedicated to Nomad RPC communication between nodes. The address used to advertise to Nomad clients for connecting to Nomad servers for RPC. This allows Nomad clients to connect to Nomad servers from behind a NAT gateway. This address muct be reachable by all Nomad client nodes. When set, the Nomad servers will use the advertise.serf address for RPC connections amongst themselves. This is the primary channel for task scheduling and status updates.
Port 4646: Enables HTTP API access for interacting between CLI Nomad agents. The address the HTTP server is bound to. The address to advertise for the HTTP interface.

Attach Security Group to Server and Client Nodes

We need to associate the created security group with both your Nomad server and client instances.

By doing this, you allow incoming traffic on the specified ports, enabling communication within the Nomad cluster.
EC2 -> Instance -> Actions -> Security -> Change Security Groups

3. Joining Forces: Connecting Clients to Your Nomad Cluster

Assuming Nomad is already installed on your client machines, follow these steps to join them to the Nomad server:

Edit client.hcl Open the client.hcl file located at /etc/nomad.d/client.hcl on each client machine. Add the following configuration, replacing 152.21.12.71 with the actual IP address of your Nomad server:

client {
  enabled = true
  server_join {
    retry_join = ["152.21.12.71:4647"]
  }
}

Restart Nomad To apply the configuration changes, restart the Nomad service on each client machine:

sudo systemctl restart nomad

Verify Client Status Check the Nomad logs on each client to ensure they have joined the cluster successfully:

sudo journalctl -u nomad -f
# or
sudo systemctl status nomad

Look for messages indicating that the client has connected to the server.

node4 nomad[1441379]: [INFO]  agent.joiner: starting retry join: servers=152.21.12.71:4647
node4 nomad[1441379]: [INFO]  agent.joiner: retry join completed: initial_servers=1 agent_mode=client
node4 nomad[1441379]: [INFO]  client: node registration complete

Your Nomad clients will now join the cluster and start receiving job allocations from the server. 5. Listing the Nodes status

nomad node status

You can view the clients and servers with their status from the dashboard, which is pretty great.

4. Deploying with Ease: Your First Nomad Job

Before deploying a job, let's create a HCL configuration file to define the job's specifications, including where it runs, how many instances to launch, and resource requirements.

Generating an Example Job File

Nomad offers a handy command to generate a basic HCL file as a starting point.
nomad init

Example Job File (fw-parse/fw-parse.hcl)

This example showcases a job that runs a Node.js backend service within a Docker container.

Datacenters:
A datacenter in Nomad represents a region or availability zone where your jobs will run. It's essential to specify the datacenter for your job.

job "fw-parse" {
  datacenters = ["dc1"]
  # ... rest of the configuration
}

In this example, dc1 is the name of the datacenter. You can have multiple datacenters in your Nomad cluster.

Groups:
A group is a logical container for one or more tasks. You can use groups to organize your job and define constraints.

job "fw-parse" {
  datacenters = ["dc1"]

  group "servers" {
    # Specifies the number of instances of this group that should be running.
    # Use this to scale or parallelize your job.
    # This can be omitted and it will default to 1.
    count = 1

    network {
      port "http" {
        static = 1337  # Use the internal port your Docker container listens on
      }
    }
    # ... task configuration
  }
  # ... rest of the configuration
}

Task Configuration: Docker Driver

For our Node.js backend, we'll use the Docker driver.

job "fw-parse" {
  datacenters = ["dc1"]

  group "servers" {
    task "fw-parse" {
      driver = "docker"
      config {
        image = "gitlab.site:5050/fw-parse/main:latest"
        # ... other config options
      }
    }
  }
}

The image attribute specifies the Docker image to use.

fw-parse/fw-parse.hcl

job "fw-parse" {
  # Specifies the datacenter where this job should be run
  datacenters = ["dc1"]

  group "servers" {
    # Specifies the number of instances of this group that should be running.
    # Use this to scale or parallelize your job.
    # This can be omitted and it will default to 1.
    count = 1

    network {
      port "http" {
        static = 1337  # Use the internal port your Docker container listens on
      }
    }

    # This particular task starts a simple web server within a Docker container
    task "fw-parse" {
      driver = "docker"

      config {
        image = "gitlab.hexmos.site:5050/backend/fw-parse/main:latest"
        auth {
          username = "gitlab@hexmos.site"
          password = "password"
        }
        ports = ["http"]
      }

      # Specify the maximum resources required to run the task
      resources {
        cpu     = 500
        memory  = 256
      }
    }
  }
}

More about ports:

If your service is exposing multiple ports, you can configure Nomad to handle them by specifying each port in the network block and mapping them to the tasks. This is particularly useful when your application requires several different ports for various services, such as HTTP, HTTPS, and an another one.

For example, if your service is exposing ports 1337, 1339, and 3000, you would define them in your Nomad job file like this:

job "fw-parse" {
  datacenters = ["dc1"]

  group "servers" {
    network {
      port "http" {
        static = 1337  # Use the internal port your Docker container listens on
      }
      port "https" {
        static = 1339  # Another port used by your container
      }
      port "admin" {
        static = 3000  # Port for an admin interface or additional service
      }
    }

    task "fw-parse" {
      driver = "docker"

      config {
        image = "gitlab.hexmos.site:5050/backend/fw-parse/main:latest"
        auth {
          username = "gitlab@hexmos.site"
          password = "password"
        }
        ports = ["http", "https", "admin"]  # List all the ports that the container will use
      }
    }
  }
}

4. Running and Stopping Deployments

Running a Deployment

Nomad offers two ways to kickstart your deployment:

Command Line: Use the nomad job run command.

nomad job run fw-parse/fwparse.hcl

Nomad UI: For a more visual experience, leverage Nomad's intuitive user interface.

Stopping a Deployment

When you need to halt a running deployment, simply execute the command nomad job stop where represents the name assigned to your deployment.

nomad stop example

Extras

Format hcl files: hclfmt
Config generator: nomcfg
Guides for getting started: Nomad tutorial

Our Roadmap: Expanding Nomad's Potential

We've recently embarked on our Nomad journey, setting up a foundational environment with a single server and three clients.

After a week of running a basic Docker-based Node.js backend, we're pleased with the initial results.

Our vision for Nomad extends far beyond this initial proof of concept.

We're eager to integrate GitLab CI/CD for streamlined deployments and explore the use of HashiCorp Vault for robust secret management.

Given our team's shared ownership model, Nomad's developer-centric approach aligns perfectly with our goals of rapid iteration and efficient deployment.

We anticipate Nomad to enhance our operational efficiency and are excited to share our experiences and learnings as we progress.

We're open to feedback and suggestions from the Nomad community as we continue to refine our setup.

If you have any recommendations based on your experiences, we'd love to hear them.

Want to dive deeper into the Kubernetes world? Check out our other blog posts:

Subscribe for a weekly dose of insights on development, IT, operations, design, leadership and more.

I spent the last 6 months building LiveAPI Proxy: Here are 10 HARD-EARNED Engineering Lessons you can use now

Rijul Rajesh — Sun, 09 Jun 2024 16:15:13 +0000

How LiveAPI Taught me some important Lessons in engineering

I have been working on a product named LiveAPI. Let me just give an idea of what this product does.

The above API doc is a static one, users cant execute and change things by themselves.

Static API docs like these often lose customer attention before the developers even try the APIs.

The above API Doc uses LiveAPI, here developers can execute these APIs instantly right within their browser, so that developer attention can be captured within the first 30 seconds of their visit.

LiveAPI uses a WASM Binary and a language core for executing the APIs. These things are already built up and we started testing this on some httpbin URLs, everything seemed fine

When we tried doing a GET request to www.google.com, it failed.
We investigated further and found out that there was a CORS error going on.

CORS error prevents us from making requests from one site to another site.
But this is a vital thing, because we are always requesting from one site(API docs) to another site(the target API url).

So we thought for a while on this issue, and an idea popped up. How about we use proxyservers? This is a potential solution to this problem and will get us back up and running. Let's see how proxy servers can be a useful approach.

Learning about Proxies: Engineering a Solution for CORS-Free browser requests

What is a proxy server?

Consider this example.
Here you can see two people, Alice and Bob. In the middle there is a proxy.

Alice asked the proxy to forward a message to him, Bob also does the same.
The proxy acts as the middleman here passing information between these two people.

This is how proxy servers work.
A proxyserver acts as a middleman between a client and a server, We have 3 things: Client Requests, Proxy Server and Responses.

Client Request: When you send a request to a website. Instead of the website receiving it first, the proxy server receives it.

Proxy Server: The proxy server then forwards your request to the actual website. It’s like a middleman that handles the communication.

Response: The website responds to the proxy server, which then forwards the response back to you.

How Proxies aid with solving the CORS problem

The proxy server makes the request to the target API on behalf of our LiveAPI tool. Since the browser sees the request coming from the proxy server rather than from our site, it bypasses the CORS restrictions.

Figuring out how to build a proxy server: The approach I took

Since we got an idea of what the solution looks like, We were thinking about what technologies should we use.

For our case, we already had an apache2 server up and running, and since Apache is a widely used server with a lot of module support, we felt it was the better choice for building our proxy server.

Putting the Solution into Action: Building an Apache2 Proxy and Getting LiveAPI working

Continue reading the article

Bot Invasion To Automated Defense: My Journey With ML Deployment

Athreya aka Maneshwar — Sun, 12 May 2024 15:25:15 +0000

Remember the bot invasion of my newsletter? I fought back with ML, but building a fancy bot zapper wasn't in the budget. Instead, I deployed a cost-effective model to keep those pesky bots out. Here is how I did it!

In my previous article, "Bots Invaded My Newsletter. Here's How I Fought Back with ML" I shared my experience building a bot-signup detector using machine learning to tackle a surge of unwanted bot signups on my newsletter.

While it wasn't the most sophisticated solution, it was a valuable learning experience.

Few people on Reddit were curious about deploying their own models.

Since I'm still on this journey of learning ML deployment, I wanted to share what I've learned so far about finding an easy and cost-effective approach or even finding more feasible approach from you guys.

Even though I'm just starting out, hopefully this can be helpful for others who are in the same boat!

Backstory Of Identifying The Enemy

I had a bot invasion to my newsletter, I knew how the bots would look(name, email) like.

Their data were stored in DB collected from signups, I used that for the dataset.

For my weapon of choice, I picked a BERT transformer.

I trained it with a bunch of emails (144 to be exact) to learn the difference between human and bot names and emails; and it was working most of the time.

So it was all ready, just needed to deploy it live and use it in the signup process.

Preparing My Weapon To Fire Against The Enemy

Now that I had my trusty bot detector trained, it was time to figure out how to load it into the battlefield (deployment).

Here's what I learned about deploying a machine learning model in a simple and cost-effective way.

Integrating the bot detector with my newsletter signup process was an exciting adventure.

It felt like discovering a whole new system, just like writing the final line of code that unlocks a new functionality!

Previously I had a transformer which would take the name and email as the input and provide a boolean value indicating if the input signup is bot or not.

For deployment we didn't wanna spin up a new VM and a server to keep listining to the calls or at the same time didn't want our existing services to have this as a piece of them. So went in for AWS Lambda server less deployment.

Can't Use Lambda Straight Away

When I was trying to deploy the transformer model I understood, I cannot use Lambda normally. Because there will be installations like transformers, scikit-learn, and many more.

So the alternate solution was to use Lambda using Docker container images.

This was a good exploration, basically it's a Docker image which you create by installing all the pre deps whatever is necessary for you and host it as a Lambda function.

Docker Container Images (But Too Big!)

I loaded up my previously built transformer of 419 MB .bin file and installed transformers, scikit-learn and may other packages, by the time I built the image it was 9.2 GB!

Clearly that was a horrible solution for such a basic problem.

Logistic Regression - Smaller and Faster

I moved on to Logistic Regression, which took less time to train and prepare the model as compared to the transformer and crazy thing was the binary is 27 KB :D

I went on with adding deps, logic and voila 820 MB Docker image.

So I went ahead and pushed the Docker image to Elastic Container Registry.

ECR is like Docker hub where I can store the Docker image I build.

Then created a Lambda function which uses the docker image from the ECR repo I created earlier, so the cannon was prepared with the load and powder, just had to figure out the firing mechanism.

Firing The Weapon

The initial plan was to trigger the Lambda function directly using the AWS CLI or Boto3 library.

However, I needed a more user-friendly way to activate the bot detector from frontend.

This led me to explore API Gateway.

It's a good service that allows you to create a public endpoint (like a trigger point) that accepts requests and forwards them to your Lambda function behind the scenes.

This was exactly what I needed – a way to invoke the Lambda function using a simple API call.

Integrating the API Gateway with my signup form wasn't completely smooth sailing.

I encountered some challenges mapping the data received by the API Gateway to the format expected by the Lambda function.

Luckily, CloudWatch logs came to the rescue.

With its detailed logs, I could easily debug the issue and get everything working seamlessly.

Killing The Enemy

Now, whenever someone signs up for my newsletter, the API in my frontend form automatically triggers the Lambda function. Here's the magic that happens behind the scenes:

The signup data is sent to the Lambda function.
The function analyzes the data using the trained model to identify potential bots.
If a bot is detected, the function automatically blocks the subscriber using Listmonk's built-in Block API.
Finally, the function sends a notification to my Discord channel, keeping me informed about signup activity (including any blocked bots).

With this system in place, I've successfully automated bot detection and eliminated the need for manual intervention.

This feels like a victory in the fight against newsletter bot signups.

Continue reading How to setup a bot detector for yourself here.

Final Thoughts

This journey of deploying a machine learning model to fight newsletter bots has been a valuable learning experience.

In my previous article, "Bots Invaded My Newsletter. Here's How I Fought Back with ML ⚔️" I covered building the bot detector model.

Now, we've explored the deployment side – a crucial step for putting your model to practical use.

Here are some resources to help you get started on your own AI/ML adventure:

Building a Logistic Regression Model (ipynb file): Logistic_regression.ipynb (This file demonstrates how I built the simpler and more efficient logistic regression model.)
Lightweight Model File (23kb): dt_model_file.pkl (Feel free to download and use this pre-trained model for basic bot detection in your own newsletter signup process.)
Lambda Function Code Repository: bot_detect_lambda (This repository contains the code for integrating the bot detection model with AWS Lambda for a serverless deployment.)

Spread the Knowledge!

Share this blog post with your friends who are interested in getting started with AI and machine learning.

Want to learn more or connect with me?

Reddit: athreyaaaa
LinkedIn: maneshwar-athreya

Athreya aka ManeshwarFollow

Software Dev | Technical Writer | 450k+ Reads | Learning, building, improving, writing :)

Credits got depleted and can't create AI images anymore? How to run your own image generator for free

Rijul Rajesh — Sun, 07 Apr 2024 14:32:04 +0000

Struggles of Running Image Generators: Limited Use and Frustration

New innovations like AI image generators are gathering popularity. With tools like midjourney and DALL-E 3, people can generate amazing images just with their imagination.

The one thing common in all of the tools is that there is pricing involved, So if you plan on generating images for free, then you will be limited either by a free trial or a limited credit system.

For example, if you are a designer and you want to create image assets, mockups, or design ideas, there would be a lot of iterations involved, which is expensive.

Sooner or later, after some usage, your credits will become depleted and you will no longer be able to generate images.

What technology does Image Generation use?

The technology that these image generators use is called Stable Diffusion.

Stable diffusion requires a lot of resources. Making it expensive to be given to everyone for free.

So I thought, Why not run stable diffusion using my own resources? That way, I won't need to deal with payments, and I can create as many images as I want.

So I will be demonstrating by the end of the article how I was able to run AI image generators on my own for absolutely free.

Here you can see that I have given a simple prompt, and I got the image generated in high quality.

Let's see how the stable diffusion technology works, so we can start running it on our own.

How the Tech Behind AI Image Generators Works

To learn about how AI image generation works, we need to know about Stable Diffusion.

We can think of stable diffusion like the real diffusion we have learned in school.

There will be a clear beaker of water, we will add a few drops of dye, The dye diffuses throughout the liquid, until it reaches a state of equilibrium.

Now let's apply the same concept to real stable diffusion.

For training a stable diffusion model, we will start with a process called Forward diffusion.

In forward diffusion, we take an image and add noise to it.

For those who don't know, think of noise like the static you see when the TV gets disconnected.

The type of noise we are adding here is called gaussian noise. This process is done multiple times, so there will be multiple layers of noise.

After performing forward diffusion, Reverse Diffusion is done. This involves reversing the gaussian noise until we get the original image.

The model gradually starts learning how to predict images from noise.

Similarly, forward and reverse diffusion is done on millions of images to properly train the model.

After the training is done, we can make a random noise, and the model will predict the image.

We may have a doubt: How is the model able to generate images from text prompts?

Images used for training have an alt text associated with them, which describes what the image is about.

This way, each image is linked to a text, and the model gradually finds the relationship between the text and the images.

This is how stable diffusion models work in a simple way. Now let's get the stable diffusion running on your own.

Let's get stable diffusion running

Continue reading the rest of the article to see how we can get stable diffusion running on your machine!

Bots Invaded My Newsletter. Here's How I Fought Back with ML ⚔️ 🤖

Athreya aka Maneshwar — Sun, 31 Mar 2024 15:36:20 +0000

My newsletter was overrun by bots! I decided to try a machine-learning solution. It was my first ML experiment and I learned a lot. Want to know how I built a bot detector and gained some ML skills along the way?

The bot invasion

I have a free newsletter that encourages you to read daily.

There are 100+ subscribers, and recently a lot of bots have signed up too.

Bots are signing up to market their own product, newsletters, etc.
They usually have a link in the name field and the message that they want to convey.

Ex:

Email	Name
watcher2112@ecocryptolab.com	🔶 Withdrawing 32 911 Dollars. Gо tо withdrаwаl >>> https://forms.yandex.com/cloud/65e6228102848f1a71edd8c9?hs=0cebe66d8b7ba4d5f0159e88dd472e8b& 🔶

These spammy signups aren't just annoying, they're a real headache!

I was tired of manually blocking bot emails and worrying about how they might hurt my email reputation.

I know I have numerous options to filter out the bot signups by embedding traditional methods like CAPTCHA, Double Opt-in, Regex patterns, or Honeypot Fields in the form.

At the same time, I also had a feeling like I was not trying to adapt to the newer technology, particularly the Machine Learning field, and wanted to get started but had no clue where to begin with.

Then one of my mentors, Shrijith suggested why not try creating a solution for the bot signup problem using ML.

I felt this was the right experiment I could begin with to learn ML.

And so, I am here with my first machine learning experiment!

What should I expect from the model?

Picture this: You've built a website with a newsletter signup form. You want to make sure your subscribers are real people, not automated bots.

So, you implement a bot detection system. But what does it mean when someone tells you their system is "95% accurate"?

Let me break it down:

Catching true bots

Imagine 100 signups are actually bots.

A 95% sensitive system should correctly identify 95 of them as bots.

5 bots might slip through the cracks and be mistaken for humans (false negatives), which is okay and not a big deal.

Not mistaking humans

Now, imagine 100 signups are from real humans.

A 95% specific system should accurately recognize 95 of these as humans.

However, 5 people could be mistakenly labeled as bots (false positives), this is very bad as the human is ignored, which is a loss of potential business lead(in general injustice).

The formulas

Sensitivity = True Bots Detected / (True Bots Detected + Bots Missed)
The system's ability to find true bots.

Specificity = True Humans Detected / (True Humans Detected + Humans Mistaken for Bots)
The system's ability to avoid mislabeling real people.

Accuracy = (True Bots Detected + True Humans Detected) / (Total Signups)
Overall correctness, but it can be misleading if your dataset has way more of one type (bots or humans).

If all three are 1.0 then congrats you have the perfect model.

One big mental mistake

I used to underestimate the power of data when training machine learning models.
I assumed that algorithms would simply "figure it out" no matter what I fed them.

With a small dataset of 103 signups (only 12 bots!), I threw it at Decision Trees, Logistic Regression, and Random Forest models.

I got an initial accuracy of 77%, but that was a classic overfitting trap.
My models were just memorizing the training data, useless for real-world scenarios.

Frustrated, I jumped to transformers, thinking the solution lay in fancy algorithms.

I got a slight boost to 87.4%, which was a relief but still left much to be desired.

To hit that 90% target, I needed to debug. Using a confusion matrix,

I finally saw the light: it was the data, not the models, holding me back.

I used SMOTE and simply balanced my dataset with equal numbers of bot and human signups, i.e 90 Human and 90 Bots then my accuracy shot up to 94%!

Long story short: How I got to the 100% accuracy bot detector

Note: my training data is 180 rows

1. Preparation

Imports for models and packages
Extracting data from my newsletter database to CSV.

2. Creating the Dataset

I cannot input the database data directly for the BERT to understand.
I need to use a tokenizer to break the text into tokens (suitable units for BERT). Created a class(NewsletterCollectionDataset) to do the above things.

3. Splitting data and loading

I split the data into three sets
- training (to teach the model) 144 rows,
- validation (to check progress during training) 18 rows, and
- testing (for final evaluation) 18 rows.
Then a function(create_data_loader) turns each of those data splits into 'DataLoaders' which the model can easily train on.

4. Building the model

BotClassifier is a class where my bot-detection model is defined.
It's based on BERT but adds some extra layers:
- bert: Loads the pre-trained BERT model.
- drop: A technique called 'dropout' to help prevent overfitting (the model memorizing too much about the training data).
- out: A final output layer to turn BERT's output into the prediction (bot or human).
Setting up the Model:
- Get the model ready to run.
- Specify an optimizer (AdamW).
- Learning rate scheduler for how the model's learning changes over time.

5. Training the model

Setting the model to training mode.
Looping through the data and updating the model's knowledge(backward propagation) using the optimizer. ### 6. The main function
A function(start_training) where a loop is present.
This loop runs for a fixed number of epochs (training cycles). In each epoch:
- The model trains on the training data.
- The model is evaluated on the validation data.
- The best-performing model is saved.

7. Final Evaluation

A function(evaluate_model) to get the truest sense of how well the model has learned to generalize to unseen data.
After training was done, I evaluated the model one more time on the held-out testing set (test_data_loader).
A function(test_with_single_data) to test out a signup on the model.

Now I will try to explain the above stages as simple as possible.

How did I create the Dataset?

I have mainly name and email fields in the newsletter signup and there is no verification.
Then I manually blacklisted all the bots in the email service Listmonk.

So the raw data was in the format of

Status	Email	Name
Available	athreyac4@gmail.com	athreya c
Blocklisted	watcher2112@ecocryptolab.com	🔶 Withdrawing 32 911 Dollars. Gо tо withdrаwаl >>> https://forms.yandex.com/cloud/65e6228102848f1a71edd8c9?hs=0cebe66d8b7ba4d5f0159e88dd472e8b& 🔶

This was good enough for me to do an experiment.

I used the above data to get it in a simple format so that I could train it easily.

df = pd.read_csv('https://raw.githubusercontent.com/usrername/repo/dataset.csv')[['name_email', 'bot']]
df.head(2)

What are the numbers for training and testing?

I had 103 signup emails. 91 were human and 12 were bot.

I used SMOTE and generated data in a such way that I had 90 bots and 90 humans.

Finally used 144 signup data for training the model,
18 for testing and 18 for validating.

### Data preparation

We use Pandas, Torch, and Sklearn packages to make use of their utils for splitting data into training and testing sets.

  sklearn.model_selection import train_test_split as tts

  INITIAL_TEST_SIZE = 0.2
  RANDOM_SEED = 42
  VALIDATION_SIZE = 0.5

# Splits the dataset into a training set (for model training) and a testing set (for evaluating its performance).

  df_train, df_test = tts(df,
                        test_size=INITIAL_TEST_SIZE,
                        random_state=RANDOM_SEED
                        )

# Further splits the testing set into a validation set (for tuning model parameters) and a final testing set.

df_val, df_test = tts(df_test,
                      test_size=VALIDATION_SIZE,
                      random_state=RANDOM_SEED,
                      )

Custom Dataset

NewsletterCollectionDataset Class
This class defines a dataset that can be used with PyTorch models.

It takes care of preprocessing the raw name email data using a BERT tokenizer and
converting it into suitable input for a machine-learning model.

   # Provide tools for creating custom datasets and loading data in batches for machine learning. 
   import torch
   from torch.utils.data import Dataset

class NewsletterCollectionDataset(Dataset):
    """
    Args:
        bot: Labels for each sample (0 or 1).
        name_emails:  List of name email text samples.
        tokenizer: BERT tokenizer for preprocessing.
        max_len:  Maximum sequence length.
    """
    def __init__(self, bots, name_emails, tokenizer, max_len):
        self.name_emails = name_emails
        self.bots = bots
        self.tokenizer = tokenizer
        self.max_len = max_len
    def __len__(self):
        return len(self.name_emails)

This is the heart of the class. Here's what happens:

Grabs a name email signup and its bot/human label.
Uses the BERT tokenizer to turn the text into numbers the model understands.
Bundles everything neatly with labels ready for PyTorch.

    def __getitem__(self, i):
        name_email = str(self.name_emails[i])
        bot = self.bots[i]
        encoding = self.tokenizer.encode_plus(
            name_email,
            add_special_tokens=True,
            max_length=self.max_len,
            truncation=True,
            return_token_type_ids=False,
            pad_to_max_length=True,
            return_attention_mask=True,
            return_tensors='pt'
        )
        return {
            'name_email': name_email,
            'input_ids': encoding['input_ids'].flatten(),
            'attention_mask': encoding['attention_mask'].flatten(),
            'bot': torch.tensor(bot, dtype=torch.long)
        }

Data Loaders

create_data_loader Function

Creates DataLoader objects, which handle loading data in batches and
shuffling for the training, validation, and testing sets.

from torch.utils.data import DataLoader
from transformers import BertTokenizer

def create_data_loader(df, tokenizer, max_len, batch_size):
    """
    Args:
        df (pandas.DataFrame): The DataFrame containing email name data and 'bot' labels.
        tokenizer: The BERT tokenizer for text preprocessing.
        max_len (int): The maximum length for tokenized sequences.
        batch_size (int): Number of samples per batch.

    Returns:
        torch.utils.data.DataLoader: A DataLoader instance for iterating over the dataset.
    """

    ds = NewsletterCollectionDataset(
        bots=df['bot'].to_numpy(),
        name_emails=df['name_email'].to_numpy(),
        tokenizer=tokenizer,
        max_len=max_len
    )

    return DataLoader(
        ds,
        batch_size=batch_size,
        num_workers=4
    )

Creating model data for training, validation, and testing using the data loaders.

# Loads the BERT tokenizer for text preprocessing.
PRE_TRAINED_MODEL_NAME = 'bert-base-cased'
TOKENIZER = BertTokenizer.from_pretrained(PRE_TRAINED_MODEL_NAME)

# Maximum sequence length for tokenization.
MAX_LEN=512

# Batch size for training.
BATCH_SIZE=16

train_data_loader = create_data_loader(df_train, TOKENIZER, MAX_LEN, BATCH_SIZE)
test_data_loader = create_data_loader(df_test, TOKENIZER, MAX_LEN, BATCH_SIZE)
val_data_loader = create_data_loader(df_val, TOKENIZER, MAX_LEN, BATCH_SIZE)

## The Model: BERT Plus a Bit More
My core model (BotClassifier) isn't crazy complex. Think of it like this:

BERT Does the Heavy Lifting: I feed BERT those name email signups and it turns them into meaningful representations.

import torch.nn as nn
from transformers import BertModel

class BotClassifier(nn.Module):
    """
    Args:
        n_classes (int): The number of output classes (e.g., 2 for bot vs. human).
    """

    def __init__(self, n_classes):
        super(BotClassifier, self).__init__()
        self.bert = BertModel.from_pretrained(PRE_TRAINED_MODEL_NAME)
        self.drop = nn.Dropout(p=0.3)
        self.out = nn.Linear(self.bert.config.hidden_size, n_classes)

Dropout: Little Bit of Randomness Dropout randomly zeroes out some connections during training, making the model less prone to overfitting.

The Output Layer: "Bot" or "Not"? A simple linear layer takes BERT's output and makes the final prediction.

Defines the forward pass through the spam classification model.

    def forward(self, input_ids, attention_mask):
        """
        Args:
            input_ids (torch.Tensor): Tokenized input sequences.
            attention_mask (torch.Tensor): Attention mask indicating real vs. padded tokens.

        Returns:
            torch.Tensor: The model's output logits (un normalized class probabilities).
        """

        pooled_output = self.bert(
            input_ids=input_ids,
            attention_mask=attention_mask
        )[1]
        output = self.drop(pooled_output)
        return self.out(output)

# Check for CUDA (GPU) availability; otherwise defaults to CPU.
DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

model = BotClassifier(n_classes=2)
model = model.to(DEVICE)

## What did the training involve?

The train function is where I teach this model to spot the bots.

import numpy as np
def train(
    model,
    loss_fn,
    optimizer,
    scheduler,
    device,
    data_loader,
    n_examples
):
    """
    Args:
        model (nn.Module): The PyTorch model to train.
        loss_fn (nn.Module): The loss function for calculating error.
        optimizer (torch.optim.Optimizer): The optimizer used for updating model parameters.
        scheduler: A learning rate scheduler to adjust learning rate during training.
        device (torch.device): The device where the model and data should be loaded ('cpu' or 'cuda')
        data_loader (torch.utils.data.DataLoader): A DataLoader providing batches of training data.
        n_examples (int): The total number of training examples in the dataset.

    Returns:
        tuple: A tuple containing:
            * train_acc (float): Training accuracy for the epoch.
            * train_loss (float): Average training loss for the epoch.
    """

    model = model.train()  # Sets the model to training mode

    losses = []
    correct_predictions = 0

For each batch of data, it:

Feeds data to the model.

    for d in data_loader:
        # Data preparation
        input_ids = d['input_ids'].to(device)
        attention_mask = d['attention_mask'].to(device)
        targets = d['bot'].to(device)

        # Forward pass
        outputs = model(input_ids=input_ids, attention_mask=attention_mask)

Calculates how wrong the model was (that's the loss).

        # Loss calculation
        loss = loss_fn(outputs, targets)

        # Accuracy calculation
        _, preds = torch.max(outputs, dim=1)
        correct_predictions += torch.sum(preds == targets)
        losses.append(loss.item())

Tweaks the model to be better next time (backpropagation and the optimizer).
Learning rate magic: The scheduler adjusts the learning rate, so the model learns quickly at first and then fine-tunes itself.

        # Back propagation
        loss.backward()
        nn.utils.clip_grad_norm_(model.parameters(), max_norm=1.0)  # Gradient clipping

        # Optimization
        optimizer.step()
        scheduler.step()
        optimizer.zero_grad()

    train_acc = correct_predictions.double() / n_examples
    train_loss = np.mean(losses)

    return train_acc, train_loss

from collections import defaultdict
history = defaultdict(list)
EPOCHS=5

def start_training():
    best_accuracy = 0

    for epoch in range(EPOCHS):
        print(f'Epoch {epoch + 1}/{EPOCHS}')
        print('-' * 10)

This is where the core learning happens for one epoch. Accuracy and loss (how wrong the model is) are calculated on your training data.

        train_acc, train_loss = train(
        model,
        loss_fn,
        optimizer,
        scheduler,
        DEVICE,
        train_data_loader,
        len(df_train)
    )
        print(f'Train loss {train_loss} accuracy {train_acc}')

The evaluate_model function tests how well the model is doing on a validation dataset it hasn't seen before.
This helps prevent overfitting.

        val_acc, val_loss = evaluate_model(
        model,
        loss_fn,
        DEVICE,
        val_data_loader,
        len(df_val)
    )
        print(f'Validation loss {val_loss} accuracy {val_acc}\n')

If the model beats its previous best performance on the validation set, it's saved.

        history['train_acc'].append(train_acc)
        history['train_loss'].append(train_loss)
        history['val_acc'].append(val_acc)
        history['val_loss'].append(val_loss)
        if val_acc > best_accuracy:
            torch.save(model.state_dict(), 'best_detector_model.bin')
            best_accuracy = val_acc

start_training()

Is it working?

Testing the model with a signup

Single Signups: The test_with_single_data Function
demonstrates how to use the model on one signup at a time

Prepping the Input: Just like during training, we use our trusty BERT tokenizer (TOKENIZER) to turn a new signup into the right format.

def test_with_single_data(data_to_test):
    """Tests a single signup to determine if it's likely from a bot or human.

    Args:
        data_to_test (str): The name and email data from a newsletter signup.

    Prints:
        The input signup data along with the model's prediction (bot or human).
    """

    # Tokenize and prepare input data for the model
    encoding = TOKENIZER.encode_plus(
        data_to_test,
        add_special_tokens=True,
        max_length=MAX_LEN,
        truncation=True,
        return_token_type_ids=False,
        pad_to_max_length=True,
        return_attention_mask=True,
        return_tensors="pt",
    )
    input_ids = encoding["input_ids"].to(DEVICE)
    attention_mask = encoding["attention_mask"].to(DEVICE)

To the Model!: The model spits out a prediction, and we turn its numbers into a probability using torch.nn.functional.softmax.

    # Set model to evaluation mode and run prediction
    model.eval()
    with torch.no_grad():
        outputs = model(input_ids=input_ids, attention_mask=attention_mask)
        prob = torch.nn.functional.softmax(outputs, dim=1)

    # Get the class prediction (0 = human, 1 = bot)
    prediction = torch.argmax(prob, dim=1).item()

Bot or Not? Based on that probability, we decide whether it's likely a bot or a real human signup.

    # Print the input data and the prediction result
    print(f"Input Name Email: {data_to_test}", )
    if prediction == 1:
        print("The signup is likely from a bot.  \n")
    else:
        print("The signup is likely from a human. \n")

email = "rishic2013@gmail.com"
name = "Rishi C "

email2 = "lama2@hexmos.com"
name2 = "🔶Lama2.  G t 12     "

test_with_single_data(name+email)
test_with_single_data(name2+email2)

The Method I used for debugging and achieved 94% from 87%

When I wanted to gain more accuracy, I didn't exactly know what was going wrong.

So when I implemented and understood the Confusion Matrix,
it was showing one False Positive.

So let me exlain what is confusion matrix is

The confusion matrix is a simple and powerful tool that provides a clear picture of how well the classification happens.

Sklearn provides a function called confusion_matrix to visualize the classification.

from sklearn.metrics import confusion_matrix
import seaborn as sns 

cm = confusion_matrix(y_test.numpy(), y_pred.numpy()) 
custom_colors = ['#f0a9b1', '#a9f0b9']
sns.heatmap(cm, annot=True, cmap=custom_colors, fmt='d') 
plt.xlabel('Predicted')
plt.ylabel('True')
plt.show()

For plotting the confusion matrix, I used Matplotlib and the Seaborn library in Python.

Think of it like a truth table for your model. It lays everything out:

True Negative (Top left): 6 - The model correctly identified 7 human signups.

False Positive (Top right): 0 - The model incorrectly identified 0 human signup as a bot.

False Negative (Bottom Left): 1 - The model incorrectly identified 1 bot signup as human.

True Positive (Bottom right): 11 - The model correctly identified 11 bot signups.

Coming back to the original problem, I had one False Positive

That meant the model was wrongly flagging a real person as a bot! A quick look at my data with my show_misclasified() function

I realized I had mislabeled data during my balancing act.
A single human mislabeled as a bot was causing the dip.

One fix, one retrain, and done – 94% accuracy!

Conclusion

My bot detector achieved a 91.6% success rate catching bots, with a perfect score (100%) identifying real subscribers.

Not bad, since accidentally blocking a real person (false positive) is a much bigger concern than missing a sneaky bot.

This is a good start, but I'm always looking to improve. I'll be gathering more data and experimenting to see if I can boost the accuracy even further.

Want to stay updated on my progress? Subscribe to our journal for next week's content on fine-tuning Stable Diffusion!

Originally published at https://journal.hexmos.com on March 31, 2024.

Training LLMs Taking Too Much Time? Technique you need to know to train it faster

Rijul Rajesh — Sun, 03 Mar 2024 15:48:23 +0000

The Challenges of Training LLMs: Lots of Time and Resources

Suppose you want to train a Large Language Model(LLM), which can understand and produce human-like text. You want to input questions related to your organization and get answers from it.

The problem is that the LLM doesn't know your organization, It only knows general things. That is where applying techniques to the models like Finetuning, RAG and many others comes up.

If we want to train Big LLMs, It requires a lot of resources and time. So it's a hefty task unless you have the proper machine to do the job.

Story of How We Solved The Problem of Time and Resources

Suppose we want to train the Llama 2 LLM based on the information of our organization, and we are using Google Colab to train it. The free version of Colab provides a single Nvidia T4 GPU, which provides 16GB of memory.

But for training the Llama 2 - 7 Billion Parameter model we require 28GB of memory.
This is a problem, We can't train the model with only 16GB of memory.

So to solve this, we tried researching into some optimization techniques and we found LoRA, Which stands for Low-Rank Adaptation of Large Language Models.

LoRA adds a layer of finetuning to the model, without modifying the existing model. This consumes less time and memory.

By using LoRA, I was able to finetune the Llama-2 Model and get the outputs from it from a single T4 GPU.

Refer to the above image. I asked the Llama2 model without finetuning a question, How many servers does Hexmos Have? It gave the reply that it is unable to provide the information.

After finetuning I asked the same question, and it gave me this reply
Hexmos has 2 servers in Azure and 4 servers in AWS

Let's see how LoRA helped me achieve this.

How LoRA Helps with Finetuning More Efficiently

Let's have a deeper dive into how LoRA works.

When training large models like GPT-3, it has 175 Billion Parameters. Parameters are like numbers that are stored in Matrices, it is like the knobs and dials that the model tweaks to get better at its task. Fully Finetuning them to our needs is a daunting task and requires a lot of computational resources.

LoRA, takes a different approach to this problem, Instead of fine-tuning the entire model, it focuses on modifying a smaller set of parameters.

Consider the above 2 boxes. One represents the weights for the existing model, the second one represents our fine-tuned weights(Based on our custom dataset). These are added together to form our fine-tuned model.

So by this method, We don't need to change the existing weights in the model. Instead, we add our fine-tuned weights on top of the original weights, this makes it less computationally expensive.

So another question may arise, how are these finetuned weights calculated?
In Matrices, we have a concept called Rank.

Rank, in simple words, determines the precision of the model after finetuning, If the Rank is low, There will be more optimization. But at the same time, you will be sacrificing the accuracy of the model.

If the Rank is high, the precision will be higher but there will be lesser optimization.

The LoRA weight matrix is calculated by multiplying 2 smaller matrices.

For example, we have to multiply 1x5 and 5x1 together to form a 5x5 LoRA weight matrix.

We can set the rank of the smaller matrix to determine the balance between precision and optimization.

Real Life Example: Training a Llama2 Model with Custom Dataset

Continue reading the article

Forem: Hexmos

LiveAPI Devlogs Part 3: Transforming User Onboarding with 3 Industry-Inspired Methods

The Difficulty We Faced with Onboarding New Users

1. Difficulty in Getting Users to Sign up and Use the Product

2. Difficulty of Keeping the Users Engaged

3. Difficulty of Explaining the Features

Case Study 1: How Picwordify Gets Users to Signup

From Lama2 to LiveAPI: Building Super-Convenient API Documentation (Part II)

The Ch

The Challenges of Scaling Lama2

The Problem with Manual API Documentation

A Vision for Automation

Building LiveAPI

Spreading the Word

HexmosTech / git-lrc

AI Micro Code Reviews That Run on Commit

git-lrc

AI Micro Code Reviews That Run on Commit

See It In Action

Why

How a Hobby API Collection and Execution Tool is Evolving into a Product

Improving Documentation Practices

From Documentation to Execution

Taking it to VSCode

The Next Problem: Scaling Documentation

From Vulnerabilities to Vault: How We Stopped Hardcoding Secrets and Started Using Hashicorp Vault

The Problem: Hardcoded Secrets and Security Risks

Exploring Secret Management Options

AWS Secrets Manager

HashiCorp Vault

Why We Chose HashiCorp Vault

Why We Chose NGINX + HashiStack Over Kubernetes for Our Service Discovery Needs

The Routing Problem: A Snapshot of Our Setup

Service Discovery: Why It Matters

Why Should You Care?

NGINX and Service Discovery: Exploring the Options

Why This Solution Worked

Moving to NGINX+HashiStack

Options for Service Discovery Integration with NGINX

1. Nomad Template Stanza

2. Consul Template Daemon

3. DNS Service Discovery with NGINX Plus

4. NGINX’s ngx_http_consul_backend_module

Workflow

Step-by-Step Guide on How We Made It Work by Rebuilding NGINX from Source

1. Install the Essential Build Tools

2. Download and Extract NGINX from Source

4. Download and Extract the NGINX Development Kit (NDK)

6. Clone the ngx_http_consul_backend_module Repository

7. Change Ownership of the NGINX Extensions Directory

8. Tidy Go Modules

9. Compile the Go Code as a Shared C Library That NGINX Will Dynamically Load

10. Configure NGINX with Required Paths and Modules

11. Build and Install NGINX

12. Verify NGINX Installation and Configuration

Hardcoded Backend vs. Consul-driven Backend

1. Hardcoded Backend

2. Consul-driven Backend with ngx_http_consul_backend_module

Conclusion: A Lightweight, Flexible Solution

Why your API Doc Structure Matters: How these 5 Principles make it easy for your readers

Common Issues with API Docs: And How I Managed to Solve Them

1. Discovering the right API that I need

2. Figuring out how to use it

3. Procrastination in creating the code and getting it done

4. Handling Errors and Authentication

How Mainstream API Docs Structure Information for the Readers: A Case Study

Mintlify

Readme

Fern

The Key features that improve Developer Engagement

Faster, Easier Deployments: How We Simplified Our Infrastructure with Nomad in 15 Hours (Goodbye, Kubernetes!)

Kubernetes: The Goliath We Couldn't Tame

Drowning in YAML: Our Kubernetes Nightmare

Nomad: The Oasis in Our Infrastructure Desert

Kubernetes vs. Nomad: A Clash of Titans

From YAML to HCL: A Developer-Friendly Shift

Nomad 101: Your Quick Start Guide

1. Installing Nomad for Production

2. Exposing Nomad Ports

Create Security Group

6. Clone the `ngx_http_consul_backend_module` Repository

2. Consul-driven Backend with `ngx_http_consul_backend_module`