Forem: Mohamed El-Bably

Rate Limiting: A Dynamic Distributed Rate Limiting with Redis

Mohamed El-Bably — Tue, 14 Nov 2023 16:15:47 +0000

In our previous article, “Rate Limiting: The Sliding Window Algorithm” we explored the theoretical underpinnings of rate limiting and how the Sliding Window Algorithm serves as an effective solution. Now, it’s time to roll up our sleeves and dive into the practical side of things. In this new article, we will take that theoretical knowledge and put it into action.

Slide Limiter

Slide Limiter is an open-source TypeScript implementation of a sliding window rate-limiting algorithm that provides distributed rate-limiting functionality using Redis. It allows you to limit the number of requests or actions that can be performed within a specific time window in a dynamic efficient way.

Architecture and Implementation

Slide Limiter leverages a dynamic rate-limiting approach, allowing you to fine-tune rate limits based on their specific needs. By creating a SlideLimiter instance with customizable options, such as the duration of the time window and the maximum limit of requests, users gain the flexibility to enforce rate limits tailored to their application's requirements.

The architecture involves two essential components:

Slide Limiter (SlideLimiter): This is the central orchestrator responsible for managing rate limits. By interfacing with different store options, it offers a seamless way to enforce rate limiting across applications. The heart of the Slide Limiter is the hit method, which checks and enforces rate limits. If a request exceeds the defined rate limit, it can trigger the necessary actions, ensuring the orderly flow of requests.
Storage Options (RedisStore and MemoryStore): Slide Limiter allows you to choose between RedisStore and MemoryStore as storage backends. RedisStore is particularly well-suited for distributed systems where consistency and shared rate limiting across multiple servers are paramount. It employs a Lua script executed atomically in Redis, ensuring that rate-limiting operations are thread-safe. On the other hand, MemoryStore is ideal for single-node or testing environments, offering simplicity and ease of use.

The following code snippet is almost what you need to create a simple rate limiter and validate the hits count.

import Redis, { RedisOptions } from 'ioredis';  
import { RedisStore, SlideLimiter } from "slide-limiter";  

// Create a RedisStore instance  
const store = new RedisStore({  
    host: env.REDIS_HOST || 'localhost',  
    port: env.REDIS_PORT || 6379,  
    password: env.REDIS_PASSWORD,  
    db: env.REDIS_DATABASE || 0,  
});  
// SlideLimiter options  
const options = {  
  windowMs: 60000,  // 1 minute  
  maxLimit: 11,     // n + 1  
};  
const limiter = new SlideLimiter(store, options);  
const bucket = 'users';  
const key = "user123";  
// Remaining requests  
const remaining = await limiter.hit(bucket, key);  
if (remaining > 0) {  
  // Allow the action  
  console.log(`Action performed. Remaining requests: ${remaining}`);  
} else {  
  // Rate limit exceeded  
  console.log("Rate limit exceeded. Please try again later.");  
}

Use the limit you need + 1, as we have one method hit that does the increment and returns the count after consuming 1 hit

By creating a SlideLimiter instance with specific options, such as a one-minute time window and a maximum limit of 10 requests, we enable fine-grained control over the rate-limiting process.

With a defined bucket and key, we can effortlessly track and manage the rate limits for different users or entities. If the remaining requests exceed zero, the action is allowed to proceed, and the remaining request count is displayed. However, if the rate limit has been reached, the system gracefully handles the situation and informs the user that they should try again later.

Adaptive Rate Limiting with Dynamic Configuration

Dynamic rate limiting offers significant advantages when dealing with varying rate-limiting requirements across different operations or clients.

This flexibility enables us to tailor rate limits to specific use cases, such as granting higher limits to premium users or imposing stricter limits on resource-intensive operations.

The sliding window algorithm provides an elegant solution to this dynamic rate-limiting need by allowing real-time adjustments to the windowMs and maxLimit parameters.

For instance, when working with a specific bucket (e.g., main) and a key (e.g., 1234) with initial settings of windowMs = 60000 and maxLimit = 5, we can seamlessly modify these parameters for larger or smaller values with each new call to the hit function. This adaptability empowers applications to fine-tune rate limits on the fly, ensuring optimal performance and resource allocation.

Fine-Grained Rate Limiting Control

Slide Limiter uses the concept of buckets with keys to effectively control and manage the flow of requests within an application or system. These parameters serve specific roles in the rate-limiting process:

Bucket

The bucket parameter is a higher-level categorization that allows us to group rate limits for different resources or endpoints. It can be used to separate the count for specific endpoints or functionalities within your application.

In the context of a web API, for example, we might want to rate limit different endpoints differently. By using a bucket, we can create distinct rate limits for different parts of your application.

For instance, we could have separate buckets for /users/auth, /api/orders, and /api/profile, each with its own rate limit configuration.

Key

The key parameter represents the identifier for the resource or entity we want to rate limit per bucket. It is typically associated with a specific user, IP address, or any other unique identifier for the client making the request. For example, if we want to rate limit requests from different users or clients separately, we can use the user’s ID or IP address as the key. The key is an essential part of the rate-limiting process because it allows us to track and limit requests on a per-client basis.

Together, buckets and keys enable fine-grained control over rate limiting. Buckets categorize rate limits, while keys distinguish requests from different clients or entities, ensuring that rate limiting is not only efficient but also customizable to the specific needs of your application.

Redis Store

Redis is used here to store and update hits count atomically, which is suitable for distributed systems where rate-limiting needs to be consistent and shared across multiple servers.

How Redis Store Works

When we use Redis as a store for Slide Limiter, calls forhit function will call RedisStore -> hit function, which actually calls the following LUA script that will be executed by Redis server.

local current_time = redis.call('TIME')  
local bucket = KEYS[1]  
local id = KEYS[2]  
local key = bucket .. ":" .. id  
local window = tonumber(ARGV[1]) / 1000  
local limit = tonumber(ARGV[2])  
local trim_time = tonumber(current_time[1]) - window  
redis.call('ZREMRANGEBYSCORE', key, 0, trim_time)  
local request_count = redis.call('ZCARD', key)  
if request_count < limit then  
  redis.call('ZADD', key, current_time[1], current_time[1] .. current_time[2])  
  redis.call('EXPIRE', key, window)  
  return limit - request_count - 1;  
end  
return 0

This Lua script is designed to be executed atomically in Redis, ensuring that rate-limiting operations are consistent and thread-safe when multiple requests are made concurrently. It checks if the request count is within the defined limit and records each request’s timestamp, allowing for accurate rate-limiting enforcement.

It performs the following steps:

Get the Current Time: Use redis.call('TIME') to retrieve the current time in Redis, which is an array with two elements: the seconds and microseconds.
Construct the Key: Combine the bucket and id to create a unique key in Redis. This key represents the rate-limiting window for a specific resource (Both are provided when invoking the script).
Calculate the Time Window: Convert the provided windowMs (in milliseconds) to seconds by dividing it by 1000.
Check the Requests Count: Check the number of requests in the specified time window by using redis.call('ZCARD', key). This count represents the requests made within the defined time window.
Rate Limit Check: If the request count is less than the limit, it means there's still capacity for more requests.
Add Request Timestamp: Add the current time to the Redis sorted set with the timestamp as the score. This effectively records the time of the request.
Set Expiry Time: Set an expiry time on the Redis key using redis.call('EXPIRE', key, window). This ensures that rate-limiting data is automatically cleared from Redis after the defined time window.
Calculate Remaining Requests: Calculate the remaining requests by subtracting the request count from the limit and then subtracting 1. This accounts for the current request.
Return Remaining Requests: Return the number of remaining requests. If it’s greater than 0, the request is allowed; otherwise, it’s rate-limited.

Using Redis for rate limiting offers several benefits

Distributed and Scalable: Redis is a distributed, in-memory data store that is well-suited for scaling applications. By using a RedisStore, we can implement rate limiting in a distributed system, making it easier to handle high traffic and load balancing across multiple instances.
High-Performance: Redis is known for its exceptional read and write performance, making it an excellent choice for rate limiting. It can quickly store and retrieve rate-limiting data, reducing the latency of rate-limiting checks.
Expiration and Automatic Cleanup: Redis allows you to set expiration times for keys, making it easy to implement automatic data cleanup. This feature is crucial for rate limiting because it ensures that old rate limit data is removed, maintaining the accuracy of the rate limiter.
Atomic Operations: Redis supports atomic operations, which means rate limit checks and updates can be performed atomically in a single step. This ensures that rate-limiting operations are consistent, even in a multi-threaded or distributed environment.

Slide Limiter on GitHub

If you’re interested in more details, please visit the project page on GitHub

https://github.com/m-elbably/slide-limiter

Conclusion

Slide Limiter is a powerful tool for implementing rate limiting in your distributed applications. By combining the efficiency of the sliding window rate-limiting algorithm with the scalability and consistency of Redis, Slide Limiter offers a robust solution for managing the flow of requests and actions within your system. Whether you need to protect endpoints, manage client-specific rate limits, or dynamically adjust rate-limiting parameters, I think Slide Limiter provides the flexibility and control you will need.

Rate Limiting: The Sliding Window Algorithm

Mohamed El-Bably — Tue, 07 Nov 2023 12:01:00 +0000

Introduction

Rate limiting is a fundamental method for managing the flow of traffic to a service or server by imposing restrictions on the number of requests that can be made within a specific time window. This essential technique is widely employed in various network and web applications to ensure the orderly and secure operation of systems.

What is Rate Limiting?

In simple terms, rate limiting is a way to govern the speed at which clients can execute requests or operations on a service or server within a specific timeframe. The core objective is to strike a balance between serving genuine users efficiently while protecting the system from abuse, overuse, or malicious activities.
Imagine a scenario where one malicious zombie machine or user can launch an astonishing 20 or more HTTP GET requests per second. Compare this to the legitimate rate, which is often much less.

Research has shown that such flooding rates can wreak havoc on web services, leading to downtime, poor performance, and potential security vulnerabilities. The result? A service that's unreliable and vulnerable to abuse.

How Does Rate Limiting Work?

Rate limiting operates as an integral part of an application, residing within its code rather than on the web server itself. It predominantly relies on tracking the IP addresses from which requests originate and closely monitoring the time intervals between each request. The IP address plays a pivotal role in identifying the source of a request within an application.

A rate-limiting solution essentially performs a dual measurement: it evaluates the time lapses between requests from individual IP addresses and counts the total number of requests within a predefined time window. If the analysis indicates an excessive number of requests emanating from a single IP address within this designated timeframe, the rate-limiting mechanism intervenes by imposing a temporary halt on fulfilling further requests from that IP address.

In simpler terms, a rate-limited application effectively communicates a friendly “Slow down, please” to users who are submitting requests at an accelerated pace. This is akin to a traffic police officer pulling over a speeding driver, or a parent kindly advising their child not to consume too much candy in a short span of time. Rate limiting, therefore, serves as a guardian that promotes fair usage, prevents system overload, and ensures a harmonious coexistence between applications and users.

Examples

API Requests Rate Limiting

A public API service allows free users to make up to 100 API requests per hour. When a user’s request rate exceeds this limit, the service throttles their requests, causing a delay in processing.

Password Reset Attempts Rate Limiting

A website’s password reset feature limits users to a maximum of 3 password reset attempts in a 24-hour period. If a user exceeds this limit, they are temporarily locked out of the password reset functionality to prevent misuse.

These examples showcase different scenarios where rate limiting is applied to control access to resources or services and prevent abuse or overuse. Rate limiting helps maintain fair usage and ensures the stability and availability of systems and services.

Depending on the context, different identifiers can be employed to track and regulate requests. For public requests, the IP address can serve as an identifier, helping to manage traffic and prevent abuse. In other scenarios, a user's unique identifier may be more appropriate, ensuring that rate limiting is applied at an individual level. The choice of identifier depends on the specific needs and objectives of the rate-limiting implementation.

Benefits of Rate Limiting

Preventing Overload: It helps prevent server overload by controlling the number of requests or operations that can be processed in a given time frame.
Protection from Abuse: Rate limiting safeguards against abusive or malicious behavior, such as Distributed Denial of Service (DDoS) attacks or brute force attacks. Improved User Experience: By ensuring fair resource allocation, rate limiting helps maintain a good user experience by preventing a single user or client from monopolizing resources.
Predictable Performance: It provides predictability in system performance by avoiding sudden spikes in traffic that can disrupt operations.
Resource Efficiency: Rate limiting optimizes resource usage and helps in efficient resource allocation.

Which attacks are Prevented by rate limiting?

Rate limiting is a powerful defense mechanism against a range of attacks that aim to overwhelm a system through an excessive influx of requests. By regulating the pace of incoming requests, it acts as a shield against resource exhaustion, reduces the vulnerability to downtime, and guards against various attack vectors. Notably, three types of attacks find their adversary in rate limiting:

DDoS Attacks — Distributed Denial of Service (DDoS) attacks are notorious for their ability to flood a system or service with a massive wave of requests, rendering it unresponsive or inaccessible. Rate limiting acts as a formidable barrier against DDoS attacks by either blocking or delaying requests from a single IP address or client that surpasses a predefined threshold. This strategic move makes it significantly more challenging for attackers to incapacitate the system.
Brute Force Attacks — Brute force attacks involve relentless attempts to guess login credentials or other sensitive information, leveraging automated tools to flood the system with a barrage of requests. Rate limiting comes to the rescue by limiting the number of login attempts that can be made within a specific time frame, thereby raising the difficulty level for attackers attempting to crack valid credentials.
API Abuse — API abuse entails the excessive sending of requests to an API with the aim of extracting vast amounts of data or causing resource depletion. Rate limiting serves as a robust deterrent against API abuse by restricting the number of requests that a single client or application can make. This ensures the equitable distribution of API resources among all clients, effectively preventing resource exhaustion and ensuring that API services remain available and responsive.

Web Scraping — While web scraping is often perceived as a benign activity, it can become a subtle form of attack when performed extensively. Rate limiting, although a valuable measure, may not entirely prevent web scraping. However, it does act as a deterrent by setting restrictions on the speed at which data can be harvested from a website or application. This serves a dual purpose by safeguarding the resources of the target site and ensuring the continuity of data integrity and accessibility.

Applications of Rate Limiting

APIs: Protecting APIs from overuse or abuse is one of the most common use cases. This ensures that API endpoints remain available to all users.
Web Servers: Rate limiting can be applied to web servers to control incoming HTTP requests and prevent server overload.
Microservices: In microservices architectures, rate limiting can be used to manage inter-service communication and prevent cascading failures.
IoT Devices: Limiting the rate of data ingestion from IoT devices to cloud services helps maintain data quality and system stability.

Common Rate Limiting Algorithms

Token Bucket: Tokens are added to a bucket at a fixed rate, and requests are allowed if tokens are available.
Leaky Bucket: Requests are placed in a bucket and leak out at a fixed rate. If the bucket overflows, requests are rejected.
Fixed Window: In this algorithm, the rate limit is applied within fixed time windows, and requests exceeding the limit are rejected.
Sliding Window: The sliding window rate limiting algorithm is based on a dynamic time window that moves with time, allowing for more flexibility in managing bursts of traffic.
Data Store: The data store serves as the repository where information related to the rate limit and request counts is stored and retrieved. Typically, a key-value store like Redis is used for this purpose.

Sliding Window Rate Limiting Algorithm

The Sliding Window Algorithm is a time-based method used to track and control the rate at which requests or operations can be made within a specific time window. It’s a dynamic system that adapts to changing traffic patterns, making it an effective tool for rate limiting in various applications.

The central concept behind the Sliding Window Algorithm is the utilization of a dynamic time window that moves with the flow of time. Unlike static methods that reset rate limits at fixed intervals, the sliding window continuously adjusts to the current time, allowing for a more flexible and adaptable approach.

Understanding the Sliding Window Algorithm

The Sliding Window Algorithm relies on a combination of two essential components: a fixed-size time window and a counter that tracks the number of requests or operations made within that window.

Fixed-Size Time Window: The first component is a time window, which can be set to any desired duration, such as one second, one minute, or one hour. This window determines the timeframe within which the rate limit is enforced.
Request Counter: The second component is a counter that keeps track of the requests or operations made within the time window. As requests come in, the counter increments.

Sliding Window rate limiting for 6 requests / second

The innovative aspect of the Sliding Window Algorithm is that it continuously adjusts the time window as time progresses. This dynamic approach allows the algorithm to maintain a rolling window of time that moves with the current moment. For example, if we have a one-minute time window, the algorithm tracks requests made in the last 60 seconds, rather than resetting every minute.

Rate Limit Enforcement

To enforce the rate limit, the Sliding Window Algorithm compares the number of requests made within the sliding time window to the predefined rate limit. If the count exceeds the limit, the algorithm can take one of several actions, such as delaying, rejecting, or throttling the excess requests. This ensures that the system remains within the defined rate limits while still allowing for bursts of activity when traffic naturally fluctuates.

Sliding Window Rate Limiting in Action

In the presented sequence diagram, we can observe the process of Sliding Window Rate Limiting in action.

Participants:

Client: Represents the entity or user making requests to a service.
Rate Limiter: The component responsible for rate limiting.
Service: The actual service to which the requests are being made.

The diagram depicts a sequence of events for handling incoming requests:

The client initiates requests to access a service.
These requests are intercepted by the Rate Limiter to ensure they comply with the defined rate limit.
Within each request, the following actions occur: — The Rate Limiter checks if the sliding window needs to be adjusted. This is a crucial aspect of sliding window rate limiting. The window slides over time to maintain the rate limit. — Expired requests that fall outside the sliding window are removed from consideration. — The timestamp of the current request is added to the sliding window, indicating its occurrence.
Each request has two scenarios according to the window limit: — Within Limit: If the current request rate is within the defined limit, the Rate Limiter forwards the request to the Service. The Service processes the request and sends a response back to the Rate Limiter, which, in turn, forwards the response to the Client. — Exceeds Limit: If the current request rate exceeds the defined limit, the Rate Limiter responds to the Client with a request rejection, indicating that the limit has been exceeded.
The process continues in a loop for incoming requests.

The Benefits of the Sliding Window Algorithm

The Sliding Window Algorithm offers several advantages, including:

Flexibility: It adapts to varying traffic patterns and prevents abrupt cutoffs when rate limits are exceeded.
Fairness: The algorithm ensures that resources are allocated fairly, preventing any single client or user from monopolizing them.
Protection: It safeguards against abuse, ensuring that the system remains secure and available even in the face of excessive traffic.

Comprehensive Rate Limiting

Rate limiting every endpoint in an API, including internal health checks, is a thoughtful approach with several compelling reasons. Firstly, it ensures a consistent and fair allocation of resources across all aspects of your system. Whether it’s a critical external API endpoint serving thousands of users or an internal health check that monitors system well-being, rate limiting helps maintain an equitable balance.

Furthermore, by rate-limiting internal health checks, you prevent the accidental or deliberate overuse of these checks. In a busy or complex system, frequent health checks can inadvertently strain resources, leading to performance issues. Rate limiting mitigates this risk, ensuring that essential monitoring activities remain effective without becoming a source of contention.

Additionally, rate limiting provides a layer of defense against potential security threats. Even internal endpoints can be vulnerable to abuse or misuse. By implementing rate limits, you can protect your system from unintentional or intentional disruptions and ensure that resources are allocated where they are genuinely needed.

Beyond the Limit: Handling Excess Requests

When the rate limit is exceeded, two common strategies come into play: dropping or throttling requests. The choice between these strategies depends on the specific requirements and objectives of the system.

Dropping Requests

When the rate limit is exceeded, the “drop” strategy involves simply rejecting or discarding the excess requests. In this scenario, the server does not process the requests beyond the rate limit. This can be an effective way to ensure that the server remains within its operational limits and maintains optimal performance.

Use Cases for Dropping Requests:

Security: Dropping requests is an excellent choice when dealing with potentially malicious traffic, such as Distributed Denial of Service (DDoS) attacks. By immediately discarding excess requests, you can protect the system from being overwhelmed.
Predictable Performance: In situations where predictable performance is paramount, like critical real-time applications, dropping requests can help maintain a consistent quality of service for legitimate users by preventing the server from becoming overburdened.

Throttling Requests

Throttling, on the other hand, is a more gentle approach to managing excess requests. Instead of rejecting them outright, the server processes them at a reduced rate, slowing down the response time. Throttling aims to ensure that the system remains responsive even when the rate limit is exceeded.

Use Cases for Throttling Requests:

User Experience: Throttling can be ideal when maintaining a positive user experience is essential. It allows legitimate users to continue accessing the service, albeit at a slower rate, rather than abruptly blocking their requests.
Graceful Degradation: Throttling is beneficial in scenarios where graceful degradation of service is preferable. For instance, during periods of high traffic, it allows the system to remain operational, though at a reduced capacity.

the choice between dropping and throttling requests, when the rate limit is exceeded, depends on the specific use case and the desired outcome. Dropping requests provides swift and strict enforcement of rate limits, making it suitable for security-critical and performance-sensitive scenarios. On the other hand, throttling requests offer a more lenient approach that prioritizes maintaining user access and system availability, making it a good fit for situations where user experience and graceful service degradation are vital.

References

Rate limiting (Wikipedia)
What is rate limiting (Cloudflare)
What is rate limiting (Radware)
Ultimate guide to rate limiting (solo.io)
How to Design a Scalable Rate Limiting Algorithm (Kong)
Redis and Lua Powered Sliding Window Rate Limiter (halodoc.io)

Photo by Ludovic Charlet on Unsplash

Distributed High-Speed Spell Checking & Correction In Node.js

Mohamed El-Bably — Wed, 03 May 2023 19:13:35 +0000

Introduction

Spell-checking is a widely used feature in software applications such as search engines, email clients, word processors and chatbots.

SymSpellEx (Extended SymSpell)

Node.js spelling correction & Fuzzy search library based on Symmetric Delete Spelling Correction algorithm (SymSpell)

Why SymSpellEx

Simple spell checking algorithms, such as the Norvig Algorithm, are not efficient and can be slow.
I needed a spell checker that would work on a distributed system with multiple nodes, to perform spelling correction without duplicating data on each node.
I needed to ensure that any changes or updates to the training data would be reflected across all connected nodes.

Edit Distance

Edit distance is a measure of how different two strings are in terms of the minimum number of operations required to transform one string into the other.

Edit distance algorithms compute the minimum number of edit operations required to transform one string into another, where edit operations include insertion, deletion, and substitution of characters. Some popular edit distance algorithms include:

Levenshtein distance: This algorithm computes the minimum number of insertions, deletions, and substitutions required to transform one string into another.
Damerau-Levenshtein distance: This is similar to Levenshtein distance, but also allows for transpositions (i.e., swapping adjacent characters).
Jaro distance: This algorithm computes the similarity between two strings based on the number of matching characters and the number of transpositions needed to make the strings identical.
Jaro-Winkler distance: This is an extension of the Jaro distance that assigns higher weights to prefix matches.

Example

The Levenshtein distance between "kitten" and "sitting" is 3. A minimal edit script that transforms the former into the latter is:

kitten → sitten (substitute "s" for "k")
sitten → sittin (substitute "i" for "e")
sittin → sitting (insert "g" at the end)

The total number of operations needed to transform "kitten" into "sitting" is three, which is the edit distance between the two strings.

Naive approach

It involves checking each word in the dictionary to find the word with the smallest edit distance to the misspelled word.

For example, if the misspelled word is "teh", the algorithm will check every word in the dictionary to find the word with the smallest edit distance. The algorithm may find "the" as a possible correction, since it only requires one edit (a substitution of "h" with "e")

This approach is computationally very expensive, which makes it not practical to use.

Peter Norvig Algorithm

Peter Norvig Algorithm is a spell-checking algorithm that generates all possible terms with a given edit distance from the query term and searches them in a dictionary to find the correct spelling. The edit distance is calculated based on the number of operations needed to transform the query term to the correct spelling, where an operation can be a deletion, transposition, replacement, or insertion of a single character.

The algorithm follows these steps:

Generate all possible candidate words within an edit distance of n from the input word, by applying deletions, transpositions, replacements, and insertions to the input word.
Filter the candidate words to keep only the ones that appear in a pre-existing corpus of text, in this case, a large English text corpus.
Rank the remaining candidate words based on their frequency of occurrence in the corpus.
Return the most likely candidate word as the corrected spelling.

Example:
Suppose the input word is "the", and we have a large English text corpus to draw from. We first generate all possible candidate words within an edit distance of 2 from "the". This gives us a set of words such as "thee", "them", "then", "there", "these", "theta", and so on.

Next, we filter this set to only include the candidate words that appear in the English text corpus. Suppose this narrows down our set to just "the" and "they".

Finally, we rank the remaining candidate words by their frequency of occurrence in the corpus. Since "the" is a very common word in English, it is much more likely to be the intended spelling than "they". Therefore, "the" is returned as the corrected spelling for the input word "the".

However, this method can be expensive and language-dependent. For a word of length n, an alphabet size a, and an edit distance of 1, there will be n deletions, n-1 transpositions, a * n alterations, and a * (n+1) insertions.

Candidate words number can be calculated using:
2n+2an+a-1 or 54n + 25 (Assuming we are using english alphabet having a = 26), will lead to 90902 candidate words search when n = 8.

In some languages, such as Chinese, the alphabet can be enormous, resulting in even more possible words to search through.

Symmetric Delete Spelling Correction algorithm

The Symmetric Delete spelling correction algorithm simplifies the process of generating possible spellings and searching in a dictionary. It does so by only using deletion as an operation, instead of deletion combined with transpose, replace, and insert operations. As a result, it is much faster, being six orders of magnitude faster than traditional methods for edit distance of 3. Moreover, it is language-independent.

Why the algorithm is fast?

Pre-calculation in training phase, while all possible spelling error variants as generated (deletes only) and stored in hash table, which makes it also fast when searching with an average search time complexity of O(1).

This makes the algorithm very fast, but it also required a large memory footprint, and the training phase takes a considerable amount of time to build the dictionary first time.

For more details check SymSpell

SymSpellEx

SymSpellEx is building on SymSpell to provide extensibility by implementing different edit distance algorithms or implementing different data store.

Features

Very fast
Word suggestions
Word correction
Multiple languages supported - The algorithm, and the implementation are language independent
Extendable - Edit distance and data stores can be implemented to extend library functionalities

Main Components

1. Tokenizer

This interface can be implemented to provide a different tokenizer for the library, a simple core tokenizer is provided.

export interface Tokenizer {
    tokenize(input: string): Array<Token>;
}

2. Edit Distance

This interface can be implemented to provide different algorithms to use to calculate edit distance between two words.

interface EditDistance {
    name: String;
    calculateDistance(source: string, target: string): number;
}

Built-in Edit Distance Algorithms

Damerau–Levenshtein distance

3. Data Store

This interface can be implemented to provide additional method to store data other than built-in stores (Memory, Redis)

Data store should handle storage for these 2 data types:

Terms: List data structure to store terms and retrieve it by index
Entries: Hash Table data structure to store dictionary entries and retrieve data by term (Key)

Data store should also handle storage for multiple languages and switch between them, check implemented data stores.

export interface DataStore {
    name: string;
    initialize(): Promise<void>;
    isInitialized(): boolean;
    setLanguage(language: string): Promise<void>;
    pushTerm(value: string): Promise<number>;
    getTermAt(index: number): Promise<string>;
    getTermsAt(indexes: Array<number>): Promise<Array<string>>;
    getEntry(key: string): Promise<Array<number>>;
    getEntries(keys: Array<string>): Promise<Array<Array<number>>>;
    setEntry(key: string, value: Array<number>): Promise<boolean>;
    hasEntry(key: string): Promise<boolean>;
    maxEntryLength(): Promise<number>;
    clear(): Promise<void>;
}

Built-in data stores

Memory: Stores data in memory, using array structure for terms and high speed hash table (megahash) to manage dictionary entries

May be limited by node process memory limits, which can be overridden

Redis: Stores data into Redis database using list structure to store terms and hash to store dictionary data

Very efficient way to train and store data, it will allow accessing by multiple processes and/or nodes/machines, training data one time on centralized distributed store, with enough memory data can be updated in production using different namespace on redis without interruptions, also dumping and migrating data will be easy.

Redis data store uses LUA scripting to efficiently set entries using multiple Redis commands on server side, by defining custom command hSetEntry

async initialize(): Promise<void> {
        await this._redis.defineCommand('hSetEntry', {
            numberOfKeys: 2,
            lua:
                `
                local olen = redis.call("hget", "${this._configNamespace}", ARGV[2])
                local value = redis.call("hset", KEYS[1], KEYS[2], ARGV[1])
                local nlen = #KEYS[2]
                if(not olen or nlen > tonumber(olen)) then
                  redis.call("hset", "${this._configNamespace}", ARGV[2], nlen)
                end
                return value
                `
        });
        this._initialized = true;
    }

Usage

Training

For single term training you can use add function:

import {SymSpellEx, MemoryStore} from 'symspell-ex';

const LANGUAGE = 'en';
// Create SymSpellEx instnce and inject store new store instance
symSpellEx = new SymSpellEx(new MemoryStore());
await symSpellEx.initialize();
// Train data
await symSpellEx.add("argument", LANGUAGE);
await symSpellEx.add("computer", LANGUAGE);

For multiple terms (Array) you can use train function:

const terms = ['argument', 'computer'];
await symSpellEx.train(terms, 1, LANGUAGE);

Searching

search function can be used to get multiple suggestions if available up to the maxSuggestions value

Arguments:

input String (Wrong/Invalid word we need to correct)
language String (Language to be used in search)
maxDistance Number, optional, default = 2 (Maximum distance for suggestions)
maxSuggestions Number, optional, default = 5 (Maximum suggestions number to return)

Return: Array<Suggetion> Array of suggestions

Example

await symSpellEx.search('argoments', 'en');

Example Suggestion Object:

{
  "term": "argoments",
  "suggestion": "arguments",  
  "distance": 2,
  "frequency": 155
}

Correction

correct function can be used to get the best suggestion for input word or sentence in terms of edit distance and frequency

Arguments:

input String (Wrong/Invalid word we need to correct)
language String (Language to be used in search)
maxDistance Number, optional, default = 2 (Maximum distance for suggestions)

Return: Correction object which contains original input and corrected output string, with array of suggestions

Example

await symSpellEx.correct('Special relatvity was orignally proposed by Albert Einstein', 'en');

Returns this Correction object:

This output is totally depending on the quality of the training data that was push into the store

{
  "suggestions": [],
  "input": "Special relatvity was orignally proposed by Albert Einstein",
  "output": "Special relativity was originally proposed by Albert Einstein"
}

Github Repository

https://github.com/m-elbably/symspell-ex

Conclusion

Spell-checking is a widely used feature in software applications. Simple spell-checking algorithms are not efficient and can be slow.

There are many popular edit distance algorithms such as Levenshtein distance, Damerau-Levenshtein distance, Jaro distance, and Jaro-Winkler distance.

Naive approach involves checking each word in the dictionary to find the word with the smallest edit distance to the misspelled word, which is computationally expensive.

Peter Norvig Algorithm is a spell-checking algorithm that generates all possible terms with a given edit distance from the query term and searches them in a dictionary to find the correct spelling.

Symmetric Delete spelling correction algorithm simplifies the process of generating possible spellings and searching in a dictionary by only using deletion as an operation, resulting in being much faster and language-independent.

References

GPT Graph: A Simple Tool for Knowledge Graph Exploration

Mohamed El-Bably — Sun, 30 Apr 2023 09:28:18 +0000

As a developer, exploring and organizing information is a crucial part of the job. That's why I wanted to share with you an open-source tool that I developed called "GPT Graph" It is a knowledge graph explorer that utilizes the powerful GPT 3.5 turbo model to help users explore information in an organized and intuitive manner.

I believe graphs are an excellent way to leverage LLMs in a variety of use cases, including brainstorming, studying, and reasoning about topics and how they relate to each other.

What is a Knowledge Graph

A knowledge graph is a type of database that is used to store and represent knowledge in a machine-readable format. It uses a graph-based model, consisting of nodes (entities) and edges (relationships), to represent information and the connections between them. Knowledge graphs are often used to represent complex information in a structured and intuitive way, making it easier for machines to understand and analyze. They can be used in various domains, such as natural language processing, search engines, recommendation systems, and data analytics.

Why GPT Graph?

It's a unique way to explore information in an organized and intuitive manner. With GPT Graph, you can easily navigate through different topics, discover new relationships between them, and generate creative ideas.

It leverages the power of GPT-3 to generate relevant and high-quality content. Unlike traditional keyword-based searches, GPT Graph takes a more semantic approach to explore the topics and generate the graph. It helps to uncover hidden relationships between different topics and provides a comprehensive view of the entire knowledge domain.

Moreover, GPT Graph provides a user-friendly interface that allows users to interact with the graph easily. Users can ask questions, generate prompts, and add their own ideas to the graph. It's a powerful tool that enables users to collaborate, brainstorm, and generate new insights in a very efficient way.

Demo

Features

The ability to describe a specific query and generate a graph of related topics.
Auto-generated prompts for generated nodes to discover more about the topic.
Custom prompts support, asking questions and getting answers, along with generated prompts to branch from your own ideas.
Markdown formatted descriptions.

Example

Adding the following prompt as a starting point for the graph:

Solve 5x^2 + 6x + 1 = 0

Will solve the equation and provide these helpful prompts to expand your knowledge about Quadratic Equation:

What is a quadratic equation?
How do you derive the quadratic formula?
What are some methods to solve quadratic equations?

These auto-generated prompts may vary from time to time, based on used temperature parameter value.

Project Details

This project uses the following technologies and frameworks:

OpenAI GPT 3.5 turbo model
Typescript
Vue.js 3.0 JavaScript Framework
Ant Design Vue UI Framework
G6 Graph Visualization Engine

Usage

npm install
npm run dev

Content Parsing

Most of the time GPT 3 returns a consistent JSON object, but unfortunately this is not always the case, so additional layer added to extract actual JSON or transform it to the valid format through jsonrepair library

Content Rendering

G6 library is used for canvas rendering and graph arrangement, with custom nodes and edges.
Two tree graph layouts used

dendrogram for vertical layout
mindmap for horizontal layout

Model Parameters

temprature: 0.7 is used to get different and more creative responses every time, you may play around with different values and check the output

API Key

An OpenAI API key is required to run the tool, You can add your own key to .env file with the key VITE_OPENAI_KEY, Check .env.example file.

GitHub Repository

https://github.com/m-elbably/gpt-graph

Future Enhancements

While this is just a limited technical example, I think there are many ways to improve it. Here are some of the potential enhancements:

Use GPT message context with the path the user took to discover specific topics, allowing the user to control the divergence of the information and get creative ideas through auto-generated prompts or questions.
Use GPT inference capabilities to label topics and automatically connect or group related nodes.
Infer topics relations and add to edges
Store the result in a NoSQL database labeled with the primary input and retrieve it later to draw the graph and search instead of querying GPT again.
Allow users to add and delete custom nodes, enabling them to use the tool as a powerful mind-mapping tool with AI behind the scenes to provide creative ideas or discuss user-created ones.
Use a normal graph structure instead of a tree graph (DAG), will allow developers and architects to design systems, by adding different components and the flow of data, then get GPT to generate mermaid diagram code for built graphs, which opens up many possibilities.

Potential Uses for GPT Graph

GPT Graph can be used in various domains where knowledge exploration is required. Here are a few examples of how GPT Graph can be used:

Research: GPT Graph can be used by researchers to explore new topics, discover new relationships between them, and generate new insights. It can help researchers to get a comprehensive view of the entire knowledge domain and generate creative ideas.
Education: GPT Graph can be used by students to study different topics, discover new relationships between them, and generate new insights. It can help students to get a better understanding of the subject matter and generate creative ideas.
Architecture and System Design: GPT Graph can be used by architects and designers to explore new solutions, discover new relationships between different components, and generate new ideas. It can help them to generate creative and innovative designs.
Business: GPT Graph can be used by businesses to explore new opportunities, discover new relationships between different markets and products, and generate new ideas. It can help businesses to innovate and stay ahead of the competition.

Overall, the idea of using graphs with GPT can lead to building powerful tools that can be used in various domains where knowledge exploration is required. It’s a unique way to explore information, generate new insights, and collaborate with others in an efficient way.

GPT Graph is a limited technical example that demonstrates the concept. However, it can also serve as an excellent example of how to build GPT 3 applications and use prompts as a developer to get results in a specific format and schema.

Conclusion

I hope that this small project serves as an example for you and other developers on how to build GPT 3 applications, or better yet, inspires you to create innovative tools that can be of assistance to others.

Feel free to check out the GitHub Repository and give it a try, Or play with the online demo if you have an OpenAI API Key.

The Original Article Published on Medium