Forem: AuthZed

Safeguarding Your Data When Using DeepSeek R1 In RAG Pipelines - Part II

Sohan — Fri, 31 Jan 2025 20:04:15 +0000

In Part I we learnt about why we should secure our RAG pipelines with Fine Grained Authorization, and also what are the methods to do so.

Let's now get our hands dirty and write code to actually do so.

We'll authorizing access to view blog articles and get information from it. We'll see what happens when a request is authorized and when it isn't. Here's our RAG pipeline with the software we're using.

1. Let's Talk Schema!

Let's set up our permissions system. Once you've installed SpiceDB, create a schema about two objects: users and articles. The setup is simple - users can be "viewers" of articles, and if you're tagged as a viewer, you get the all-access pass to view that article.

from authzed.api.v1 import (
    Client,
    WriteSchemaRequest,
)

import os

#change to bearer_token_credentials if you are using tls
from grpcutil import insecure_bearer_token_credentials

SCHEMA = """definition user {}

definition article {
    relation viewer: user

    permission view = viewer
}"""

client = Client(os.getenv('SPICEDB_ADDR'), insecure_bearer_token_credentials(os.getenv('SPICEDB_API_KEY')))

try:
    resp = await(client.WriteSchema(WriteSchemaRequest(schema=SCHEMA)))
except Exception as e:
    print(f"Write schema error: {type(e).__name__}: {e}")

2. Write a Relationship

Alright, first things first - we're gonna tell SpiceDB that Tim should be able to peek at document 123 and document 456. Think of it like giving Tim a special pass to view these specific files.

This is how we write a Relationship in SpiceDB. Once we've done this, SpiceDB will know exactly what Tim can and can't see.

from authzed.api.v1 import (
    ObjectReference,
    Relationship,
    RelationshipUpdate,
    SubjectReference,
    WriteRelationshipsRequest,
)

try:
    resp = await (client.WriteRelationships(
        WriteRelationshipsRequest(
            updates=[
                RelationshipUpdate(
                    operation=RelationshipUpdate.Operation.OPERATION_TOUCH,
                    relationship=Relationship(
                        resource=ObjectReference(object_type="article", object_id="123"),
                        relation="viewer",
                        subject=SubjectReference(
                            object=ObjectReference(
                                object_type="user",
                                object_id="tim",
                            )
                        ),
                    ),
                ),
                RelationshipUpdate(
                    operation=RelationshipUpdate.Operation.OPERATION_TOUCH,
                    relationship=Relationship(
                        resource=ObjectReference(object_type="article", object_id="456"),
                        relation="viewer",
                        subject=SubjectReference(
                            object=ObjectReference(
                                object_type="user",
                                object_id="tim",
                            )
                        ),
                    ),
                ),
            ]
        )
    ))
except Exception as e:
    print(f"Write relationships error: {type(e).__name__}: {e}")

3. Writing to our Vector DB

Pinecone is a vector database where we store our embeddings. Let's set up our Pinecone serverless index - don't worry, it's not as complicated as it sounds!

#from pinecone.grpc import PineconeGRPC as Pinecone
from pinecone import ServerlessSpec
from pinecone import Pinecone
import os

pc = Pinecone(api_key=os.getenv("PINECONE_API_KEY"))

index_name = "oscars"

pc.create_index(
    name=index_name,
    dimension=1024,
    metric="cosine",
    spec=ServerlessSpec(
        cloud="aws",
        region="us-east-1"
    )
)

Here's where it gets fun - we're going to create a totally made-up fact: "Bill Gates won the 2025 Oscar for best football movie." (I know, wild right? 😄). We're using this made-up fact to show how RAG handles information that LLMs don't already know about.

We'll also add a little tag (article_id) to keep track of where this info came from. This is super important because it helps us link everything back to our permission system.

from langchain_pinecone import PineconeEmbeddings
from langchain_pinecone import PineconeVectorStore

from langchain.schema import Document
import os

# Create a Document object that specifies our made up article and specifies the document_id as metadata.
text = "Bill Gates won the 2025 Oscar for best football movie"
metadata = {
    "article_id": "123"
}
document = Document(page_content=text,metadata=metadata)


# Initialize a LangChain embedding object.
model_name = "multilingual-e5-large"
embeddings = PineconeEmbeddings(
    model=model_name,
    pinecone_api_key=os.environ.get("PINECONE_API_KEY")
)

namespace_name = "oscar"

# Upsert the embedding into your Pinecone index.
docsearch = PineconeVectorStore.from_documents(
    documents=[document],
    index_name=index_name,
    embedding=embeddings,
    namespace=namespace_name
)

4. Checking Tim's VIP Permissions

Now comes the cool part! We'll ask SpiceDB what documents Tim can actually see. This is how you can check for permissions and look up resources in SpiceDB. Here we're using the LookupResources API to get a list of articles that Tim has permission to view.

from authzed.api.v1 import (
    LookupResourcesRequest,
    ObjectReference,
    SubjectReference,
)

subject = SubjectReference(
    object=ObjectReference(
        object_type="user",
        object_id="tim",
    )
)

def lookupArticles():
    return client.LookupResources(
        LookupResourcesRequest(
            subject=subject,
            permission="view",
            resource_object_type="article",
        )
    )
try:
    resp = lookupArticles()

    authorized_articles = []

    async for response in resp:
            authorized_articles.append(response.resource_object_id)
except Exception as e:
    print(f"Lookup error: {type(e).__name__}: {e}")

print("Article IDs that Tim is authorized to view:")
print(authorized_articles)

Output:

Article IDs that Tim is authorized to view:
['123', '456']

With that sorted, we can chat with our DeepSeek R1 model, but only about stuff Tim's allowed to see. It's like having a really smart assistant who's also great at keeping secrets!

Quick side notes:

We're using OpenRouter to access the DeepSeek R1 LLM
We're sticking with OpenAI for the embeddings part because they're pretty much the gold standard for this kind of thing.

from langchain_community.chat_models import ChatOpenAI
from langchain_openai import OpenAIEmbeddings
from langchain_pinecone import PineconeVectorStore
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import (
    RunnableParallel,
    RunnablePassthrough
)
import os

# Custom wrapper for OpenRouter
class ChatOpenRouter(ChatOpenAI):
    openai_api_base: str
    openai_api_key: str
    model_name: str

    def __init__(self,
                 model_name: str,
                 openai_api_base: str = "https://openrouter.ai/api/v1",
                 **kwargs):
        openai_api_key = os.environ.get("OPENROUTER_API_KEY") 
        super().__init__(openai_api_base=openai_api_base,
                         openai_api_key=openai_api_key,
                         model_name=model_name, **kwargs)

# Define the ask function
def ask():

    # Initialize a LangChain object for DeepSeek via OpenRouter.
    llm = ChatOpenRouter(
      model_name="deepseek/deepseek-r1-distill-llama-70b",
      max_tokens=None,
      max_retries=2,
    )

    # Initialize a LangChain object for a Pinecone index with OpenAI embeddings model.
    knowledge = PineconeVectorStore.from_existing_index(
        index_name=index_name,
        namespace=namespace_name,
        embedding=OpenAIEmbeddings(
            openai_api_key=os.environ.get("OPENAI_API_KEY"),
            dimensions=1024,
            model="text-embedding-3-large"
        )
    )

    # Initialize a retriever with a filter that restricts the search to authorized documents.
    retriever = knowledge.as_retriever(
        search_kwargs={
            "filter": {
                "article_id":
                    {"$in": authorized_articles},
            },
        }
    )

    # Initialize a string prompt template for context and question.
    prompt = ChatPromptTemplate.from_template(
        "Answer the question below using the context:\n\nContext: {context}\nQuestion: {question}\nAnswer:"
    )

    # Combine retrieval and prompt to pass through DeepSeek LLM via OpenRouter
    retrieval = RunnableParallel(
        {"context": retriever, "question": RunnablePassthrough()}
    )
    chain = retrieval | prompt | llm | StrOutputParser()

    # Example question
    question = "Who won the 2025 Oscar for best football movie?"

    print("Prompt: \n")
    print(question)
    result = chain.invoke(question)
    print(result)


# Invoke the ask function
ask()

Output:

Prompt: 

Who won the 2025 Oscar for best football movie?
Bill Gates won the 2025 Oscar for best football movie.

Answer: Bill Gates

There you go! Our RAG pipeline got this information that LLM didn't already know about.

5. What Happens When Tim's Pass Expires?

Let's shake things up and see what happens when Tim loses access to some docs.

First step: we're gonna revoke Tim's viewing privileges fora document. This code snippet updates a relationship between Tim and document 123

try: 
    resp = await client.WriteRelationships(
        WriteRelationshipsRequest(
            updates=[
                RelationshipUpdate(
                    operation=RelationshipUpdate.Operation.OPERATION_DELETE,
                    relationship=Relationship(
                        resource=ObjectReference(object_type="article", object_id="123"),
                        relation="viewer",
                        subject=SubjectReference(
                            object=ObjectReference(
                                object_type="user",
                                object_id="tim",
                            )
                        ),
                    ),
                ),
            ]
        )
    )
except Exception as e:
    print(f"Write relationships error: {type(e).__name__}: {e}")

Then we'll double-check what Tim can still see.

#this function was defined above
try:
        resp = lookupArticles()

        authorized_articles = []

        async for response in resp:
                authorized_articles.append(response.resource_object_id)
except Exception as e:
    print(f"Lookup error: {type(e).__name__}: {e}")

print("Documents that Tim can view:")
print(authorized_articles)

Output:

Documents that Tim can view:
['456']

Tim's lost access to document_123 which had the vital piece of info about the "2025 Oscar for Best Football Movie".

Time to try our query again!

#this function was defined above
ask()

Output

Prompt: 

Who won the 2025 Oscar for best football movie?
The 2025 Oscars, which honored films released in 2024, did not include a category for "best football movie." The Academy Awards do not have a specific category dedicated to sports films or football-themed movies. Therefore, no award was given in that non-existent category. It's possible there might be confusion with another award ceremony that recognizes sports-related films. 

Answer: No one won an Oscar for best football movie in 2025 because the Academy Awards do not have such a category.

And... plot twist! The system won't spill the beans anymore because Tim's not authorized to see that document. It's like trying to read a book that's been checked out of the library.

Conclusion

This was a step-by-step guide on how you can have fine grained authorization for your RAG pipelines. Do you have other ways of writing authorization logic for your LLMs and RAGs? Let me know in the comments!

As for the image: Well this is what DALL-E thinks what "Bill Gates won the 2025 Oscar for best football movie" looks like!

As promised, here is a link to the working Jupyter Notebook. Have fun!

Safeguarding Your Data When Using DeepSeek R1 In RAG Pipelines - Part 1

Sohan — Fri, 31 Jan 2025 19:38:58 +0000

DeepSeek is the talk of the tech world right now, and rightfully so!

If you're implementing the DeepSeek Large Language Model (or any LLM for that matter) in your Retrieval-Augmented Generation (RAG) Pipeline, you have to ensure that the LLM accesses only the data its authorized to.

This guide will walk you through the nuts and bolts of securing your RAG pipelines with Fine Grained Authorization while also about making your queries secure and super efficient! There's also a notebook linked at the end if you want to look at some code.

Note: This example uses DeepSeek R1 but works with any LLM. Using Authorization for RAG Pipelines is a best practice regardless of which LLM and Emebedding model you are using.

How is this image relevant? It's relevant to our RAG Pipeline and you'll find out how at the end of this guide. 🤭

Software used in this guide:

DeepSeek R1 LLM (through OpenRouter)
OpenAI for Embeddings
SpiceDB for permissions
Pinecone as our Vector Database
Langchain for language model integration

Why is this important?

Because we now need to think of Day2 AI Ops.

Enterprises are working extra hard to keep sensitive info (like personal details and company secrets) from leaking out. The go-to solution? Setting up some solid guardrails around RAG to keep data safe while making sure everything runs smoothly and efficiently.

To get these guardrails just right, you need to set up some smart permission systems that can keep track of who can see what and which resources they can access.

How It Works

Let me break down how a typical RAG pipeline works - it's pretty straightforward with two main parts:

1. Ingestion

Think of this as preparing your knowledge base. We grab all sorts of data, clean it up a bit, turn it into embeddings (vectors that represent real-world objects), and store them in a vector database. It's like organizing your digital library, where each book (or document) gets a special tag - like "document123" - so we can keep track of where everything came from.

2. Query & Response

Here's where it gets fun! When someone asks the chatbot a question, it transforms their question into the same kind of embedding format and goes hunting through the vector database for relevant matches. It's like having a super-smart librarian who knows exactly where to look! Once it finds the answer, the chatbot feeds this information to the LLM, which crafts a nice, helpful response based on what it found.

But here's the catch - and it's a big one - this setup is missing something crucial: authorization checks! 🚨

For example, if someone who shouldn't have access to sensitive financial data asks "What was our Q4 revenue?", they might get an answer they're not supposed to see. Not ideal, right?

Authorization, ReBAC & SpiceDB

In case you're new to the world of AuthZ, here's a quick primer:

Authorization determines whether you have permission to access a resource. Traditional models like Role-Based Access Control (RBAC) work well for simple setups, but as systems grow more complex, defining permissions based on roles alone can get messy. That’s where Relationship-Based Access Control (ReBAC) comes in. Instead of just assigning roles, ReBAC uses relationships—like “Alice is a manager of Project X” or “Bob is a friend of Charlie”—to determine access dynamically. This makes it ideal when it comes to securing your RAG pipelines.

This guide uses SpiceDB, a powerful, open-source database designed to handle ReBAC at scale. Inspired by Google’s Zanzibar (which powers Google's Authorization systems across Docs, YouTube and more), SpiceDB lets you define and enforce complex access rules efficiently. With it, you can model relationships between users and resources, then perform lightning-fast permission checks.

Three things about SpiceDB

Here's a quick TL;DR of how SpiceDB works:

Schema: This defines the types of objects found, how those objects relate to one another, and the permissions that can be computed off of those relations. Developers can read and write a schema based on their use-case and then store & query data.
Relationships: Relationships are what binds together a Subject and a Resource via a Relation. A functioning Permissions System that uses ReBAC is the combination of Schema and Relationships
Checks & Lookups: Now that we have a schema and relationships in the database, we can issue checks on whether a subject has a permission on a specific resource, or what resources a subject can access whether via a computed permission or relation membership.

Adding Authorization to your RAG Pipeline

Now there are two approaches to adding AuthZ to your RAG Pipeline.

Post-filter Authorization

So here's the deal: each embedding can have meta data showing which document it came from (like document123). We use this to check if you're actually allowed to see that content.

The process? We can perform a check for each relevant embedding to see if the user has permissions to view the document that the embedding originated from. You can specify the contexts you require: Ex: “I need 5 pieces of additional context before I make the prompt to the LLM” or “exhaust all the embeddings returned”

Pre Filter Authorization

Here we make a query and embed it. But before diving in, we check with our permissions system to see what stuff we're actually allowed to peek at. It gives us back a list of all the documents we can access.

Then we just use that list as our filter, grab all the relevant embeddings we're allowed to see, and boom - we're good to go! That's what we'll be playing with in this guide.

Step-by-step guide

Where's the code you ask? Well that's in Part II of this guide. Now that you've understood the concepts, here's the step-by-step guide to securing your RAG Pipelines.

Don't use JWT for Authorization!

Sohan — Tue, 14 Jan 2025 14:30:00 +0000

What's with the shouty title? Well, I wanted to grab your attention and get straight to the point:

🗣️🗣️ Don't use JWT for your backend authorization

Look, there's a time and place for every piece of technology and the tricky part is determining if your use case actually is the time and place. Hopefully this post will walk you through why JWTs might not be your best friend, and the rare cases where they actually make sense.

🔄 Quick Crash Course: What's a JWT?

So, JWT (pronounced "jot") stands for JSON Web Token. It's part of this whole family of specs called JOSE (no way!) that deal with encrypting and signing JSON. JWT is the cool kid of the family - it's defined in RFC7519 and gets all the attention. Why? Because while its siblings (JWA, JWE, JWK, JWS) handle the nitty-gritty encryption stuff, JWT is the one carrying the actual payload.

Think of a JWT as a JSON object wearing a fancy coat (some headers) and carrying an ID card (a signature) to prove it's legit. It's got these things called "claims" - like when it expires (exp), who created it (iss), who it's for (aud), and so on. The most popular claim for authorization is called "scope", which, fun fact, isn't even from JOSE - it's borrowed from OAuth2. Most developers end up mixing and matching these pieces like a authorization puzzle until something works.

⚔️ The New Enemy Problem: JWT's Achilles' Heel

Here's the thing: JWTs have a major weakness - once they're out there, you can't take them back (except waiting for them to expire). It's like giving someone an all-access pass and not being able to revoke it if they go rogue. This becomes super awkward with web sessions - ever tried implementing a proper "logout" with JWTs? Good luck with that! You're basically crossing your fingers hoping users will play nice and throw away their old tokens.

But wait, it gets worse for backend services. Imagine this: you revoke someone's access on your server, but they're still holding a valid JWT from before. They can keep accessing stuff they shouldn't - this is what the smart folks call the "New Enemy Problem" (first spotted in Google's Zanzibar paper). It's like changing the locks but forgetting about all the spare keys you handed out. Centralized authorization systems fix this by having a central service (think of it as like a bouncer at a bar) checking everyone's credentials in real-time. The New Enemy problem is a really hard and interesting distributed systems problem (and perhaps a future post here)

An example of the New Enemy problem:

Alice removes Bob from the ACL of a document;
Alice then asks Charlie to add new contents to the document;
Bob should not be able to see the new contents, but may do so if the ACL check is evaluated with a stale ACL from before Bob's removal

📏 JWT Scopes: Not as Fine-Grained as You'd Think

While JWTs look good on paper, things get messy in practice. Remember that scope claim I mentioned? It's... kinda vague. The spec basically just says "here's what characters you can use" and calls it a day. You'll see examples such as 'email profile phone address' floating around, and developers often try to get fancy with stuff like 'profile:admin'. But here's the million-dollar question: what does that actually mean? The whole site? Just one user's profile? Even GitHub's REST API has been wrestling with this for ages!

Modern apps need super specific permissions - we're talking granular stuff like 'issue/authzed/spicedb/52:author' instead of just 'issue:author'. When your users might need access to billions of things, you can't stuff all that into a token that's bouncing between services.

Centralized authorization is like having a smart assistant who keeps track of everything in one place. Need to check something? Just ask! For example: SpiceDB does this using something called ReBAC (Relationship-Based Access Control) - it's like a Swiss Army knife that can handle super detailed permissions while still playing nice with other permissions systems such as Role Based Access Control (RBAC), Attribute Based Access Control (ABAC), and other fancy patterns. Google also uses ReBAC for authorization across their services such as YouTube, Docs, and more.

🔮 The Crystal Ball Problem with JWT Authorization

Let's play pretend and say you're cool with using just a few JWT scopes. Even then, you've got a problem: how do you know what permissions you'll need? When your JWT gets created at the front door (like in an API gateway), it needs to predict what every downstream service might want. For anything beyond a super simple setup, that's like trying to predict next week's lottery numbers!

Plus, if you send a token with too many permissions to the next service, you're basically giving attackers a bigger target to hit. This headache led to the creation of Macaroons. These tokens can actually be trimmed down before being passed along - cool idea, right? But in reality, they're so complicated that most folks who tried them ended up saying "thanks, but no thanks."

Centralized authorization systems take a different approach. They're like "Hey, we know we can't predict the future, so just ask us when you need something!" Sure, you have to make an extra call, but systems like SpiceDB are optimized to keep data in-memory - so latency looks similar to reaching out to any other cache like redis or memcache.

🤔 So... Are JWTs Ever the Right Choice?

After all this JWT-bashing, you might think they're completely useless. But there is one scenario where they shine: one-time grants where access cannot be revoked. Though honestly, that's about as rare as finding a unicorn in your backyard!

What system do you use for your authorization needs? Let me know in the comments below.

UPDATE 1:
There's a nice discussion in the comments about either adding state to the JWT, or a system of using a denylist for token revocation. Both these approaches have their downsides and can be fraught with errors. Check the comments below for more info

How I'm Learning SpiceDB

Sohan — Tue, 12 Nov 2024 16:30:00 +0000

(Cover pic by Kelly Sikkema on Unsplash)

A Life Update

I recently joined AuthZed as a Developer Advocate, and I want to document my learning journey for those going through a similar process.

Here are the 4 steps that helped me ramp up my knowledge of SpiceDB. I hope you'll find these helpful on your own learning journey!

1. Start with the Basics

It's always beneficial to have strong foundational knowledge. In the past, my eagerness to code got the better of me, and I dove headfirst into building something only to backtrack to actually understand how it works. This time, I didn't want to repeat that mistake, so I started with a refresher on Authorization, and ABAC RBAC & ReBAC. If these acronyms are new to you, I'd suggest starting here.

I then read the Google Zanzibar paper that inspired SpiceDB, and re-read it - this time with annotations. I have to admit - I find it hard to parse academic papers (who doesn't wish for a TikTok-style summary sometimes?)

That's where this presentation by Jake Moshenko came in really handy. His explanation brings to life all the concepts listed in the paper and reinforces understanding of how Zanzibar works.

Although SpiceDB is inspired by Zanzibar, there are some key differences. Here are some differences in a Q&A format that helped clarify the concepts. If the number of new concepts and terminologies seems overwhelming, that's okay! You don't have to understand all of it from the start, and hopefully, the rest of this article will help with your learning journey.

2. Get the Hang of Schema Design

Schema design is central to SpiceDB and was a new concept for me. A schema essentially defines the types of objects in your system, how those objects relate to one another, and the permissions that can be computed from those relations. I started by watching this video on modeling the GitHub permissions system using Schema.

For practice, I used real-life examples (such as Google Groups or a banking system) and sketched out the different users, objects, and relationships between them. Progressing from a basic user-document schema to a complex real-life example provides valuable practice in designing schemas for SpiceDB.

You can experiment with modeling these in the SpiceDB playground. I encourage you to try it out.

(My niece calls Github as Gibbut so that's the name I refer to it now 😎)

3. Build Something Starting from a Point of Familiarity

Having worked at companies like Amazon Web Services (AWS) and Fermyon, I have background knowledge in Cloud, Compute, and Serverless technologies. I looked through the documentation for familiar territory and found Deploying SpiceDB on Elastic Kubernetes Service. My experience with Amazon EKS helped me understand how SpiceDB integrates into that system.

If you come from an application development background, you might prefer starting with one of our client libraries to build a simple app that communicates with a local SpiceDB instance. Our getting started guide Protecting A Blog Application can be particularly helpful. For those with authorization experience, we offer guides on how SpiceDB compares with Open Policy Agent (OPA) or a comparison with Ruby on Rails CanCanCan. Both show different approaches but share some common ground.

Good time to shout-out that SpiceDB is completely open-source, and we welcome community contributions! Whether you'd like to suggest improvements, fix documentation typos, or contribute to the community, please feel free to do so. Check out our Good First Issues and join our Discord community.

4. Use AI Strategically

While learning to deploy SpiceDB on Amazon EKS, I encountered some challenges (a natural part of learning) and consulted ChatGPT about these errors. Here's a debugging step that I received:

(For context: zed is the AuthZed CLI tool)

Pretty straightforward, right?

Well, except that config is not a zed CLI command. LLMs can hallucinate and often do so with a lot of confidence. Watch out for inconsistencies like these that could trip you up when copying code from an LLM.

This highlights an important distinction between "learning something" and "building something". Asking ChatGPT "How do I install SpiceDB on EKS" and then just spamming the copy-paste keys is not the best way to learn something. I can attest to this because it's exactly what I did at the start! Only partway through did I realize that I hadn't achieved what I set out to do and had to backtrack. On the other hand, asking an LLM about how I could start debugging certain errors gave me a good understanding of what's under the hood. Use these tools thoughtfully and purposefully.

One Final Thought

I'm on a roll with the advice, so here's one more thing (yes, that's a Stevenote reference). This has held me in good stead over the years when learning anything new: enjoy the process, the results will follow.

Happy Learning!

P.S. Here's a webinar I recorded for CNCF about Deploying SpiceDB in EKS. There's nothing quite like learning in public! 😎

Authentication vs. Authorization

AuthZed — Wed, 10 Jan 2024 10:04:18 +0000

Authz and Authn, a primer

As humans, we’ve been guarding our stuff since we first invented locks 6,000 years ago.

Guarding assets and sharing access to them is human nature. There are two major steps involved in this process: authentication and authorization. First, one must authenticate that they are who they say they are. Then, once someone’s credentials verify their identity, the next question is whether or not that individual is allowed access to the asset.

Think of it this way:

You want to go on a trip to another country. You need a passport to verify that you are who you say you are - this is an example of authentication. You also need a visa that gives you permission to enter the other country. This is an example of authorization. The same principles can be applied in computing:

Authorization and authentication get confused OFTEN - I mean, look at how similar the words are! Instead of having the perspective of authentication vs. authorization, let’s understand how they work together. As you’ll see, you can’t have one without the other.

Authentication

What is Authentication?

Authentication, in computing, involves verifying the identity of a user, process, or machine. To authenticate a user, we can use one or a combination of three key factors: Knowledge: Something a person knows – i.e. a password/username, a passcode or PIN, or a security question Ownership: Physical token or something a person owns – i.e. a Yubikey, hardware capable of an encrypted handshake (like ACME for Apple devices), or Google Authenticator Inherence: A vital piece of something a user is or does – i.e. biometrics or keystrokes

Whenever you sign into social media, your email, or any Web-based application, you are using authentication to prove you are who you say you are.

The History of Authentication

Authentication has been a part of our world since the dawn of time, but that is another conversation for another day. For now, let’s dive into the history of authentication in modern computing.

Authentication started all the way back in the 1960’s (60 years ago!) with the first passwords in databases. From there it moved to passwords with a hash, and eventually, encryption. Modern encryption has been around since the 1970’s with RSA asymmetric encryption.

Timeline

[Timeline is based on an excellent piece - “Digital authentication: The past, present and uncertain future of the keys to online identity” on GeekWire, with a few new additions by the authors]

We don’t know where authentication will go in the future but as technology continues to advance, the ability to identify an individual user will get more complicated. This rings true for all Web applications and services, but even more so for highly secure or zero trust environments such as government platforms and healthcare data. Authentication is the passport to our digital world.

Authorization

If authentication is our passport, authorization is our visa. Authorization is when one entity grants permission to another entity to engage with a resource within a set of boundaries. Authorization has become increasingly difficult for application developers, often causing performance issues when deployed at scale and blocking feature development.

As our digital footprint has grown, experts have developed new access control and permissions management models to meet the needs of feature hungry teams. Since the advent of the original Access Control List Model (ACL) in the 1960s – similar to an “invite only” guest list – we have designed increasingly abstract authorization systems.

Timeline

Moving forward, interest will likely trend toward dynamic, context-aware systems that can adjust permissions in real time. The exploration of decentralized models for authorization that integrates concepts from ReBAC and systems like Zanzibar suggests a future with more secure, efficient, and flexible access control mechanisms.

As technology develops, so will both authentication and authorization. We at AuthZed are excited to be on the forefront of the authorization space, bringing Google’s Zanzibar to the masses with SpiceDB.

You can try out a schema here and get started with fine-grained access control across all of your environments and applications.

A Primer on Modern Enterprise Authorization (AuthZ) Systems

AuthZed — Wed, 06 Sep 2023 15:37:08 +0000

Introduction

A business operating complex software environments needs to provide end-user experiences that enable and delight users, but never at the cost of a poor security stance. In 2022, a lapse in security fetched on average 9.44M for a data breach in the US, and $4.35M globally according to IBM.

This post aims to help companies with their authorization decisions and systems and share what we see in the market through conversations with companies looking to solve their authorization challenges, specifically authorization that impacts end-user interaction with their products. I’ll cover how authorization became an issue for most companies, the different approaches, an introduction to Google Zanzibar–the solution the market is converging on, the prominent use cases driving businesses to adopt this new approach, and what you can expect when moving to a relationship-based access control system (ReBAC).

What is AuthZ?

Because the Internet is, by definition, a networked system that connects users, most Internet-facing software is designed with a multi-user experience in mind. These environments require constructs to facilitate a natural experience that protects users’ data: authentication , commonly referred to as authN, is the key that verifies an application-specific identity, typically called a user, and authorization , commonly referred to as authZ, dictates what doors that key can open for a user.

AuthZ plays a pivotal role in software security by ensuring that users have the appropriate level of access to different resources and functionalities within a system. At its core, authZ involves granting or denying access rights to specific resources or actions based on the identity and privileges of the user. It is crucial in preventing unauthorized access and verifying that only authorized users can perform specific actions within a system. It serves as a gatekeeper that protects sensitive data and functionalities. By implementing robust authorization mechanisms, developers can control the level of access granted to different users or roles, thereby safeguarding the system from potential security breaches.

Collectively in the context of a product’s end users, authN and authZ are called Customer Identity and Access Management (CIAM). Authorization is such a foundational part of the digital experience that its underlying design principles have become 99% invisible, even to developers, leading to fundamental challenges as a business scales.

Why AuthZ is an Issue

A company typically starts by aggregating all user requests into a single piece of software that tightly couples application logic with authZ. As the company’s product gains traction and its user base grows, the focus shifts to distributing the software and scaling infrastructure components, often ignoring a much-needed change to the authorization system. This further embeds an authorization construct not meant to handle a growing number of requirements.

From the business perspective, the two key limitations of a legacy authorization system are:

Permissions are inflexible : There isn’t a way to easily add additional constructs like user-defined roles, recursive relationships, attribute-based access control (ABAC), or fine-grained authorization (FGA).
Siloed permissions: as a company grows, it scales revenue by offering additional products; application teams then build bespoke authorization implementations that are hard to reason about and don’t consider a holistic user experience, especially at large scale.

Google set out to fix this problem, along with several “unique challenges involving data consistency and scalability.”

Google’s Solution to Authorization: Google Zanzibar

The confluence of business requirements driving the adoption of zero-trust architectures and 94% of consumers wanting more control of their data in near real-time galvanized the search for a modern authorization system. Google’s response was a modern approach that can scale with your business and maintain strict security requirements, now known as Google Zanzibar.

Among the requirements set forth by the Google Zanzibar team is “support for a rich set of access control policies as required by both consumer and enterprise applications” and “establish consistent semantics and user [developer] experience across applications.” Zanzibar powers authorization across hundreds of Google Products, including Google Calendar, Cloud, Drive, Maps, Photos, and YouTube. Notably, it unlocks unique experiences like cross-product authorization checks, e.g., Slack’s Gmail extension can check if a recipient has access to a Google Doc, unlocking growth through reduced friction points while maintaining user privacy.

Google Zanzibar is a relationship-based access control system (ReBAC), meaning that permissions are derived from the existence of a relationship between digital objects and users. This has positive performance implications, especially for recursive permissions, but, importantly, it’s a natural extension of sharing in the real world, which makes it intuitive for most developers.

Use-Cases Driving Adoption of Google Zanzibar

Product-Led Growth

To share something is a user choice and subsequent action. Companies we’re speaking with have learned that facilitating frictionless sharing can help onboard additional users to their platforms. For instance, a hiring platform we’re working with implements fine-grained authorization (FGA) so enterprise recruiters are comfortable exposing permissions to hiring managers related to the positions the managers are looking to fill. Hiring managers, in turn, can proactively engage with candidates and increase activity on the platform.

Another example is adding capabilities that foster more in-app experiences. For instance, a sharing economy company is boosting engagement with its platform by bringing resource management for users into their native applications instead of relying on third-party applications. Most companies we speak with share similar product-led growth initiatives built atop robust authorization.

Breaking into the Enterprise

Enhancing security and compliance is a key requirement for B2B companies looking to scale revenue within an enterprise market segment, and OWASP’s 2021 report cites broken access control as a top security concern. One of the main risks is “exposure of sensitive information to an unauthorized actor,” which is a core tenet of the Google Zanzibar paper; the system "must ensure consistency of access control decisions to respect user intentions.”

Enterprise users also require increased flexibility; these manifest as the following requirements for product teams:

Fine-grained authorization (FGA): the ability to control resources down to a granular level, e.g., a page in a document, though there is a balance, see How small is too small?
User-defined Roles and Permissions: beyond a typical application-defined Role-Based Access Control (RBAC) system, product teams need to allow end-user admins to create roles and associated permissions for delegating to internal teams.
Recursive relationships: at a certain scale, teams start owning teams. This is challenging for a traditional authorization system dealing with permissions stored in a relational database alongside application data.
Attributes-Based Access Control (ABAC): support for dynamic time-bound or otherwise caveated access.

What to Expect When Adopting Relationship-Based Access Control (ReBAC)

A crucial part of ReBAC systems like Google Zanzibar and our own Zanzibar-inspired authorization system, SpiceDB, is storing permissions data in a separate database; product-specific data (e.g. content of a social media post) is stored in the application database, while the data that drives who can edit that data live in the permissions database. If you have an existing authorization flow, you’ll have to translate that data into permissions data.

Modeling data is probably the most fun and intuitive part. SpiceDB, like other solutions, has a domain-specific language (DSL) called the SpiceDB Schema Language for defining the objects you want to create an authorization system for. The permissions schema defines the objects, e.g., users and documents, how they relate to each other, and the permissions those relationships define.

Since you’re writing permissions data, integration is a big part of the journey. A ReBAC authorization system is delivered over a gRPC or HTTP API; SpiceDB has libraries available in multiple languages to help developers get up to speed quickly. You’ll want to make sure whatever solution you choose delivers a solid developer experience.

Google Zanzibar doesn’t mention policy, but we’ve learned through our collaboration building Attribute-Based Access Control (ABAC) for Netflix that pairing policy with ReBAC is a powerful paradigm. An example of this capability is SpiceDB Caveats: Caveats: A Scalable Solution for Policy.

A common practice is to organize a core team of developers tasked with architecting and executing an overhaul to your authorization system. The effort must be cross-functional; you’ll want platform engineers, application engineers, and product managers to work together to ensure smooth adoption.

Get Started

Given how popular Google's approach to authorization has become, there are a number of new companies and projects looking to provide Zanzibar-aaS. At AuthZed, we've created a faithful open-source implementation of Google Zanzibar called SpiceDB, and offer managed commercial offerings that make it easy to get into production. Join the community on Discord or schedule a call to learn more!

Additional Reading

If you’re interested in learning more about Authorization and Google Zanzibar, we recommend reading the following posts:

Hotspot Caching in Google Zanzibar and SpiceDB

AuthZed — Wed, 10 May 2023 00:00:00 +0000

Anatomy of a Cache Entry

In section 3.2.5 of Google’s Zanzibar paper, the authors say: “We found the handling of hot spots to be the most critical frontier in our pursuit of low latency and high availability.” They go on to describe a system of splitting peer Zanzibar servers into a distributed cache where the cache keys include an evaluation timestamp and servers are selected based on consistent hashing. With SpiceDB, AuthZed’s open-source Google Zanzibar implementation, we’ve created a faithful implementation of this system. In today’s blog post, I want to talk about how this all works under the hood.

"We found the handling of hot spots to be the most critical frontier in our pursuit of low latency and high availability."

Section 3.2.5, Zanzibar: Google’s Consistent, Global Authorization System

You may recall in our Check it Out #2 blog post, Joey demonstrated how we can break down top-level problems into subproblems to make caching easier.

For a quick refresher, consider the following schema and relationships:

definition user {}

definition organization {
    relation admin: user
}

definition document {
    relation org: organization

    relation owner: user
    relation reader: user

    permission view = reader + owner + org->admin
}

document:doc1 has org of organization:org1

document:doc1 has owner of user:sally

document:doc1 has reader of user:billy

organization:org1 has admin of user:francesca

When evaluating the top level question: “Can user:francesca view document:doc1?”, there are several interesting subproblems:

“Is user:francesca reader of document:doc1?"

“Is user:francesca owner of document:doc1?"

“document:doc1 has org of organization:org1. Is user:francesca admin of organization:org1?"

In Zanzibar, in order to return consistent results, all problems are evaluated at a consistent snapshot timestamp. You can think of these as representing a single moment in time. If we pick an example timestamp 12345, this becomes an implicit clause for our top-level problem and all sub-problems:

“Can user:francesca view document:doc1 at timestamp 12345?”

“Is user:francesca reader of document:doc1 at timestamp 12345?"

And so on.

Ignoring SpiceDB Caveats for a moment, each of these questions also has a yes or no answer at any given time. In SpiceDB, we refer to this answer as “permissionship”, with values: PERMISSIONSHIP_HAS_PERMISSION and PERMISSIONSHIP_NO_PERMISSION respectively.

For any given problem or sub-problem, at a specified time, the result to the check is immutable : the value will not ever change. Here are the immutable answers to the problems and sub-problems from above:

“Is user:francesca reader of document:doc1@ 12345?"→ PERMISSIONSHIP_NO_PERMISSION

“Is user:francesca owner of document:doc1@ 12345?" → PERMISSIONSHIP_NO_PERMISSION

“document:doc1 has org of organization:org1. Is user:francesca admin of organization:org1@ 12345?" → PERMISSIONSHIP_HAS_PERMISSION

“Can user:francesca view document:doc1 @ 12345?” → PERMISSIONSHIP_HAS_PERMISSION

These become cache keys in SpiceDB as follows:

document:doc1#reader@user:francesca@12345 → PERMISSIONSHIP_NO_PERMISSION

document:doc1#owner@user:francesca@12345 → PERMISSIONSHIP_NO_PERMISSION

organization:org1#admin@user:francesca@12345 → PERMISSIONSHIP_HAS_PERMISSION

document:doc1#reader@user:francesca@12345 → PERMISSIONSHIP_HAS_PERMISSION

It is important to note: if we were to change the evaluation timestamp, the cache keys would change as well and the existing answers would become irrelevant. That is why picking evaluation timestamps becomes so important.

Picking a Good Snapshot Timestamp

In the previous section, I said that a timestamp represents a single unique point in time. While there is some debate about whether the passage of time is actually an illusion, for our lived experience a single instant of time occurs only once, and if we were randomly picking times, we would have an infinitesimally small chance of picking the exact same time twice. So how do snapshots timestamps help us?

To make this easier to visualize, we will use the following diagram of a timeline:

On the X-axis we have a time continuum from 0 to 10, and on the Y-axis we have the snapshot timestamp that the request was actually evaluated at. In this example our request was sent at time 5, and the snapshot timestamp chosen was also time 5.

There are several important factors that play together when determining the timestamp at which a problem and all of its subproblems will be evaluated. The most important of these factors though is the user’s specified consistency level.

SpiceDB allows you to choose amongst four levels of consistency on a per-request basis:


fully_consistent	Chooses the system’s current time as the snapshot timestamp, and will include all changes that were fully committed before the request was started.
snapshot	Evaluates requests at the exact same timestamp as that encoded in a ZedToken from a previous request.
minimize_latency	Allows the system to freely choose a timestamp that balances freshness while also minimizing latency.
at_least_as_fresh	Allows the system to choose a timestamp, as long as the time selected occurs after the time represented in the user-supplied ZedToken.

Fully Consistent

At consistency level fully_consistent we are always picking the exact snapshot timestamp at which the request arrived at the system. Here is an example with many requests and the snapshot timestamp chosen.

As you can see, no two requests are evaluated at the same snapshot timestamp, effectively disabling the cache.

At Exact Snapshot

At consistency level at_exact_snapshot we are always picking the same snapshot revision for every request. This will result in perfect cache usage, but our results will get more and more stale as time goes on.

In this example, the default setting for the Snapshot Revision is 4.25s and all of our requests are evaluated at that time. By the time we get to time 10 our results are already 5.75 seconds old. You can further adjust the Snapshot Revision setting and observe how the average staleness of results changes.

Minimize Latency

At consistency level minimize_latency we allow the server to pick a snapshot timestamp freely to optimize the experience. We use quantization, a method to downsample high-frequency data into discrete buckets, to allow different servers to all choose the same request timestamps independently. In this example, the server is configured with --datastore-revision-quantization-interval=1s (the default is 5s) and --datastore-revision-quantization-max-staleness-percent=0.

As you can see, the request times are quantized (i.e. rounded) down to the nearest whole second. This lets us re-use cache entries for as many requests are we receive during that 1s window, and our results will be up to 1s stale. This allows us to make an intentional tradeoff between consistency and cache usage.

In order to avoid effectively completely evicting the cache every time a new quantization period starts, Pull Request #1285 introduced the ability to blend between one quantized revision and the next over a period of allowed staleness. The following example shows a staleness of 100% of the quantized revision, i.e. 1s:

As you can see, there are multiple active candidate snapshots at every request time. Older snapshot revisions are phased out and newer snapshot revisions are phased in. We can visualize this probability as a stacked area chart, where all probabilities always add up to 100%. Here is such a chart for 1s max staleness:

We can adjust the staleness time according to our requirements for freshness of answers, even beyond 100%. The following graphs demonstrate 10% (0.1s) and 200% (2s) of max staleness respectively:

Play with the Quantization Window and Max Stale Percentage settings to see how they affect snapshot revision reuse and result staleness.

At Least as Fresh

At consistency level at_least_as_fresh we use ZedTokens (Zookies in the Zanzibar paper) to give the user a powerful way to express causality between requests in the system. Functionally, you can think of an at_least_as_fresh request as being the same as minimize_latency unless the selected timestamp is less than the timestamp encoded in the ZedToken. When that happens, the chosen timestamp is adjusted forward to use the timestamp encoded in the ZedToken. In our scatterplot visualizations this looks like the following:

We see the familiar stepped pattern from minimize_latency. However, when a request is received in between the ZedToken being issued and the next quantized revision, it is adjusted as seen in the following zoomed scatterplot when a request comes in between times 6.3 and 7 with the ZedToken at time 6.3:

Adjust the settings in the following interactive chart and observe the effect on the timestamps.

Allowing a user to pick their consistency level means that SpiceDB is usually picking revisions using all of these modes at the same time, with each request making the appropriate tradeoff between consistency, freshness, and performance.

Now that we see how the system as a whole creates and utilizes immutable cache entries, let’s see how we scale it!

Consistent Hashring Dispatch

The Zanzibar paper Section 3.2.5 (that we referenced earlier) has this to say about scaling this caching solution:

Zanzibar servers in each cluster form a distributed cache for both reads and check evaluations, including intermediate check results evaluated during pointer chasing. Cache entries are distributed across Zanzibar servers with consistent hashing.

Evan, one of our engineers, has written about the “how” for consistent hash load balancing before in his post: Consistent Hash Load Balancing for gRPC, but I would like to expand a little bit more on the “why”.

It’s simple math really: if all nodes were equally likely to compute (and cache) the results for every problem and subproblem, we would end up storing several copies of the same result. If our average cache result size is B bytes, and we’ve dedicated M bytes of memory on each server to caching, the total number of unique problems we can keep in cache (regardless of cluster size) is M / B. Additionally, our cache hit ratio will suffer due to needing to recompute each cache entry on each server the first time that problem or sub-problem is encountered.

If we were to always send the same problem or sub-problem to the same SpiceDB server or group of servers, we could achieve higher hit rates, and more overall caching. If we sent each problem to only one server, our overall number of unique cached items in a cluster of size N would be N * M / B. Thus, as our dataset grows, we can keep more of it in cache by expanding our cluster size correspondingly.

Hotspot Caching vs. Other Types of Caching

The type of caching we’ve discussed in this post, hotspot caching, is used to address latency problems when the same data is accessed frequently. There are other types of caching referenced in the Zanzibar paper as well. Section 3.2.5 also describes a relationship cache where groups of frequently accessed relationships are cached, and Section 3.2.4 describes the “Leopard Indexing System”, which is a continually updated denormalization of user group data stored in Zanzibar. Lovingly following the CPU cache naming model, we refer to these techniques internally as L0 and L2 caches, and have designs for how these will make their way into SpiceDB in the future.

But that’s a story for another time.

Sincerely, Your Most Faithful Implementation

I hope this has been an informative post, and that you’ve learned a lot about our implementation of hotspot caching in SpiceDB. If you’re using one of AuthZed’s hosted SpiceDB solutions, you’re already taking advantage of hotspot caching automatically. If you’re one of our self-hosted or open source customers, I hope you’ve learned enough about how the various datastore parameters and consistency levels work together to tailor your experience. If you’d like to learn more, you can reach us through our Discord, by scheduling a call, or by reaching out to support@authzed.com.

Additional Reading

Image Credit: NASA Goddard Space Flight Center, CC BY 2.0, via Wikimedia Commons

Forem: AuthZed

Safeguarding Your Data When Using DeepSeek R1 In RAG Pipelines - Part II

1. Let's Talk Schema!

2. Write a Relationship

3. Writing to our Vector DB

4. Checking Tim's VIP Permissions

5. What Happens When Tim's Pass Expires?

Conclusion

Safeguarding Your Data When Using DeepSeek R1 In RAG Pipelines - Part 1

Why is this important?

How It Works

Authorization, ReBAC & SpiceDB

Adding Authorization to your RAG Pipeline

Step-by-step guide

Don't use JWT for Authorization!

🔄 Quick Crash Course: What's a JWT?

⚔️ The New Enemy Problem: JWT's Achilles' Heel

📏 JWT Scopes: Not as Fine-Grained as You'd Think

🔮 The Crystal Ball Problem with JWT Authorization

🤔 So... Are JWTs Ever the Right Choice?

How I'm Learning SpiceDB

A Life Update

1. Start with the Basics

2. Get the Hang of Schema Design

3. Build Something Starting from a Point of Familiarity

4. Use AI Strategically

One Final Thought

Authentication vs. Authorization

Authz and Authn, a primer

Authentication

What is Authentication?

The History of Authentication

T﻿imeline

Authorization

T﻿imeline

A Primer on Modern Enterprise Authorization (AuthZ) Systems

Introduction

What is AuthZ?

Why AuthZ is an Issue

Google’s Solution to Authorization: Google Zanzibar

Use-Cases Driving Adoption of Google Zanzibar

Product-Led Growth

Breaking into the Enterprise

What to Expect When Adopting Relationship-Based Access Control (ReBAC)

Get Started

Additional Reading

Hotspot Caching in Google Zanzibar and SpiceDB

Anatomy of a Cache Entry

Picking a Good Snapshot Timestamp

Fully Consistent

At Exact Snapshot

Minimize Latency

At Least as Fresh

Consistent Hashring Dispatch

Hotspot Caching vs. Other Types of Caching

Sincerely, Your Most Faithful Implementation

Additional Reading

Timeline

Timeline