Forem: Yolanda Robla Mota

Deploying an Okta-Authenticated BigQuery MCP Server on Kubernetes with ToolHive

Yolanda Robla Mota — Wed, 19 Nov 2025 09:56:11 +0000

In my previous article, I showed how to connect Okta authentication to a BigQuery MCP server running locally. The objective was to build a workflow that was secure (with user-level attribution and least privilege roles), short-lived, and that would save you the pain of managing Google service-account keys. That setup worked perfectly for local development, but it wasn’t something I’d confidently hand off to production.
This time, we’ll take that local prototype and transform it into a production-ready, cloud-native deployment running on Kubernetes, secured by Okta, and managed end-to-end by the ToolHive Operator. We’ll even make it accessible remotely through ngrok, so you can connect to it from anywhere using VS Code.

Setting the Stage

Before diving in, let’s make sure we have the right pieces in place. You’ll need a Kubernetes cluster (I’ll be using kind for simplicity), along with kubectl and helm. You’ll also need an Okta account with an authorization server configured, and a Google Cloud project with BigQuery enabled.
If you haven’t already, set up Workload Identity Federation in your Google Cloud project. That’s what allows Google Cloud to trust Okta tokens and issue temporary credentials for BigQuery access.
Finally, install the ToolHive CLI (thv) and sign up for an ngrok account — we’ll use both to expose your service later on.

Deploying the ToolHive Operator

Let’s start by getting the ToolHive Operator running in our cluster. The operator is what manages the lifecycle of MCP servers — it handles the pods, proxies, authentication, and updates automatically.
I’m using kind to create a local cluster:

kind create cluster --name toolhive

Next, install the ToolHive CRDs and the operator itself:

helm upgrade --install toolhive-operator-crds \
  oci://ghcr.io/stacklok/toolhive/toolhive-operator-crds

helm upgrade --install toolhive-operator \
  oci://ghcr.io/stacklok/toolhive/toolhive-operator \
  --namespace toolhive-system --create-namespace

A quick check confirms the operator is running:

kubectl get pods -n toolhive-system

You should see something like:

toolhive-operator-7875c8c5cd-xxxxx   1/1     Running   0   30s

With that, our cluster is ready to start managing MCP servers.

Storing the Okta Secret

The next step is to give ToolHive access to your Okta client secret. This allows the proxy to validate incoming tokens. Instead of hardcoding secrets, Kubernetes encourages us to store them in a dedicated Secret resource.
Here’s the YAML to create one:

apiVersion: v1
kind: Secret
metadata:
  name: okta-client-secret
  namespace: default
type: Opaque
stringData:
  client-secret: <YOUR_OKTA_CLIENT_SECRET>

Save that as 00-okta-client-secret.yaml and apply it:

kubectl apply -f 00-okta-client-secret.yaml

Setting Up Token Exchange

To allow Okta to exchange its tokens for Google Cloud credentials, we’ll define an MCPExternalAuthConfig resource. This tells ToolHive how to talk to Google’s Security Token Service (STS) and request access tokens for BigQuery.
Here’s the config:

apiVersion: toolhive.stacklok.dev/v1alpha1
kind: MCPExternalAuthConfig
metadata:
  name: bigquery-token-exchange
  namespace: default
spec:
  type: tokenExchange
  tokenExchange:
    tokenUrl: https://sts.googleapis.com/v1/token
    audience: //iam.googleapis.com/projects/<YOUR_PROJECT_NUMBER>/locations/global/workloadIdentityPools/okta-pool/providers/okta-provider
    subjectTokenType: id_token
    scopes:
      - https://www.googleapis.com/auth/bigquery
      - https://www.googleapis.com/auth/cloud-platform

Apply it with:

kubectl apply -f 01-external-auth-config.yaml

This configuration acts as a bridge between Okta and Google Cloud, handling the secure exchange behind the scenes.

Deploying the BigQuery MCP Server

Now we can create the MCP server that will connect VS Code to BigQuery. This configuration ties together the image, authentication, and proxy.
We need to expose a public endpoint that is the resourceURL. For that, we can use a service like ngrok. Configure a domain in the ngrok dashboard or note your automatically-generated “dev domain” if you’re on a free account. Configure that properly on the custom resource, along with the other settings indicated with :

apiVersion: toolhive.stacklok.dev/v1alpha1
kind: MCPServer
metadata:
  name: database-toolbox-bigquery
  namespace: default
spec:
  image: us-central1-docker.pkg.dev/database-toolbox/toolbox/toolbox:0.19.1
  env:
    - name: BIGQUERY_PROJECT
      value: <YOUR_GCP_PROJECT_ID>
    - name: BIGQUERY_USE_CLIENT_OAUTH
      value: "true"

  args:
    - --prebuilt
    - bigquery
    - --address
    - 0.0.0.0

  transport: streamable-http
  proxyPort: 8000
  mcpPort: 5000

  oidcConfig:
    type: inline
    resourceUrl: https://<YOUR_NGROK_DOMAIN>.ngrok-free.app/mcp   # Replace with your ngrok URL
    inline:
      issuer: https://<YOUR_OKTA_DOMAIN>.okta.com/oauth2/<YOUR_AUTH_SERVER_ID>
      audience: //iam.googleapis.com/projects/<YOUR_PROJECT_NUMBER>/locations/global/workloadIdentityPools/okta-pool/providers/okta-provider
      clientId: <YOUR_OKTA_CLIENT_ID>
      clientSecretRef:
        name: okta-client-secret
        key: client-secret

  externalAuthConfigRef:
    name: bigquery-token-exchange

  resources:
    limits:
      cpu: "1"
      memory: "512Mi"
    requests:
      cpu: "100m"
      memory: "128Mi"

Apply it with:

kubectl apply -f 02-mcp-server-bigquery.yaml

Kubernetes will create two pods: one running the MCP server, and another running the ToolHive proxy.

Exposing the Service Publicly

Once the MCP server is running, we can expose it publicly to be reachable by authentication endpoints and clients. This means we’ll temporarily expose the service, create a tunnel through ngrok using ToolHive’s built-in support, and grab that domain before proceeding.
Start by forwarding the proxy service locally:

kubectl port-forward -n default svc/database-toolbox-bigquery-proxy-svc 8000:8000

This makes the MCP proxy accessible at http://127.0.0.1:8000.

Now, use the ToolHive CLI to open a secure tunnel with ngrok:

thv proxy tunnel http://127.0.0.1:8000 tunnel \
  --tunnel-provider ngrok \
  --provider-args '{"auth-token": "<YOUR_NGROK_AUTH_TOKEN>", “url”: “https://<YOUR_NGROK_DOMAIN>.ngrok-free.app”}'

ToolHive will create the tunnel and print a line like:

✔ Tunnel created
Public URL: https://<YOUR_NGROK_DOMAIN>.ngrok-free.app

If you want more background on this tunneling feature, the ToolHive team has a nice write-up: Exposing a Kubernetes-Hosted MCP Server with ToolHive + ngrok (with Basic Auth)

Verifying the Deployment

After a few moments, confirm everything’s running:

kubectl get pods -n default -l toolhive-name=database-toolbox-bigquery

You should see two pods in the “Running” state — one for the server, one for the proxy.
If you’d like to peek under the hood, tail the proxy logs to see the authentication and token exchange process in action:

kubectl logs -n default -l app.kubernetes.io/instance=database-toolbox-bigquery-proxy --tail=50

You should see debug lines referencing token validation and the STS endpoint.

Connect from VS Code

Once your MCP server is running, secured, and exposed via your public ngrok URL (for example: https://abc123.ngrok-free.app/mcp), you’ll use VS Code’s MCP support to connect.

Open VS Code. Make sure you have the MCP / Copilot Chat extension installed and enabled.
Open the Command Palette (Ctrl+Shift+P or ⌘+Shift+P) and run “MCP: Add Server” (or you can open the mcp.json configuration manually).
When prompted, enter a JSON configuration like this:

{
  "servers": {
    "toolbox": {
      "url": "https://<YOUR_NGROK_DOMAIN>.ngrok-free.app/mcp",
      "type": "http"
    }
  },
  "inputs": []
}

The "type": "http" indicates you’re connecting over HTTP transport.

After saving/accepting this config, VS Code will attempt to connect to the MCP server. During this process it will prompt you to enter the Client ID and the Client Secret from your Okta app
These credentials allow VS Code to authenticate and authorize with the server according to the MCP/OIDC handshake.
Once the authentication completes, the server will appear in your MCP server list. You can open the Chat view, select the MCP tools (e.g., query_bigquery, list_datasets, etc.), and issue queries or commands as needed.
Try a test query to confirm everything is working:

Wrapping Up

We’ve come a long way from a local Okta-authenticated server to a fully managed, cloud-ready Kubernetes deployment. Now you have a secure, scalable, and remote-accessible BigQuery MCP server managed entirely by ToolHive.
This setup combines Okta’s identity management, Google Cloud’s token exchange, and Kubernetes automation into a single cohesive workflow. The result is a developer-friendly environment that’s easy to scale and safe to expose beyond your local machine.
If you’re interested in exploring further, join the ToolHive Discord community to share what you’ve built. The possibilities with ToolHive, Okta, and Kubernetes together are just getting started.

How to use Okta to remotely authenticate to your BigQuery MCP Server

Yolanda Robla Mota — Thu, 06 Nov 2025 12:07:50 +0000

This article builds on our previous post, where we explored the high-level architecture of token exchange, identity federation, and how to run MCP servers in a secure and IdP-agnostic way. Now we shift into the hands-on phase: how to use ToolHive to enable an MCP server to query Google BigQuery for users authenticated via Okta. While we use Okta and Google Cloud as the example stack, this flow is adaptable to any IdP and any cloud provider with a compatible STS / federation service.

Scenario overview

You run an MCP server that receives requests from users who are authenticated via Okta.
The MCP server must execute queries in Google Cloud BigQuery.
You don’t want to manage Google service-account keys, embed JSON credentials in config, or lose per-user audit.
You want: user-level attribution, least-privilege roles, secure, short-lived access, and federation between Okta and Google Cloud.

In this example, we’re implementing the IdP federation approach described as scenario “B” in the previous blog post. The diagram below shows how ToolHive, Okta, and Google Cloud interact in this flow.

Prerequisites

Before you start, make sure you have:

Okta admin access: You’ll need permissions to create an OIDC app and an authorization server.
A Google Cloud project: With BigQuery enabled and permissions to create a Workforce Identity Pool.
ToolHive CLI: download it from toolhive.dev and confirm it’s in your system path.
Container runtime: Docker, Podman, or Rancher Desktop are supported.
An MCP client such as Claude Code (or any other client supporting the MCP protocol).

Detailed configuration steps

Step 1: Configure Okta as Identity Provider

In the Okta Admin Console, navigate to Applications → Applications and click Create App Integration. See https://help.okta.com/en-us/content/topics/apps/apps_app_integration_wizard_oidc.htm
Choose OIDC – OpenID Connect and then Web Application for the app type.
Configure the sign-in redirect URI to http://localhost:8666/callback (this is the callback needed for the MCP server that we will run later using ToolHive).

IMPORTANT: Note the client ID and client secret; you’ll need them in later steps.

Step 2: Create an Authorization Server in Okta

Your OIDC app issues tokens via an Authorization Server. For the Workforce Federation and token exchange, you need one configured correctly.

In the Okta Admin Console, Navigate to Security → API → Authorization Servers.
Click Add Authorization Server.
Name: BigQuery MCP Server (or any descriptive name)
Audience: set this to match the audience expected by your MCP server configuration (for example, mcpserver).
Click Save.
Configure an additional gcp.access scope:
And the access policies for the types of tokens to generate, including Token Exchange:

With this setup, Okta will:

Issue standards-compliant OIDC tokens to your MCP server through ToolHive.
Include the claims Google Cloud expects during the token exchange.

IMPORTANT: Note the issuer URL for the Authorization Server, you’ll need it in the next steps.

Step 3: Create Workforce Identity Pool in Google Cloud

In the Google Cloud console, create a Workforce Identity Pool and a matching provider, using the Issuer URL you noted in the previous step:
Define custom audiences. The Okta client ID needs to be passed as an audience, so start by copying the default audience. Then select Allowed audiences, add the default value, and include your Okta client ID as well.

Configure permissions for the Okta user so they can read BigQuery data. Repeat this for each user you want to map:

gcloud projects add-iam-policy-binding <PROJECT_NAME> \
--member="principalSet://iam.googleapis.com/projects/<PROJECT_ID>/locations/global/workloadIdentityPools/okta-pool/attribute.email/<MAPPED_OKTA_EMAIL>" \
--role="roles/bigquery.dataViewer"

Step 4: Deploy MCP server + proxy with remote authentication via ToolHive

In this step, we bring together the MCP server and the remote authentication/federation flow. Using ToolHive, we’ll run the server and wrap it with a proxy that handles user authentication with Okta and token exchange into Google Cloud.

Start by creating a group. ToolHive automatically manages clients registered to your default group, adding or removing MCP servers as you run them. Since this server will sit behind an authenticated proxy, we don’t want that auto-configuration behavior, so we’ll create a separate group for it instead:

thv group create toolbox-group

Then start the open source MCP Toolbox for Databases server using the ToolHive CLI. ToolHive automatically pulls the server image using metadata from the ToolHive registry. You can view details about the image with thv registry info database-toolbox.

thv run --group toolbox-group database-toolbox \
--env BIGQUERY_PROJECT=<YOUR_PROJECT_ID> \
--env BIGQUERY_USE_CLIENT_OAUTH=true \
--proxy-port 6000 \
-- --prebuilt bigquery --address 0.0.0.0

Here’s what each parameter does:
--group toolbox-group: Name of the ToolHive group that the MCP server belongs to
database-toolbox: The MCP server image from the ToolHive registry
--env BIQUERY_PROJECT: Your Google Cloud project ID containing BigQuery resources
--env BIGQUERY_USE_CLIENT_OAUTH=true: Use the OAuth flow instead of static service account credentials
--proxy-port: Port exposed on your host for the containerized MCP server
--: CLI arguments passed into the MCP server
--prebuilt bigquery: Use the prebuilt configuration for BigQuery
--address 0.0.0.0: Bind the server to all network interfaces so the proxy can reach it

ToolHive spins up the MCP server container and HTTP proxy process, ready to handle BigQuery queries using the MCP protocol. Using ToolHive ensures the server is containerized, isolated, and managed securely — avoiding the “run-it-manually” friction.

Next, the thv proxy command starts a proxy process that sits in front of the MCP server and handles all incoming requests. It prompts you to sign in with Okta, exchanges your Okta token for a Google Cloud access token, and then forwards your request to the MCP server using that token.

thv proxy \
  --target-uri http://127.0.0.1:6000 \
  --remote-auth-client-id <OKTA_CLIENT_ID> \
  --remote-auth-client-secret <OKTA_CLIENT_SECRET> \
  --remote-auth okta \
  --remote-auth-issuer <AUTHORIZATION_SERVER_URL> \
  --remote-auth-callback-port 8666 \
  --remote-auth-scopes 'openid,profile,email,gcp.access' \
  --port 62614 \
  --token-exchange-url https://sts.googleapis.com/v1/token \
  --token-exchange-scopes 'https://www.googleapis.com/auth/bigquery,https://www.googleapis.com/auth/cloud-platform' \
  --token-exchange-audience //iam.googleapis.com/projects/<GOOGLE_PROJECT_NUMBER>/locations/global/workloadIdentityPools/okta-pool/providers/okta-provider

Here’s what each flag does:
--target-uri: Points to the MCP server’s proxy port (from the previous step)
--remote-auth-client-id: Client ID of your Okta app (from step 1)
--remote-auth-client-secret: Client secret of your Okta app (from step 1)
--remote-auth okta: Specifies the remote auth provider
--remote-auth-issuer: URL of the Okta authorization server’s issuer (from step 2)
--remote-auth-callback-port: Local port used for the OAuth callback (must match the callback URL used in step 1)
--remote-auth-scopes: Scopes requested from Okta during authentication
--port: Port the ToolHive proxy exposes to clients
--token-exchange-url: Google STS endpoint for exchanging tokens
--token-exchange-scopes: Google Cloud scopes required to access BigQuery and related APIs
--token-exchange-audience: Google Workload Identity Pool audience for Okta federation

When your browser opens, sign in with Okta. The proxy uses your Okta credentials to generate ID tokens, exchange them for valid Google tokens with the right scopes, and then continues the request automatically.

Step 5: Run the MCP server with Claude or another client

Let’s use Claude Code as an example. Because ToolHive doesn’t automatically manage client configurations for proxied MCP servers, you’ll need to add it manually:

# Add the authenticated ToolHive proxy
claude mcp add --scope user --transport http database-toolbox http://127.0.0.1:62614/mcp

# Run Claude Code
claude

The Toolbox MCP server uses the token provided by the ToolHive proxy and passes it to Google Cloud, giving you access to the resources available to your account.

Any other MCP-compatible client can connect the same way. Just point it to the ToolHive proxy endpoint.

Why this architecture is powerful

Simple for clients: Apps connect to the ToolHive proxy just like any other MCP server endpoint.
Secure authentication flow: The proxy makes you log in through Okta, so every request carries a verified user identity.
Federated access to Google Cloud: Instead of embedding service account keys in your server, the proxy handles a token exchange so Google recognizes your identity through the workforce identity provider.
Least-privilege and auditable: BigQuery jobs run under your federated Okta identity, so logs show “user@domain.com ran a BigQuery job” rather than “service-account X”.
Separation of concerns: The MCP server (Toolbox) focuses on data tools and queries, while the proxy handles auth, token exchange, and routing. It’s a cleaner, safer architecture.

Of course, it’s easy to get started with ToolHive, since it’s free and open source. I encourage you to visit toolhive.dev, where you can download the project and explore our docs.

Using Token Exchange with ToolHive and Okta for MCP Server to GraphQL Authentication

Yolanda Robla Mota — Tue, 04 Nov 2025 16:37:21 +0000

This article builds on our previous post, where we introduced the core concepts of token exchange and its role in secure authentication. Here, we delve into a practical application, demonstrating how to leverage Okta and ToolHive to facilitate token exchange for authenticating an MCP server with a GraphQL API.

Environment

This demo mimics a (hopefully!) real world example where we run an API service and we want to expose it with an MCP server. The back end API requires a token with aud=backend and scopes=[backend-api:read].

"Aud" (audience) in a token specifies the intended recipient of the token, indicating which service or application is meant to consume it. "Scopes" define the specific permissions or access rights granted by the token, detailing what actions the token holder is authorized to perform. Only tokens having the expected audience and the expected scopes authorize the caller to use the service.

We don’t want to expose the back end service directly to the AI client, but only through the MCP server. We also want to maintain a clean audit trail showing us who accessed what.

The MCP server requires a token with aud=mcpserver and scopes=mcp:tools:call.

Both the API service and the MCP server are part of the same Okta realm, but we’ll use different Authorization Servers to ensure that both the token the MCP server receives and the token use different audiences.

We’ll simulate the whole flow as a developer connecting to this setup by adding the MCP server to VSCode and calling the tools it provides.

It should be noted that in this example, we’ll be using an Apollo-based GraphQL service as the backend API service and the existing Apollo MCP server, but the same setup applies to any kind of API services as long as they both use OAuth tokens from the same realm as the authentication mechanism.

In order to follow along, you can clone the Apollo GraphQL service from a demo repository.

Okta setup

I’ve used the Okta integrator setup to prepare this demo and therefore the instructions cover the whole setup from the ground up including creating the Authorization Servers. This is likely not needed or needs to be adjusted in a real world environment.

Authorization Servers

To logically separate the MCP server from the back end API service, we’ll configure two Okta Authorization servers - one for the MCP server and client and the other for the backend server.

Create the Authorization Servers and then the following scopes:

mcpserver AS mcp:tools:call
backend AS backend-api:read

Trust between authorization servers

In order to enable token exchange between two authorization servers - the one that issues tokens for access to the MCP server and the one that issues tokens for accessing the back end, we need to establish trust between the two.

Go to the back end AS and down at the settings tab, add the mcpserver AS as trusted:

Applications

We’ll set up two Applications:

A VSCode client to authenticate to the MCP server. We create a client directly to avoid Dynamic Client registration. This will be an OIDC application with a client ID and a secret. It is important to match the Redirect URIs that VSCode uses. Set the Redirect URIs to http://127.0.0.1:33418 and https://vscode.dev/redirect
A toolhive client that will perform the Token Exchange. This is an API Services type in Okta lingo. To create the application, go to:
Applications -> Create App Integration and select API Services
Name your application
In the application page, navigate to the General Settings page and uncheck the “Require Demonstrating Proof of Possession” header as this is not yet supported by ToolHive
Check the Token Exchange grant

Policies

In order for applications to authenticate, we need to include them in policies, otherwise Okta will not issue tokens to the clients. We’ll define two policies: One that allows the MCP Client (VSCode) to request tokens with mcp:tools:call and another one that allows the token exchange by the ToolHive process.

MCP client to MCP server

This policy is to be defined on the mcpserver AS side. Select “Add New Access Policy”, then “Assign to the following Clients” and select the VSCode client. When the policy is created, click “Add Rule” in the policy and in the “And the following scopes” section add both the “OpenID Connect” scopes and the mcp:tools:call scopes.

MCP server token exchange

This policy is to be defined on the back end AS side. Select “Add New Access Policy”, then “Assign to the following Clients” and select the ToolHive client. When adding the rule, don’t forget to unroll “Advanced” under the “If Grant Type Is” section and add Token Exchange. Add “backend-api:read” to the scopes.

Running the GraphQL server

Let’s clone our server locally:

git clone https://github.com/StacklokLabs/apollo-mcp-auth-demo

Next, let’s configure the IDP settings in the .env file:

cp .env.example .env
vim .env

Using my Okta integrator account, the .env file looks as follows:

# Okta Configuration
# Your Okta domain (e.g., dev-123456.okta.com)
OKTA_DOMAIN=integrator-3683736.okta.com

# Your Okta issuer URL (authorization server)
# For default authorization server: https://your-domain.okta.com/oauth2/default
# For custom authorization server: https://your-domain.okta.com/oauth2/{authServerId}
OKTA_ISSUER=https://integrator-3683736.okta.com/oauth2/auswdh3wurjeJ62La697

# JWT Validation Configuration
# Expected audience in JWT tokens (space-separated if multiple)
OKTA_AUDIENCE=backend
# Required scopes in JWT tokens (space-separated)
REQUIRED_SCOPES=backend-api:read

# Authentication Configuration
# Set to 'true' to require valid tokens for all requests (recommended)
# Set to 'false' to disable authentication requirement (for testing)
REQUIRE_AUTH=true

# Server Configuration
PORT=4000

Now we’re ready to start the server:

npm install
npm start

Running ToolHive

In our testing, we’re using the already existing Apollo MCP server with no modifications - all the heavy lifting is done by ToolHive. The Apollo MCP server is merely configured to accept the downstream authentication token in the Authorization: Bearer HTTP header and forward it to the external API.
The MCP server configuration can be found in the mcp-server-data directory in the demo repository.

Because the unmodified MCP server also validates the incoming tokens, we need to set the transport.auth.servers attribute in the config file to the back end Authorization server:

vim mcp-server-data/apollo-mcp-config.yaml

...
transport:
  type: sse
  port: 8000
  auth:
    servers:
      - https://integrator-3683736.okta.com/oauth2/auswdh3wurjeJ62La697
...

Now we can run the server with:

thv run \
--debug \
--foreground \
--transport streamable-http \
--name apollo \
--target-port 8000 \
--proxy-port 8000 \
--volume $(pwd)/mcp-server-data/apollo-mcp-config.yaml:/config.yaml \
--volume $(pwd)/mcp-server-data:/data \
--oidc-audience mcpserver \
--resource-url http://localhost:8000/mcp \
       --oidc-issuer https://integrator-3683736.okta.com/oauth2/ausw8f1ut6X0WMjZN697 \
--oidc-jwks-url https://integrator-3683736.okta.com/oauth2/ausw8f1ut6X0WMjZN697/v1/keys \
--token-exchange-audience backend \
--token-exchange-client-id 0oawdgw7krVBSwzIx697 \
--token-exchange-client-secret O2zqVb-evhKgfBOD-PRVDs5HFyCXAnRZAwxAtQOH9oGt72aBrLBiwEVlyyTengj9 \
--token-exchange-scopes backend-api:read \
--token-exchange-url https://integrator-3683736.okta.com/oauth2/auswdh3wurjeJ62La697/v1/token \
apollo-mcp-server -- /config.yaml

Let’s unpack the parameters:
--oidc-audience mcpserver - When the OIDC token from VSCode arrives to toolhive, then toolhive checks if the token’s aud field matches this value and rejects the connection otherwise

--resource-url http://localhost:9090/mcp - Setting the resource explicitly helps VSCode discover the proper Protected Resource Metadata Endpoint as per the MCP specification and in effect points VSCode to the Okta instance. Typically not needed in e.g. Kubernetes environments where the service name can be used

--oidc-issuer https://integrator-3683736.okta.com/oauth2/ausw8f1ut6X0WMjZN697 - This is the issuer of the mcpserver Authorization Server (see the first screenshot of the document)

--oidc-jwks-url https://integrator-3683736.okta.com/oauth2/ausw8f1ut6X0WMjZN697/v1/keys - The JWKS endpoint of the mcpserver Authorization Server

--token-exchange-audience 'backend' - We want ToolHive to take the incoming tokens and exchange them for tokens with audience of “backend”

--token-exchange-client-id 0oawdgw7krVBSwzIx697 - The Client ID of the “ToolHive client”, the one who has assigned the token exchange policy to itself

--token-exchange-client-secret O2zqVb-evhKgfBOD-PRVDs5HFyCXAnRZAwxAtQOH9oGt72aBrLBiwEVlyyTengj9 - the client secret of the ToolHive client. Outside demos, please use the --token-exchange-client-secret-file switch instead, or the TOOLHIVE_TOKEN_EXCHANGE_CLIENT_SECRET environment variable.

--token-exchange-scopes 'backend-api:read' - The scopes we request for the external token. Must match what’s in the policy.

--token-exchange-url [https://integrator-3683736.okta.com/oauth2/auswdh3wurjeJ62La697/v1/token](https://integrator-3683736.okta.com/oauth2/auswdh3wurjeJ62La697/v1/token) - the token endpoint of the back end Authorization Server.

Note that the example above uses thv run, but it’s equally possible to use the token exchange from thv proxy which can then also provide authentication to the MCP server:

thv proxy demo-mcp-server \
    --target-uri http://localhost:8091 \
    --port 3000 \
    --remote-auth \
    --remote-auth-client-id 0oawdhc2mlgHOwNvW697 \
    --remote-auth-client-secret Ag0Zj6ALuxxqascP6KJ-CA4uCRcOLmIKtQeR_o3ClGgxMxx0zcgZYYtg-TmHF6U- \
    --remote-auth-issuer https://integrator-3683736.okta.com/oauth2/ausw8f1ut6X0WMjZN697 \
    --remote-auth-scopes 'mcp:tools:call,openid,email' \
    --token-exchange-audience 'backend' \
    --token-exchange-client-id 0oawdgw7krVBSwzIx697 \
    --token-exchange-client-secret O2zqVb-evhKgfBOD-PRVDs5HFyCXAnRZAwxAtQOH9oGt72aBrLBiwEVlyyTengj9 \
    --token-exchange-scopes 'backend-api:read' \
    --token-exchange-url https://integrator-3683736.okta.com/oauth2/auswdh3wurjeJ62La697/v1/token

Authentication from VSCode and putting it all together

Once the server is running, it should automatically appear in the list of the configured MCP servers in VSCode. Clicking Start will prompt authentication against Okta. The first time, you’ll be prompted to enter the client ID and secret as well. Once Okta authenticates, VSCode receives the token, uses it to authenticate to the MCP server (toolhive) which exchanges the token which enables calling the back end API.

Past the initial setup on the IDP side, authentication and authorization to the MCP server fronted by ToolHive and by extension the back end service is seamless and allows partition access to the back end services as well as provides a cleaner audit trail.

As the last step, we can invoke one of the MCP tools to verify the setup end-to-end:

As seen on the screenshot above, the GetCountry tool of the Apollo server was called and returned a reply! If we check the logs of the API server we ran earlier we also see details of the token that was validated:

This token has different audience than the one passed to the ToolHive - if you recall the thv run parameters, they specified, through the --oidc-audience mcpserver argument that the tokens must set the aud claim to mcpserver while the token that arrived to the back end API has audience backend. Looking closely at the issuer, we also see that the token was issued by the back end Authorization Server, while the tokens issued to authenticate to ToolHive were issued by the mcpserver Authorization Server. This shows that the token exchange works correctly. In the next section, we’ll illustrate for completeness’ sake how the tokens look exactly and how the whole flow works.

The token exchange under the hood

The flow is described in the Mermaid diagram below.

The client authenticates to the toolhive which exposes the interface and endpoints as the MCP standard describes. The toolhive authentication middleware verifies the token was issued by the expected IDP and has the expected audience. After authentication, the token is then passed to the Token Exchange middleware which contacts the IDP and exchanges the token meant for the MCP server for the token meant for the external service.

The token issued to the client might look like this (simplified):

{
    "iss": https://idp.example.com/oauth2/default",
    "aud": "mcp-server",
    "scp": [
        "backend-mcp:tools:call",
        "backend-mcp:tools:list",
    ],
    "sub": "user@example.com",
}

While the exchanged token would have different scopes and a different audience, allowing the MCP server to authenticate to the back end service:

{
    "iss": https://idp.example.com/oauth2/default",
    "aud": "backend-server",
    "scp": [
        "backend-api:read",
    ],
    "sub": "user@example.com",
}

This exchanged token is then injected into the Authorization: Bearer HTTP header and passed on to the actual MCP server running under Toolhive. The MCP server can then use the token.

Summary and benefits

By leveraging token exchange, ToolHive enables MCP servers to authenticate to third-party APIs in a secure, efficient, and tenant-aware way. MCP servers receive properly scoped, short-lived access tokens instead of embedding long-lived secrets or bespoke authentication logic. Each API call made upstream can be attributed to the individual user identity rather than a generic service account, making audit trails clearer and more meaningful.

References

https://modelcontextprotocol.io/docs/tutorials/security/authorization

https://developer.okta.com/docs/guides/set-up-token-exchange/main/

Using Token Exchange with ToolHive and Okta for MCP Server to GraphQL Authentication

Yolanda Robla Mota — Tue, 04 Nov 2025 16:37:21 +0000

Environment

We don’t want to expose the back end service directly to the AI client, but only through the MCP server. We also want to maintain a clean audit trail showing us who accessed what.

The MCP server requires a token with aud=mcpserver and scopes=mcp:tools:call.

We’ll simulate the whole flow as a developer connecting to this setup by adding the MCP server to VSCode and calling the tools it provides.

In order to follow along, you can clone the Apollo GraphQL service from a demo repository.

Okta setup

Authorization Servers

To logically separate the MCP server from the back end API service, we’ll configure two Okta Authorization servers - one for the MCP server and client and the other for the backend server.

Create the Authorization Servers and then the following scopes:

mcpserver AS mcp:tools:call
backend AS backend-api:read

Trust between authorization servers

Go to the back end AS and down at the settings tab, add the mcpserver AS as trusted:

Applications

We’ll set up two Applications:

A VSCode client to authenticate to the MCP server. We create a client directly to avoid Dynamic Client registration. This will be an OIDC application with a client ID and a secret. It is important to match the Redirect URIs that VSCode uses. Set the Redirect URIs to http://127.0.0.1:33418 and https://vscode.dev/redirect
A toolhive client that will perform the Token Exchange. This is an API Services type in Okta lingo. To create the application, go to:
Applications -> Create App Integration and select API Services
Name your application
In the application page, navigate to the General Settings page and uncheck the “Require Demonstrating Proof of Possession” header as this is not yet supported by ToolHive
Check the Token Exchange grant

Policies

MCP client to MCP server

MCP server token exchange

Running the GraphQL server

Let’s clone our server locally:

git clone https://github.com/StacklokLabs/apollo-mcp-auth-demo

Next, let’s configure the IDP settings in the .env file:

cp .env.example .env
vim .env

Using my Okta integrator account, the .env file looks as follows:

# Okta Configuration
# Your Okta domain (e.g., dev-123456.okta.com)
OKTA_DOMAIN=integrator-3683736.okta.com

# Your Okta issuer URL (authorization server)
# For default authorization server: https://your-domain.okta.com/oauth2/default
# For custom authorization server: https://your-domain.okta.com/oauth2/{authServerId}
OKTA_ISSUER=https://integrator-3683736.okta.com/oauth2/auswdh3wurjeJ62La697

# JWT Validation Configuration
# Expected audience in JWT tokens (space-separated if multiple)
OKTA_AUDIENCE=backend
# Required scopes in JWT tokens (space-separated)
REQUIRED_SCOPES=backend-api:read

# Authentication Configuration
# Set to 'true' to require valid tokens for all requests (recommended)
# Set to 'false' to disable authentication requirement (for testing)
REQUIRE_AUTH=true

# Server Configuration
PORT=4000

Now we’re ready to start the server:

npm install
npm start

Running ToolHive

Because the unmodified MCP server also validates the incoming tokens, we need to set the transport.auth.servers attribute in the config file to the back end Authorization server:

vim mcp-server-data/apollo-mcp-config.yaml

...
transport:
  type: sse
  port: 8000
  auth:
    servers:
      - https://integrator-3683736.okta.com/oauth2/auswdh3wurjeJ62La697
...

Now we can run the server with:

thv run \
--debug \
--foreground \
--transport streamable-http \
--name apollo \
--target-port 8000 \
--proxy-port 8000 \
--volume $(pwd)/mcp-server-data/apollo-mcp-config.yaml:/config.yaml \
--volume $(pwd)/mcp-server-data:/data \
--oidc-audience mcpserver \
--resource-url http://localhost:8000/mcp \
       --oidc-issuer https://integrator-3683736.okta.com/oauth2/ausw8f1ut6X0WMjZN697 \
--oidc-jwks-url https://integrator-3683736.okta.com/oauth2/ausw8f1ut6X0WMjZN697/v1/keys \
--token-exchange-audience backend \
--token-exchange-client-id 0oawdgw7krVBSwzIx697 \
--token-exchange-client-secret O2zqVb-evhKgfBOD-PRVDs5HFyCXAnRZAwxAtQOH9oGt72aBrLBiwEVlyyTengj9 \
--token-exchange-scopes backend-api:read \
--token-exchange-url https://integrator-3683736.okta.com/oauth2/auswdh3wurjeJ62La697/v1/token \
apollo-mcp-server -- /config.yaml

--oidc-issuer https://integrator-3683736.okta.com/oauth2/ausw8f1ut6X0WMjZN697 - This is the issuer of the mcpserver Authorization Server (see the first screenshot of the document)

--oidc-jwks-url https://integrator-3683736.okta.com/oauth2/ausw8f1ut6X0WMjZN697/v1/keys - The JWKS endpoint of the mcpserver Authorization Server

--token-exchange-audience 'backend' - We want ToolHive to take the incoming tokens and exchange them for tokens with audience of “backend”

--token-exchange-client-id 0oawdgw7krVBSwzIx697 - The Client ID of the “ToolHive client”, the one who has assigned the token exchange policy to itself

--token-exchange-client-secret O2zqVb-evhKgfBOD-PRVDs5HFyCXAnRZAwxAtQOH9oGt72aBrLBiwEVlyyTengj9 - the client secret of the ToolHive client. Outside demos, please use the --token-exchange-client-secret-file switch instead, or the TOOLHIVE_TOKEN_EXCHANGE_CLIENT_SECRET environment variable.

--token-exchange-scopes 'backend-api:read' - The scopes we request for the external token. Must match what’s in the policy.

--token-exchange-url [https://integrator-3683736.okta.com/oauth2/auswdh3wurjeJ62La697/v1/token](https://integrator-3683736.okta.com/oauth2/auswdh3wurjeJ62La697/v1/token) - the token endpoint of the back end Authorization Server.

Note that the example above uses thv run, but it’s equally possible to use the token exchange from thv proxy which can then also provide authentication to the MCP server:

thv proxy demo-mcp-server \
    --target-uri http://localhost:8091 \
    --port 3000 \
    --remote-auth \
    --remote-auth-client-id 0oawdhc2mlgHOwNvW697 \
    --remote-auth-client-secret Ag0Zj6ALuxxqascP6KJ-CA4uCRcOLmIKtQeR_o3ClGgxMxx0zcgZYYtg-TmHF6U- \
    --remote-auth-issuer https://integrator-3683736.okta.com/oauth2/ausw8f1ut6X0WMjZN697 \
    --remote-auth-scopes 'mcp:tools:call,openid,email' \
    --token-exchange-audience 'backend' \
    --token-exchange-client-id 0oawdgw7krVBSwzIx697 \
    --token-exchange-client-secret O2zqVb-evhKgfBOD-PRVDs5HFyCXAnRZAwxAtQOH9oGt72aBrLBiwEVlyyTengj9 \
    --token-exchange-scopes 'backend-api:read' \
    --token-exchange-url https://integrator-3683736.okta.com/oauth2/auswdh3wurjeJ62La697/v1/token

Authentication from VSCode and putting it all together

As the last step, we can invoke one of the MCP tools to verify the setup end-to-end:

The token exchange under the hood

The flow is described in the Mermaid diagram below.

The token issued to the client might look like this (simplified):

{
    "iss": https://idp.example.com/oauth2/default",
    "aud": "mcp-server",
    "scp": [
        "backend-mcp:tools:call",
        "backend-mcp:tools:list",
    ],
    "sub": "user@example.com",
}

While the exchanged token would have different scopes and a different audience, allowing the MCP server to authenticate to the back end service:

{
    "iss": https://idp.example.com/oauth2/default",
    "aud": "backend-server",
    "scp": [
        "backend-api:read",
    ],
    "sub": "user@example.com",
}

This exchanged token is then injected into the Authorization: Bearer HTTP header and passed on to the actual MCP server running under Toolhive. The MCP server can then use the token.

Summary and benefits

References

https://modelcontextprotocol.io/docs/tutorials/security/authorization

https://developer.okta.com/docs/guides/set-up-token-exchange/main/

Beyond API Keys: Token Exchange, Identity Federation & MCP Servers

Yolanda Robla Mota — Thu, 30 Oct 2025 11:04:03 +0000

Modern backend systems—especially in the era of AI agents, MCP servers, and multi-cloud architectures—are evolving far beyond static credentials and monolithic identity models. In this post we explore the architecture of token exchange, identity federation, and how a system like ToolHive enables secure deployment of MCP servers in this world.

The legacy problem: static credentials

The MCP authorization specification focuses on how to authorize access to the MCP server itself. It doesn't specify how an MCP server should authenticate with the server it's connecting to. This leaves MCP server creators without clear guidance.

In many deployments of MCP (Model Context Protocol) servers and tooling services today, developers still default to patterns like:

A service-account JSON key or a long-lived API key embedded in configuration.
All calls executed under a single “shared identity” with elevated permissions.
If the key is compromised, the impact spans many users or tenants; rotating or tracking the key is operationally heavy.
Least-privilege is often compromised because the shared identity needs broad access to avoid blocking tool invocation.

This approach doesn’t align with how modern identity systems, federated services and cloud tools are designed. It’s less secure, harder to govern, and doesn’t scale across users or multi‐tenant environments.

Step up: Short-lived tokens via an IdP

A much better pattern emerges when you shift to short-lived tokens:

A user (or service) authenticates via an Identity Provider (IdP) — for example, Okta or Azure AD.
They receive a short-lived token (OIDC ID token or OAuth access token) that's scoped to their identity and minimal permissions.
This token is used to authenticate to the MCP server (with the help of ToolHive), which validates it and establishes the user's identity.
Toolhive then acquires a separate token for the downstream backend API—either through token exchange (if using the same IdP) or federation (if crossing identity domains).
Your MCP server receives this backend-scoped token and uses it when calling downstream services or tools.

Because tokens are scoped, time-limited, and mapped to a specific user context, you get better auditability, enforce least-privilege, and eliminate static credentials. Next, we’ll show you how to ensure that your MCP server always has the right credentials for its backend API without embedding secrets or handling complex auth flows.

Token Exchange & Federation: crossing trust-boundaries

Token exchange refers to the process where one security token (issued by one identity domain) is presented to a “Security Token Service” (STS) or similar endpoint, and in return you receive a new token valid for another domain, audience, or scope.
The standard for this is RFC 8693 (OAuth 2.0 Token Exchange) which lets you request a new token via a grant like urn:ietf:params:oauth:grant-type:token-exchange

Use-cases for token exchange include:

A token issued by your internal IdP being exchanged for a token valid for a cloud provider’s API.
A token from one IdP being reused to obtain tokens in another trust domain without forcing the user to log in again.
A service acting on behalf of a user, exchanging its own token for one with narrower scopes or different audiences.

Two common scenarios

A) The downstream service uses the same IdP as the MCP server

In this case your identity provider (IdP) issues tokens for both the MCP server and the downstream resources. No cross-domain trust is needed.

User authenticates via IdP → obtains a token for the MCP server.
ToolHive validates the token and performs access control checks.
ToolHive exchanges that token with the same IdP for a new token with the downstream service's audience and scopes.
MCP server receives this exchanged token and uses it to call the downstream service. - Simpler, fewer moving parts, since the exchange happens within the same IdP ecosystem.

The token issued to the client might look like this (simplified):

{
   "iss": https://idp.example.com/oauth2/default",
   "aud": "**mcp-server**",
   "scp": [
     "**backend-mcp:tools:call**",
     "**backend-mcp:tools:list**",
   ],
   "sub": "user@example.com",
}

While the exchanged token would have different scopes and a different audience, allowing the MCP server to authenticate to the back end service:

{
    "iss": https://idp.example.com/oauth2/default",
    "aud": "**backend-server**",
    "scp": [
        "**backend-api:read**",
    ],
    "sub": "user@example.com",
}

B) The downstream service uses a different IdP and you rely on federation

Here you have two distinct identity/trust domains: one used by the MCP server (or its IdP) and another used by the back end resource. Instead of issuing separate credentials or having users login twice, you rely on federation and token exchange.

User authenticates via IdP A → receives a token for domain A that is presented to ToolHive
ToolHive validates the token and performs access control checks.
ToolHive presents the token to an STS or federation service (e.g., Google Cloud STS) → obtains a federated token valid for domain B (cloud provider).
Downstream service validates the token from domain B and executes requests under that identity.

This approach enables your system to be IdP-agnostic and cloud-agnostic: authenticate with any IdP, then federate into any trust-configured domain.

The token issued to the client might look like this (simplified):

{
  "iss": "**https://idp.example.com/oauth2/default**",
  "aud": "**mcp-server**",
  "sub": "user@example.com",
  "email": "user@example.com",
  "scp": [
    "**mcp:tools:call**",
    "**mcp:tools:list**"
  ],
  "exp": 1729641600,
  "iat": 1729638000
}

The exchanged federated access token would have a different issuer, audience, and scopes, allowing the MCP server to authenticate to the upstream service as the federated user identity:

{
  "iss": "**https://sts.googleapis.com**",
  "aud": "**https://bigquery.googleapis.com/**",
  "sub": "**principal://iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/POOL_ID/subject/user@example.com**",
  "email": "user@example.com",
  "scp": [
    "**https://www.googleapis.com/auth/bigquery**",
  ],
  "exp": 1729641600,
  "iat": 1729638000
}

Why this matters for MCP servers

MCP servers are often deployed to call different services on behalf of users. If they rely on static credentials or simplistic “shared identity” models, you lose user-level attribution, least-privilege control, and auditability.
By using token exchange + federation, you allow your MCP server to operate under the right identity context, even when the target service sits in a different trust domain.
It also lets you design your architecture so the authentication piece (login, token issuance) is decoupled from the MCP server logic — the server can remain auth-agnostic and medium-agnostic.

Where ToolHive fits

ToolHive simplifies deployment of MCP servers by handling the operational and security heavy-lifting.

You run your MCP servers in containers with minimal permissions and network access — ToolHive manages that.
ToolHive acts as a gateway: it verifies the user's token (via your IdP), enforces access policies, then acquires the appropriate backend token—either through exchange or federation—before passing that to your MCP server.
This separation means your MCP server remains auth-agnostic — ToolHive handles authN/authZ and you plug in any IdP or downstream STS.

This blog post is the first in a series. Over the coming posts we’ll dive into a set of practical examples using ToolHive — showing how to wire up different IdPs, federate into different clouds, run MCP servers securely, and deal with real-world edge cases.

Note: ToolHive is an open source project, and we encourage you to download it (from toolhive.dev) and start using it. We value your feedback and would love to engage with you via our GitHub repo and/or Discord channel.

Exposing a Kubernetes-Hosted MCP Server with ToolHive + ngrok (with Basic Auth)

Yolanda Robla Mota — Wed, 17 Sep 2025 15:17:38 +0000

In the previous post, we tunneled a local MCP server with ngrok to expose internal services externally (for testing and integration, demo access, branch office access and other scenarios). Now let’s do the same for a Kubernetes-hosted workload managed by ToolHive. This is very much a production scenario in which exposed MCP servers are also exposed via Kubernetes clusters; but with ToolHive and ngrok, we can keep the approach simple. Once you’ve got ToolHive and ngrok up-and-running, just follow the steps below:

1. Deploy ToolHive to your cluster, then the fetch MCP server

Follow the ToolHive Kubernetes Operator quickstart to install the operator and deploy an MCP server in your cluster (I’m using the fetch server here). The operator turns MCP servers into first-class Kubernetes resources you can manage declaratively:

kubectl apply -f https://raw.githubusercontent.com/stacklok/toolhive/refs/heads/main/examples/operator/mcp-servers/mcpserver_fetch.yaml

After applying your manifests/CRs, you’ll see Services like:

kubectl get service -n toolhive-system
# NAME                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
# mcp-fetch-headless   ClusterIP   None            <none>        8080/TCP   12m
# mcp-fetch-proxy      ClusterIP   10.96.166.106   <none>        8080/TCP   12m

These are ClusterIP Services, which are intentionally in-cluster only (no host access yet). We’ll bridge them to the host next.

2. Port-forward the Service to your laptop

Use kubectl port-forward to map the Service’s port to localhost:8080 so you can reach it from your machine:

kubectl -n toolhive-system port-forward svc/mcp-fetch-proxy 8080:8080

Now http://127.0.0.1:8080 is a portal to the in-cluster Service.

3. Add a simple ngrok Traffic Policy (HTTP Basic Auth)

Before we open this to the internet, let’s require a username/password via ngrok Traffic Policy. Save a policy file like /tmp/policy.yaml:

on_http_request:
  - actions:
      - type: basic-auth
        config:
          credentials:
            - stacklok:p4ssw0rd

ngrok’s Basic Auth policy validates the Authorization: Basic … header, returning 200 OK when credentials match, and 401 Unauthorized otherwise

Tip: echo -n 'stacklok:p4ssw0rd' | base64 helps you generate the header value locally.

4. Launch the tunnel with ToolHive’s proxy

With the Service forwarded to 127.0.0.1:8080, start a ToolHive tunnel pointing at that local address, telling ToolHive to use ngrok and your policy file:

thv proxy tunnel http://127.0.0.1:8080 test \
  --tunnel-provider ngrok \
  --provider-args '{"auth-token":"${NGROK_TOKEN}","traffic-policy-file":"/tmp/policy.yaml"}'

ToolHive will bring up an ngrok HTTPS endpoint and print the public URL for the fetch MCP server — something like:

"fetch": {
  "url": "https://bf18062fef8a.ngrok-free.app/mcp",
  "description": "Fetch MCP server for testing",
  "headers": {
    "Authorization": "Basic c3RhY2tsb2s6cDRzc3cwcmQ="
  },
  "type": "http"
}

Send requests with the Authorization header and you’ll get through; omit it and you’ll see a 401 by design.

Summarizing the benefits of this approach

Kubernetes-native management: ToolHive’s operator defines and manages MCP servers as Kubernetes resources, which is great for multi-user and production workflows
Safe local bridge: kubectl port-forward exposes the internal Service to your host without changing cluster networking.
Hardened public edge: ngrok’s Traffic Policy adds Basic Auth at the edge so your tunnel isn’t wide open during tests/demos.

With these few steps, you’ve taken a Kubernetes-hosted MCP server, bridged it to your localhost safely, and published it behind a secure, temporary ngrok URL. This is perfect for quick external tests, demos, or sharing an endpoint without touching production.

We’re excited about the integration of ToolHive and ngrok and how it quickly and elegantly solves a problem that more enterprises will encounter as they adopt MCP. If you have questions or ideas, we’d love to hear from you. Please checkout ToolHive and ngrok, and connect with us on Discord.

How-to Safely Expose your MCP Servers Externally Using ngrok and ToolHive

Yolanda Robla Mota — Tue, 09 Sep 2025 14:10:23 +0000

As you make increasing use of Model Context Protocol (MCP) servers, you’re going to find yourself in a situation where you need to expose these endpoints externally. For example, you may need to expose servers to a partner or customer for testing and integration. Perhaps your organization has a branch office without direct network access but the same need to reach MCP servers. Or, your product may offer MCP ‘tools-as-a-service’ to clients that live outside your VPC.
There’s a quick, simple and safe way to expose MCP servers when they’re managed by ToolHive and integrated with ngrok. Below we’ll show you how you can do it using ToolHive's proxy tunnel command. But, first, a quick description of ToolHive and ngrok for those new to these solutions.

ToolHive: The MCP Engine

ToolHive is your starting point for running MCP in production. It handles:

Server Lifecycle: Starting, stopping, and managing MCP server instances.
Transport Methods: Supporting multiple communication protocols (stdio, SSE, streamable-http).
Security: Managing secrets, permissions, and isolation.
Discovery: Providing a registry of available MCP servers.

ngrok: The API Gateway

ngrok is a flexible API gateway that provides instant and secure access, anywhere. It handles:

Identity and access: Supporting OIDC and OAuth2 and providing tenant-aware RBAC
Secure tunnel: Handling HTTPS with secure, automatically generated public URLs
Safety & governance: Establishing rate limits and managing blast radius
Observability: Providing logs and audit trails

Getting started:

Before diving in, you’ll need to address a few easy prerequisites:

Create an ngrok account

Visit the ngrok website and sign up for a free (or paid) account.

Obtain your ngrok auth token

After logging in, you'll find your authentication token in the ngrok dashboard—often under Auth or Setup.
Copy that token (here represented as xxx in examples).

Enable fixed or custom domains (optional)

Set up a permanent, branded domain (e.g. your-app.ngrok.io or a custom domain like api.yourcompany.com) instead of a random address by claiming your free static domain at dashboard.ngrok.com/domains.

Exposing MCP server endpoints

As we work through this example, imagine you’ve got an OSV MCP server that you want to expose externally, so that other users can test your integration. You set-up and are managing that OSV MCP server using ToolHive with internal workload listening on localhost.

Your first command would be:

export NGROK_TOKEN=<your_ngrok_token>
export NGROK_URL=<your_ngrok_url>

thv proxy tunnel osv test \
  --tunnel-provider ngrok \
  --provider-args "{
    \"auth-token\": \"${NGROK_TOKEN}\",
    \"url\": \"${NGROK_URL}\"
  }"

That command includes these actions:

thv proxy tunnel osv test spins up a tunnel for the ToolHive workload named osv (in a test context).
--tunnel-provider ngrok tells ToolHive to use ngrok as the tunneling mechanism.
--provider-args passes any needed parameters for ngrok, such as authentication credentials so the tunnel will establish properly under your account.

The result is an endpoint in ngrok with all the settings configured:

Once the command runs successfully, you’ll get a public HTTPS URL that you can use and integrate into your tools. In our example, that URL looks like this:

"osv": {
  "url": "https://ricarda-presuggestive-archaically.ngrok-free.app",
  "description": "OSV MCP server for testing"
}

And that means that the local OSV MCP server, which was accessible only on localhost, is now reachable externally, and can be used by other users to test your integration:

Summarizing the benefits of this approach

With a simple (single) command, you’ve set-up an instant, public URL with no DNS changes or firewall configuration. It’s a secure HTTPS endpoint that’s lightweight and temporary, so it’s ideal for short-term testing, demos or collaborating with remote customers and teammates.