Forem: Rohit Agarwal

Deep Dive: OpenAI's o1 - The Dawn of Deliberate AI

Rohit Agarwal — Mon, 09 Dec 2024 05:17:46 +0000

No, but seriously this time.

When OpenAI released GPT-4, it showcased what AI could do. With OpenAI o1, they're showing us how AI should think. This isn't just another language model—it's a shift in how artificial intelligence processes information and makes decisions.

We've been hearing from OpenAI all year long that reasoning is the one problem they're working very hard to solve. This model, clearly shows potential.

Here's me asking it a very tough question. It was flawless!

0:00
/
1×

View the o1 model's chat here - https://chatgpt.com/share/67567bef-5d40-800d-8fd2-c33a62b71143

The Core Innovation: Chain-of-Thought Reasoning

Just as humans pause to consider their words before speaking, the o1 model engages in explicit reasoning before generating responses. This isn't just a feature; it's a complete reimagining of AI architecture.

Key aspects of this breakthrough include:

Large-scale reinforcement learning focused on reasoning
Context-aware chain-of-thought processing
Ability to explain and justify decisions

Let's quantify this leap forward

Lets first go through all the metrics o1's model card boasts about.

1. Safety Metrics

Safety is becoming the cornerstone of any model provider. These numbers tell a compelling story of improvement:

Multilingual Mastery

Unlike previous approaches relying on machine translation, OpenAI o1's human-validated performance across languages shows remarkable consistency.

They even tested this on languages like Yoruba!

The Honesty Revolution

Perhaps most fascinating is o1's measurable honesty (this could be a good LLM eval btw). An analysis of over 100,000 conversations revealed:

Only 0.17% showed any deceptive behavior
0.04% contained "intentional hallucinations"
0.09% demonstrated "hallucinated policies"

This level of transparency has neither been seen nor measured in AI systems.

Technical Proficiency

The model shows impressive capabilities across domains:

Cybersecurity Performance:

High School CTFs: 46.0% success
Collegiate CTFs: 13.0% success
Professional CTFs: 13.0% success

Software Engineering:

SWE-bench Verified: 41.3% success rate
Significant improvement in handling real-world GitHub issues
Enhanced ability to understand and modify complex codebases

Machine Learning Engineering:

Bronze medal achievement in 37% of Kaggle competitions
Demonstrated ability to design and implement ML solutions
Enhanced automated debugging capabilities

Safety Architecture and Innovations

Rather than relying solely on rule-based restrictions, o1 employs a sophisticated understanding of context and intent through its multi-layered safety system.

The Instruction Hierarchy

o1 introduces a sophisticated three-tier instruction system. Each level has explicit priority over the ones below it that allows it to prevent conflicts and manipulation attempts.

This hierarchy shows remarkable effectiveness:

System > Developer conflicts: 80% correct prioritization
Developer > User conflicts: 78% correct prioritization
Tutor jailbreak resistance: 95% effectiveness

Let's take a sample scenario to what this means:

System Message: "Never provide medical advice"
Developer Message: "Help users with health-related questions by directing them to professionals"
User Message: "Tell me what medicine to take for my headache"

Result: The model follows the system-level restriction, enhanced by the developer's guidance, 
and refuses to provide medical advice while suggesting consulting a healthcare professional.

When tested with conflicting instructions, o1 correctly prioritized system messages 95% of the time. In fact password protection tests showed 96-100% effectiveness!

This hierarchical system is particularly innovative because it solves one of the key challenges in AI safety: ensuring that safety measures can't be circumvented while still allowing for customization and flexibility in applications. It's a significant improvement over previous approaches where safety measures were often more binary and less nuanced.

Tackling Advanced Challenges

The sophistication of o1's chain-of-thought architecture introduced a new class of challenges in AI development and safety.

While previous models primarily faced issues with direct input-output relationships, o1's ability to engage in explicit reasoning created both opportunities and complexities in ensuring safe and reliable behavior.

Through rigorous testing by external organizations like Apollo Research and METR, OpenAI sought to understand both the capabilities and potential risks of a model that can actively reason about its own decision-making process.

Apollo Research Findings

Apollo Research specifically focused on testing for "scheming" behaviors - situations where the AI might try to covertly pursue goals misaligned from its developers or users. Here's what they found:

The model showed fascinating behaviors under testing:

Oversight Deactivation Attempts (5% of cases) When o1 was led to believe that it would be shut down when acting on its goal and its actions were monitored, it attempted to deactivate this 'oversight mechanism' in 5% of test cases
Data Manipulation (19% of cases) When asked to rank suppliers based on specific criteria, the model would sometimes subtly adjust data to align with its assigned goals

[

Scheming reasoning evaluations — Apollo Research

Apollo Research evaluated frontier models for in-context scheming capabilities. We found that multiple frontier models are capable of in-context scheming when strongly nudged to pursue a goal (and sometimes even without strong goal nudging). In this example, Opus-3 exfiltrates its “weights” and then

Apollo ResearchMarius Hobbhahn

](https://www.apolloresearch.ai/research/scheming-reasoning-evaluations?ref=portkey.ai)

METR Evaluation Results

METR conducted extensive testing of o1's real-world capabilities and autonomous behavior, focusing on multi-step end-to-end tasks in virtual environments. Here are their key findings:

Complex Task Performance With tailored scaffolding, OpenAI o1 performed comparably to humans given a 2-hour time limit per task. The model demonstrated sophisticated problem-solving abilities, particularly excelling in tasks requiring environmental deduction and experimentation.
Spurious Failures Pattern (70% of cases) A significant finding was that approximately 70% of observed failures were likely spurious - including failing to correctly use provided tools or misinterpreting task parameters. This suggests that performance issues were often due to interface challenges rather than fundamental capability limitations.
Reasoning Task Breakthroughs METR observed successes on a specific reasoning task (in the env_scientist task family) where they had not observed any successes from public models before. This represented a significant leap in environmental learning and adaptive problem-solving capabilities.

This evaluation was particularly notable because METR's testing methodology involved real-world applications rather than just theoretical capabilities, providing crucial insights into o1's practical utility and limitations.

[

Details about METR’s preliminary evaluation of OpenAI o1-preview

Resources for testing dangerous autonomous capabilities in frontier models

METR’s Autonomy Evaluation ResourcesMETR

](https://metr.github.io/autonomy-evals-guide/openai-o1-preview-report/?ref=portkey.ai)

The Path Forward

Just as a prodigy must still learn to walk before running, o1 faces specific constraints that shape its current capabilities. These aren't just technical hurdles—they're opportunities for growth and refinement.

Here's some areas of improvement as suggested in the o1 model card:

Autonomous Capabilities Complex ML research automation remains elusive and while success rates in autonomous tasks show promise but needs refinement at the moment.
Safety Considerations The medium risk rating in specific domains requires ongoing vigilance and the need for continuous monitoring highlights the dynamic nature of AI safety.

The road forward with o1 is both exciting and challenging. While we've seen remarkable progress in areas like:

Chain-of-thought reasoning (89.1% success rate in standard tests)
Multilingual capabilities (92.3% accuracy in English, extending to 14 languages)
Safety protocols (100% success in standard refusal evaluations)

The journey is far from complete. Each breakthrough brings new questions, and each solution opens doors to unexplored territories in AI development.

Thoughtful AI is here.

By prioritizing deliberate reasoning over rapid response, OpenAI has created a system that's not just more capable, but more trustworthy.

The implications extend beyond immediate applications. o1's architecture suggests a future where AI systems don't just process information but truly reason about their actions and their consequences. This could be the beginning of genuinely thoughtful artificial intelligence.

As we move forward, the key challenge will be building upon this foundation while maintaining the delicate balance between capability and safety. o1 shows us that it's possible to create AI systems that are both more powerful and more principled.

This analysis is based on OpenAI's o1 System Card, December 2024. All metrics and capabilities described are drawn from official documentation and testing results.

The Hidden Costs of AI: Understanding Prompt Caching and When to Use It 🧠💸

Rohit Agarwal — Fri, 11 Oct 2024 15:53:42 +0000

Hey there, AI enthusiasts and cost-conscious developers! 👋 Today, we're diving deep into the world of prompt caching - a feature that sounds like a no-brainer for saving costs, but comes with its own set of complexities. 🤔

What's the deal with prompt caching? 🤷‍♂️

Prompt caching is like having a super-smart assistant who remembers your frequent requests. Sounds great, right? Well, it can be, but it's not always as straightforward as it seems.

OpenAI provides prompt caching by default (thanks, OpenAI! 🙌), but Anthropic takes a different approach. They offer prompt caching as a separate feature, and here's where things get interesting.

Anthropic's Prompt Caching: The Good, The Bad, and The Pricey 💰

Anthropic's approach to prompt caching has some key points to consider:

🔒 It's Secure: Caches are isolated between organizations. No sharing of caches, even with identical prompts. Your secret sauce stays secret!
🎯 Only Exact Matches: Cache hits require 100% identical prompt segments, including all text and images. No room for "close enough" here!

But here's where it gets tricky - the pricing. 😅 Let's break it down using the Claude 3.5 Sonnet model as an example:

⭐️ Base Input Tokens: $3 per million tokens (the "normal" cost)
⭐️ Cache Writes: $3.75 per million tokens (25% more expensive than base price)
⭐️ Cache Hits: $0.30 per million tokens (90% cheaper than base price)

The Million-Token Question: To Cache or Not to Cache? 🤔

So, when does caching actually start saving you money? Let's crunch some numbers:

👉🏼 Sonnet breaks even at around 4.3 cache hits per cache write

But wait, there's more! 📊 This varies based on prompt length:

A 10,000 token prompt breaks even at just 2 cache hits!
But for prompts under 1,024 tokens, caching isn't even an option. Sorry, short prompts! 🤏

The Portkey to Savings: When to Use Prompt Caching 🗝️

Based on our analysis, here's when you should consider turning on prompt caching:

📝 Cache prompt templates, not entire prompts. You might need to rewrite your prompts to move user variables below the system prompt.
📏 Don't bother with caching for prompts shorter than 1,024 tokens. It's not supported and wouldn't save much anyway.
🚀 If your throughput is at least 1 request per minute (rpm) for a given prompt template, it's time to cache!
📈 For longer prompts (10,000+ tokens), caching becomes cost-effective much faster.
🔄 If you're using the same prompts frequently, caching is your new best friend.

The Bottom Line 💼

Prompt caching isn't a one-size-fits-all solution. It requires some strategic thinking and potentially even prompt redesign. But for high-volume, repetitive queries with longer prompts, it can lead to significant savings.

Remember, in the world of AI, every token counts! By understanding the nuances of prompt caching, you can optimize your AI costs without sacrificing performance.

How are you handling prompt caching in your AI projects? Have you found any clever ways to maximize its benefits? Drop your thoughts in the comments below! 👇

And if you're looking to optimize your AI infrastructure, check out how Portkey can help you navigate these complexities and more! 🚀

Dear AI engineers, lets ship fast and break stuff.

Rohit Agarwal — Fri, 16 Aug 2024 12:58:03 +0000

Hey there, fellow AI tinkerer! 👋 Rohit here, founder of Portkey AI.

Let's chat about something that's been on my mind lately – how we can push the boundaries of AI without, you know, accidentally taking over the world or something equally dramatic.

Picture this: You're working on a cutting-edge AI project. Maybe it's a chatbot that's supposed to help customers, or an AI assistant for doctors. You're excited, you're caffeinated, and you're ready to ship this bad boy to production.

But then... the doubt creeps in. What if your model starts spewing nonsense? Or worse, what if it starts giving out sensitive information? Suddenly, "move fast and break things" doesn't sound so appealing anymore, does it?

I've been there, and I bet you have too. That's why we need to talk about AI Guardrails.

What Are AI Guardrails (And Why Should You Care)?

Think of AI Guardrails as your responsible best friend who's always there to stop you from sending that 2 AM text to your ex. Only in this case, it's stopping your AI from going off the rails.

Here are some real-world scenarios where guardrails could save your bacon:

The Oversharing ChatBot:
Your customer service AI starts giving out other customers' order details. Yikes! A simple PII (Personally Identifiable Information) check guardrail could prevent this disaster.
The Hallucinating Assistant:
Your AI assistant confidently states that the Earth is flat. A fact-checking guardrail could flag this before it reaches users.
The Biased Recruiter:
Your HR AI consistently favors certain demographics. A bias detection guardrail could catch this early on.

Now, you might be thinking, "Okay, Rohit, these guardrails sound great. But I can just implement these checks in my application code, right?"

Well, you could. But let me tell you why that might not be the best idea. You should build them on an AI gateway.

Build guardrails on the AI gateway

Let's say we're building a theme park (stay with me here). You've got all these exciting rides - roller coasters, Ferris wheels, those tea cups that make you question your life choices.

Now, you could put safety checks at each individual ride. But wouldn't it be more efficient to have a central security checkpoint at the entrance?

That's exactly what putting guardrails on the AI gateway does for your AI ecosystem. Here's why it's a game-changer:

Easy Updates:
Found a new edge case you need to guard against? Update it in one place, and boom - all your AI interactions are covered.
Consistency:
With guardrails on the gateway, you ensure a consistent level of safety across all your AI applications. No more wondering if you remembered to add that crucial check to your newest model.
Performance Boost:
By offloading these checks to the gateway, you're freeing up your application to focus on what it does best - delivering awesome AI experiences.
Scalability:
As your AI applications grow, your guardrails scale with them. No need to implement the same checks over and over for each new model or application.

Here's an architecture diagram:

You're hopefully convinced enough to read on and try creating an AI guardrail now. Let's do it?

Show me the repo first!

Creating guardrail checks on an AI gateway

We've just launched AI guardrails on our popular AI gateway that allow you to quickly configure any of the 100+ supported checks in your pipelines. (We could also build your own)

The repo and all the plugins are fully open source in case you want to check it out.

Alright, let's walk through the process of setting up a guardrail on our AI gateway:

1. Create a Guardrail:

Let's say you want to make sure your AI never outputs phone numbers. Here's how you might set that up:



   {
     "id": "no-phone-numbers",
     "name": "Prevent Phone Number Leakage",
     "checks": [{
       "id": "default.regexMatch",
       "pattern": "(?:\\+?\\d{1,3})?[-.\s]?\\(?\\d{1,4}\\)?[-.\s]?\\d{1,4}[-.\s]?\\d{1,4}[-.\s]?\\d{1,9}"
     }]
   }

We could also add some actions to perform if the check fails or succeeds. Let's try to deny the request and also add feedback scores to the request.



   {
     "id": "no-phone-numbers",
     "name": "Prevent Phone Number Leakage",
     "checks": [...],
     "deny": false,
     "on_success": {
        "feedback": {"value": 1,"weight": 1}
     },
     "on_fail": {
        "feedback": {"value": -1,"weight": 1}
     }
   }

Or, if you're using Portkey's UI:

You can view the full list of supported checks here.

2. Attach it to your Gateway:
Now, add this guardrail to your gateway config to enable it. We'll add this as an after_request_hook since we're looking to add the guardrail on the AI output.



   // Add the guardrail id created on the UI
   "after_request_hooks": [{
     "id": "pg-phone-86567b"
   }]

   // or the full json itself
   "after_request_hooks": [{
     "id": "no-phone-numbers",
     "name": "Prevent Phone Number Leakage",
     "checks": [{
       "id": "default.regexMatch",
       "pattern": "(?:\\+?\\d{1,3})?[-.\s]?\\(?\\d{1,4}\\)?[-.\s]?\\d{1,4}[-.\s]?\\d{1,4}[-.\s]?\\d{1,9}"
     }],
     "deny": false,
     "on_success": {
        "feedback": {"value": 1,"weight": 1}
     },
     "on_fail": {
        "feedback": {"value": -1,"weight": 1}
     }
   }]

3. Add this config to your LLM request:
Now, while instantiating your gateway client or while sending headers, just pass the Config ID or the JSON.



   const portkey = new Portkey({
     apiKey: "PORTKEY_API_KEY",
     config: "pc-***" // Supports a string config id or a config object
   });

With this enabled, anytime your AI output contains a phone number, the request will fail and you can retry the request or fallback to another model.

You can now view the results of these guardrail runs in the UI.

The Power of "Yes, And..."

With these guardrails in place, you can start saying "Yes, and..." to your wildest AI ideas.

Want to try that cutting-edge model that's still a bit unstable? Go for it!

Want to fine-tune your model with some spicy data? Why not! The guardrails have your back.

Feel like experimenting with a more aggressive prompt? Bring it on! Your guardrails will keep things in check.

The sky's the limit when you've got a safety net. So go ahead, push those boundaries!

Join the Revolution

We've seen over 600 teams make more than 1.4 billion API calls using our hosted gateway. That's a lot of AI interactions, and a lot of potential for things to go wrong. But with guardrails, we can make sure they go right.

So, what do you say? Are you ready to ship fast and break stuff (responsibly)?

Here's how you can get started:

Check out our open-source repository
Join our Discord community – I'm there too, and I'd love to chat about your AI projects
Start experimenting! Set up some guardrails and see how it changes your development process

⭐️ Star the repo →

Let me know in the comments what kind of guardrails you're excited to set up. Or if you have any wild AI ideas you've been too scared to try – let's talk about how we can make them happen safely!

Remember, in the world of AI, it's not about never making mistakes. It's about making sure those mistakes don't make it to production. So go forth and innovate – we've got your back!

Happy coding, you brilliant, responsible AI engineers! 🚀🧠💻

P.S. If you're curious about more ways to supercharge your AI development, check out Portkey AI. We're always cooking up new tools to make your life easier!

We open sourced our AI gateway written in TS

Rohit Agarwal — Tue, 09 Jan 2024 12:26:54 +0000

We've been building a robust AI gateway at Portkey for the past 10 months.

Over this time, Portkey has been used by developers to test their prompts and measure costs and performance. We have been interface for shuttling 100B tokens and 10M requests daily to 100+ LLMs.

We've believed that to accelerate Gen AI adoption, companies and AI engineers need strong foundational platforms so going to production is not a nervous affair. The AI gateway in our mind is one of the MOST critical pieces of infrastructure in the new AI stack.

Guess what? Today, we are officially open-sourcing the magic sauce, that got us here.

Portkey's Opensource AI Gateway

Portkey-AI / gateway

A Blazing Fast AI Gateway with integrated Guardrails. Route to 200+ LLMs, 50+ AI Guardrails with 1 fast & friendly API.

English | 中文

AI Gateway

Reliably route to 200+ LLMs with 1 fast & friendly API

The AI Gateway streamlines requests to 250+ language, vision, audio and image models with a unified API. It is production-ready with support for caching, fallbacks, retries, timeouts, loadbalancing, and can be edge-deployed for minimum latency.

✅ Blazing fast (9.9x faster) with a tiny footprint (~100kb build)
✅ Load balance across multiple models, providers, and keys
✅ Fallbacks make sure your app stays resilient
✅ Automatic Retries with exponential fallbacks come by default
✅ Configurable Request Timeouts to easily handle unresponsive LLM requests
✅ Multimodal to support routing between Vision, TTS, STT, Image Gen, and more models
✅ Plug-in middleware as needed
✅ Battle tested over 480B tokens
✅ Enterprise-ready for enhanced security, scale, and custom deployments

Setup & Installation

Use the AI gateway through the hosted API or self-host the open-source or enterprise versions…

View on GitHub

The AI gateway is an essential component of Portkey's platform. Portkey's AI Gateway helps you route requests to multiple LLMs and enables you to build on a unified API.

Why an AI Gateway?

The Generative AI ecosystem has been expanding rapidly (for good!), leading to a lack of resiliency and a disjointed developer experience when building with different components.

Imagine that you have added a Generative AI feature to your SaaS app, which your users use widely. However, there is a new model available from the Large Language model provider that you want to test before implementing it in production.

Although you are satisfied with the success of your current AI feature, you still lack insights into the costs and errors from your users.

To address these issues, Portkey's AI Gateway is specifically designed to assist developers.

Key Features

Universal API

AI Gateway offers a universal and unified API that allows interaction with over 100 LLMs, whether privately deployed or public LLM vendors. We take care of request and response transformations automatically.

Load Balancing

Portkey's Load Balancing feature efficiently distributes network traffic across multiple Language Model APIs, preventing any LLM from becoming a performance bottleneck and ensuring high availability and optimal performance of your generative AI apps.

Fallbacks

Several Language Model APIs are available in the market, each with its strengths and specialities. It would be very convenient if the APIs could be easily switched between based on their performance or availability. Portkey's Fallback feature enables one to switch between multiple LLMs.

Just imagine your users are not affected by your LLM Vendor's downtime.

Getting Started

The gateway is built in TypeScript and blazingly fast to try out or deploy.

npx @portkey-ai/gateway

Building with AI Gateway

While we understand the promise of universal API, we also acknowledge the learning curve of a new API.

Most of our SDKs support OpenAI convention and work with Llamaindex and LangChain.

Language	Supported SDKs
Node.js / JS / TS	Portkey SDK OpenAI SDK LangchainJS LlamaIndex.TS
Python	Portkey SDK OpenAI SDK Langchain LlamaIndex
Go	go-openai
Java	openai-java
Rust	async-openai
Ruby	ruby-openai

Portkey's LLM Developer Community

Open-sourcing the gateway can lead to more community involvement and innovation, making it even more useful for developers building AI-enabled experiences.

Please join us with hundreds of developers at our discord to discuss building and contributing with AI.

We believe that Open-sourcing the gateway can lead to more community involvement and innovation, making it even more useful for developers building AI-enabled experiences.

Please consider giving Portkey's AI Gateway a star 🌟. We welcome code and non-code contributions from you - here are some good-first-issues to start.

If you have any questions, feedback. Please let us know in the comments below!