Implications of OpenAI’s Reality-TV Moment
What OpenAI's incredible few days might mean for people building with LLMs
The past 4 days have been a rollercoaster for the OpenAI team and ecosystem. Sam Altman was fired from the OpenAI board for seemingly banal reasons, Greg Brockman resigned, then Sam returned for negotiations and now he's joined Microsoft? Phew. And now, Sam is back in seat it seems! Horay.
As we continue to depend on LLMs (and in particular OpenAI's models) as our application intelligence layer, the stakes continue to rise. If you're reading this, AI is likely a top priority in your business, and if you're feeling uneasy about what this means.
Let’s talk about some of the implications and takeaways stemming from the past few days.
Compute capacity is the oil of AI
The raw material of AI is compute, and a lot of it is needed. Anybody who has attempted to scale GPT-4 knows that lighting racks of GPUs up costs a lot of money and capacity is tight. In a normal situation where an exiting CEO has the backing of the majority of the team, the answer would be a clean cap table and a fresh start. But without Microsoft and their $50b annual investment in compute, there is no GPT. Sam knows this, Satya knows this and so, no matter how these cards fall, Microsoft will win.
For us, this means compute capacity planning and management is key. As we think about choosing model compute providers (and choice increases), it’s important to think deeply about the policies, governance and the support escalation processes of different providers. Here are some areas that anyone building in production with LLMs should consider:
- Capacity planning : If you have a tool or service that begins to get traction, do you have a path to getting increased capacity quickly? Build a relationship with your LLM compute provider to ensure you have a path to getting limits increased should you need them.
- Fallback providers : Consider redundancy plans in case of unexpected service interruption. At AirOps, we recommend having backup workflows using alternative models from Anthropic in the case of OpenAI service interruption.
- Model use hygiene matters : Keep an eye on usage and vectors for potential abuse. Being a responsible steward of your service will help avoid interruption. There are rumors of a lot of GPT-4 abuse from providers looking to train off it’s outputs and complex misuse scenarios relating to new vision and DALLE models.
OAI application layer = a big unknown
It seems there are 3 OpenAI’s. There is deep, frontier research OpenAI on a quest for AGI, the developer APIs aimed at commercializing the latest, safest and greatest and finally there is the nascent (and somewhat confusing) application layer in the form of ChatGPT and now GPTs. There is some indication this third pillar was a contributor to the internal rift.
The frontier research part of OpenAI is the most aligned with it’s mission. This is absolutely core to the founding objective of the company, no surprises there. The developer APIs are next, supportive of the mission by progressively giving AI models to society to inform alignment work. This is approaching a $1b ARR run rate, and is now a real business for the company, one that critics would argue is as much about revenue as it is about informing alignment. For the avoidance of doubt, and as a major beneficiary of the scaling up of model availability, i believe progressive and scaled availability of these models helps all of us build and learn what’s possible with these models. Alignment is helped massively by having AI systems scale in society in tandem with model capabilities.
The final piece is the application layer. ChatGPT was a significant milestone in AI. It did its job by giving the world its "aha" moment and kick-starting the revolution. Next came plugins, an interesting, though largely unsuccessful, attempt to expand ChatGPT into new domains. Then came the announcement of GPTs and the GPT store. This felt, to many, off-mission. Why would OpenAI, a company focused on creating super-intelligence, attempt to absorb all these low-impact use cases into ChatGPT?
At first glance, GPTs seemed underpowered and off-mission. In one keynote, the team announced custom training models for the most important and demanding use cases, while also introducing flavored GPTs that could tell jokes, order stickers, and occasionally fire a Zap. It felt like a distraction or a rushed attempt to compete with companies like Character.ai. Although GPTs could potentially replace a larger portion of the application layer in the future, the question arises: why bother if all roads ultimately lead to OpenAI's LLMs? Given indications that the rift was partially caused by the accelerated and perceived "rushed" development of the store, there are significant uncertainties surrounding the GPT roadmap.
A counter argument would be the GPTs can grow in sophistication to capture more and more of the application layer. But, despite the magic of LLMs and the future promise of ephemeral UI and long term memory this seems like a very hard hill to climb in the next 2-3 years.
For people like us building in the AI space, GPTs and the GPT store are an interesting surface to watch. There are some interesting things happening, like ChatPRD, a Chief Product Officer chatbot that’s actually pretty great but i wonder about longevity of usage for any of these. For the next 6-12 months, the best outcome for AI builders is that GPTs become a distribution channel for deeper services.
We need alternatives to GPT-4, at scale
AWS, Google and Anthropic need to step it up and quickly. There have been so many keynotes, fancy demos, papers and marketing videos, but we need a real alternative to GPT-4 that is generally available and at scale. Simpler LLM use cases can transition off of GPT-3.5 to more efficient, fine-tuned smaller parameter models like Llama 2 but at the top end the majority of companies are still relying on GPT-4.
The closest we have found is Claude 2 which represents a comparable model in some use cases, but suffers much more from hallucination. It’s not just us that feels this overweight reliance on one provider, the Retool State of AI report agrees.
Update : Claude 2.1 was just announced with a 200k context window and reduced hallucinations... perfect timing
OpenAI’s governance model doesn't work
"Show me the incentive, I'll show you the outcome." The trouble is, what are the incentives in a company with two factions that are arguably at odds? The mission driven non-profit and capped profit subsidiary seem to have created factional conflict. Others like Ben Thompson have done a fantastic job of talking about this.
Suffice to say, conflicting incentives and opaque board decision making undermine confidence in the foundational technology for this decades primary platform shift. How can companies invest, build, raise and scale in a world where there isn’t total alignment (the human kind)?
What we're doing at AirOps
AirOps is being built to give customers choice and flexibility in what is going to be a rapidly evolving market. We will continue to deepen our relationship with OpenAI, Anthropic and emerging compute providers to ensure our customers can take advantage of the best models for their use case. Beyond that, we are continuing to plan our roadmap to equip our customers with tooling to confidently build and deploy meaningful solutions for their customers. That means more visibility, redundancy and tooling to reduce risk and improve performance. More to come on that soon. AI is going to change business and eventually society and along the way there are going to be a lot of other "shocks". We don't know what they will be, but we will be prepared.
Watch this space, I know you will…
This past week has highlight the fragility of relying on a single provider for critical AI infrastructure. As CEOs, CTOs, and developers, it's crucial to not only monitor these developments but also to actively seek alternatives and make clear redundancy plans. We will continue to support our customers in adapting to the changing LLM landscape.
OpenAI's situation is a poignant reminder of the need for clear, consistent mission alignment within AI organizations. The apparent conflict between OpenAI's non-profit mission and its commercial aspirations reveals a critical challenge in the AI industry: balancing the pursuit of advanced AI research with the practicalities of running a successful business. As builders in the AI space, we should advocate for transparency and ethical governance in AI companies, ensuring that the technologies we depend on are developed and managed in a way that aligns with broader societal values and the needs of the AI community. The future of AI is not just in the hands of a few companies but in a collaborative ecosystem where diverse voices and perspectives shape the trajectory of this transformative technology.