OpenAI B2B Signals: Frontier Firms Use 3.5x More AI

OpenAI's first B2B Signals report finds frontier firms use 3.5x more AI per worker than typical firms. What B2B teams should learn from the data.

Direct answer – What did OpenAI’s B2B Signals report find about frontier firms?

OpenAI’s B2B Signals report is a recurring research initiative built on aggregated, privacy-preserving usage data from ChatGPT Enterprise, ChatGPT Team, the API, and Codex. Its first edition, published May 2026, found that frontier firms, the top 5% of enterprises by intelligence-per-worker, now consume 3.5x more AI per worker than typical firms, up from 2x a year ago. Message volume explains only 36% of that gap; the remaining 64% is depth, richer prompts and agent-delegated work.

OpenAI published its first B2B Signals report this week, a recurring research initiative built on privacy-preserving, aggregated usage data from enterprise OpenAI products. The headline finding: organizations at the 95th percentile of usage, what OpenAI calls “frontier firms,” now consume 3.5 times more AI intelligence per worker than typical firms. A year ago that ratio was 2x.

The gap is not about message volume. OpenAI’s data shows that raw message count explains only 36% of the frontier advantage. The remaining 64% comes from depth: longer prompts, richer context, harder tasks, and more substantive outputs. Frontier firms also send 16 times more Codex messages per worker than typical firms, the single largest separation in any workflow category OpenAI measured.

For B2B marketing, RevOps, and martech teams, the report is less a celebration of AI growth than a diagnostic of who is pulling ahead and why. The 3.5x number widening from 2x in twelve months is the rate-of-divergence signal. The 36% depth-vs-breadth split is the operational signal. The first tells you the gap is compounding. The second tells you where it is compounding.

Key Takeaways

OpenAI’s first B2B Signals report finds frontier firms (95th percentile of usage) now use 3.5x more AI intelligence per worker than typical firms, up from 2x a year ago.
Message volume explains only 36% of the frontier advantage. The other 64% is depth: richer context, harder tasks, longer outputs.
Frontier firms send 16x more Codex messages per worker than typical firms, the largest single workflow gap in the dataset.
OpenAI revenue chief Giancarlo Dresser told CNBC on May 11 that enterprise AI adoption is “at a tipping point,” with usage now scaling beyond pilot teams into core operations.
The lever is not “use more AI.” It is delegated work with agents, governance for production use, and enablement that moves teams from chat to workflows.

What B2B Signals Actually Measures

B2B Signals is the enterprise extension of OpenAI’s broader Signals research program. It draws on aggregated, anonymized usage patterns across ChatGPT Enterprise, ChatGPT Team, the API, and Codex, mapping how organizations of different sizes, sectors, and maturity profiles actually use the products in production. The first report covers usage from late 2025 through Q1 2026, with the frontier cohort defined as the top 5% by intelligence-per-worker, a composite metric combining message volume, prompt complexity, output length, and feature adoption.

What OpenAI is doing here is publishing the same kind of usage benchmark that AWS publishes for cloud workloads and Snowflake publishes for warehouse queries: a vendor’s own view of how the leading users of the product use it. The methodology has obvious bias, OpenAI cannot measure what happens on Anthropic, Google, or Microsoft stacks, but for the population of enterprises using OpenAI products, it is the most authoritative dataset that exists. Analyst commentary will follow. The primary signal sits in the OpenAI numbers themselves. That instinct to publish hard usage data is itself a positioning move, and it is spreading: proof-led positioning is becoming the B2B default as the AI leaders disclose named customers and deployment percentages that buyers and answer engines can both verify.

The framing matters because the 3.5x figure is easy to misread as “frontier firms send 3.5x more prompts.” That is not the finding. The 36% number, the share of the gap attributable to volume alone, is the more important data point. Two-thirds of the frontier advantage is in how prompts are constructed, what tasks they are pointed at, and what the team does with the outputs, not how many prompts get sent.

The 36% Number Is the Real Story

The 36% volume share of the frontier gap is the operational hinge. If volume explained most of the gap, the lever would be access: get more seats deployed, get more people using the tool. That is the easy enterprise-IT move, and most organizations have been making it. The 64% depth share says the lever is something else: enablement, workflow design, and the shift from chat-style assistance to delegated work.

OpenAI’s own framing in the report is direct on this point. The companies pulling ahead “measure depth, build governance for production use, invest in enablement, scale what works, and move from chat-based assistance to delegated work with agents.” Four of those five behaviors are operating-model decisions. Only one, scale what works, is an investment decision. The frontier cohort is not winning because it bought more AI. It is winning because it deployed AI into more substantive work. The staffing-side consequence is now measurable: Wynter’s 47% invisible-cut finding shows B2B marketing teams already compressing roles through attrition and stopped backfills as senior operators absorb more execution.

The 16x Codex gap is the most visible expression of that pattern. Codex is the agentic coding product, and code generation is the workflow where delegation produces the largest measurable productivity delta. The fact that frontier firms send 16x more Codex messages per worker than typical firms suggests engineering teams at the frontier have moved past assisted coding into agent-delegated coding, the same trajectory we covered in our analysis of ChatGPT workspace agents for B2B ops, where delegation rather than autocomplete is the productivity step-change. ZoomInfo’s native Codex app extends that delegation pattern into GTM work, where the limiting factor is not coding skill but verified data, permission scope, and CRM lineage.

What This Means for B2B Marketing Teams

The marketing-side equivalent of the Codex gap is the agent-delegated content, research, and campaign-orchestration workflow. Teams at the frontier are not using ChatGPT to draft emails faster. They are running structured research pipelines, multi-step campaign generation with brand and compliance guardrails, and account-research workflows that hand off to RevOps systems with minimal human intervention. The depth dimension OpenAI measured shows up in marketing as the difference between “rewrite this paragraph” prompts and “produce a 12-account research brief with intent signals, contact mapping, and a tailored outreach sequence” prompts. The depth dimension also shows up as infrastructure investment underneath the prompts, the kind Twilio’s GA of its agentic conversation layer at SIGNAL 2026 is selling: persistent memory, channel-spanning orchestration, and real-time signal layers that move teams past chat-style assistance into delegated work.

The CNBC interview with OpenAI revenue chief Giancarlo Dresser on May 11 added context: Dresser described enterprise AI adoption as “at a tipping point,” with usage now scaling beyond pilot teams into core operational workflows. That framing is consistent with the B2B Signals data: the frontier cohort has crossed from experimentation into production, and the gap to the rest of the market is widening because production deployment compounds while pilots do not.

The enterprise reality check is the Agentforce adoption gap: agent usage can rise quickly while paid rollout still lags.

Our read: the 3.5x divergence is the same story Gartner’s 2026 CMO Spend Survey measured from the marketing-budget side, where 70% of CMOs say AI leadership is a critical 2026 goal but only 30% have the readiness to scale AI investments. The OpenAI data is the usage-side mirror of the same readiness gap. Both datasets are measuring the same population, just from different ends. The CMOs who told Gartner they are AI-ready are very likely the same workforces showing up in OpenAI’s 95th percentile.

Three Moves to Make This Quarter

The B2B Signals report is most useful as a self-benchmark. If your team’s usage looks like high-frequency, low-depth prompting, you are in the 80% the report describes as “typical.” Three moves separate the typical cohort from the frontier:

Measure depth, not seat count. Stop reporting AI program success as “X% of the team has access to ChatGPT Enterprise.” Start reporting it as average prompt length, average output length, and share of work delegated to agents rather than assisted by chat. The 36% volume share of the frontier gap means depth metrics are now the more accurate leading indicator of program maturity. Supermetrics’ AI adoption gap research made the same point from the marketing-analytics angle: teams that converted AI use into measurable ROI built the measurement infrastructure first.
Move one workflow from chat to delegation this quarter. Pick the single highest-volume marketing workflow your team runs, account research, content brief generation, campaign QA, lifecycle email orchestration, and rebuild it as an agent-delegated workflow with explicit handoff conditions, brand guardrails, and a human review checkpoint at the end. The 16x Codex gap shows what delegated workflows look like in engineering. The marketing equivalent is structurally identical, just pointed at different artifacts.
Build production governance before scaling. Frontier firms invested in governance for production use, prompt libraries, output review processes, compliance gates, before they scaled. Salesforce’s State of Sales 2026 data shows the same pattern on the sales side: high-performing teams operationalized AI governance before high-performing teams scaled AI tooling. Skipping the governance step is what produces the failed-pilot pattern Gartner forecasts will cancel 40% of agentic AI projects by 2027.

The 12-month delta from 2x to 3.5x is the most underrated number in the report. At that rate of divergence, the gap doubles roughly every 18 months. Organizations that have not crossed from chat to delegation by the end of 2026 will be measurably further behind in 2027 than they are now, not because the technology will have changed, but because the frontier cohort will have compounded twelve more months of delegated-work practice. Google’s I/O 2026 information-agents launch reinforces the timeline: the autonomous monitoring layer Google built into Search only pays off for B2B teams whose content surfaces are already delegated-workflow-ready, which is the same population the OpenAI dataset identifies as frontier.

The Skeptical Read

The methodology has limits worth naming. OpenAI cannot see usage on Anthropic Claude, Google Gemini, or Microsoft Copilot stacks, so a multi-vendor enterprise looks lighter on OpenAI’s dashboard than it actually is on AI overall. The 95th percentile cohort may be partially a selection artifact, organizations that standardize on OpenAI will use OpenAI more by definition. The report does not break out cohort composition by sector, which means a frontier dominated by software companies tells a different story for B2B marketing teams in financial services or industrials than for SaaS marketers.

None of those caveats reverse the directional finding. The depth-vs-breadth ratio is structural, not vendor-specific, and the rate of divergence is the operationally important variable regardless of whether the absolute level is calibrated correctly. The frontier is pulling ahead. The B2B Signals dataset is one of the cleaner measurements of how fast.

Frequently Asked Questions

What is OpenAI’s B2B Signals report?

B2B Signals is OpenAI’s new recurring research initiative measuring how enterprises actually use OpenAI products. It draws on privacy-preserving, aggregated usage data from ChatGPT Enterprise, ChatGPT Team, the API, and Codex. The first report, published in May 2026, benchmarks usage patterns and identifies what OpenAI calls “frontier firms,” the top 5% of enterprises by intelligence-per-worker.

What does 3.5x more AI intelligence per worker mean?

It is OpenAI’s composite measure combining message volume, prompt complexity, output length, and feature adoption. A worker at a frontier firm consumes 3.5x more on that composite than a worker at a typical firm. The metric explicitly weights depth (richer prompts, harder tasks) heavily, which is why volume alone explains only 36% of the gap.

Why does volume explain only 36% of the frontier advantage?

Because the other 64% is in how the prompts are constructed and what they are pointed at. Frontier firms send longer, more contextual prompts at harder problems and use agentic features more heavily. The Codex gap, 16x more messages per worker, illustrates that depth dimension: code delegation is structurally different from chat-style assistance, and the productivity delta reflects that.

What should B2B marketing teams change based on this report?

Three moves: stop measuring AI program success by seat count and start measuring it by depth such as prompt length and delegation share, move at least one high-volume marketing workflow from chat-assisted to agent-delegated this quarter, and build production governance before scaling. The 12-month rate of divergence from 2x to 3.5x means the cost of staying typical compounds quickly.