Token-based billing: from premium request units to AI credits and tokens

This video explains GitHub Copilot’s shift from premium request units to token-based billing, what that means in practice, and what changes start in June.

Full summary based on transcript

What GitHub is changing: premium request units → token-based billing

Rob Bos explains that GitHub is replacing the previous billing model (based on premium request units) with usage-based billing based on tokens.

Refresher: how premium request units worked

Under the old model:

Why GitHub is moving to tokens

The presenter frames token billing as a more “honest” model because a single premium request unit could represent a very long or compute-heavy interaction (e.g., a long-running session) while still only costing $0.04, which is not sustainable for GitHub/model providers.

The new model: AI credits and token costs

GitHub is moving to a system based on AI credits:

The presenter then explains how credits relate to tokens by referencing GitHub’s published model pricing page and using Anthropic models as an example:

The key takeaway is that developers and organizations need to start thinking in terms of:

Mini demo: how a single prompt can consume many tokens

The presenter demonstrates in an editor session that even a single question can consume a large number of tokens due to the context window and the codebase context.

What will be billed (and what stays included)

The presenter lists which Copilot features will be affected by token-based billing:

“Free models” going away

The presenter states that “free models” will no longer be free under the new system:

Enterprise impact: pooled credits (shared billing)

A major change highlighted for enterprise customers is pooling:

The presenter gives a worked example:

Practical takeaway

The presenter concludes that teams should prepare for token-based billing by: