Introducing more enterprise-grade features for API clients

To help organizations scale their use of AI without stretching their budgets too far, we've added two new ways to reduce costs on consistent and asynchronous workloads:

Reduced use on the committed flow: Customers with a sustained level of token usage per minute (TPM) on GPT-4 or GPT-4 Turbo can request access to provisioned throughput to receive discounts ranging from 10% to 50% depending on token size. 'commitment.
Reduced costs on asynchronous workloads: Customers can use our new Batch API to run non-urgent workloads asynchronously. Batch API requests are charged 50% off shared prices, offer much higher rate limits, and return results within 24 hours. This is ideal for use cases such as model evaluation, offline classification, summarization, and synthetic data generation.

We plan to continue adding new features focused on enterprise-grade security, administrative controls, and cost management. For more information on these launches, visit our API Documentation Or contact our team to discuss custom solutions for your business.

Introducing more enterprise-grade features for API clients

Leave a Reply Cancel reply

Stay Connected

Create an Amazing Newspaper

Latest News

Federal authorities are warning of scammers who use couriers to collect cash and gold from their victims, many of whom are elderly.

Government announces £100m for quantum research centres

Presentation of LEDGER FLEX: announced live at B24

Bitcoin Investors Won't Sell BTC Even If Price Falls to $3,000, Peter Schiff Survey Finds

Subscribe to our newsletter