Usage.AI + OpenAI Cost Optimization Documentation

Integrate Usage AI into your AWS bill via

AWS Marketplace

November 24, 2023 | 5 min read

Usage.AI + OpenAI Cost Optimization Documentation

Kaveh Khorram

Founder and CEO

Kaveh Khorram is the Founder and CEO of Usage.AI

<p>Usage.AI + OpenAI Cost Optimization Documentation</p>

Announcement

Given the recent surge in popularity of OpenAI – and some people even calling OpenAI bills the next “AWS bills” – we decided it would be in-line with our mission of helping customers optimize all their cloud costs. We are thrilled to soon announce our OpenAI cost optimization feature, alongside AWS, GCP, and Azure cost optimization support.

Since OpenAI doesn’t have an official billing API, we had to get creative. So we decided to get a little creative and figure out how to get billing costs anyway.

We found out that you can access OpenAI usage reports by sending a GET request to https://api.openai.com/v1/usage?date=2023-11-24, and we got the following response:

{

object: 'list',

data: [

{

organization_id: 'org-aqof5ov5m',

aggregation_timestamp: 1700758500,

n_requests: 2,

operation: 'completion',

snapshot_id: 'gpt-3.5-turbo-0301',

n_context_tokens_total: 26,

n_generated_tokens_total: 10,

user_id: null

},

{

organization_id: 'org-aqof5ov5m',

aggregation_timestamp: 1700758800,

n_requests: 2,

operation: 'completion',

snapshot_id: 'gpt-3.5-turbo-0301',

n_context_tokens_total: 26,

n_generated_tokens_total: 10,

user_id: null

},

{

organization_id: 'org-aqof5ov5m',

aggregation_timestamp: 1700759700,

n_requests: 4,

operation: 'completion',

snapshot_id: 'gpt-3.5-turbo-0301',

n_context_tokens_total: 52,

n_generated_tokens_total: 20,

user_id: null

},

{

organization_id: 'org-aqof5ov5m',

aggregation_timestamp: 1700761500,

n_requests: 2,

operation: 'completion',

snapshot_id: 'gpt-4-0613',

n_context_tokens_total: 24,

n_generated_tokens_total: 10,

user_id: null

},

{

organization_id: 'org-aqof5ov5m',

aggregation_timestamp: 1700771400,

n_requests: 5,

operation: 'completion',

snapshot_id: 'gpt-4-0613',

n_context_tokens_total: 60,

n_generated_tokens_total: 25,

user_id: null

},

{

organization_id: 'org-aqof5ov5m',

aggregation_timestamp: 1700771700,

n_requests: 2,

operation: 'completion',

snapshot_id: 'gpt-4-0613',

n_context_tokens_total: 24,

n_generated_tokens_total: 10,

user_id: null

}

],

ft_data: [],

dalle_api_data: [],

whisper_api_data: []

}

Here’s the code I used to make the request:

async function fetchOpenAIUsageData(date) {

try {

const response = await axios.get(OPENAI_USAGE_URL, {

headers: { 'Authorization': `Bearer ${OPEN_API_KEY}` }

});

return response.data;

} catch (error) {

console.error('Error fetching usage data:', error);

return null;

}

Ok great, so now we’re getting some OpenAI API usage data. Unfortunately, there’s nothing here that tells us what the costs are. However, we do get the following information:

organization_id: the ID of the organization making the API call. This could be helpful if you’re using multiple organizations.
operation: Gives you the type of operation completed for this request. Could be a completion, embedding, etc. This can be useful in calculating costs.
Snapshot_id: The model you are using for the request.
n_context_tokens_total: The number of tokens in the context of this request. Context tokens are billable tokens by OpenAI.
n_generated_tokens_total: The number of tokens generated in the output. These tokens are also billable tokens by OpenAI.

Great - so now we have the usage data. We also have the number of tokens used, so the formula to get costs for each request becomes:

(n_context_tokens_total + n_generated_tokens_total) * (Cost per Output Token by Model)

Ok, so now we need to get the Cost per Output Token by Model. Since there is no billing API, there is certainly no pricing endpoint. So I used ChatGPT to turn the OpenAI pricing page into JSON:

https://openai.com/pricing

The result was:

const openaiPricingData = {

"gpt-4_turbo": {

"gpt-4-1106-preview": {

"input": 0.01,

"output": 0.03

},

"gpt-4-1106-vision-preview": {

"input": 0.01,

"output": 0.03

}

},

"gpt-4": {

"input": 0.03,

"output": 0.06

},

"gpt-4-32k": {

"input": 0.06,

"output": 0.12

}

},

"gpt-3_5_turbo": {

"gpt-3.5-turbo-1106": {

"input": 0.0010,

"output": 0.0020

},

"gpt-3.5-turbo-instruct": {

"input": 0.0015,

"output": 0.0020

}

},

"fine_tuning_models": {

"gpt-3.5-turbo": {

"training": 0.0080,

"input_usage": 0.0030,

"output_usage": 0.0060

},

"davinci-002": {

"training": 0.0060,

"input_usage": 0.0120,

"output_usage": 0.0120

},

"babbage-002": {

"training": 0.0004,

"input_usage": 0.0016,

"output_usage": 0.0016

}

},

"embedding_models": {

"ada_v2": {

"usage": 0.0001

}

},

"base_models": {

"davinci-002": {

"usage": 0.0020

},

"babbage-002": {

"usage": 0.0004

}

},

"image_models": {

"dall_e_3_standard_1024x1024": 0.040,

"dall_e_3_standard_1024x1792_1792x1024": 0.080,

"dall_e_3_hd_1024x1024": 0.080,

"dall_e_3_hd_1024x1792_1792x1024": 0.120,

"dall_e_2_1024x1024": 0.020,

"dall_e_2_512x512": 0.018,

"dall_e_2_256x256": 0.016

},

"audio_models": {

"whisper": {

"per_minute": 0.006

},

"tts": {

"per_thousand_characters": 0.015

},

"tts_hd": {

"per_thousand_characters": 0.030

}

Great – now we have all the ingredients to calculate the cost.

Here’s the function we used to calculate the cost of each request:

function calculateModelCosts(usageData) {

const modelToPricePath = {

'gpt-3.5-turbo-0301': ['gpt-3_5_turbo', 'gpt-3.5-turbo-1106', 'output'],

'gpt-4-0613': ['gpt-4', 'gpt-4', 'output'],

};

return usageData.data.map(({ snapshot_id, n_context_tokens_total, n_generated_tokens_total, organization_id }) => {

const tokens = n_context_tokens_total + n_generated_tokens_total;

const [category, model, type] = modelToPricePath[snapshot_id];

const cost = (tokens / 1000) * openaiPricingData[category][model][type];

return { model: snapshot_id, cost, organization_id };

});

}

The snapshot_id did not line up directly with the model names in the JSON produced by the pricing page, so we needed to create a custom path for each snapshot_id to map to the right price.

We end up with this output from a call to calculateModelCosts:

[

{

model: 'gpt-3.5-turbo-0301',

cost: 0.000072,

organization_id: 'org-aqof5ov5m'

},

{

model: 'gpt-3.5-turbo-0301',

cost: 0.000072,

organization_id: 'org-aqof5ov5m'

},

{

model: 'gpt-3.5-turbo-0301',

cost: 0.000144,

organization_id: 'org-aqof5ov5m'

},

{

model: 'gpt-4-0613',

cost: 0.00204,

organization_id: 'org-aqof5ov5m'

},

{

model: 'gpt-4-0613',

cost: 0.0051,

organization_id: 'org-aqof5ov5m'

},

{

model: 'gpt-4-0613',

cost: 0.00204,

organization_id: 'org-aqof5ov5m'

}

]

Great - we now have the costs for all requests completed on a certain day. Next, we decided to clean up the output a bit by showing the aggregate model costs by organization_id. Here’s the code we used for that:

function combineCosts(usageData) {

const costs = usageData.reduce((acc, { model, cost, organization_id }) => {

// Initialize organization if it doesn't exist

if (!acc[organization_id]) {

acc[organization_id] = { totalCost: 0, models: {} };

}

// Add cost to the organization total

acc[organization_id].totalCost += cost;

// Initialize model cost for the organization if it doesn't exist

if (!acc[organization_id].models[model]) {

acc[organization_id].models[model] = 0;

}

// Add cost to the model under the organization

acc[organization_id].models[model] += cost;

return acc;

}, {});

return costs;

}

The final, grouped output is now the following:

Combined Costs: {

'org-aqof5ov5m': {

totalCost: 0.009468,

models: { 'gpt-3.5-turbo-0301': 0.000288, 'gpt-4-0613': 0.00918 }

}

Great – now this is much easier to read. It breaks down total model costs by organization, and it shows each model and the associated costs.

So what’s next? As you can imagine, you can run this for as many days as you’d like to look back to get a cost breakdown of your OpenAI costs over a historical time period.

If you’re a Usage.AI user, all you’ll need to do is plug in your OpenAI key into your Usage.AI dashboard and you’ll get the ability to:

Compare your weekly and monthly OpenAI costs to previous period costs.
Forecast your OpenAI costs, detect cost anomalies, and send alerts when your costs are higher (or lower) than normal.
Filter by organization, service, and model: combine multiple filters to drill into which teams are spending what on OpenAI.

We plan on launching our Open.AI integration in the coming weeks, so stay tuned. We wanted to document our journey getting billing data from OpenAI with this blog post. If you’re interested in beta testing our OpenAI integration ahead of its release, please get in touch with me directly at [email protected]. If you are interested in saving money on AWS, GCP, or Azure – you can get in touch with us by visiting our website at www.usage.ai or by emailing our SVP of sales [email protected].