Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API usage on a per request basis #146

Open
fskuteken opened this issue Mar 20, 2025 · 0 comments
Open

API usage on a per request basis #146

fskuteken opened this issue Mar 20, 2025 · 0 comments

Comments

@fskuteken
Copy link

fskuteken commented Mar 20, 2025

Context

I have a multi-tenant SaaS application where I need to keep track of the Unstructured quota each tenant consumes.
Is there a way to track the amount of billed pages on a per request basis, the same way LLMs return the consumed tokens in each API request?
If not, replicating the same logic Unstructured uses to count pages for each different file type may be an impediment for us to continue using Unstructured, as we won't be able to track the tenant consumption in an appropriate manner.

The problem

Page count

Unstructured supports a wide range of different file types. It is not easy to replicate the page count for each file type: pdf, pptx, docx etc. The result is that we don't know the actual page count we are being billed for on a per request basis.

Strategy

Unstructured supports the auto strategy as an input in the API. However, the actual used strategy depends on the contents of the file. It is not trivial to replicate the same logic Unstructured uses internally to choose the actual strategy that will be used to partition the file. The result is that we don't know the actual strategy we are being billed for on a per request basis.

Arguments

Cost tracking is a real problem. So real that there are products in the market that focus on providing insights about the usage of different services (e.g. Helicone, Keywords AI, Sentry etc).

Example

The Unstructured API could add headers to provide more information about the request, keeping it backwards compatible without modifying the response body used today (a JSON array containing the partition elements).

const client = new UnstructuredClient({
    serverURL: process.env.UNSTRUCTURED_BASE_URL,
    security: {
      apiKeyAuth: process.env.UNSTRUCTURED_API_KEY
    }
  })

const response = await client.general.partition({
  partitionParameters: {
    files: {
      content: fileBuffer,
      fileName: fileName
    },
    strategy: 'auto'
  }
})

// Is it possible to update the Unstructured API to return this information? 🤔 
const actualUsedStrategy = response.rawResponse.headers.get('x-unstructured-strategy')
const actualPagesConsumed = parseInt(response.rawResponse.headers.get('x-unstructured-pages'))

Benchmarks

Many providers that support the OpenAI SDK provide transparency about the API consumption and pricing.

const openai = new OpenAI()

const response = await client.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Hi' }]
})

// We can calculate the actual cost of the request 🥳 
const actualModelUsed = response.model
const promptTokensUsage = result.usage?.prompt_tokens ?? 0
const completionTokensUsage = result.usage?.completion_tokens ?? 0

OpenAI has different prices for different gpt-4o snapshots.
When we send the model as gpt-4o, it is an alias for a specific snapshot that we are billed for.
This way, we can keep track of our usage in a very straightforward manner.

Image

Benefits

  • Transparency is increased, raising Unstructured to be on par of other AI providers regarding billing
  • Every Unstructured customer could benefit from this improvement, increasing its adoption as an important tool in the AI age
  • The API maintains its backwards compatibility
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant