Skip to content

Conversation

fenos
Copy link
Contributor

@fenos fenos commented Oct 3, 2025

What kind of change does this PR introduce?

Feature

What is the new behaviour?

Implement Vector Bucket data source

Supported Operations:

  • CreateIndex
  • DeleteIndex
  • GetIndex
  • ListIndexes
  • PutVectors
  • ListVectors
  • ListVectorBuckets
  • QueryVectors
  • DeleteVectors
  • GetVectorBucket
  • GetVectors

Authentication mechanisms:

  • SignV4
  • JWT service_role

Copy link

snyk-io bot commented Oct 3, 2025

Snyk checks have failed. 7 issues have been found so far.

Status Scanner Critical High Medium Low Total (7)
Code Security 0 7 0 0 7 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@fenos fenos force-pushed the feat/vector-buckets branch 5 times, most recently from 1cf95bd to 01797cc Compare October 10, 2025 11:04
@fenos fenos force-pushed the feat/vector-buckets branch from 01797cc to a2715e7 Compare October 10, 2025 11:12
@coveralls
Copy link

Pull Request Test Coverage Report for Build 18404797921

Details

  • 2035 of 2319 (87.75%) changed or added relevant lines in 40 files are covered.
  • 23 unchanged lines in 2 files lost coverage.
  • Overall coverage increased (+0.8%) to 77.143%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/http/plugins/jwt.ts 3 5 60.0%
src/http/routes/vector/create-bucket.ts 43 45 95.56%
src/http/routes/vector/create-index.ts 64 66 96.97%
src/http/routes/vector/delete-bucket.ts 43 45 95.56%
src/http/routes/vector/delete-index.ts 51 53 96.23%
src/http/routes/vector/delete-vectors.ts 48 50 96.0%
src/http/routes/vector/get-bucket.ts 43 45 95.56%
src/http/routes/vector/get-index.ts 62 64 96.88%
src/http/routes/vector/get-vectors.ts 47 49 95.92%
src/http/routes/vector/list-buckets.ts 44 46 95.65%
Files with Coverage Reduction New Missed Lines %
src/http/plugins/signature-v4.ts 1 47.19%
src/storage/protocols/s3/signature-v4.ts 22 73.11%
Totals Coverage Status
Change from base Build 18352064985: 0.8%
Covered Lines: 23149
Relevant Lines: 29723

💛 - Coveralls

app.register(routes.cdn, { prefix: 'cdn' })
app.register(routes.healthcheck, { prefix: 'health' })
app.register(routes.iceberg, { prefix: 'iceberg/v1' })
app.register(routes.vectors, { prefix: 'vectors' })
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be singular ("vector")? to match: bucket, object, ...

: 400

if (statusCode === 500) {
console.log('error')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

debug log, remove

fastify.addHook('preParsing', async (request: AWSRequest, reply, bodyPayload) => {
if (
opts.skipIfJwtToken &&
request.headers.authorization?.replace('Bearer ', '')?.match(JWT_SHAPE)
Copy link
Contributor

@itslenny itslenny Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this could be moved into a helper function in @internal/auth/jwt.ts

? byteHasherStream!.toReadable({ autoCleanup: true })
: (body as Readable)

const { secret: jwtSecret, jwks } = await getJwtSecret(request.tenantId)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this should use the same signing secret as URL signing (not sure if that's the intention or not) you should use urlSigningKey instead of secret which uses the JWK signing key if set otherwise falls back to jwtSecret inside getJwtSecret().

ChunkSignatureV4Parser,
V4StreamingAlgorithm,
} from '@storage/protocols/s3/signature-v4-stream'
// @ts-expect-error - no types for compose
Copy link
Contributor

@itslenny itslenny Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably better to manually declare typedef for compose and remove this bypass.

Adding this down by the fastify declare (line 38) seems to resolve it and maintain stricter typing

declare module 'stream' {
  export function compose<A extends Stream, B extends Stream>(s1: A, s2: B): B & A;
}


const indexResult = await request.s3Vector.listVectors(request.body)

return response.send(indexResult)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be nice to align this with the existing listObjectsV2 naming for consistency.

  • include hasNext boolean in response
  • rename to cursor (request) and nextCursor (response)

}

export default async function routes(fastify: FastifyInstance) {
fastify.post<listIndexRequest>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as listVectors. Align pagination types with existing list V2

}
}

throw ERRORS.TransactionError('Transaction failed after maximum retries', lastError)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be nice to make "after maximum retries" conditional on if there were retires, and/or maybe include the retry count in the error message to avoid confusion if we need to debug this in the future.

return {
vectorBucket: {
vectorBucketName: vectorBucket.id,
creationTime: Math.floor(vectorBucket.created_at?.getTime() / 1000),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check vectorBucket.created_at is typeof Date to avoid error/NaN

return {
vectorBuckets: bucketResult.vectorBuckets.map((bucket) => ({
vectorBucketName: bucket.id,
creationTime: Math.floor(bucket.created_at?.getTime() / 1000),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check vectorBucket.created_at is typeof Date to avoid error/NaN

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants