Bring the module into a state of readiness for release

To achieve a wide release for users to plug into their embedding pipelines, this module should achieve the following features:
- [x] An async HTTP/1.1 server using the `axum` routing and `tokio` async crates
- [x] A multi-threaded backend for massively parallel inference using the `rayon` crate
- [x] A interlink between `axum-tokio` and `rayon` using the `tokio-rayon` crate
- [x] Support for transformer models sourced from HuggingFace running on CPU using ORT through the `OnnxBert` struct
- [x] Support for transformer models sourced from HuggingFace running on GPU using CUDA through the `CandleBert` struct
- [ ] A pipeline that builds separate images for CPU and GPU support due to compiled nature of `rust`:
https://github.com/weaviate/t2v-transformers-models-rs/pull/2
- [ ] Built and published images for the following embedding models:
    - [ ] `BAAI/bge-large-en-v1.5` and `BAAI/bge-small-en-v1.5`
    - [ ] `sentence-transformers/all-MiniLM-L6-v2`
    - [ ] `sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2`, no `onnx/model.onnx` dir on HFhub
    - [ ] `Snowflake/snowflake-arctic-embed-l` and `Snowflake/snowflake-arctic-embed-s`
    - [ ] `mixedbread-ai/mxbai-embed-large-v1`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Bring the module into a state of readiness for release #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Bring the module into a state of readiness for release #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions