Skip to content

Latest commit

 

History

History
309 lines (206 loc) · 6.03 KB

File metadata and controls

309 lines (206 loc) · 6.03 KB

API Reference

Complete API documentation for the Visual Search system.

Indexing Service

IndexingService

Main service for indexing images.

from visual_search.indexing.indexing_service import IndexingService

service = IndexingService(
    storage: StorageBackend,
    embedding_service: EmbeddingService,
    index: VectorIndex,
)

Methods

index_image(image, image_id) -> ImageMetadata

Index a single image.

Parameters:

Name Type Description
image PIL.Image Image to index
image_id str Unique identifier for the image

Returns: ImageMetadata object

Example:

from PIL import Image

image = Image.open("photo.jpg")
metadata = service.index_image(image, "photo_001")
index_batch(images, image_ids) -> list[ImageMetadata]

Index multiple images at once.

Parameters:

Name Type Description
images list[PIL.Image] List of images
image_ids list[str] List of unique identifiers

Returns: List of ImageMetadata objects

remove(image_id, delete_from_storage=True)

Remove an image from the index.

Parameters:

Name Type Description
image_id str Image ID to remove
delete_from_storage bool Also delete from storage
save_index(path)

Save the index to disk.

load_index(path)

Load a previously saved index.

count() -> int

Return number of indexed images.


Search Service

NearestNeighborSearch

Service for finding similar images.

from visual_search.prediction.nearest_neighbor import NearestNeighborSearch

search = NearestNeighborSearch(
    index: VectorIndex,
    metadata_store: dict = None,  # Optional metadata
)

Methods

search(query_vector, query_id, k=10, include_metadata=False) -> SearchResults

Search for similar images using a query vector.

Parameters:

Name Type Description
query_vector np.ndarray 512-dim query embedding
query_id str Query identifier
k int Number of results
include_metadata bool Include image metadata

Returns: SearchResults object

Example:

results = search.search(
    query_vector=embedding,
    query_id="q001",
    k=10,
    include_metadata=True,
)

for result in results.results:
    print(f"{result.rank}. {result.image_id}: {result.score:.4f}")
search_by_id(image_id, query_id, k=10, exclude_query=True) -> SearchResults

Search using an already-indexed image.

Parameters:

Name Type Description
image_id str Indexed image to use as query
query_id str Query identifier
k int Number of results
exclude_query bool Exclude query image from results
search_batch(query_vectors, query_ids, k=10) -> list[SearchResults]

Batch search with multiple queries.


Reranker

Reranker

Post-process search results.

from visual_search.prediction.reranking import Reranker

reranker = Reranker(max_distance=10.0)

Methods

rerank(results, normalize=True, min_score=None, top_k=None, diversity_threshold=None) -> SearchResults

Apply reranking pipeline.

Parameters:

Name Type Description
results SearchResults Raw search results
normalize bool Convert L2 distance to similarity
min_score float Minimum score threshold
top_k int Limit number of results
diversity_threshold float Remove results too similar to each other

Example:

reranked = reranker.rerank(
    results=raw_results,
    normalize=True,
    min_score=0.3,
    top_k=5,
)

Embedding Service

EmbeddingService

Generate CLIP embeddings from images.

from visual_search.prediction.embedding_service import EmbeddingService

service = EmbeddingService(
    model_name: str = "clip-ViT-B-32",
    device: str = None,  # Auto-detect GPU/CPU
)

Methods

generate_embedding(image, image_id) -> EmbeddingVector

Generate embedding for a single image.

batch_embed(images, image_ids) -> list[EmbeddingVector]

Generate embeddings for multiple images.


Vector Index

VectorIndex

FAISS-based vector index.

from visual_search.indexing.index_table import VectorIndex

index = VectorIndex(dimension=512)

Methods

add(image_id, vector)

Add a single vector.

add_batch(image_ids, vectors)

Add multiple vectors.

search(query, k=10) -> tuple[list[str], np.ndarray]

Search for nearest neighbors.

Returns: Tuple of (image_ids, distances)

remove(image_id)

Remove a vector.

contains(image_id) -> bool

Check if image ID exists.

get_vector(image_id) -> np.ndarray

Get vector by image ID.

save(path) / load(path)

Persistence operations.


Data Models

ImageMetadata

from visual_search.models.image import ImageMetadata

metadata = ImageMetadata(
    image_id="img001",
    file_path="images/img001.jpg",
    width=1920,
    height=1080,
    format="JPEG",
)

# Properties
metadata.aspect_ratio  # -> 1.778

EmbeddingVector

from visual_search.models.embedding import EmbeddingVector

embedding = EmbeddingVector(
    image_id="img001",
    vector=[0.1, 0.2, ...],  # 512 floats
    model_name="clip-ViT-B-32",
)

# Methods
embedding.to_numpy()  # -> np.ndarray
embedding.normalize()  # -> EmbeddingVector (unit length)
embedding.cosine_similarity(other)  # -> float

SearchResult / SearchResults

from visual_search.models.search_result import SearchResult, SearchResults

result = SearchResult(
    image_id="img001",
    score=0.95,
    rank=1,
    metadata={"file_path": "..."},
)

results = SearchResults(
    query_id="q001",
    results=[result, ...],
)

# Methods
results.get_top_k(5)  # -> list[SearchResult]
results.get_image_ids()  # -> list[str]