RAG systems don't just ignore time. They don't even know what time it is.
Most real-world data (scraped docs, blogs, PDFs) lacks consistent temporal metadata. Consequently, standard vector search prioritizes well-cited, high-density results from 2018 over more relevant breakthroughs from 2024.
HalfLife fixes this as a 1-line drop-in layer.
Ask any standard RAG pipeline: "What is the best way to manage state in React today?"
- β Baseline (Vector-Only): #1 β Redux (2018). It matches the query perfectly but is 8 years outdated.
- β With HalfLife: #1 β React 19 / Zustand (2024). HalfLife automatically detects the "today" intent and re-ranks based on temporal relevance.
Run the Live Demo (No Timestamps!) β
Most rerankers assume you've already extracted and cleaned your metadata. HalfLife handles the messy reality:
- Automated Inference: Assigns chronological confidence by extracting years directly from raw unstructured text.
- Intent Classification: Detects if your query is Fresh (latest is best), Historical (older is best), or Static.
- Temporal Fusion: Re-weights relevance scores using decay functions, even if your vector store is "timestamp-blind."
Validate HalfLife on your existing RAG results without changing your code:
- Export chunks: Save some nodes/payloads to
data.json. - Run the test script:
python3 scripts/try_on_your_data.py --query "Best LLM today" --file data.json - Compare: See the difference between #1 baseline vs. #1 HalfLife.
- Install via PyPI:
pip install halflife-rag
- Integrate (LlamaIndex):
from halflife.integrations.llamaindex import HalfLifePostprocessor # Plug into any LlamaIndex query engine query_engine = index.as_query_engine( node_postprocessors=[HalfLifePostprocessor(top_n=3)] )
Evaluated on 120 real Arxiv papers (Parameter-Efficient Fine-Tuning) from 2019β2024:
| Query Intent | Baseline Avg Result Age | HalfLife Avg Result Age | Ξ |
|---|---|---|---|
| Fresh ("current state-of-the-art...") | 3.9 yr | 2.4 yr | β1.5 yr |
| Static ("explain how...") | 4.3 yr | 4.0 yr | β0.3 yr (Stable) |
| Historical ("original paper...") | 5.0 yr | 5.2 yr | +0.2 yr (Verified) |
For fresh queries, HalfLife surfaces results 1.4 years more recent on average. For static queries, behavior is unchanged β HalfLife is invisible when it doesn't need to act. Historical inversion is marginal on this corpus (2019β2024); deeper corpora show stronger results.
Reproduce this yourself β the Colab notebook fetches live Arxiv data with no setup:
HalfLife sits between your vector retriever and your LLM. It re-scores each retrieved chunk using a weighted fusion of three signals:
final_score = Ξ± Β· vector_score + Ξ² Β· temporal_score + Ξ³ Β· trust_score
The weights Ξ±, Ξ², Ξ³ are set dynamically based on query intent:
| Detected Intent | Example keywords | Ξ± (vector) | Ξ² (temporal) | Ξ³ (trust) |
|---|---|---|---|---|
| Fresh | "latest", "current", "today", "SOTA" | 0.3 | 0.6 | 0.1 |
| Historical | "originally", "history of", "early" | 0.4 | 0.5 | 0.1 |
| Static | "explain", "what is", "how does" | 0.8 | 0.1 | 0.1 |
For historical queries, the temporal signal is inverted β older chunks score higher.
User Query
β
QueryIntentClassifier β sets Ξ±, Ξ², Ξ³
β
Vector Retrieval (Qdrant / any store)
β
HalfLife Reranker
βββ Temporal decay per chunk (exponential / piecewise / learned MLP)
βββ Redis metadata cache (< 1ms overhead on cached chunks)
βββ Min-Max fusion
β
Re-ranked chunks β LLM
pip install halflife-ragfrom halflife import HalfLife
hl = HalfLife()
# Drop into your existing retrieval results
results = qdrant.search(query=query)
reranked = hl.rerank(query=query, chunks=results, top_k=5)
for chunk in reranked:
print(f"[{chunk['final_score']:.3f}] ({chunk['timestamp'][:4]}) {chunk['text'][:80]}")git clone https://github.com/amaydixit11/halflife.git
cd halflife
docker-compose up -d # starts Qdrant + Redis
pip install -e .
halflife demo # runs the adversarial demofrom langchain.retrievers import ContextualCompressionRetriever
from halflife.integrations.langchain import HalfLifeReranker
retriever = ContextualCompressionRetriever(
base_compressor=HalfLifeReranker(top_k=5),
base_retriever=your_existing_retriever # unchanged
)
docs = retriever.get_relevant_documents("Latest approach to LLM fine-tuning?")
# β surfaces 2024 papers instead of 2020 papersfrom halflife.integrations.llamaindex import HalfLifePostprocessor
query_engine = index.as_query_engine(
similarity_top_k=20,
node_postprocessors=[HalfLifePostprocessor(top_n=5)]
)
response = query_engine.query("What is the latest React version?")Three decay functions are available, selectable per document at ingestion time:
| Strategy | Formula | Best for |
|---|---|---|
| Exponential | e^(βΞ»Ξt) |
News, fast-moving fields, software versions |
| Piecewise | Step function (1.0 β 0.7 β 0.3) | Documentation, compliance, versioned specs |
| Learned | MLP-predicted Ξ» from doc features | Mixed corpora with feedback signal |
The learned decay MLP runs at ingestion time only β zero ML inference at query time.
- Learned Decay MLP β pure-NumPy, predicts per-chunk Ξ» from doc type, source domain, text length, and feedback ratio. Train on your own benchmark results with
halflife train. - Feedback Loop β adaptive Ξ» tuning via EMA from user "was this useful?" signals.
- Event Bus β hard/soft invalidation when a fact is superseded (e.g. a retraction, a product discontinuation).
halflife demo # adversarial demo (no Docker)
halflife quickstart # end-to-end ingest β query β rerank
halflife benchmark --output results.json # nDCG / MRR / temporal freshness
halflife evaluate --ablation # compare exponential / linear / learned / baseline
halflife train --results results.json # train the decay MLP
halflife serve --port 8000 # start the FastAPI middleware- Phase 1: Core decay engine + Redis metadata store
- Phase 2: Intent-aware fusion + historical inversion
- Phase 3: Learned decay MLP + benchmark harness
- Phase 3.5: Real-world Arxiv validation + Colab notebook
- Phase 4: Event-driven fact supersession
- Phase 5: Pinecone, Weaviate, Chroma integrations
- Phase 6: Transformer-based intent classifier (replace keyword matching)
MIT License. Contributions welcome β especially new decay functions, vector store integrations, and domain-specific benchmark datasets.
If you're using HalfLife in a project, open an issue and let us know β we'd like to link to it.