HalfLife: Temporal-Aware Reranking (Even Without Timestamps)

RAG systems don't just ignore time. They don't even know what time it is.

Most real-world data (scraped docs, blogs, PDFs) lacks consistent temporal metadata. Consequently, standard vector search prioritizes well-cited, high-density results from 2018 over more relevant breakthroughs from 2024.

HalfLife fixes this as a 1-line drop-in layer.

🧨 The "Authority Trap" (Fixed)

Ask any standard RAG pipeline: "What is the best way to manage state in React today?"

❌ Baseline (Vector-Only): #1 → Redux (2018). It matches the query perfectly but is 8 years outdated.
✅ With HalfLife: #1 → React 19 / Zustand (2024). HalfLife automatically detects the "today" intent and re-ranks based on temporal relevance.

Run the Live Demo (No Timestamps!) ↓

🧩 Works Even Without Timestamps

Most rerankers assume you've already extracted and cleaned your metadata. HalfLife handles the messy reality:

Automated Inference: Assigns chronological confidence by extracting years directly from raw unstructured text.
Intent Classification: Detects if your query is Fresh (latest is best), Historical (older is best), or Static.
Temporal Fusion: Re-weights relevance scores using decay functions, even if your vector store is "timestamp-blind."

🧪 Try on your own data (5 mins)

Validate HalfLife on your existing RAG results without changing your code:

Export chunks: Save some nodes/payloads to data.json.

Run the test script:

python3 scripts/try_on_your_data.py --query "Best LLM today" --file data.json

Compare: See the difference between #1 baseline vs. #1 HalfLife.

🚀 Get Started

Install via PyPI:
```
pip install halflife-rag
```

Integrate (LlamaIndex):

from halflife.integrations.llamaindex import HalfLifePostprocessor

# Plug into any LlamaIndex query engine
query_engine = index.as_query_engine(
    node_postprocessors=[HalfLifePostprocessor(top_n=3)]
)

📊 Real-World Performance

Evaluated on 120 real Arxiv papers (Parameter-Efficient Fine-Tuning) from 2019–2024:

Query Intent	Baseline Avg Result Age	HalfLife Avg Result Age	Δ
Fresh ("current state-of-the-art...")	3.9 yr	2.4 yr	−1.5 yr
Static ("explain how...")	4.3 yr	4.0 yr	−0.3 yr (Stable)
Historical ("original paper...")	5.0 yr	5.2 yr	+0.2 yr (Verified)

For fresh queries, HalfLife surfaces results 1.4 years more recent on average. For static queries, behavior is unchanged — HalfLife is invisible when it doesn't need to act. Historical inversion is marginal on this corpus (2019–2024); deeper corpora show stronger results.

Reproduce this yourself — the Colab notebook fetches live Arxiv data with no setup:

How It Works

HalfLife sits between your vector retriever and your LLM. It re-scores each retrieved chunk using a weighted fusion of three signals:

final_score = α · vector_score + β · temporal_score + γ · trust_score

The weights α, β, γ are set dynamically based on query intent:

Detected Intent	Example keywords	α (vector)	β (temporal)	γ (trust)
Fresh	"latest", "current", "today", "SOTA"	0.3	0.6	0.1
Historical	"originally", "history of", "early"	0.4	0.5	0.1
Static	"explain", "what is", "how does"	0.8	0.1	0.1

For historical queries, the temporal signal is inverted — older chunks score higher.

User Query
    ↓
QueryIntentClassifier  →  sets α, β, γ
    ↓
Vector Retrieval (Qdrant / any store)
    ↓
HalfLife Reranker
    ├── Temporal decay per chunk  (exponential / piecewise / learned MLP)
    ├── Redis metadata cache      (< 1ms overhead on cached chunks)
    └── Min-Max fusion
    ↓
Re-ranked chunks  →  LLM

Quickstart

No infrastructure (2 minutes)

pip install halflife-rag

from halflife import HalfLife

hl = HalfLife()

# Drop into your existing retrieval results
results = qdrant.search(query=query)
reranked = hl.rerank(query=query, chunks=results, top_k=5)

for chunk in reranked:
    print(f"[{chunk['final_score']:.3f}] ({chunk['timestamp'][:4]}) {chunk['text'][:80]}")

With Docker (full feature set including Redis cache)

git clone https://github.com/amaydixit11/halflife.git
cd halflife
docker-compose up -d   # starts Qdrant + Redis
pip install -e .
halflife demo          # runs the adversarial demo

LangChain Integration

from langchain.retrievers import ContextualCompressionRetriever
from halflife.integrations.langchain import HalfLifeReranker

retriever = ContextualCompressionRetriever(
    base_compressor=HalfLifeReranker(top_k=5),
    base_retriever=your_existing_retriever   # unchanged
)

docs = retriever.get_relevant_documents("Latest approach to LLM fine-tuning?")
# → surfaces 2024 papers instead of 2020 papers

LlamaIndex Integration

from halflife.integrations.llamaindex import HalfLifePostprocessor

query_engine = index.as_query_engine(
    similarity_top_k=20,
    node_postprocessors=[HalfLifePostprocessor(top_n=5)]
)

response = query_engine.query("What is the latest React version?")

Decay Strategies

Three decay functions are available, selectable per document at ingestion time:

Strategy	Formula	Best for
Exponential	`e^(−λΔt)`	News, fast-moving fields, software versions
Piecewise	Step function (1.0 → 0.7 → 0.3)	Documentation, compliance, versioned specs
Learned	MLP-predicted λ from doc features	Mixed corpora with feedback signal

The learned decay MLP runs at ingestion time only — zero ML inference at query time.

Experimental Features

Learned Decay MLP — pure-NumPy, predicts per-chunk λ from doc type, source domain, text length, and feedback ratio. Train on your own benchmark results with halflife train.
Feedback Loop — adaptive λ tuning via EMA from user "was this useful?" signals.
Event Bus — hard/soft invalidation when a fact is superseded (e.g. a retraction, a product discontinuation).

CLI

halflife demo                              # adversarial demo (no Docker)
halflife quickstart                        # end-to-end ingest → query → rerank
halflife benchmark --output results.json   # nDCG / MRR / temporal freshness
halflife evaluate --ablation               # compare exponential / linear / learned / baseline
halflife train --results results.json      # train the decay MLP
halflife serve --port 8000                 # start the FastAPI middleware

Roadmap

Phase 1: Core decay engine + Redis metadata store
Phase 2: Intent-aware fusion + historical inversion
Phase 3: Learned decay MLP + benchmark harness
Phase 3.5: Real-world Arxiv validation + Colab notebook
Phase 4: Event-driven fact supersession
Phase 5: Pinecone, Weaviate, Chroma integrations
Phase 6: Transformer-based intent classifier (replace keyword matching)

License & Contributing

MIT License. Contributions welcome — especially new decay functions, vector store integrations, and domain-specific benchmark datasets.

If you're using HalfLife in a project, open an issue and let us know — we'd like to link to it.

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
.github/workflows		.github/workflows
api		api
demo		demo
docs		docs
engine		engine
examples		examples
halflife		halflife
scripts		scripts
templates		templates
tests		tests
web		web
.gitignore		.gitignore
COLAB.md		COLAB.md
Dockerfile		Dockerfile
LICENSE		LICENSE
PLAN.md		PLAN.md
README.md		README.md
TESTING.md		TESTING.md
decay_mlp.npz		decay_mlp.npz
docker-compose.yml		docker-compose.yml
launch_digest.txt		launch_digest.txt
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HalfLife: Temporal-Aware Reranking (Even Without Timestamps)

🧨 The "Authority Trap" (Fixed)

🧩 Works Even Without Timestamps

🧪 Try on your own data (5 mins)

🚀 Get Started

📊 Real-World Performance

How It Works

Quickstart

No infrastructure (2 minutes)

With Docker (full feature set including Redis cache)

LangChain Integration

LlamaIndex Integration

Decay Strategies

Experimental Features

CLI

Roadmap

License & Contributing

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HalfLife: Temporal-Aware Reranking (Even Without Timestamps)

🧨 The "Authority Trap" (Fixed)

🧩 Works Even Without Timestamps

🧪 Try on your own data (5 mins)

🚀 Get Started

📊 Real-World Performance

How It Works

Quickstart

No infrastructure (2 minutes)

With Docker (full feature set including Redis cache)

LangChain Integration

LlamaIndex Integration

Decay Strategies

Experimental Features

CLI

Roadmap

License & Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages