HELIX Genomic Sequencing Engine 🧬

HELIX is a research-grade, algorithmic pipeline and web dashboard designed for genomic sequencing, assembly, and analysis. It combines over 20 classical and advanced computer science algorithms to simulate the end-to-end process of reading, assembling, aligning, and analyzing DNA sequences.

Unlike basic sequence simulators, HELIX utilizes a suite of dynamic programming, backtracking, greedy algorithms, and graph theory techniques to handle complex biological simulations—such as tumor aneuploidy, sex determination, and primer placement.

🌟 Key Features

1. Advanced Assembly & Alignment

Eulerian vs Hamiltonian Assembly: Demonstrates the performance shift from NP-Complete (Hamiltonian Path via Backtracking) to Linear Time $O(V+E)$ (Eulerian Path via De Bruijn Graphs) when assembling shredded k-mer reads.
Smith-Waterman Alignment: Uses dynamic programming to identify exact local alignments and accurately flag mutations (Substitutions, Insertions, Deletions, Frameshifts).

2. Biological Simulation & Intelligence

Tumor Aneuploidy Simulation: Uses k-Color Graph Coloring to phase haplotypes and determine copy number variations (e.g., detecting 3N or 4N karyotypes in simulated cancer cells).
Sex Determination Consensus: Aggregates three independent algorithms (Coverage ratio, SRY gene detection, and Heterozygosity rates) to confidently predict the biological sex of the sample.
Gene-Level Tracking: Tracks the status of critical clinical markers (e.g., TP53, BRCA1, KRAS), especially when cancer simulation is toggled.

3. Resource Optimization & Efficiency

Huffman Compression: Compresses standard A/T/G/C streams based on frequency, saving disk space dynamically (often achieving ~20-30% savings).
0/1 Knapsack Read Selection: Discards low-quality overlapping reads intelligently within a strict RAM budget, optimizing for quality vs memory footprint.
Job Sequencing with Deadlines: Prioritizes the sequencing of high-value clinical genes (like BRCA1) over intergenic "junk" regions using weighted profit metrics.

🛠️ Tech Stack

Backend Engine: Python 3, FastAPI, Uvicorn
Frontend Dashboard: React, TypeScript, Vite, Tailwind CSS
Design Pattern: Glassmorphism UI with real-time "Algorithm Explorer" and "Intelligence" visualizers.

🚀 Quick Start Guide

Prerequisites

Make sure you have Node.js (for the React frontend) and Python 3.8+ installed.

1. Start the Backend API

Navigate to the root directory and install the required dependencies (if you haven't already):

pip install fastapi uvicorn pydantic

Run the FastApi server:

python api.py

(The server will start on http://localhost:8000)

2. Start the Frontend Dashboard

Open a new terminal window, navigate to the frontend folder, and run the Vite dev server:

cd frontend
npm install
npm run dev

(The frontend will be available at http://localhost:5173)

🧪 How to Use the Dashboard

Configure Parameters: On the left panel, select your target species (e.g., Human, Mouse, SARS-CoV-2) or input your own custom DNA sequence.
Adjust Simulation Specs: Tweak the genome length, sequencing coverage depth, and expected mutation count. Toggle "Simulate Cancer/Tumor" to observe aneuploidy and targeted gene mutations.
Launch the Pipeline: Click Launch HELIX Pipeline. The frontend will fetch data from the Python backend and instantly populate the results.
Explore the Tabs:
- Overview: View the reconstructed genome sequence and the Boy/Girl consensus results.
- Intelligence: View research-grade assembly metrics (N50, L50, GC Content), deep mutation analytics (SNPs vs INDELs), and human-readable genomic insights.
- Algorithm Explorer: Click on individual algorithms (like De Bruijn Graphs or Huffman Compression) to see exactly how your specific DNA data was mathematically processed.
- Comparisons: View the "Anytime Algorithm" charts, comparing Greedy, DP, and B&B time complexities.
- Advanced: View specific outputs for the Reliability DP, Job Sequencing algorithms, and N-Queens primer placement.

🧠 Core Algorithms Implemented

Category	Algorithm	Purpose in HELIX
Graphs	De Bruijn Graph	Reconstructing DNA sequences via Eulerian paths.
Graphs	K-Coloring	Phasing haplotypes and estimating aneuploidy.
Dynamic Programming	Smith-Waterman	Optimal local sequence alignment.
Dynamic Programming	0/1 Knapsack	Selecting the highest quality reads within a RAM budget.
Greedy	Huffman Coding	Compressing output DNA text efficiently.
Greedy	Job Sequencing	Prioritizing clinical cancer genes for processing.
Backtracking	N-Queens	Placing primers cleanly without overlapping repeat regions.
Backtracking	Hamiltonian Path	Used purely to demonstrate the inefficiency of NP-Complete approaches vs Eulerian paths.

👨‍💻 Developer Notes

If you are modifying the Python engine (helix_main.py), note that api.py does not hot-reload by default. If you make changes to the backend algorithms, you must restart the python api.py terminal process for the frontend to receive the new data.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
__pycache__		__pycache__
frontend		frontend
frontend_old		frontend_old
README.md		README.md
algorithms.py		algorithms.py
alignment.py		alignment.py
anytime.py		anytime.py
api.py		api.py
assembly.py		assembly.py
dna_utils.py		dna_utils.py
helix_main.py		helix_main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HELIX Genomic Sequencing Engine 🧬

🌟 Key Features

1. Advanced Assembly & Alignment

2. Biological Simulation & Intelligence

3. Resource Optimization & Efficiency

🛠️ Tech Stack

🚀 Quick Start Guide

Prerequisites

1. Start the Backend API

2. Start the Frontend Dashboard

🧪 How to Use the Dashboard

🧠 Core Algorithms Implemented

👨‍💻 Developer Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HELIX Genomic Sequencing Engine 🧬

🌟 Key Features

1. Advanced Assembly & Alignment

2. Biological Simulation & Intelligence

3. Resource Optimization & Efficiency

🛠️ Tech Stack

🚀 Quick Start Guide

Prerequisites

1. Start the Backend API

2. Start the Frontend Dashboard

🧪 How to Use the Dashboard

🧠 Core Algorithms Implemented

👨‍💻 Developer Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages