Temporal AI Model Question Planetarium

This is an example implementation of running multiple AI models using Temporal activities and workers. Our page uses WebSockets to communicate updates back and forth between our flask server and client browser. By leveraging Temporal we get automatic job queuing and execution in addition to retries, timeout management, question status tracking and more.

The application supports two AI models:

SmolLM3-3B: Local Hugging Face model for fast, lightweight inference
gpt-oss-20b: Via Ollama integration for alternative model responses

How this project works:

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Flask App     │    │  Temporal       │    │  Worker         │
│                 │◄──►│  Server         │◄──►│  (AI Models)    │
│                 │    │                 │    │                 │
│ • Web Interface │    │ • Workflow      │    │ • SmolLM3-3B    │
│ • Model Selection│   │   Queue         │    │ • gpt-oss-20b (Ollama)│
│ • WebSockets    │    │ • Job Management│    │ • Inference     │
│ • Status Updates│    │ • Retry Logic   │    │ • Model Routing │
└─────────────────┘    └─────────────────┘    └─────────────────┘

This project has 3 major components that communicate with each other.
- The flask app manages our frontend with model selection, pushes out updates to a browser using WebSockets, and creates new workflows to be coordinated by our temporal server.
- The Temporal server assigns work to our worker, manages the queue, handles retries and more.
- Our worker executes the selected AI model (SmolLM3-3B via transformers or gpt-oss-20b via Ollama) and returns the results of our questions.

The specific files include:

Flask Application (app.py): Web server handling HTTP requests and WebSockets connections
Temporal Workflows (workflows.py): Defines the workflow structure with retry policies.
Activities (activities.py): Handles the actual AI model inference logic with routing between SmolLM3-3B and gpt-oss-20b
Worker (run_worker.py): Temporal worker that processes queued workflows
Frontend (templates/, static/,): Frontend that leverages the orbit CSS framework to create the circular layout and client-side JavaScript to communicated with the flask server using WebSockets to update the page as information is returned from our worker.

Getting started

Requirements

To deploy this project you'll need the following installed on your system:

Temporal CLI
Python 3
Ollama (for gpt-oss-20b model support)

Given that we're downloading a hugging face model and executing it local the better CPU and memory you have, the faster it'll perform. On my M4, each question takes me about 10ish seconds to generate a response.

Setup

Clone this repository.
Create a virtual environment and install the dependencies from requirements.txt.
- This project was built using python 3.13.5 so default to that version if you encounter any issues.
Install and configure Ollama for gpt-oss-20b support:
- Install Ollama from ollama.ai
- Pull the gpt-oss model: ollama pull gpt-oss:20b. This is a 13gb model so expect it to take a bit of time.
Start a local temporal server.
- temporal server start-dev
Create at least one temporal worker.
- python run_worker.py
Start the flask server.
- python app.py

This will start up a flask app with a UI for asking AI models questions on port 5000. The application supports two models: SmolLM3-3B for fast local inference and gpt-oss-20b via Ollama for more powerful responses. You can select between models using the dropdown in the interface. The first time you use each model, there will be some initial setup time for downloading and caching.

You're able to ask as many questions as you'd like and can navigate to the temporal UI running on port 8233 to view the status of the queue. You can navigate to a specific task queue by clicking on the question from our flask UI. From the temporal UI, you can restart, terminate or just view the status of any specific workflow.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.vscode		.vscode
images		images
static		static
templates		templates
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.MD		README.MD
activities.py		activities.py
app.py		app.py
model_manager.py		model_manager.py
requirements.txt		requirements.txt
run_worker.py		run_worker.py
workflows.py		workflows.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Temporal AI Model Question Planetarium

How this project works:

Getting started

Requirements

Setup

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

temporal-community/temporal-ai-question-planetarium

Folders and files

Latest commit

History

Repository files navigation

Temporal AI Model Question Planetarium

How this project works:

Getting started

Requirements

Setup

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages