Skip to content

CogStack/CogStack-NiFi

CogStack-NiFi

Introduction

This repository proposes a possible next step in the evolution of free-text data processing originally implemented in CogStack-Pipeline, moving towards a more modular, Platform-as-a-Service (PaaS) approach.

CogStack-NiFi demonstrates how to use Apache NiFi as the central data workflow engine for clinical document processing, integrating services such as text extraction and natural language processing (NLP). Each component runs as a standalone service, with NiFi handling data routing between components and data sources/sinks.

All NLP services are expected to implement a uniform RESTful API, allowing seamless integration into existing pipelines—making it easy to incorporate any NLP application into the stack.


⚠️ Important Notice

This project is under active development. New features or services may impact existing deployments. Please review the release notes and documentation before upgrading.


💬 Asking Questions

Need help? Feel free to:


🗂️ Project

Folder Description
nifi Custom Apache NiFi Docker image with workflows, configs, drivers, and user resources.
security Scripts for generating SSL certificates and other security-related tools.
services NLP and auxiliary services, each with its own configs and resources.
deploy Example deployment setup, combining NiFi and related services.
scripts Helper scripts (e.g., setup tools, sample DB ingestion, Elasticsearch ingestion).
data Place any test or ingested data here.

📚 Documentation & Getting Started

Prerequisites:

  • Docker (mandatory)
  • Basic knowledge of Python and Linux/UNIX systems

📖 Official documentation: cogstack-nifi.readthedocs.io

🚀 New to the project? Start with the deployment guide for example setups and workflows.

🐞 For troubleshooting or bug reports, consult the Known Issues section before opening a ticket.


🛑 Important Updates

Check the IMPORTANT_NEWS section regularly for:

  • Major changes to project structure or configuration
  • Security advisories or vulnerabilities affecting deployments