Enterprise AI

1. Overall

The Enterprise AI includes XIM (Xeon Inference Microservice) and scalable cloud native framework which is part of OPEA(Open Platform Enterprise AI).

Xeon Inference Microservice (XIM) is a scalable and stateless container service exposing standard resful APIs. It allow Intel accelerators to optimize the inference engine and customized model for AIGC workload.

Layer name	Description
Accelerators	A XIM could be optimized by any of Intel Accelerators like AMX/VNNI/AVX512 etc
Optimized Engine	Intel provide many engine for different purposes like OneAPI, xFT, IPEX
Models	A model can be customized for xFT format in different Quantization like BF16/INT8/FP4 etc
Microservices	A container services with stateless design to support scalable ochrestartion
API	LangChain/LlamaIndex and existing vendor like OpenAI provide industrial standard restfule API to expsoe service

Please refer here for more details.

2. Business Pipeline

2.1 ChatBot Pipeline

2.2 Meeting Summary Pipeline

More Business pipeline please refer to OPEA's GenAIExamples

3. XIM (Xeon Inference Microservice)

Name	Description	Registry
ASR (whisper)	Auto Speech Recognition	registry.cn-hangzhou.aliyuncs.com/kenplusplus/whisper-server
ASR + Diarize (whisperx)	Speech Recognition + Speaker Recognition	registry.cn-hangzhou.aliyuncs.com/kenplusplus/whisperx-server
ASR (fast-whisper)	Accelerated ASR	registry.cn-hangzhou.aliyuncs.com/kenplusplus/faster-whisper-server
FastChat	AMX opted IPEX based LLM	registry.cn-hangzhou.aliyuncs.com/kenplusplus/fastchat-server
TTS (OpenVoice)	Text to Speech	registry.cn-hangzhou.aliyuncs.com/kenplusplus/openvoice-server
TTS （OpenTTS)	Text to Speech	registry.cn-hangzhou.aliyuncs.com/kenplusplus/opentts-server

Following models are used:

Name	Size	Micro Services	Description
THUDM/chatglm2-6b	12G	FastChat	LLM model
Trelis/Llama-2-7b-chat-hf-shared-bf16	25G	FastChat	LLM model using BF16 for AMX
lmsys/vicuna-7b-v1.3	13.5G	FastChat	LLM model using INT8 for VNNI
Systran/faster-whisper-tiny	75M	faster-whisper	Speech Recognition model
pyannote/speaker-diarization-3.1	14M	whisperx-server	Speaker Diarize
pyannote/segmentation-3.0	5.8M	whisperx-server	Speech Segmentation
jonatasgrosman/wav2vec2-large-xlsr-53-chinese-zh-cn	2.4G	whisperx-server	Chinese Speech to vector
pyannote/wespeaker-voxceleb-resnet34-LM	51M	whisperx-server	Extract embedding
silero-vad	17M	openvoice-server	Voice Activity Detector
whisper(small)	244M	whisper-server	OpenAI whisper model

4. Business Pipeline Orchestration

4.1 Flowise

TBD

4.2 Dify

TBD

5. Cloud Native Services Orchestration

TBD

5.1 Scalability for Concurrent

TBD

5.2 Sustainability

TBD

5.3 Confidentiality

TBD

6. Deployment

TBD

Name		Name	Last commit message	Last commit date
Latest commit History 125 Commits
deploy/compose		deploy/compose
docs		docs
externals/opea		externals/opea
orchestrate		orchestrate
test		test
xim		xim
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

Enterprise AI

1. Overall

2. Business Pipeline

2.1 ChatBot Pipeline

2.2 Meeting Summary Pipeline

3. XIM (Xeon Inference Microservice)

4. Business Pipeline Orchestration

4.1 Flowise

4.2 Dify

5. Cloud Native Services Orchestration

5.1 Scalability for Concurrent

5.2 Sustainability

5.3 Confidentiality

6. Deployment

About

Uh oh!

Releases 1

Packages

Contributors 5

Uh oh!

Languages

Uh oh!

Uh oh!

kenplusplus/enterprise-ai

Folders and files

Latest commit

History

Repository files navigation

Enterprise AI

1. Overall

2. Business Pipeline

2.1 ChatBot Pipeline

2.2 Meeting Summary Pipeline

3. XIM (Xeon Inference Microservice)

4. Business Pipeline Orchestration

4.1 Flowise

4.2 Dify

5. Cloud Native Services Orchestration

5.1 Scalability for Concurrent

5.2 Sustainability

5.3 Confidentiality

6. Deployment

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 5

Uh oh!

Languages

Packages