Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Jan provider support #932

Merged
merged 2 commits into from
Dec 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions docs/src/content/docs/getting-started/configuration.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -1184,58 +1184,61 @@

Follow [this guide](https://huggingface.co/blog/yagilb/lms-hf) to load Hugging Face models into LMStudio.

## Jan

The `jan` provider connects to the [Jan](https://jan.ai/) local server.

## LocalAI

[LocalAI](https://localai.io/) act as a drop-in replacement REST API that’s compatible
with OpenAI API specifications for local inferencing. It uses free Open Source models
and it runs on CPUs.

LocalAI acts as an OpenAI replacement, you can see the [model name mapping](https://localai.io/basics/container/#all-in-one-images)
used in the container, like `gpt-4` is mapped to `phi-2`.

<Steps>

<ol>

<li>

Install Docker. See the [LocalAI documentation](https://localai.io/basics/getting_started/#prerequisites) for more information.

</li>

<li>

Update the `.env` file and set the api type to `localai`.

```txt title=".env" "localai"
OPENAI_API_TYPE=localai
```

</li>

</ol>

</Steps>

To start LocalAI in docker, run the following command:

```sh
docker run -p 8080:8080 --name local-ai -ti localai/localai:latest-aio-cpu
docker start local-ai
docker stats
echo "LocalAI is running at http://127.0.0.1:8080"
```

## Llamafile

[https://llamafile.ai/](https://llamafile.ai/) is a single file desktop application
that allows you to run an LLM locally.

The provider is `llamafile` and the model name is ignored.

## Jan, LLaMA.cpp
## LLaMA.cpp

Check warning on line 1240 in docs/src/content/docs/getting-started/configuration.mdx

View workflow job for this annotation

GitHub Actions / build

The section on Jan is out of place and does not fit with the flow of the document. It should be moved to a more appropriate location or removed if it is not necessary.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The section on Jan is out of place and does not fit with the flow of the document. It should be moved to a more appropriate location or removed if it is not necessary.

generated by pr-docs-review-commit structure


[Jan](https://jan.ai/),
[LLaMA.cpp](https://github.com/ggerganov/llama.cpp/tree/master/examples/server)
also allow running models locally or interfacing with other LLM vendors.

Expand Down
18 changes: 11 additions & 7 deletions packages/cli/src/nodehost.ts
Original file line number Diff line number Diff line change
Expand Up @@ -96,13 +96,16 @@ class ModelManager implements ModelService {
): Promise<ResponseStatus> {
const { trace } = options || {}
const { provider, model } = parseModelIdentifier(modelid)
if (provider === MODEL_PROVIDER_OLLAMA) {
if (this.pulled.includes(modelid)) return { ok: true }
if (this.pulled.includes(modelid)) return { ok: true }

if (!isQuiet) logVerbose(`ollama pull ${model}`)
if (provider === MODEL_PROVIDER_OLLAMA) {
logVerbose(`${provider}: pull ${model}`)
try {
const conn = await this.getModelToken(modelid)
const res = await fetch(`${conn.base}/api/pull`, {

let res: Response
// OLLAMA
res = await fetch(`${conn.base}/api/pull`, {
method: "POST",
headers: {
"User-Agent": TOOL_ID,
Expand All @@ -115,13 +118,14 @@ class ModelManager implements ModelService {
),
})
if (res.ok) {
const resp = await res.json()
const resj = await res.json()
//logVerbose(JSON.stringify(resj, null, 2))
}
if (res.ok) this.pulled.push(modelid)
return { ok: res.ok, status: res.status }
} catch (e) {
logError(`failed to pull model ${model}`)
trace?.error("pull model failed", e)
logError(`${provider}: failed to pull model ${model}`)
trace?.error(`${provider}: pull model failed`, e)
return { ok: false, status: 500, error: serializeError(e) }
}
}
Expand Down
16 changes: 15 additions & 1 deletion packages/core/src/connection.ts
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,9 @@ import {
MISTRAL_API_BASE,
MODEL_PROVIDER_LMSTUDIO,
LMSTUDIO_API_BASE,
MODEL_PROVIDER_JAN,
JAN_API_BASE,
} from "./constants"
import { fileExists, readText, writeText } from "./fs"
import {
OpenAIAPIType,
host,
Expand Down Expand Up @@ -481,6 +482,19 @@ export async function parseTokenFromEnv(
}
}

if (provider === MODEL_PROVIDER_JAN) {
const base = findEnvVar(env, "JAN", BASE_SUFFIX)?.value || JAN_API_BASE
if (!URL.canParse(base)) throw new Error(`${base} must be a valid URL`)
return {
provider,
model,
base,
token: "lmstudio",
type: "openai",
source: "env: JAN_API_...",
}
}

if (provider === MODEL_PROVIDER_TRANSFORMERS) {
return {
provider,
Expand Down
9 changes: 9 additions & 0 deletions packages/core/src/constants.ts
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,7 @@ export const ALIBABA_BASE =
"https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
export const MISTRAL_API_BASE = "https://api.mistral.ai/v1"
export const LMSTUDIO_API_BASE = "http://localhost:1234/v1"
export const JAN_API_BASE = "http://localhost:1337/v1"

export const PROMPTFOO_CACHE_PATH = ".genaiscript/cache/tests"
export const PROMPTFOO_CONFIG_DIR = ".genaiscript/config/tests"
Expand Down Expand Up @@ -184,6 +185,7 @@ export const MODEL_PROVIDER_TRANSFORMERS = "transformers"
export const MODEL_PROVIDER_ALIBABA = "alibaba"
export const MODEL_PROVIDER_MISTRAL = "mistral"
export const MODEL_PROVIDER_LMSTUDIO = "lmstudio"
export const MODEL_PROVIDER_JAN = "jan"

export const TRACE_FILE_PREVIEW_MAX_LENGTH = 240

Expand All @@ -210,6 +212,8 @@ export const DOCS_CONFIGURATION_OLLAMA_URL =
"https://microsoft.github.io/genaiscript/getting-started/configuration/#ollama"
export const DOCS_CONFIGURATION_LMSTUDIO_URL =
"https://microsoft.github.io/genaiscript/getting-started/configuration/#lmstudio"
export const DOCS_CONFIGURATION_JAN_URL =
"https://microsoft.github.io/genaiscript/getting-started/configuration/#jan"
export const DOCS_CONFIGURATION_LLAMAFILE_URL =
"https://microsoft.github.io/genaiscript/getting-started/configuration/#llamafile"
export const DOCS_CONFIGURATION_LITELLM_URL =
Expand Down Expand Up @@ -311,6 +315,11 @@ export const MODEL_PROVIDERS: readonly {
detail: "LM Studio local server",
url: DOCS_CONFIGURATION_LMSTUDIO_URL,
},
{
id: MODEL_PROVIDER_LMSTUDIO,
detail: "Jan local server",
url: DOCS_CONFIGURATION_JAN_URL,
},
{
id: MODEL_PROVIDER_ALIBABA,
detail: "Alibaba models",
Expand Down
10 changes: 0 additions & 10 deletions packages/core/src/promptdom.ts
Original file line number Diff line number Diff line change
Expand Up @@ -17,22 +17,12 @@ import { YAMLStringify } from "./yaml"
import {
DEFAULT_FENCE_FORMAT,
MARKDOWN_PROMPT_FENCE,
MODEL_PROVIDER_ALIBABA,
MODEL_PROVIDER_ANTHROPIC,
MODEL_PROVIDER_AZURE_OPENAI,
MODEL_PROVIDER_AZURE_SERVERLESS_MODELS,
MODEL_PROVIDER_AZURE_SERVERLESS_OPENAI,
MODEL_PROVIDER_LLAMAFILE,
MODEL_PROVIDER_LMSTUDIO,
MODEL_PROVIDER_OLLAMA,
MODEL_PROVIDER_OPENAI,
PROMPT_FENCE,
PROMPTY_REGEX,
SANITIZED_PROMPT_INJECTION,
TEMPLATE_ARG_DATA_SLICE_SAMPLE,
TEMPLATE_ARG_FILE_MAX_TOKENS,
} from "./constants"
import { parseModelIdentifier } from "./models"
import {
appendAssistantMessage,
appendSystemMessage,
Expand Down
1 change: 1 addition & 0 deletions packages/core/src/types/prompt_template.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,7 @@ type ModelType = OptionsOrString<
| "anthropic:claude-2.1"
| "anthropic:claude-instant-1.2"
| "huggingface:microsoft/Phi-3-mini-4k-instruct"
| "jan:llama3.2-3b-instruct"
| "google:gemini-1.5-flash"
| "google:gemini-1.5-flash-latest"
| "google:gemini-1.5-flash-8b"
Expand Down
148 changes: 0 additions & 148 deletions packages/vscode/src/lmaccess.ts
Original file line number Diff line number Diff line change
@@ -1,141 +1,13 @@
/* eslint-disable @typescript-eslint/naming-convention */
import * as vscode from "vscode"
import { ExtensionState } from "./state"
import {
MODEL_PROVIDER_OLLAMA,
MODEL_PROVIDER_LLAMAFILE,
MODEL_PROVIDER_AICI,
MODEL_PROVIDER_AZURE_OPENAI,
MODEL_PROVIDER_LITELLM,
MODEL_PROVIDER_OPENAI,
MODEL_PROVIDER_CLIENT,
MODEL_PROVIDER_GITHUB,
TOOL_NAME,
MODEL_PROVIDER_AZURE_SERVERLESS_MODELS,
MODEL_PROVIDER_AZURE_SERVERLESS_OPENAI,
DOCS_CONFIGURATION_URL,
MODEL_PROVIDER_GOOGLE,
MODEL_PROVIDER_ALIBABA,
MODEL_PROVIDER_LMSTUDIO,
} from "../../core/src/constants"
import { OpenAIAPIType } from "../../core/src/host"
import { parseModelIdentifier } from "../../core/src/models"
import { ChatCompletionMessageParam } from "../../core/src/chattypes"
import { LanguageModelChatRequest } from "../../core/src/server/client"
import { ChatStart } from "../../core/src/server/messages"
import { serializeError } from "../../core/src/error"
import { logVerbose } from "../../core/src/util"
import { renderMessageContent } from "../../core/src/chatrender"

async function generateLanguageModelConfiguration(
state: ExtensionState,
modelId: string
) {
const { provider } = parseModelIdentifier(modelId)
const supportedProviders = [
MODEL_PROVIDER_OLLAMA,
MODEL_PROVIDER_LLAMAFILE,
MODEL_PROVIDER_AICI,
MODEL_PROVIDER_AZURE_OPENAI,
MODEL_PROVIDER_AZURE_SERVERLESS_OPENAI,
MODEL_PROVIDER_AZURE_SERVERLESS_MODELS,
MODEL_PROVIDER_LITELLM,
MODEL_PROVIDER_LMSTUDIO,
MODEL_PROVIDER_GOOGLE,
MODEL_PROVIDER_ALIBABA,
]
if (supportedProviders.includes(provider)) {
return { provider }
}

const languageChatModels = await state.languageChatModels()
if (Object.keys(languageChatModels).length)
return { provider: MODEL_PROVIDER_CLIENT, model: "*" }

const items: (vscode.QuickPickItem & {
model?: string
provider?: string
apiType?: OpenAIAPIType
})[] = []
if (isLanguageModelsAvailable()) {
const models = await vscode.lm.selectChatModels()
if (models.length)
items.push({
label: "Visual Studio Language Chat Models",
detail: `Use a registered LLM such as GitHub Copilot.`,
model: "*",
provider: MODEL_PROVIDER_CLIENT,
})
}
items.push(
{
label: "OpenAI",
detail: `Use a personal OpenAI subscription.`,
provider: MODEL_PROVIDER_OPENAI,
},
{
label: "Azure OpenAI",
detail: `Use a Azure-hosted OpenAI subscription.`,
provider: MODEL_PROVIDER_AZURE_OPENAI,
apiType: "azure",
},
{
label: "Azure AI OpenAI (serverless deployment)",
detail: `Use a Azure OpenAI serverless model deployment through Azure AI Studio.`,
provider: MODEL_PROVIDER_AZURE_SERVERLESS_OPENAI,
apiType: "azure_serverless",
},
{
label: "Azure AI Models (serverless deployment)",
detail: `Use a Azure serverless model deployment through Azure AI Studio.`,
provider: MODEL_PROVIDER_AZURE_SERVERLESS_MODELS,
apiType: "azure_serverless_models",
},
{
label: "GitHub Models",
detail: `Use a GitHub Models with a GitHub subscription.`,
provider: MODEL_PROVIDER_GITHUB,
},
{
label: "Alibaba Cloud",
detail: "Use Alibaba Cloud models.",
provider: MODEL_PROVIDER_ALIBABA,
},
{
label: "LocalAI",
description: "https://localai.io/",
detail: "Use local LLMs instead OpenAI. Requires LocalAI and Docker.",
provider: MODEL_PROVIDER_OPENAI,
apiType: "localai",
},
{
label: "Ollama",
description: "https://ollama.com/",
detail: "Run a open source LLMs locally. Requires Ollama",
provider: MODEL_PROVIDER_OLLAMA,
},
{
label: "AICI",
description: "http://github.com/microsoft/aici",
detail: "Generate AICI javascript prompts.",
provider: MODEL_PROVIDER_AICI,
}
)

const res: { model?: string; provider?: string; apiType?: OpenAIAPIType } =
await vscode.window.showQuickPick<
vscode.QuickPickItem & {
model?: string
provider?: string
apiType?: OpenAIAPIType
}
>(items, {
title: `Configure a Language Model for ${modelId}`,
})

return res
}

async function pickChatModel(
state: ExtensionState,
model: string
Expand Down Expand Up @@ -165,26 +37,6 @@ async function pickChatModel(
return chatModel
}

export async function pickLanguageModel(
state: ExtensionState,
modelId: string
) {
const res = await generateLanguageModelConfiguration(state, modelId)
if (res === undefined) return undefined

if (res.model) return res.model
else {
const configure = "Configure..."
vscode.window.showWarningMessage(
`${TOOL_NAME} - model connection not configured.`,
configure
)
if (configure)
vscode.env.openExternal(vscode.Uri.parse(DOCS_CONFIGURATION_URL))
return undefined
}
}

export function isLanguageModelsAvailable() {
return (
typeof vscode.lm !== "undefined" &&
Expand Down
Loading