Skip to content

nicolay-r/bulk-chain

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

bulk-chain 1.0.1

twitter PyPI downloads

Third-party providers hosting↗️
👉demo👈

A no-strings-attached framework for your LLM that allows applying Chain-of-Thought-alike prompt schema towards a massive textual collections using custom third-party providers ↗️.

Main Features

  • No-strings: you're free to LLM dependencies and flexible venv customization.
  • Support schemas descriptions for Chain-of-Thought concept.
  • Provides iterator over infinite amount of input contexts

Installation

From PyPI:

pip install --no-deps bulk-chain

or latest version from here:

pip install git+https://github.com/nicolay-r/bulk-chain@master

Chain-of-Thought Schema

To declare Chain-of-Though (CoT) schema, this project exploits JSON format. This format adopts name field for declaring a name and schema is a list of CoT instructions for the Large Language Model.

Each step represents a dictionary with prompt and out keys that corresponds to the input prompt and output variable name respectively. All the variable names are expected to be mentioned in {}.

Below, is an example on how to declare your own schema:

{
"name": "schema-name",
"schema": [
    {"prompt": "Given the question '{text}', let's think step-by-step.", 
     "out": "steps"},
    {"prompt": "For the question '{text}' the reasoining steps are '{steps}'. what would be an answer?", 
     "out":  "answer"},
]
}

Usage

🤖 Prepare

  1. schema
  2. LLM model from the Third-party providers hosting↗️.
  3. Data (iter of dictionaries)

🚀 Launch

API: For more details see the related Wiki page

from bulk_chain.core.utils import dynamic_init
from bulk_chain.api import iter_content

content_it = iter_content(
    # 1. Your schema.              
    schema="YOUR_SCHEMA.json",
    # 2. Your third-party model implementation.
    llm=dynamic_init(class_filepath="replicate_104.py", class_name="Replicate")(api_token="<API-KEY>"),
    # 3. Your iterator of dictionaries
    input_dicts_it=YOUR_DATA_IT)
    
for content in content_it:
    # Handle your LLM responses here ...

Embed your LLM

All you have to do is to implement BaseLM class, that includes:

  • __init__ -- for setting up batching mode support and (optional) model name;
  • ask(prompt) -- infer your model with the given prompt.

See examples with models at nlp-thirdgate 🌌.