-
Notifications
You must be signed in to change notification settings - Fork 797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Dynamic Service creation #4498
Open
holzweber
wants to merge
29
commits into
bentoml:main
Choose a base branch
from
holzweber:dynamic-service-creation
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 4 commits
Commits
Show all changes
29 commits
Select commit
Hold shift + click to select a range
8b167aa
Added add_api functionalty with example | draft
holzweber a28891d
working but unclean version of dynamic services
holzweber 8299fd5
run pre-commit hook on changed files
holzweber fda3a5f
Missing checks
holzweber b346b6c
fix(sdk): current directory for built bentos (#4505)
bojiang 1c52c7b
chore(cloud cli): rename cluster to region (#4508)
bojiang 259383d
doc: Add the lcm lora use case doc (#4510)
Sherlock113 ad0d485
fix(sdk): clean bentoml version (#4511)
bojiang 6c2ac38
fix: bug: Dataframes not serializing correctly in the new API (#4491)
frostming 89adfb2
docs: Update the get started docs (#4513)
Sherlock113 da08bfa
fix(sdk): incorrect bento_path if not provided (#4514)
bojiang 71c483f
docs: Add client code examples without context manager (#4512)
Sherlock113 2ed9108
docs: Update docs (#4515)
Sherlock113 b1557bf
docs: Add authorization docs (#4517)
Sherlock113 504ff63
docs: Change sample input to one line (#4518)
Sherlock113 b64ce64
docs: Update ControlNet use case docs (#4519)
Sherlock113 ed91f8a
docs: Update the distributed services and get started docs (#4521)
Sherlock113 7b0b0e6
refactor(cli): make CLI commands available as modules (#4487)
frostming 69b8a29
docs: Refactor BentoCloud docs (#4525)
Sherlock113 b7169c7
Added add_api functionalty with example | draft
holzweber cc86bc5
working but unclean version of dynamic services
holzweber 6fc21a1
run pre-commit hook on changed files
holzweber cc20f20
Missing checks
holzweber 7f8a01f
Merge branch 'dynamic-service-creation' of https://github.com/holzweb…
holzweber 0ed2ab5
added dynamic service
holzweber 05fa5a1
ci: auto fixes from pre-commit.ci
pre-commit-ci[bot] 3bc355d
added services with type()
holzweber c2fbcfc
fix merge issues
holzweber 9a42ea0
ci: auto fixes from pre-commit.ci
pre-commit-ci[bot] File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
# BentoML Sklearn Example: document classification pipeline | ||
|
||
0. Install dependencies: | ||
|
||
```bash | ||
pip install -r ./requirements.txt | ||
``` | ||
|
||
1. Train a document classification pipeline model | ||
|
||
```bash | ||
python ./train.py | ||
``` | ||
|
||
2. Run the service: | ||
|
||
```bash | ||
bentoml serve service.py:svc | ||
``` | ||
|
||
3. Send test request | ||
|
||
Test the `/predict` endpoint: | ||
```bash | ||
curl -X POST -H "content-type: application/text" --data "hello world" http://127.0.0.1:3000/predict_model_0 | ||
``` | ||
|
||
Test the `/predict_proba` endpoint: | ||
```bash | ||
curl -X POST -H "content-type: application/text" --data "hello world" http://127.0.0.1:3000/predict_proba_model_0 | ||
``` | ||
|
||
|
||
4. Build Bento | ||
|
||
``` | ||
bentoml build | ||
``` | ||
|
||
5. Build docker image | ||
|
||
``` | ||
bentoml containerize doc_classifier:latest | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
service: "service.py:svc" | ||
include: | ||
- "service.py" | ||
- "requirements.txt" | ||
python: | ||
requirements_txt: "./requirements.txt" |
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
from typing import Any | ||
|
||
import bentoml | ||
from bentoml import Runner | ||
from bentoml.io import JSON | ||
from bentoml.io import Text | ||
|
||
"""The following example is based on the sklearn/pipeline example. | ||
|
||
The concept revolves around dynamically constructing service endpoints: | ||
|
||
Imagine you have n models ready for production. | ||
When creating your Bento, you may not know in advance which models will be served. | ||
Therefore, you create an endpoint for every available model that can be deployed. | ||
|
||
Scenario: You trained hundreds of models. | ||
While they are still in the training pipeline, you want to begin serving your first models already in production. | ||
|
||
When constructing Bentos, you require a predefined service.py file. However, the number of endpoints is unknown | ||
during construction of this file. You aim to reuse the same file each time you create a new Bento, without the need | ||
to alter the service definitions repeatedly. Each model should ideally have a route with a unique running index, | ||
for instance. """ | ||
|
||
|
||
def wrap_service_methods(runner: Runner, targets: Any): | ||
"""Pass Runner and target names, as they are needed in both methods. | ||
|
||
Note: Only passed arguments are available in the methods below, scope is not overwritten. | ||
""" | ||
|
||
async def predict(input_doc: str): | ||
predictions = await runner.predict.async_run([input_doc]) | ||
return {"result": targets[predictions[0]]} | ||
|
||
async def predict_proba(input_doc: str): | ||
predictions = await runner.predict_proba.async_run([input_doc]) | ||
return predictions[0] | ||
|
||
return predict, predict_proba | ||
|
||
|
||
available_model_set = set() | ||
# Add all unique variations of twenty_news_group to the service | ||
for available_model in bentoml.models.list(): | ||
if "twenty_news_group" in available_model.tag.name: | ||
available_model_set.add(available_model.tag.name) | ||
|
||
model_runner_list: [Runner] = [] | ||
target_names: [] = [] | ||
|
||
for available_model in available_model_set: | ||
bento_model = bentoml.sklearn.get(f"{available_model}:latest") | ||
target_names.append(bento_model.custom_objects["target_names"]) | ||
model_runner_list.append(bento_model.to_runner()) | ||
|
||
svc = bentoml.Service("doc_classifier", runners=model_runner_list) | ||
|
||
for idx, (model_runner, target_name) in enumerate(zip(model_runner_list, target_names)): | ||
path_predict = f"predict_model_{idx}" | ||
path_predict_proba = f"predict_proba_model_{idx}" | ||
fn_pred, fn_pred_proba = wrap_service_methods( | ||
runner=model_runner, targets=target_name | ||
) | ||
|
||
svc.add_api( | ||
input=Text(), | ||
output=JSON(), | ||
user_defined_callback=fn_pred, | ||
name=path_predict, | ||
doc=None, | ||
route=path_predict, | ||
) | ||
svc.add_api( | ||
input=Text(), | ||
output=JSON(), | ||
user_defined_callback=fn_pred_proba, | ||
name=path_predict_proba, | ||
doc=None, | ||
route=path_predict_proba, | ||
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,118 @@ | ||
import logging | ||
from pprint import pprint | ||
from time import time | ||
|
||
from sklearn.datasets import fetch_20newsgroups | ||
from sklearn.feature_extraction.text import CountVectorizer | ||
from sklearn.feature_extraction.text import TfidfTransformer | ||
from sklearn.linear_model import SGDClassifier | ||
from sklearn.model_selection import GridSearchCV | ||
from sklearn.pipeline import Pipeline | ||
|
||
import bentoml | ||
|
||
# Display progress logs on stdout | ||
logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(message)s") | ||
|
||
# Load some categories from the training set | ||
categories = [ | ||
"alt.atheism", | ||
"talk.religion.misc", | ||
] | ||
|
||
# Uncomment the following to do the analysis on all the categories | ||
# categories = None | ||
|
||
print("Loading 20 newsgroups dataset for categories:") | ||
print(categories) | ||
|
||
data = fetch_20newsgroups(subset="train", categories=categories) | ||
print("%d documents" % len(data.filenames)) | ||
print("%d categories" % len(data.target_names)) | ||
print() | ||
|
||
# Define a pipeline combining a text feature extractor with a simple classifier | ||
pipeline = Pipeline( | ||
[ | ||
("vect", CountVectorizer()), | ||
("tfidf", TfidfTransformer()), | ||
("clf", SGDClassifier(loss="log_loss")), | ||
] | ||
) | ||
|
||
# Parameters to use for grid search. Uncommenting more parameters will give | ||
# better exploring power but will increase processing time in a combinatorial | ||
# way | ||
parameters = { | ||
"vect__max_df": (0.5, 0.75, 1.0), | ||
# 'vect__max_features': (None, 5000, 10000, 50000), | ||
"vect__ngram_range": ((1, 1), (1, 2)), # unigrams or bigrams | ||
# 'tfidf__use_idf': (True, False), | ||
# 'tfidf__norm': ('l1', 'l2'), | ||
"clf__max_iter": (20,), | ||
"clf__alpha": (0.00001, 0.000001), | ||
"clf__penalty": ("l2", "elasticnet"), | ||
# 'clf__max_iter': (10, 50, 80), | ||
} | ||
|
||
# Find the best parameters for both the feature extraction and the | ||
# classifier | ||
grid_search = GridSearchCV(pipeline, parameters, n_jobs=-1, verbose=1) | ||
|
||
print("Performing grid search...") | ||
print("pipeline:", [name for name, _ in pipeline.steps]) | ||
print("parameters:") | ||
pprint(parameters) | ||
t0 = time() | ||
grid_search.fit(data.data, data.target) | ||
print("done in %0.3fs" % (time() - t0)) | ||
print() | ||
|
||
print("Best score: %0.3f" % grid_search.best_score_) | ||
best_parameters = grid_search.best_estimator_.get_params() | ||
best_parameters = { | ||
param_name: best_parameters[param_name] for param_name in sorted(parameters.keys()) | ||
} | ||
print(f"Best parameters set: {best_parameters}") | ||
|
||
bento_model = bentoml.sklearn.save_model( | ||
"twenty_news_group", | ||
grid_search.best_estimator_, | ||
signatures={ | ||
"predict": {"batchable": True, "batch_dim": 0}, | ||
"predict_proba": {"batchable": True, "batch_dim": 0}, | ||
}, | ||
custom_objects={ | ||
"target_names": data.target_names, | ||
}, | ||
metadata=best_parameters, | ||
) | ||
print(f"Model saved: {bento_model}") | ||
|
||
# Test running inference with BentoML runner | ||
test_runner = bentoml.sklearn.get("twenty_news_group:latest").to_runner() | ||
test_runner.init_local() | ||
assert test_runner.predict.run(["hello"]) == grid_search.best_estimator_.predict( | ||
["hello"] | ||
) | ||
|
||
bento_model = bentoml.sklearn.save_model( | ||
"twenty_news_group_second", | ||
grid_search.best_estimator_, | ||
signatures={ | ||
"predict": {"batchable": True, "batch_dim": 0}, | ||
"predict_proba": {"batchable": True, "batch_dim": 0}, | ||
}, | ||
custom_objects={ | ||
"target_names": data.target_names, | ||
}, | ||
metadata=best_parameters, | ||
) | ||
print(f"Model saved: {bento_model}") | ||
|
||
# Test running inference with BentoML runner | ||
test_runner = bentoml.sklearn.get("twenty_news_group_second:latest").to_runner() | ||
test_runner.init_local() | ||
assert test_runner.predict.run(["hello"]) == grid_search.best_estimator_.predict( | ||
["hello"] | ||
) |
Empty file.
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi holzweber. We are deprecating Runner API from 1.2. But I see the values of the dynamic servive example. Would you like update this once we finished the 1.2 DOC?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thus may I mark this as merge-hold?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bojiang i just checked the new documentation. but i do not know if my idea will work any longer, as i can not access the service object with the new service-annotation idea. i am scared that we really need to generate a pyhton file for dynamic services.. which is kind of an ugly solution...
any other idea how to get this working with 1.2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dynamic service in 1.2 will still work, as OpenLLM has to do something similar like this.
You can probably hijack directly into the service object, since
bentoml.Service
will return the new service object, which treats all runnable as a normal python class.I think one huge difference here is that the lifecycle is just a class, so probably a lot simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will send more example once I finish openllm revamp 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aarnphm ... Any more updates on this?
I was trying to do something like this, but it does not work as only the last services is in the service list afterwards (based on what i see in openapi)....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edit: i added the methods via the locals() function. can you check and resolve conversation, if this is okey?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can probably use types.new_class here, or even type() to construct subclass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true, i tried it with type() and it seems to work. updated my latest push.. so it is visible now :)