Skip to content

Releases: aws/sagemaker-python-sdk

v3.1.0

03 Dec 23:50
0443307

Choose a tag to compare

Model Fine-Tuning Support in SageMaker Python SDK V3

We’re excited to introduce comprehensive model fine-tuning capabilities in the SageMaker Python SDK V3, bringing state-of-the-art fine-tuning techniques to production ML workflows. Fine-tune foundation models with enterprise features including automated experiment tracking, serverless infrastructure, and integrated evaluation—all with just a few lines of code.

What's New

The SageMaker Python SDK V3 now includes four specialized Fine-Tuning Trainers for different fine-tuning techniques. Each trainer is optimized for specific use cases, following established research and industry best practices:

SFTTrainer - Supervised Fine-Tuning

Fine-tune models with labeled instruction-response pairs for task-specific adaptation.

from sagemaker.train import SFTTrainer
from sagemaker.train.common import TrainingType

trainer = SFTTrainer(
    model="meta-llama/Llama-2-7b-hf",
    training_type=TrainingType.LORA,
    model_package_group_name="my-fine-tuned-models",
    training_dataset="s3://bucket/train.jsonl"
)

training_job = trainer.train()

DPOTrainer - Direct Preference Optimization

Align models with human preferences using the DPO algorithm. Unlike traditional RLHF, DPO eliminates the need for a separate reward model, simplifying the alignment pipeline while achieving comparable results. Use cases : Preference alignment, safety tuning, style adaptation.

from sagemaker.train import DPOTrainer

trainer = DPOTrainer(
    model="meta-llama/Llama-2-7b-hf",
    training_type=TrainingType.LORA,
    model_package_group_name="my-dpo-models",
    training_dataset="s3://bucket/preference_data.jsonl"
)

training_job = trainer.train()

RLAIFTrainer - Reinforcement Learning from AI Feedback

Leverage AI-generated feedback as reward signals using Amazon Bedrock models. RLAIF offers a scalable alternative to human feedback while maintaining quality.

from sagemaker.train import RLAIFTrainer

trainer = RLAIFTrainer(
    model="meta-llama/Llama-2-7b-hf",
    training_type=TrainingType.LORA,
    model_package_group_name="my-rlaif-models",
    reward_model_id="anthropic.claude-3-5-haiku-20241022-v1:0",
    reward_prompt="Builtin.Helpfulness",
    training_dataset="s3://bucket/rlaif_data.jsonl"
)

training_job = trainer.train()

RLVRTrainer - Reinforcement Learning from Verifiable Rewards

Train with custom, programmatic reward functions for domain-specific optimization.

from sagemaker.train import RLVRTrainer

trainer = RLVRTrainer(
    model="meta-llama/Llama-2-7b-hf",
    training_type=TrainingType.LORA,
    model_package_group_name="my-rlvr-models",
    custom_reward_function="arn:aws:sagemaker:region:account:hub-content/.../evaluator/1.0",
    training_dataset="s3://bucket/rlvr_data.jsonl"
)

training_job = trainer.train()

Key Features

Parameter-Efficient Fine-Tuning

  • LoRA (Low-Rank Adaptation): Default, memory-efficient approach
  • Full Fine-Tuning: Train all model parameters, for maximum performance

Built-in MLflow Integration

Automatic experiment tracking with intelligent defaults:

  • Auto-resolves MLflow tracking servers
  • Domain-aware server selection
  • Automatic experiment and run management
  • Provides ongoing visibility into performance and loss metrics during training

Dynamic Hyperparameter Management

Discover and customize training hyperparameters with built-in validation:

# View available hyperparameters
trainer.hyperparameters.get_info()

# Customize training
trainer.hyperparameters.learning_rate = 0.0001
trainer.hyperparameters.max_epochs = 3
trainer.hyperparameters.lora_alpha = 32

Continued Fine-Tuning

Build on previously fine-tuned models for iterative improvement:

from sagemaker.core.resources import ModelPackage

# Use a previously fine-tuned model
base_model = ModelPackage.get(
    model_package_name="arn:aws:sagemaker:region:account:model-package/..."
)

trainer = SFTTrainer(
    model=base_model,  # Continue from fine-tuned model
    training_type=TrainingType.LORA,
    model_package_group_name="my-models-v2"
)

Flexible Dataset Support

Multiple input formats with automatic validation:

  • S3 URIs: s3://bucket/path/data.jsonl
  • SageMaker AI Registry Dataset ARNs
  • DataSet objects with validation

Serverless Training

No infrastructure management required—just specify your model and data:

  • Automatic compute provisioning
  • Managed training infrastructure
  • Pay only for training time

Enterprise-Ready

Production-ready security features:

  • VPC support for secure training
  • KMS encryption for outputs
  • IAM role management
  • EULA acceptance for gated models

Model Evaluation

Comprehensive evaluation framework with three evaluator types:

  • BenchMarkEvaluator: Standard benchmarks (MMLU, BBH, GPQA, MATH, STRONG_REJECT, IFEVAL, GEN_QA, MMMU, LLM_JUDGE, INFERENCE_ONLY)
  • CustomScorerEvaluator: Built-in metrics (PRIME_MATH, PRIME_CODE) or custom evaluators
  • LLMAsJudgeEvaluator: LLM-based evaluation with explanations

See the "Evaluating Fine-Tuned Models" section below for detailed examples.

Evaluating Fine-Tuned Models

Evaluate your fine-tuned models using standard benchmarks, custom metrics, or LLM-based evaluation.

Benchmark Evaluation

Evaluate against 11 standard benchmarks including MMLU, BBH, GPQA, MATH, and more.

Discover available benchmarks

from sagemaker.train.evaluate import BenchMarkEvaluator, get_benchmarks, get_benchmark_properties

Benchmark = get_benchmarks()
print(list(Benchmark))
# [MMLU, MMLU_PRO, BBH, GPQA, MATH, STRONG_REJECT, IFEVAL, GEN_QA, MMMU, LLM_JUDGE, INFERENCE_ONLY]

Get benchmark details

props = get_benchmark_properties(Benchmark.MMLU)
print(props['description'])
print(props['subtasks'])

Run evaluation

evaluator = BenchMarkEvaluator(
    benchmark=Benchmark.MMLU,
    model_package_arn="arn:aws:sagemaker:...",
    base_model="meta-llama/Llama-2-7b-hf",
    output_s3_location="s3://bucket/eval-results/",
    mlflow_resource_arn="arn:aws:sagemaker:..."
)

execution = evaluator.evaluate(subtask="college_mathematics")
execution.wait()
execution.show_results()

Custom Scorer Evaluation

Use built-in metrics or custom evaluators

from sagemaker.train.evaluate import CustomScorerEvaluator, get_builtin_metrics

# Discover built-in metrics
BuiltInMetric = get_builtin_metrics()
# [PRIME_MATH, PRIME_CODE]

# Using built-in metric
evaluator = CustomScorerEvaluator(
    evaluator=BuiltInMetric.PRIME_MATH,
    dataset="s3://bucket/eval-data.jsonl",
    model_package_arn="arn:aws:sagemaker:...",
    mlflow_resource_arn="arn:aws:sagemaker:..."
)

execution = evaluator.evaluate()
execution.wait()
execution.show_results()

LLM-as-Judge Evaluation

Leverage large language models for nuanced evaluation with explanations:

from sagemaker.train.evaluate import LLMAsJudgeEvaluator

evaluator = LLMAsJudgeEvaluator(
    judge_model_id="anthropic.claude-3-5-haiku-20241022-v1:0",
    evaluation_prompt="Rate the helpfulness of this response",
    dataset="s3://bucket/eval-data.jsonl",
    model_package_arn="arn:aws:sagemaker:...",
    mlflow_resource_arn="arn:aws:sagemaker:..."
)

execution = evaluator.evaluate()
execution.wait()

# Show first 5 results
execution.show_results()

# Show next 5 with explanations
execution.show_results(limit=5, offset=5, show_explanations=True)

Deploying Fine-Tuned Models

Flexible deployment options for production inference. Deploy your fine-tuned models to SageMaker endpoints or Amazon Bedrock.

Deploy to SageMaker Endpoint

Deploy from training job

from sagemaker.core.resources import TrainingJob
from sagemaker.serve import ModelBuilder

training_job = TrainingJob.get(training_job_name="my-training-job")
model_builder = ModelBuilder(model=training_job)
model = model_builder.build()
endpoint = model_builder.deploy(endpoint_name="my-endpoint")

Deploy from model package

from sagemaker.core.resources import ModelPackage

model_package = ModelPackage.get(model_package_name="arn:aws:sagemaker:...")
model_builder = ModelBuilder(model=model_package)
model_builder.build()
endpoint = model_builder.deploy(endpoint_name="my-endpoint")

Deploy from trainer

trainer = SFTTrainer(...)
training_job = trainer.train()
model_builder = ModelBuilder(model=trainer)
endpoint = model_builder.deploy()

Deploy Multiple Adapters to Same Endpoint

Deploy multiple fine-tuned adapters to a single endpoint for cost-efficient serving

# Deploy base model
model_builder = ModelBuilder(model=training_job)
model_builder.build()
endpoint = model_builder.deploy(endpoint_name="my-endpoint")

# Deploy adapter to same endpoint
model_builder2 = ModelBuilder(model=training_job2)
model_builder2.build()
endpoint2 = model_builder2.deploy(
    endpoint_name="my-endpoint", # Same endpoint
    inference_component_name="my-adapter" # New adapter
)

Deploy to Amazon Bedrock

from sagemaker.serve.bedrock_model_builder import BedrockModelBuilder

training_job = TrainingJob.get(training_job_name="my-training-job")
bedrock_builder = BedrockModelBuilder(model=training_job)

deployment_result = bedrock_builder.deploy(
    job_name="my-bedrock-job",
    imported_model_name="my-bedrock-model",
    role_arn="arn:aws:iam::..."
)

Find Endpoints Using a Base Model

model_builder = ModelBuilder(model=training_job)
endpoint_names = model_builder.fetch_endpoint_names_for_base_model()
# Returns: set of endpoint name...
Read more

v3.0.1

03 Dec 18:37
869f3ee

Choose a tag to compare

What's Changed

Full Changelog: v3.0.0...v3.0.1

Note

This release is created retroactively for code deployed on Thu Nov 20 2025
All changes listed below are already live in production.

SageMaker V3 Release

03 Dec 18:11
fa30a6d

Choose a tag to compare

❗🔥 SageMaker V3 Release
Version 3.0.0 represents a significant milestone in our product's evolution. This major release introduces a modernized architecture, enhanced performance, and powerful new features while maintaining our commitment to user experience and reliability.

Important: Please review these breaking changes before upgrading.

Older interfaces such as Estimator, Model, Predictor and all their subclasses will not be supported in V3.

Please see our V3 examples folder for example notebooks and usage patterns.

Migrating to V3
Upgrading to 3.x
To upgrade to the latest version of SageMaker Python SDK 3.x:

pip install --upgrade sagemaker

If you prefer to downgrade to the 2.x version:

pip install sagemaker==2.*

See SageMaker V2 Examples for V2 documentation and examples.

Key Benefits of 3.x

Modular Architecture: Separate PyPI packages for core, training, and serving capabilities

Unified Training & Inference: Single classes (ModelTrainer, ModelBuilder) replace multiple framework-specific classes

Object-Oriented API: Structured interface with auto-generated configs aligned with AWS APIs

Simplified Workflows: Reduced boilerplate and more intuitive interfaces

Training Experience
V3 introduces the unified ModelTrainer class to reduce complexity of initial setup and deployment for model training. This replaces the V2 Estimator class and framework-specific classes (PyTorchEstimator, SKLearnEstimator, etc.).

This example shows how to train a model using a custom training container with training data from S3.

SageMaker Python SDK 2.x:

from sagemaker.estimator import Estimator
estimator = Estimator(
    image_uri="my-training-image",
    role="arn:aws:iam::123456789012:role/SageMakerRole",
    instance_count=1,
    instance_type="ml.m5.xlarge",
    output_path="s3://my-bucket/output"
)
estimator.fit({"training": "s3://my-bucket/train"})

SageMaker Python SDK 3.x:

from sagemaker.train import ModelTrainer
from sagemaker.train.configs import InputData

trainer = ModelTrainer(
    training_image="my-training-image",
    role="arn:aws:iam::123456789012:role/SageMakerRole"
)

train_data = InputData(
    channel_name="training",
    data_source="s3://my-bucket/train"
)

trainer.train(input_data_config=[train_data])

See more examples: SageMaker V3 Examples

Inference Experience
V3 introduces the unified ModelBuilder class for model deployment and inference. This replaces the V2 Model class and framework-specific classes (PyTorchModel, TensorFlowModel, SKLearnModel, XGBoostModel, etc.).

This example shows how to deploy a trained model for real-time inference.

SageMaker Python SDK 2.x:

from sagemaker.model import Model
from sagemaker.predictor import Predictor
model = Model(
    image_uri="my-inference-image",
    model_data="s3://my-bucket/model.tar.gz",
    role="arn:aws:iam::123456789012:role/SageMakerRole"
)
predictor = model.deploy(
    initial_instance_count=1,
    instance_type="ml.m5.xlarge"
)
result = predictor.predict(data)

SageMaker Python SDK 3.x:

from sagemaker.serve import ModelBuilder
model_builder = ModelBuilder(
    model="my-model",
    model_path="s3://my-bucket/model.tar.gz"
)
endpoint = model_builder.build()
result = endpoint.invoke(...)

See more examples: SageMaker V3 Examples

SageMaker V3 Examples

Training Examples

Inference Examples

ML Ops Examples

Looking for V2 Examples? See SageMaker V2 Examples below.

Note

This release is created retroactively for code deployed on Thu Nov 20 2025
All changes listed below are already live in production.

v2.255.0

03 Dec 20:46
227d4ac

Choose a tag to compare

What's Changed

  • Extracts reward Lambda ARN from Nova recipes
  • Passes it as training job hyperparameter
  • Added LLMFT recipe support with standardized recipe handling
  • Enhanced recipe validation and multi-model type compatibility

v2.254.1

31 Oct 02:54

Choose a tag to compare

Bug Fixes and Other Changes

  • update get_execution_role to directly return the ExecutionRoleArn if it presents in the resource metadata file
  • [hf] HF PT Training DLCs

v2.254.0

29 Oct 15:23

Choose a tag to compare

Features

  • Triton v25.09 DLC

Bug Fixes and Other Changes

  • Add Numpy 2.0 support
  • add HF Optimum Neuron DLCs
  • [Hugging Face][Pytorch] Inference DLC 4.51.3
  • [hf] HF Inference TGI

v2.253.1

14 Oct 17:37

Choose a tag to compare

Bug Fixes and Other Changes

  • Update instance type regex to also include hyphens
  • Revert the change "Add Numpy 2.0 support"
  • [hf-tei] add image uri to utils
  • add TEI 1.8.2

v2.253.0

10 Oct 17:32

Choose a tag to compare

Features

  • Added condition to allow eval recipe.
  • add model_type hyperparameter support for Nova recipes

Bug Fixes and Other Changes

  • Fix for a failed slow test: numpy fix
  • Add numpy 2.0 support
  • chore: domain support for eu-isoe-west-1
  • Adding default identity implementations to InferenceSpec
  • djl regions fixes #5273
  • Fix flaky integ test

v2.252.0

29 Sep 18:39

Choose a tag to compare

Features

  • change S3 endpoint env name
  • add eval custom lambda arn to hyperparameters

Bug Fixes and Other Changes

  • merge rba without the iso region changes
  • handle trial component status message longer than API supports
  • Add nova custom lambda in hyperparameter from estimator
  • add retryable option to emr step in SageMaker Pipelines
  • Feature/js mlops telemetry
  • latest tgi

v2.251.1

29 Aug 23:56

Choose a tag to compare

Bug Fixes and Other Changes

  • chore: onboard tei 1.8.0