Releases · huggingface/kernels

16 Oct 18:24

github-actions

v0.10.4

d8fefae

v0.10.4 Latest

Latest

What's Changed

start to build xpu kernels from torch 2.7 by @sywangyi in #165
fix: kernels upload to a repo branch by @sayakpaul in #168
feat: allow get_kernel to log telemetry. by @sayakpaul in #167
Set version to 0.10.4.dev0 by @danieldk in #169

Full Changelog: v0.10.3...v0.10.4

Contributors

danieldk, sayakpaul, and sywangyi

Assets 6

13 Oct 15:26

github-actions

v0.10.3

9592141

v0.10.3

New features

`kernels check`

This release adds a check subcommand to the kernels command. This subcommand can be used to check the ABI compatibility of a kernel on the Hub. For example:

$ kernels check kernels-community/activation
Checking variant: torch29-cxx11-cu130-x86_64-linux                                                                                                                                                                                   
  Dynamic library activation/_activation_beeaae6.abi3.so:                                                                                                                                                                            
    🐍 Python ABI 3.9 compatible                                                                                                                                                                                                     
    🐧 manylinux_2_28 compatible
[...]
Checking variant: torch29-cxx11-cu126-x86_64-linux                                                                                                                                                                                   
  Dynamic library activation/_activation_beeaae6.abi3.so:                                                                                                                                                                            
    🐍 Python ABI 3.9 compatible                                                                                                                                                                                                     
    🐧 manylinux_2_28 compatible

Upload a kernel to a branch

kernels upload now has an additional --branch option to upload a kernel to a branch.

What's Changed

Add support for NPU kernelize/layers by @zheliuyu in #155
Only run staging tests in one configuration by @danieldk in #156
Add a Makefile to run formatting one-shot by @sayakpaul in #157
Add the kernels check subcommand by @danieldk in #158
Add a note on torch.compile by @sayakpaul in #159
Link local kernel and local/locked kernel API docs by @danieldk in #160
Bump torch version in runner by @MekkCyber in #162
feat: allow kernels to be uploaded to a revision by @sayakpaul in #161
Set version to 0.10.3.dev0 by @danieldk in #164

Full Changelog: v0.10.2...v0.10.3

Contributors

danieldk, sayakpaul, and 2 other contributors

Assets 6

22 Sep 18:16

github-actions

v0.10.2

0a3c828

v0.10.2

New Features

XPU support

This release adds full support for Intel XPU devices, including kernel layers. XPU variants use the form: torch<torch-version>-cxx<C++-ABI>-xpu<OneAPI-version>-x86_64-linux.

`kernel upload` utility

Upload kernels to the Hub in a single command. For example, to upload the kernel in the current directory:

$ kernels upload . --repo_id="username/kernelname"

The repository will also be created (publicly) if it does not exist yet. For more information, see the documentation.

What's Changed

Add support for XPU layer repostories by @danieldk in #142
[feat] add an uploading utility. by @sayakpaul in #138
Improve errors for layer validation by @danieldk in #145
Describe the get_kernel/LayerRepository version argument by @danieldk in #147
Removing unexisting link in README by @MekkCyber in #148
Fix some spelling errors to check docs CI is working by @danieldk in #120
Document the to-wheel subcommand by @danieldk in #149
Bump huggingface_hub upper bound <2.0 by @Wauplin in #151
faq: why only replace forward methods? by @danieldk in #153
[tests] turn the kernels upload tests to be staging tests by @sayakpaul in #152
Set version to 0.10.2.dev0 by @danieldk in #154

New Contributors

@sayakpaul made their first contribution in #138
@Wauplin made their first contribution in #151

Full Changelog: v0.10.1...v0.10.2

Contributors

danieldk, Wauplin, and 2 other contributors

Assets 6

10 Sep 07:43

github-actions

v0.10.1

f03cde8

v0.10.1

What's Changed

XPU: look up kernel by framework version by @sywangyi in #139
Set version to 0.10.1.dev0 by @danieldk in #140

Full Changelog: v0.10.0...v0.10.1

Contributors

danieldk and sywangyi

Assets 6

05 Sep 08:54

github-actions

v0.10.0

e8d07e8

v0.10.0

New features

Before this release, get_local_kernel would only work with the top-level kernel directory (that contains build). This function will now also work with the top-level directory (mykernel), the build directory (mykernel/build), and the build variant directory (mykernel/build/torch28-cxx11-cu128-x86_64-linux).

Breaking API changes

The default for the mode argument of kernelize is removed.

Before this change, the default mode was Mode.TRAIN | Mode.COMPILE. This had the benefit that by default, kernelize would use kernels that support all use cases. However, it would skip e.g. inference-only kernels, which degrades performance when the user forgets to set mode when using kernels for inference.

What's Changed

Small markup fixes of the local kernel repo example by @danieldk in #127
feat: improve get local kernel importing by @drbh in #129
fix: add get local tests by @drbh in #134
cpu is not (yet) a supported device type by @danieldk in #132
Remove default for mode argument of kernelize by @danieldk in #136
Set version to v0.10.0.dev0 by @danieldk in #137

Contributors

danieldk and drbh

Assets 6

01 Aug 14:46

github-actions

v0.9.0

4332797

v0.9.0

New features

Initial ROCm support

This release adds the rocm device type. For instance to register a kernel that supports both CUDA and ROCm, you can use:

kernel_layer_mapping = {
    "SiluAndMul": {
        "cuda": LayerRepository(
            repo_id="kernels-community/activation",
            layer_name="SiluAndMul",
        ),
        "rocm": LayerRepository(
            repo_id="kernels-community/activation",
            layer_name="SiluAndMul",
        )
    }
}

register_kernel_mapping(kernel_layer_mapping)

Support for loading local kernel layers

For development and debugging it can often be useful to load kernel layers from a local directory. This is supported by the new LocalLayerRepository class. You can directly use the output of kernel-builder. For example:

kernel_layer_mapping = {
    "SiluAndMul": {
        "cuda": LocalLayerRepository(
            repo_path="/home/daniel/kernels/activation",
            package_name="activation",
            layer_name="SiluAndMul",
        )
    }
}

register_kernel_mapping(kernel_layer_mapping)

What's Changed

Fix typo in layers documentation by @shadeMe in #116
Update documentation for compatibility with doc-builder by @danieldk in #117
Test examples in docstrings using mktestdocs by @danieldk in #118
Add doc build to CI by @danieldk in #119
Log when using fallback layer by @danieldk in #121
Add LocalLayerRepository to load from a local repo by @danieldk in #123
Run black check by @danieldk in #124
Nix: go back to hf-nix main by @danieldk in #125
Add ROCm device discovery by @ahadnagy in #122
Set version to 0.9.0.dev0 by @danieldk in #126

New Contributors

@shadeMe made their first contribution in #116
@ahadnagy made their first contribution in #122

Full Changelog: v0.8.1...v0.9.0

Contributors

danieldk, shadeMe, and ahadnagy

Assets 6

23 Jul 12:47

github-actions

v0.8.1

0429131

v0.8.1

New features

Kernel version bounds

get_kernel adds a version argument, which you can use to fetch the latest version compatible with the given version specifier:

activation = kernels.get_kernel("kernels-community/activation", version=">=0.0.3,<0.1")

Version bounds are now also supported when registering layers:

kernel_layer_mapping = {
    "SiluAndMul": {
        "cuda": LayerRepository(
            repo_id="kernels-community/activation",
            version=">=0.0.3,<0.1",
            layer_name="SiluAndMul",
        )
    }
}

Kernel layer locking

Layers can now also use version locks from kernels.lock by using LockedLayerRepository:

kernel_layer_mapping = {
    "SiluAndMul": {
        "cuda": LockedLayerRepository(
            repo_id="kernels-community/activation",
            layer_name="SiluAndMul",
        )
    }
}

See the kernel locking documentation for more information.

What's Changed

get_kernel: allow Python-style version specifiers by @danieldk in #111
triton based kernel could also run in xpu by @sywangyi in #112
Add version support to LayerRepository by @danieldk in #113
Add support for project-wide locking of layers by @danieldk in #114
Set version to 0.8.1.dev0 by @danieldk in #115

New Contributors

@sywangyi made their first contribution in #112

Full Changelog: v0.8.0...v0.8.1

Contributors

danieldk and sywangyi

Assets 6

15 Jul 16:47

github-actions

v0.8.0

10a9686

v0.8.0

New features

Kernel mode fallbacks

Before this release, when using kernelize with a mode, it would only look up the exact match in the kernel mapping. Starting with this release, kernelize will fall back to other compatible modes. For instance, when a model is kernelized as

model = kernelize(model, mode=Mode.INFERENCE)

kernelize will try the following kernel mappings (in-order):

Mode.INFERENCE
Mode.INFERENCE | Mode.TORCH_COMPILE
Mode.TRAINING
Mode.TRAINING | Mode.TORCH_COMPILE
Mode.FALLBACK

since all these modes are compatible with inference. See the kernel modes documentation for more information and a list per mode of the possible fallbacks.

Support for registering kernels by compute capability

It is now possible to register multiple CUDA kernels with different capabilities. This will allow you to provide e.g. different kernels for Ada, Hopper, and Blackwell GPUs. See the docs on Registering kernels for specific CUDA capabilities for more information.

API-breaking changes

Mode.DEFAULT has been renamed to Mode.FALLBACK for clarity.

What's Changed

Fix macOS tests by marking some CUDA-only tests by @danieldk in #105
Support registering layers with a range of CUDA capabilities by @danieldk in #106
Improve mode handling by @danieldk in #108
Log kernel layer selection by @danieldk in #109
Set version to 0.8.0.dev0 by @danieldk in #110

Full Changelog: v0.7.0...v0.8.0

Contributors

danieldk

Assets 6

07 Jul 13:11

github-actions

v0.7.0

b7b5f40

v0.7.0

API changes

This version contains an API change to the kernelize function that makes it possible to use different kernels for inference/training/torch.compile. This requires a small adjustment to how kernelize is called, see the kernelize documentation for more information. In short, to kernelize a model for inference, use:

model = MyModel(...)
model = kernelize(model, mode=Mode.INFERENCE)

For training:

model = MyModel(...)
model = kernelize(model, mode=Mode.TRAINING)

What's Changed

Add get_local_kernel function by @danieldk in #102
Support registering inference/training-specific layers by @danieldk in #103
Set version to 0.7.0.dev0 by @danieldk in #104

Full Changelog: v0.6.2...v0.7.0

Contributors

danieldk

Assets 6

25 Jun 08:10

github-actions

v0.6.2

a30e821

v0.6.2

What's Changed

CI: main triton-layer-norm has docs, branch is gone by @danieldk in #99
darwin: fix variant CPU for aarch64 by @danieldk in #97
Make the flake work on Darwin by @danieldk in #98
Set version to 0.6.2.dev0 by @danieldk in #100

Full Changelog: v0.6.1...v0.6.2

Contributors

danieldk

Assets 6

Releases: huggingface/kernels

v0.10.4

What's Changed

Contributors

Uh oh!

v0.10.3

New features

kernels check

Upload a kernel to a branch

What's Changed

Contributors

Uh oh!

v0.10.2

New Features

XPU support

kernel upload utility

What's Changed

New Contributors

Contributors

Uh oh!

v0.10.1

What's Changed

Contributors

Uh oh!

v0.10.0

New features

Breaking API changes

What's Changed

Contributors

Uh oh!

v0.9.0

New features

Initial ROCm support

Support for loading local kernel layers

What's Changed

New Contributors

Contributors

Uh oh!

v0.8.1

New features

Kernel version bounds

Kernel layer locking

What's Changed

New Contributors

Contributors

Uh oh!

v0.8.0

New features

Kernel mode fallbacks

Support for registering kernels by compute capability

API-breaking changes

What's Changed

Contributors

Uh oh!

v0.7.0

API changes

What's Changed

Contributors

Uh oh!

v0.6.2

What's Changed

Contributors

Uh oh!

`kernels check`

`kernel upload` utility