1.5.0
We are excited to announce our release for aleph-vm in version 1.5.0!
This new release of aleph-vm brings significant improvements to the Aleph VM, including the new GPU reservation system and in-VM reboot functionality. It also addresses several important bug fixes related to CI stability, GPU detection, network configuration, VM lifecycle, and volume downloads. The upgrade of Firecracker and aleph-message
, along with the Pydantic migration, further enhances the platform.
New Features
- GPU Reservation System: Implemented a system for reserving GPUs. (PR #776)
- VM Reboot from Inside: Allowed users to reboot their VMs from within the VM itself. (PR #790)
- Operator Authentication Improvements: Enhanced operator authentication with better HTTP error reporting. (PR #788)
Bug Fixes
- Random CI Failures: Fixed random failures in the Continuous Integration (CI) pipeline and merged workflows. (PR #787)
- GPU
x-vga
Detection: Corrected the detection ofx-vga
for GPUs. (PR #786) ndppd
Proxy Interface Loading: Fixed an issue where thendppd
proxy was not loading existing interfaces on supervisor restart. (PR #783)- VM Lifecycle and Blocking Resources: Addressed issues related to the VM lifecycle and blocking resources. (PR #789)
- Volume Download from Connector: Fixed a problem with downloading volumes from the connector. (PR #796)
- Pydantic TypeAdapter Issue: Solved an issue related to Pydantic's
TypeAdapter
. (PR #794) - Confidential VM Cleanup (ALEPH-517): Fixed an issue related to the cleanup of confidential VMs. (PR #797)
- Confidential and GPU Instance Stopping (ALEPH-518): Resolved a bug where confidential and GPU instances were stopping immediately after starting. (PR #798)
- Message Requirements Casting: Solved a casting issue with message requirements. (PR #800)
- Internet Check Failure: Fixed an issue causing the internet check to fail. (PR #799)
Improvements
- Hatch Formatting: Performed some cleanup using
hatch fmt
. (PR #782) - Usage System: Removed reserved usage tracking for volumes in the usage system. (PR #784)
- Misc Stability Improvements: Implemented various stability improvements (OL ALEPH-166). (PR #719)
- Firecracker Upgrade: Upgraded Firecracker from version 1.5.0 to 1.7.0. (PR #771)
- Pydantic Migration: Migrated the project from Pydantic V1 to V2. (PR #791)
- Confidential VM Locale: Updated locale settings to
en_US UTF-8
during the setup of confidential VMs. (PR #792) aleph-message
Upgrade: Upgraded thealeph-message
dependency to version 1.0.0. (PR #795)
What's Changed
- CI: Fix random failures. Merge workflows by @olethanh in #787
- Bit of clean up with hatch fmt by @olethanh in #782
- Gpu Reservation system by @olethanh in #776
- Fix
x-vga
detection on GPUs by @nesitor in #786 - Fix: ndppd proxy not loading existing interfaces on supervisor restart by @nesitor in #783
- Usage system: remove reserved usage for volumes by @olethanh in #784
- Ol aleph 166 misc stability by @olethanh in #719
- Fix: Upgrade Firecracker 1.5.0 -> 1.7.0 by @hoh in #771
- Fix VM lifecycle and blocking resource. by @olethanh in #789
- enh: allow user to reboot from inside the VM by @olethanh in #790
- enh: Operator auth. Better http error by @olethanh in #788
- Migrate Pydantic V1 to V2 by @nesitor in #791
- Solve Pydantic TypeAdapter issue by @nesitor in #794
- Confidentials: Update locale settings to en_US UTF-8 during setup by @aliel in #792
- FIX downloading volume from connector by @olethanh in #796
- Upgrade
aleph-message==1.0.0
version dependency by @nesitor in #795 - Fix ALPEH-517 confidential clean up by @olethanh in #797
- ALEPH-518 Fix Confidential and GPU instance stopping after start by @olethanh in #798
- Solve message requirements casting issue by @nesitor in #800
- Fix internet check failure by @olethanh in #799
Full Changelog: 1.4.2...1.5.0
How to upgrade
1. Upgrade the packages
This part did not change, download and install the new package as usual.
On Debian 12 (Bookworm):
rm -f /opt/aleph-vm.debian-12.deb
wget -P /opt https://github.com/aleph-im/aleph-vm/releases/download/1.5.0/aleph-vm.debian-12.deb
apt install /opt/aleph-vm.debian-12.deb
On Ubuntu 22.04 (Jammy Jellyfish):
sudo rm -f /opt/aleph-vm.ubuntu-22.04.deb
sudo wget -P /opt https://github.com/aleph-im/aleph-vm/releases/download/1.5.0/aleph-vm.ubuntu-22.04.deb
sudo apt install /opt/aleph-vm.ubuntu-22.04.deb
On Ubuntu 24.04 (Noble Numbat):
sudo rm -f /opt/aleph-vm.ubuntu-24.04.deb
sudo wget -P /opt https://github.com/aleph-im/aleph-vm/releases/download/1.5.0/aleph-vm.ubuntu-24.04.deb
sudo apt install /opt/aleph-vm.ubuntu-24.04.deb
2. Enable GPU support (optional)
In order to enable GPU support on your Compute Resource Node, you must:
- Ensure that your system have a compatible GPU card.
- Detach GPU cards from the kernel module drivers and attach it to QEMU
vfio
drivers. - Enable GPU support in the
aleph-vm
configuration.
Please follow these instructions
Enable GPU in the configuration of aleph-vm
, by default in /etc/aleph-vm/supervisor.env
. This is not the default yet.
ALEPH_VM_ENABLE_GPU_SUPPORT=True
After launching the server you can check the endpoint
http://localhost:4020/status/config
or https://<your-node-domain>/status/config
and verify that ENABLE_GPU_SUPPORT
has the value true
.