Skip to content

1.5.0

Compare
Choose a tag to compare
@nesitor nesitor released this 16 Apr 16:05
· 96 commits to main since this release

We are excited to announce our release for aleph-vm in version 1.5.0!

This new release of aleph-vm brings significant improvements to the Aleph VM, including the new GPU reservation system and in-VM reboot functionality. It also addresses several important bug fixes related to CI stability, GPU detection, network configuration, VM lifecycle, and volume downloads. The upgrade of Firecracker and aleph-message, along with the Pydantic migration, further enhances the platform.

New Features

  • GPU Reservation System: Implemented a system for reserving GPUs. (PR #776)
  • VM Reboot from Inside: Allowed users to reboot their VMs from within the VM itself. (PR #790)
  • Operator Authentication Improvements: Enhanced operator authentication with better HTTP error reporting. (PR #788)

Bug Fixes

  • Random CI Failures: Fixed random failures in the Continuous Integration (CI) pipeline and merged workflows. (PR #787)
  • GPU x-vga Detection: Corrected the detection of x-vga for GPUs. (PR #786)
  • ndppd Proxy Interface Loading: Fixed an issue where the ndppd proxy was not loading existing interfaces on supervisor restart. (PR #783)
  • VM Lifecycle and Blocking Resources: Addressed issues related to the VM lifecycle and blocking resources. (PR #789)
  • Volume Download from Connector: Fixed a problem with downloading volumes from the connector. (PR #796)
  • Pydantic TypeAdapter Issue: Solved an issue related to Pydantic's TypeAdapter. (PR #794)
  • Confidential VM Cleanup (ALEPH-517): Fixed an issue related to the cleanup of confidential VMs. (PR #797)
  • Confidential and GPU Instance Stopping (ALEPH-518): Resolved a bug where confidential and GPU instances were stopping immediately after starting. (PR #798)
  • Message Requirements Casting: Solved a casting issue with message requirements. (PR #800)
  • Internet Check Failure: Fixed an issue causing the internet check to fail. (PR #799)

Improvements

  • Hatch Formatting: Performed some cleanup using hatch fmt. (PR #782)
  • Usage System: Removed reserved usage tracking for volumes in the usage system. (PR #784)
  • Misc Stability Improvements: Implemented various stability improvements (OL ALEPH-166). (PR #719)
  • Firecracker Upgrade: Upgraded Firecracker from version 1.5.0 to 1.7.0. (PR #771)
  • Pydantic Migration: Migrated the project from Pydantic V1 to V2. (PR #791)
  • Confidential VM Locale: Updated locale settings to en_US UTF-8 during the setup of confidential VMs. (PR #792)
  • aleph-message Upgrade: Upgraded the aleph-message dependency to version 1.0.0. (PR #795)

What's Changed

Full Changelog: 1.4.2...1.5.0

How to upgrade

1. Upgrade the packages

This part did not change, download and install the new package as usual.

On Debian 12 (Bookworm):

rm -f /opt/aleph-vm.debian-12.deb
wget -P /opt https://github.com/aleph-im/aleph-vm/releases/download/1.5.0/aleph-vm.debian-12.deb
apt install /opt/aleph-vm.debian-12.deb

On Ubuntu 22.04 (Jammy Jellyfish):

sudo rm -f /opt/aleph-vm.ubuntu-22.04.deb
sudo wget -P /opt https://github.com/aleph-im/aleph-vm/releases/download/1.5.0/aleph-vm.ubuntu-22.04.deb
sudo apt install /opt/aleph-vm.ubuntu-22.04.deb

On Ubuntu 24.04 (Noble Numbat):

sudo rm -f /opt/aleph-vm.ubuntu-24.04.deb
sudo wget -P /opt https://github.com/aleph-im/aleph-vm/releases/download/1.5.0/aleph-vm.ubuntu-24.04.deb
sudo apt install /opt/aleph-vm.ubuntu-24.04.deb

2. Enable GPU support (optional)

In order to enable GPU support on your Compute Resource Node, you must:

  1. Ensure that your system have a compatible GPU card.
  2. Detach GPU cards from the kernel module drivers and attach it to QEMU vfio drivers.
  3. Enable GPU support in the aleph-vm configuration.

Please follow these instructions

Enable GPU in the configuration of aleph-vm, by default in /etc/aleph-vm/supervisor.env. This is not the default yet.

ALEPH_VM_ENABLE_GPU_SUPPORT=True

After launching the server you can check the endpoint
http://localhost:4020/status/config or https://<your-node-domain>/status/config and verify that ENABLE_GPU_SUPPORT has the value true.