Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] import experiment-base #47

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@ venv/
*.egg-info/

# Plots and results (uncomment once results are stable, and publicly available)
plots/
results/
#plots/
#results/

# Kubernetes
.kube/
Expand Down
1 change: 1 addition & 0 deletions K8S_VERSION
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
1.28.5
18 changes: 3 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,10 @@
# Granny Experiments

This repo contains the experiments for the [Granny paper](
https://arxiv.org/abs/2302.11358).
This repo contains the experiments for the [Granny paper](https://arxiv.org/abs/2302.11358).

When following any instructions in this repository, it is recommended to
have two open terminals:
* One on the [`experiment-base`](https://github.com/faasm/experiment-base) repo
with the virtual environment activated (`source ./bin/workon.sh`). From now
onward, we will refer to this shell by its venv name: `faasm-exp-base`.
* One with this repo and the virtual environment activated
(`source ./bin/workon.sh`). From now onward, we will refer to this shell by
its venv name: `faasm-exp-faabric`.
When following any instructions in this repository, it is recommended to have a dedicated terminal with virtual environment of this repo activated: (`source ./bin/workon.sh`).

The former is used to provision/deprovision K8s clusters on Azure (with AKS),
and also to access low-level monitoring tools (we recommend `k9s`).

The latter is used to deploy Faabric clusters, run the experiments, and plot
the results.
This virtual environment provides commands for provision/deprovision K8s clusters on Azure (with AKS), accessing low-level monitoring tools (we recommend `k9s`), and also commands for deploy Faabric clusters, run the experiments, and plot the results.

## Experiments in this repository

Expand Down
1 change: 1 addition & 0 deletions config/granny_aks_kubelet_config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{ "allowedUnsafeSysctls": ["net.*"] }
13 changes: 13 additions & 0 deletions config/granny_aks_os_config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
{
"sysctls": {
"netCoreRmemMax": 16777216,
"netCoreWmemMax": 16777216,
"netIpv4TcpRmem": "4096 87380 16777216",
"netIpv4TcpWmem": "4096 65536 16777216",
"netCoreNetdevMaxBacklog": "30000",
"netCoreRmemDefault": 16777216,
"netCoreWmemDefault": 16777216,
"netIpv4TcpMem": "16777216 16777216 16777216",
"netIpv4RouteFlush": 1
}
}
1 change: 1 addition & 0 deletions plots/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
0.27.0/*
Binary file added plots/elastic/elastic_speedup.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added plots/kernels-mpi/mpi_kernels_slowdown.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added plots/kernels-omp/openmp_kernels_slowdown.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions tasks/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
from invoke import Collection

from . import cluster
from . import docker
from . import format_code
from . import k8s

import logging

Expand All @@ -20,8 +22,10 @@
logging.getLogger().setLevel(logging.DEBUG)

ns = Collection(
cluster,
docker,
format_code,
k8s,
)

ns.add_collection(elastic_ns, name="elastic")
Expand Down
143 changes: 143 additions & 0 deletions tasks/cluster.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
from invoke import task
from os.path import join
from subprocess import run
from tasks.util.env import (
ACR_NAME,
AKS_CLUSTER_NAME,
AKS_NODE_COUNT,
AKS_REGION,
AKS_VM_SIZE,
AZURE_PUB_SSH_KEY,
AZURE_RESOURCE_GROUP,
CONFIG_DIR,
KUBECTL_BIN,
)
from tasks.util.version import get_k8s_version


# AKS commandline reference here:
# https://docs.microsoft.com/en-us/cli/azure/aks?view=azure-cli-latest
def _run_aks_cmd(name, az_args=None):
cmd = [
"az",
"aks {}".format(name),
"--resource-group {}".format(AZURE_RESOURCE_GROUP),
]

if az_args:
cmd.extend(az_args)

cmd = " ".join(cmd)
print(cmd)
run(cmd, shell=True, check=True)


@task
def list(ctx):
"""
List all AKS resources
"""
_run_aks_cmd("list")


@task(optional=["sgx"])
def provision(
ctx,
nodes=AKS_NODE_COUNT,
vm=AKS_VM_SIZE,
location=AKS_REGION,
name=AKS_CLUSTER_NAME,
sgx=False,
granny=True,
):
"""
Provision the AKS cluster
"""
k8s_ver = get_k8s_version()
sgx = sgx and (sgx.lower() != "false")
granny_kubelet_config = join(CONFIG_DIR, "granny_aks_kubelet_config.json")
granny_os_config = join(CONFIG_DIR, "granny_aks_os_config.json")

if sgx and "Standard_DC" not in vm:
print(
"Error provisioning SGX cluster: only `Standard_DC` VMs are supported"
)
return

_run_aks_cmd(
"create",
[
"--name {}".format(name),
"--node-count {}".format(nodes),
"--node-vm-size {}".format(vm),
"--os-sku Ubuntu",
"--kubernetes-version {}".format(k8s_ver),
"--ssh-key-value {}".format(AZURE_PUB_SSH_KEY),
"--location {}".format(location),
# Could not create a role assignment for ACR. Are you an Owner on this subscription?
# "--attach-acr {}".format(ACR_NAME.split(".")[0]),
"{}".format(
"--kubelet-config {}".format(granny_kubelet_config)
if granny
else ""
),
"{}".format(
"--linux-os-config {}".format(granny_os_config)
if granny
else ""
),
"{}".format(
"--enable-addons confcom --enable-sgxquotehelper"
if sgx
else ""
),
],
)


@task
def details(ctx):
"""
Show the details of the cluster
"""
_run_aks_cmd(
"show",
[
"--name {}".format(AKS_CLUSTER_NAME),
],
)


@task
def delete(ctx, name=AKS_CLUSTER_NAME):
"""
Delete the AKS cluster
"""
_run_aks_cmd(
"delete",
[
"--name {}".format(name),
"--yes",
],
)


@task
def credentials(ctx, name=AKS_CLUSTER_NAME, out_file=None):
"""
Get credentials for the AKS cluster
"""
# Set up the credentials
_run_aks_cmd(
"get-credentials",
[
"--name {}".format(name),
"--overwrite-existing",
"--file {}".format(out_file) if out_file else "",
],
)

# Check we can access the cluster
cmd = "{} get nodes".format(KUBECTL_BIN)
print(cmd)
run(cmd, shell=True, check=True)
7 changes: 6 additions & 1 deletion tasks/elastic/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Elastic Scaling Micro-Benchmark
# Elastic Scaling Micro-Benchmark (Fig.12)

In this experiment we measure the benefits of elastically scaling-up OpenMP
applications to benefit from idle resources. We run a pipe-lined algorithm
Expand Down Expand Up @@ -44,6 +44,11 @@ You may now plot the results using:
inv elastic.plot
```

the plot will be available in [`/plots/elastic/elastic_speedup.pdf`](/plots/elastic/elastic_speedup.pdf), we also include it below:

![Elastic Scaling Plot](/plots/elastic/elastic_speedup.png)


## Clean-Up

Finally, delete the Granny cluster:
Expand Down
91 changes: 91 additions & 0 deletions tasks/k8s.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
from invoke import task
from os.path import join, exists
from os import makedirs
from shutil import copy, rmtree
from subprocess import run

from tasks.util.env import (
BIN_DIR,
GLOBAL_BIN_DIR,
K9S_VERSION,
)

from tasks.util.version import get_k8s_version


def _download_binary(url, binary_name):
makedirs(BIN_DIR, exist_ok=True)
cmd = "curl -LO {}".format(url)
run(cmd, shell=True, check=True, cwd=BIN_DIR)
run("chmod +x {}".format(binary_name), shell=True, check=True, cwd=BIN_DIR)

return join(BIN_DIR, binary_name)


def _symlink_global_bin(binary_path, name):
global_path = join(GLOBAL_BIN_DIR, name)
if exists(global_path):
print("Removing existing binary at {}".format(global_path))
run(
"sudo rm -f {}".format(global_path),
shell=True,
check=True,
)

print("Symlinking {} -> {}".format(global_path, binary_path))
run(
"sudo ln -s {} {}".format(binary_path, name),
shell=True,
check=True,
cwd=GLOBAL_BIN_DIR,
)


@task
def install_kubectl(ctx, system=False):
"""
Install the k8s CLI (kubectl)
"""
k8s_ver = get_k8s_version()
url = "https://dl.k8s.io/release/v{}/bin/linux/amd64/kubectl".format(
k8s_ver
)

binary_path = _download_binary(url, "kubectl")

# Symlink for kubectl globally
if system:
_symlink_global_bin(binary_path, "kubectl")


@task
def install_k9s(ctx, system=False):
"""
Install the K9s CLI
"""
tar_name = "k9s_Linux_amd64.tar.gz"
url = "https://github.com/derailed/k9s/releases/download/v{}/{}".format(
K9S_VERSION, tar_name
)
print(url)

# Download the TAR
workdir = "/tmp/k9s-csg"
makedirs(workdir, exist_ok=True)

cmd = "curl -LO {}".format(url)
run(cmd, shell=True, check=True, cwd=workdir)

# Untar
run("tar -xf {}".format(tar_name), shell=True, check=True, cwd=workdir)

# Copy k9s into place
binary_path = join(BIN_DIR, "k9s")
copy(join(workdir, "k9s"), binary_path)

# Remove tar
rmtree(workdir)

# Symlink for k9s command globally
if system:
_symlink_global_bin(binary_path, "k9s")
19 changes: 9 additions & 10 deletions tasks/kernels_mpi/README.md
Original file line number Diff line number Diff line change
@@ -1,34 +1,34 @@
# ParRes Kernels Experiment (MPI)
# ParRes Kernels Experiment - MPI (Fig.9b)

This experiment runs a set of the [ParRes Kernels](https://github.com/ParRes/Kernels)
as a microbenchmark for Granny's MPI implementation.

## Start AKS cluster

In the `experiment-base` terminal, run:
Create a new cluster:

```bash
(faasm-exp-base) inv cluster.provision --vm Standard_D8_v5 --nodes 3 cluster.credentials
inv cluster.provision --vm Standard_D8_v5 --nodes 3 cluster.credentials
```

## Granny

Deploy the cluster:

```bash
(faasm-exp-faabric) faasmctl deploy.k8s --workers=2
faasmctl deploy.k8s --workers=2
```

Upload the WASM file:

```bash
(faasm-exp-faabric) inv kernels-mpi.wasm.upload
inv kernels-mpi.wasm.upload
```

and run the experiment with:

```bash
(faasm-exp-faabric) inv kernels-mpi.run.wasm
inv kernels-mpi.run.wasm
```

finally, delete the Granny cluster:
Expand Down Expand Up @@ -63,15 +63,14 @@ To plot the results, just run:
inv kernels-mpi.plot
```

the plot will be available in [`./plots/kernels-mpi/mpi_kernels_slowdown.pdf`](
./plots/kernels-mpi/mpi_kernels_slowdown.pdf), we also include it below:
the plot will be available in [`/plots/kernels-mpi/mpi_kernels_slowdown.pdf`](/plots/kernels-mpi/mpi_kernels_slowdown.pdf), we also include it below:

![MPI Kernels Slowdown Plot](./plots/kernels-mpi/mpi_kernels_slowdown.png)
![MPI Kernels Slowdown Plot](/plots/kernels-mpi/mpi_kernels_slowdown.png)

## Clean-up

Finally, delete the AKS cluster:

```bash
(faasm-exp-base) inv cluster.delete
inv cluster.delete
```
Loading