Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installation And Usage Instruction #2

Open
stephenstubbs opened this issue Jul 23, 2020 · 9 comments
Open

Installation And Usage Instruction #2

stephenstubbs opened this issue Jul 23, 2020 · 9 comments

Comments

@stephenstubbs
Copy link

Hi,

I would really like to test this with microk8s. How do I install and use it?

@stephenstubbs
Copy link
Author

I only need backup and restore for dqlite.

@ktsakalozos
Copy link
Owner

Hi @sstubbs,

This migrator utility is packaged in the snap so what you have to do to take a backup is:

sudo /snap/microk8s/current/bin/migrator --mode backup-dqlite --db-dir db --debug

To restore the backup:

sudo /snap/microk8s/current/bin/migrator --mode restore-to-dqlite --db-dir db --debug

There are a few known issues that I am fixing right now:

  • If restore fails on an entry it does not retry. You may need to re-issue the same restoration command to work around this.
  • Fail messages when database directories exist.

We also want to expose this utility as a command, eg microk8s dbctl. What do you think about that?

There is also the need to set the expectations right. A backup & restore cycle will restore the datastore but it will not immediately restore your cluster.
As the new restored cluster runs under different certificates you may want to reapply the CNI with:

microk8s.kubectl delete -f /var//snap/microk8s/current/args/cni-network/cni.yaml
microk8s.kubectl apply -f /var//snap/microk8s/current/args/cni-network/cni.yaml

The service account keys (I think) that get restored from the backup will not be valid on the new cluster so they need to be reissued.

Addons that update the kubernetes service arguments (eg the dns) or download binaries (eg helm) will need to be disabled and re-enabled.

Some of the above are reasonable and may be handled by the microk8s dbctl command.

Your feedback is much appreciated. Thank you.

@stephenstubbs
Copy link
Author

That is really great news. I will test it now. a microk8s dbctl command would be awesome.

@stephenstubbs
Copy link
Author

stephenstubbs commented Jul 24, 2020

I've just created a new microk8s cluster and enabled the following:

microk8s enable ha-cluster dns rbac prometheus

then backed it up and uninstalled the snap. Then reinstalled the snap and restored the backup. It works really well in terms of the backup and restore however you right about the service account tokens not being able to be mounted. Is there anything else I can do or test with this? This would be really great to have as I'm using openebs to manage local pvs which are on different drives and mounted on the hostpath. Ideally I would like to be able to reinstall microk8s without having to restore them all from external backups in the future. I am backing them up so not the end of the world but if it's possible to do this that would be awesome.

I deleted and reapplied the cni.yaml but then calico goes into a crashloop after I stop and start microk8s

@ktsakalozos
Copy link
Owner

Have a look at this comment kubernetes/kubernetes#91070 (comment) .

In general the complete recovery of a k8s cluster from a back up is not a totally automatic process.

@stephenstubbs
Copy link
Author

ok thanks for the help. I will try this and different varations of disabling and reeabling plugins and the openebs chart and see. For my use it's mainly the bundled microk8s addons and openebs that I would like to recover as the openebs pvs use directories with unique ids on the hostpath. Everything else I have deployed can quickly and easily be redeployed anyway but if I didn't have to redeploy everything else that would be an added bonus.

@stephenstubbs
Copy link
Author

So I've been trying doing a recovery with the steps in various orders. I can't get this to work as coredns and calico just seem to cause too many problems with the previous versions being present in the backup. The only way I can think of would be to omit any resources related to calico and coredns from the backup so then a person could manually enable the dns plugin before restoring the database if they were using it before.

This would be an amazing feature to have but I think I'm just going to keep using external backups for PVCS if reinstalling microk8s for the time being.

Here are the steps where I can get some pods to run but I can't think of a better way until there is some other fix.

Inintial Cluster Install

sudo snap install microk8s --classic --channel=latest/edge/ha-preview && \
sudo usermod -a -G microk8s $USER && \
su - $USER

Encrypt kubernetes secrets

ENCRYPTION_KEY=$(cat 05-kubernetes/encryption.yaml) && ssh -t server-address \
"sudo -- bash -c 'echo \"$ENCRYPTION_KEY\" > /path/keyfile && chmod 0400 /path/keyfile'"

ulimit, secret encryption & max pods

sed -i 's/ulimit -n 65536/ulimit -n 1048576/g' /var/snap/microk8s/current/args/containerd-env && \
echo "ulimit -c unlimited || true" >> /var/snap/microk8s/current/args/containerd-env && \
echo "--max-pods=250" >> /var/snap/microk8s/current/args/kubelet && \
echo "--encryption-provider-config=/path/keyfile" >> /var/snap/microk8s/current/args/kube-apiserver && \
microk8s stop && \
microk8s start

encrypt all secrets

microk8s kubectl get secrets --all-namespaces -o json | microk8s kubectl replace -f -

addons

microk8s enable dns rbac prometheus
  • use microk8s config to get kubeconfig
  • run sudo vim /var/snap/microk8s/current/certs/csr.conf.template and add external ip

Backup

#!/usr/bin/env bash
sudo mkdir -p /backup && \
microk8s kubectl get sa --all-namespaces \
-o=jsonpath='{range .items[*]}--namespace={.metadata.namespace} serviceaccount/{.metadata.name}{"\n"}{end}' \
| sudo tee /backup/service-accounts.txt && \
sudo /snap/microk8s/current/bin/migrator --mode backup-dqlite --db-dir /backup/db --debug

Restore

sudo snap remove microk8s --purge && \
sudo snap install microk8s --classic --channel=latest/edge/ha-preview && \
sudo usermod -a -G microk8s $USER && \
su - $USER

Encrypt kubernetes secrets

ENCRYPTION_KEY=$(cat 05-kubernetes/encryption.yaml) && ssh -t server-address \
"sudo -- bash -c 'echo \"$ENCRYPTION_KEY\" > /path/keyfile && chmod 0400 /path/keyfile'"

ulimit, secret encryption & max pods

sed -i 's/ulimit -n 65536/ulimit -n 1048576/g' /var/snap/microk8s/current/args/containerd-env && \
echo "ulimit -c unlimited || true" >> /var/snap/microk8s/current/args/containerd-env && \
echo "--max-pods=250" >> /var/snap/microk8s/current/args/kubelet && \
echo "--encryption-provider-config=/path/keyfile" >> /var/snap/microk8s/current/args/kube-apiserver && \
microk8s stop && \
microk8s start

encrypt all secrets

microk8s kubectl get secrets --all-namespaces -o json | microk8s kubectl replace -f -
  • use microk8s config to get kubeconfig
  • run sudo vim /var/snap/microk8s/current/certs/csr.conf.template and add external ip

Restore dqlite

microk8s kubectl delete -f /var/snap/microk8s/current/args/cni-network/cni.yaml
sudo /snap/microk8s/current/bin/migrator --mode restore-to-dqlite --db-dir /backup/db --debug
microk8s disable dns
microk8s kubectl delete -n kube-system replicaset coredns-588fd544bf
  • manually remove hanging coredns pods
microk8s kubectl delete -f /var/snap/microk8s/current/args/cni-network/cni.yaml
  • manually remove hanging calico pods.
microk8s kubectl apply -f /var/snap/microk8s/current/args/cni-network/cni.yaml
sudo cat /backup/service-accounts.txt | xargs -L 1 microk8s kubectl patch --type=merge -p '{"secrets":[]}'
microk8s stop && \
microk8s start
microk8s enable dns

@ktsakalozos
Copy link
Owner

Here is a suggestion you could try out by itself or in combination with what you already have.

To take a backup do the following:

sudo /snap/microk8s/current/bin/migrator --mode backup-dqlite --db-dir /backup/db --debug
sudo cp -R /var/snap/microk8s/current/credentials /backup/
sudo cp -R /var/snap/microk8s/current/args /backup/
sudo cp -R /var/snap/microk8s/current/certs /backup/

To restore:

sudo cp -R /backup/certs /var/snap/microk8s/current/
sudo cp -R /backup/credentials /var/snap/microk8s/current/
sudo cp -R /backup/args /var/snap/microk8s/current/
sudo /snap/microk8s/current/bin/migrator --mode restore-to-dqlite --db-dir /backup/db --debug
sudo microk8s stop ; sudo microk8s start

The idea is that the "state of the cluster" is not only on the datastore but also on the arguments of the running services, the certificates issued and the credentials used by each component.

@stephenstubbs
Copy link
Author

stephenstubbs commented Jul 26, 2020

This seems to work just about perfectly. All I had to do was delete the hanging calico pod but all services say repairing then start working too it seems. Will do some more testing. Really impressed though. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants