Disaster recovery

Back up etcd data
Recover from lost master hosts
Recover from lost master hosts
Restore back to a previous cluster state
- Restore back to a previous cluster state
- Recover from expired control plane certificates

Back up etcd data

Follow these steps to back up etcd data by creating a snapshot. This snapshot can be saved and used at a later time if you need to restore etcd.

Prerequisites

SSH access to a master host.

Procedure

Access a master host as the root user.
Run the etcd-snapshot-backup.sh script and pass in the location to save the etcd snapshot to.
In this example, the snapshot is saved to ./assets/backup/snapshot.db on the master host.

Recover from lost master hosts

This document describes the process to recover from a complete loss of a master host. This includes situations where a majority of master hosts have been lost, leading to etcd quorum loss and the cluster going offline.

At a high level, the procedure is to:

Restore etcd quorum on a remaining master host.
Create new master hosts.
Correct DNS and load balancer entries.
Grow etcd to full membership.

If the majority of master hosts have been lost, you will need a backed up etcd snapshot to restore etcd quorum on the remaining master host.

Recover from lost master hosts

Follow these steps to recover from the loss of one or more master hosts.

Prerequisites

Access to the cluster as a user with the cluster-admin role.
SSH access to a remaining master host.
A backed up etcd snapshot, if we are recovering a loss of a majority of masters.

Procedure

Restore etcd quorum on the remaining master. Note

This step is only necessary if we have had a majority of the masters fail. We can skip this step if we have a majority of the masters still available.

Copy the etcd snapshot file to the remaining master host.
This procedure assumes that we have copied a snapshot file called snapshot.db to the /home/core/ directory of the master host.
Access the remaining master host.
Set the INITIAL_CLUSTER variable to the list of members in the format of <name>=<url>. This variable will be passed to the restore script, and in this procedure, it is assumed that there is only a single member at this time.

Run the etcd-snapshot-restore.sh script.

Pass in two parameters to the etcd-snapshot-restore.sh script: the path to the backed up etcd snapshot file and list of members, which is defined by the INITIAL_CLUSTER variable.

[core@ip-10-0-143-125 ~]$ sudo /usr/local/bin/etcd-snapshot-restore.sh /home/core/snapshot.db $INITIAL_CLUSTER
Create asset directory ./assets
Downloading etcdctl binary..
etcdctl version: 3.3.10
API version: 3.3
Backing up /etc/kubernetes/manifests/etcd-member.yaml to ./assets/backup/
Stopping all static pods..
..stopping kube-scheduler-pod.yaml
..stopping kube-controller-manager-pod.yaml
..stopping kube-apiserver-pod.yaml
..stopping etcd-member.yaml
Stopping etcd..
Waiting for etcd-member to stop
Stopping kubelet..
Stopping all containers..
bd44e4bc942276eb1a6d4b48ecd9f5fe95570f54aa9c6b16939fa2d9b679e1ea
d88defb9da5ae623592b81619e3690faeb4fa645440e71c029812cb960ff586f
3920ced20723064a379739c4a586f909497a7b6705a5b3cf367d9b930f23a5f1
d470f7a2d962c90f3a21bcc021970bde96bc8908f317ec70f1c21720b322c25c
Backing up etcd data-dir..
Removing etcd data-dir /var/lib/etcd
Restoring etcd member etcd-member-ip-10-0-143-125.ec2.internal from snapshot..
2019-05-15 19:03:34.647589 I | pkg/netutil: resolving etcd-0.clustername.devcluster.openshift.com:2380 to 10.0.143.125:2380
2019-05-15 19:03:34.883545 I | mvcc: restore compact to 361491
2019-05-15 19:03:34.915679 I | etcdserver/membership: added member cbe982c74cbb42f [https://etcd-0.clustername.devcluster.openshift.com:2380] to cluster 807ae3bffc8d69ca
Starting static pods..
..starting kube-scheduler-pod.yaml
..starting kube-controller-manager-pod.yaml
..starting kube-apiserver-pod.yaml
..starting etcd-member.yaml
Starting kubelet..

Once the etcd-snapshot-restore.sh script completes, the cluster should now have a single member etcd cluster, and API services will begin restarting. This might take up to 15 minutes.

In a terminal that has access to the cluster, run to verify that it is ready:

$ oc get nodes -l node-role.kubernetes.io/master
NAME                                         STATUS   ROLES    AGE   VERSION
ip-10-0-143-125.us-east-2.compute.internal   Ready    master   46m   v1.13.4+db7b699c3

Note

Be sure that all old etcd members being replaced are shut down. Otherwise, they might try to connect to the new cluster and will report errors like the following in the logs:

2019-05-20 15:33:17.648445 E | rafthttp: request cluster ID mismatch (got 9f5f9f05e4d43b7f want 807ae3bffc8d69ca)

Create new master hosts.
If the cluster has its Machine API enabled and functional, then when the OpenShift machine-api Operator is restored, it will create the new masters. If we do not have the machine-api Operator enabled, create new masters using the same methods that were used to originally create them.
You will also need to approve the certificates signing requests (CSRs) for these new master hosts. Two pending CSRs are generated for each machine that was added to the cluster.
1. In a terminal that has access to the cluster, runs to approve the CSRs:
  1. Get the list of current CSRs.
  2. Review the details of a CSR to verify it is valid.
    1
    
    <csr_name> is the name of a CSR from the list of current CSRs.
  3. Approve each valid CSR.
    Be sure to approve both the pending client and server CSR for each master that was added to the cluster.
2. In a terminal that has access to the cluster, run to verify that your masters are ready:
Correct the DNS entries.
1. From the AWS console, review the etcd-0, etcd-1, and etcd-2 Route 53 records in the private DNS zone, and if necessary, update the value to the appropriate new private IP address. See Editing Records in the AWS documentation for instructions.
  We can obtain the private IP address of an instance by running the following command in a terminal that has access to the cluster.
Update load balancer entries.
If we are using a cluster-managed load balancer, the entries will automatically be updated for you. If we are not, be sure to update your load balancer with the current addresses of the master hosts.
If your load balancing is managed by AWS, see Register or Deregister Targets by IP Address in the AWS documentation for instructions on updating load balancer entries.
Grow etcd to full membership.
1. Set up a temporary etcd certificate signer service on your master where we have restored etcd.
  1. Access the original master, and log in to the cluster as a cluster-admin user...
  2. Obtain the pull specification for the kube-etcd-signer-server image.
    [core@ip-10-0-143-125 ~]$ export KUBE_ETCD_SIGNER_SERVER=$(sudo oc adm release info --image-for kube-etcd-signer-server --registry-config=/var/lib/kubelet/config.json)
  3. Run the tokenize-signer.sh script.
    Be sure to pass in the -E flag to sudo so that environment variables are properly passed to the script.
    1
    
    The host name of the original master you just restored, where the signer should be deployed.
  4. Create the signer Pod using the file that was generated.
  5. Verify that the signer is listening on this master node.
2. Add the new master hosts to the etcd cluster.
  1. Access one of the new master hosts, and log in to the cluster as a cluster-admin user...
  2. Export two environment variables required by the etcd-member-recover.sh script.
  3. Run the etcd-member-recover.sh script.
    Be sure to pass in the -E flag to sudo so that environment variables are properly passed to the script.
    1
    
    Specify both the IP address of the original master where the signer server is running, and the etcd name of the new member.
  4. Verify that the new master host has been added to the etcd member list.
    1. Access the original master and connect to the running etcd container.
    2. In the etcd container, export variables needed for connecting to etcd.
    3. In the etcd container, execute etcdctl member list and verify that the new member is listed.
      Note that it may take up to 10 minutes for the new member to start.
  5. Repeat these steps to add your other new master host until we have achieved full etcd membership.
3. After all members are restored, remove the signer Pod because it is no longer needed.
  In a terminal that has access to the cluster, run:

Restoring back to a previous cluster state

In order to restore the cluster to a previous state, have previously backed up etcd data by creating a snapshot. You will use this snapshot to restore the cluster state.

Restoring back to a previous cluster state

We can use a saved etcd snapshot to restore back to a previous cluster state.

Prerequisites

Access to the cluster as a user with the cluster-admin role.
SSH access to master hosts.
A backed up etcd snapshot.

Procedure

Prepare each master host in the cluster to be restored.
You should run the restore script on all of the master hosts within a short period of time so that the cluster members come up at about the same time and form a quorum. For this reason, it is recommended to stage each master host in a separate terminal, so that the restore script can then be started quickly on each.
1. Copy the etcd snapshot file to a master host.
  This procedure assumes that we have copied a snapshot file called snapshot.db to the /home/core/ directory of the master host.
2. Access the master host.
3. Set the INITIAL_CLUSTER variable to the list of members in the format of <name>=<url>. This variable will be passed to the restore script and must be exactly the same for each member.
4. Repeat these steps on your other master hosts, each in a separate terminal.
Run the restore script on all of the master hosts.
1. Start the etcd-snapshot-restore.sh script on your first master host. Pass in two parameters: the path to the snapshot file and list of members, which is defined by the INITIAL_CLUSTER variable.
2. Once the restore starts, run the script on your other master hosts.
Verify that the Machine Configs have been applied.
In a terminal that has access to the cluster as a cluster-admin user, run.
When the snapshot has been applied, the currentConfig of the master will match the ID from when the etcd snapshot was taken. The currentConfig name for masters is in the format rendered-master-<currentConfig>.

Verify that all master hosts have started and joined the cluster.

Access a master host and connect to the running etcd container.
In the etcd container, export variables needed for connecting to etcd.

In the etcd container, execute etcdctl member list and verify that the three members show as started.

sh-4.2#  etcdctl member list -w table

+------------------+---------+------------------------------------------+------------------------------------------------------------------+---------------------------+
|        ID        | STATUS  |                   NAME                   |                            PEER ADDRS                            |       CLIENT ADDRS        |
+------------------+---------+------------------------------------------+------------------------------------------------------------------+---------------------------+
| 29e461db6be4eaaa | started | etcd-member-ip-10-0-164-170.ec2.internal | https://etcd-2.clustername.devcluster.openshift.com:2380 | https://10.0.164.170:2379 |
|  cbe982c74cbb42f | started | etcd-member-ip-10-0-143-125.ec2.internal | https://etcd-0.clustername.devcluster.openshift.com:2380 | https://10.0.143.125:2379 |
| a752f80bcb0da3e8 | started |   etcd-member-ip-10-0-156-2.ec2.internal | https://etcd-1.clustername.devcluster.openshift.com:2380 |   https://10.0.156.2:2379 |
+------------------+---------+------------------------------------------+------------------------------------------------------------------+---------------------------+

Note that it may take up to 10 minutes for each new member to start.

Recover from expired control plane certificates

Follow this procedure to recover from a situation where your control plane certificates have expired.

Prerequisites

SSH access to master hosts.

Procedure

Access a master host with an expired certificate as the root user.
Obtain the cluster-kube-apiserver-operator image reference for a release.
1

An example value for <release_image> is quay.io/openshift-release-dev/ocp-release:4.1.0.
Pull the cluster-kube-apiserver-operator image.

Create a recovery API server.

# podman run \
         -it \
         --network=host \
         -v /etc/kubernetes/:/etc/kubernetes/:Z \
         --entrypoint=/usr/bin/cluster-kube-apiserver-operator "${KAO_IMAGE}" recovery-apiserver create

Run the export KUBECONFIG command from the output of the above command, which is needed for the oc commands later in this procedure.
Wait for the recovery API server to come up.

Run the regenerate-certificates command. It fixes the certificates in the API, overwrites the old certificates on the local drive, and restarts static Pods to pick them up.

# podman run \
         -it \
         --network=host \
         -v /etc/kubernetes/:/etc/kubernetes/:Z \
         --entrypoint=/usr/bin/cluster-kube-apiserver-operator "${KAO_IMAGE}" regenerate-certificates

After the certificates are fixed in the API, use the following commands to force new rollouts for the control plane. It will reinstall itself on the other nodes because the kubelet is connected to API servers using an internal load balancer.

# oc patch kubeapiserver cluster \
     -p='{"spec": {"forceRedeploymentReason": "recovery-'"$( date --rfc-3339=ns )"'"}}' \
     --type=merge

# oc patch kubecontrollermanager cluster \
     -p='{"spec": {"forceRedeploymentReason": "recovery-'"$( date --rfc-3339=ns )"'"}}' \
     --type=merge

# oc patch kubescheduler cluster \
     -p='{"spec": {"forceRedeploymentReason": "recovery-'"$( date --rfc-3339=ns )"'"}}' \
     --type=merge

Create a bootstrap kubeconfig with a valid user.
1. Run the recover-kubeconfig.sh script and save the output to a file called kubeconfig.
2. Copy the kubeconfig file to all master hosts and move it to /etc/kubernetes/kubeconfig.
3. Get the CA certificate used to validate connections from the API server.
4. Copy the /etc/kubernetes/ca.crt file to all other master hosts and nodes.
5. Add the machine-config-daemon-force file to all master hosts and nodes to force the Machine Config Daemon to accept this certificate update.
Recover the kubelet on all masters.
1. On a master host, stop the kubelet.
2. Delete stale kubelet data.
3. Restart the kubelet.
4. Repeat these steps on all other master hosts.
If necessary, recover the kubelet on the worker nodes.
After the master nodes are restored, the worker nodes might restore themselves. We can verify this by running the oc get nodes command. If the worker nodes are not listed, then perform the following steps on each worker node.
1. Stop the kubelet.
2. Delete stale kubelet data.
3. Restart the kubelet.
Approve the pending node-bootstrapper certificates signing requests (CSRs).
1. Get the list of current CSRs.
2. Review the details of a CSR to verify it is valid.
  1
  
  <csr_name> is the name of a CSR from the list of current CSRs.
3. Approve each valid CSR.
  Be sure to approve all pending node-bootstrapper CSRs.
Destroy the recovery API server because it is no longer needed.
Wait for the control plane to restart and pick up the new certificates. This might take up to 10 minutes.