Simple note-to-self about restoring a volume from a Longhorn backup.
Should be straightforward, but have made of mess of it a few times - so therefore a short 'note-to-self' about restoring a Longhorn backup. Longhorn backups is stored remotely on AWS S3/NFS contrary to snapshots.
Requirements: Installed Longhorn, an existing backup and the Longhorn backup target should be working correctly.
Disclaimer: I am not a Longhorn/K3s expert. Use your own judgment.
Backups have already been created. Cannot figure out how to create Longhorn backups from kubctl, so backups are made directly from Longhorn GUI or by backup jobs. All very simple and well made.
Existing backups:
kubectl get backups.longhorn.io -n longhorn-system
NAME SNAPSHOTNAME SNAPSHOTSIZE SNAPSHOTCREATEDAT STATE LASTSYNCEDAT
backup-485e03e3292146d9 e5533596-6028-4f30-ae87-c486d9e0b6de 385875968 2023-04-05T14:14:13Z Completed 2023-04-05T14:14:17Z
And the volumes from which backup backup-485e03e3292146d9 have been taken.
kubectl get backupvolumes.longhorn.io -n longhorn-system
NAME CREATEDAT LASTBACKUPNAME LASTBACKUPAT LASTSYNCEDAT
pvc-d735df79-6107-42ff-be45-4a8e4336df66 2023-04-05T14:13:46Z backup-485e03e3292146d9 2023-04-05T14:14:13Z 2023-04-05T15:05:01Z
Notice longhorn backup keep the 'old' volume name (pvc-d735df79-6107-42ff-be45-4a8e4336df66).
The PVC. Connecting PVC name (wp-pv-claim) to the volume (pvc-d735df79-6107-42ff-be45-4a8e4336df66)
kubectl get persistentvolumeclaims -n wordpress
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
wp-pv-claim Bound pvc-d735df79-6107-42ff-be45-4a8e4336df66 5Gi RWO longhorn 109m
PVC 'wp-pv-claim' has created a volume 'pvc-d735df79-6107-42ff-be45-4a8e4336df66'. From this volume there exist a least one backup: 'backup-485e03e3292146d9'. All above data is also available from Rancher/Longhorn GUI.
Restore backup
Longhorn cannot do in-place restore jobs. I.e., volumes need to be removed or the backup will need to restore to a different volume name. Not a colossal problem, as Longhorn keep all relevant data for the volume. As volume name and namespace.
I have two types of restore jobs: A) Pods are running fine, but data is garbage and B) All is gone. Pods/workloads attached to the PVC have been deleted or failing. I do not matter if data have been delete/corrupted. You are left with the original helm charts/deployments files and the longhorn backup. Basically, start with the namespace.
For A) Restoring PV while workloads are still intact is straight forward; delete PVC and select the relevant backup set and run 'restore latest backup'.
Make absolutely sure the correct PVC/PV is deleted. Most cases you would know which workload are using which PVCs, but to make sure; data is available from Longhorn GUI or running kubectl describe pod-name -n namespace. When the PVC/PV is removed, the workload will stop working.
Find the relevant pod:
kubectl get pods -n wordpress
NAME READY STATUS RESTARTS AGE
wordpress-5c8796574f-ltxj9 1/1 Running 0 17h
wordpress-mysql-5696494775-trv5s 1/1 Running 0 18h
Pod resources (snip from output).
kubectl describe pod wordpress-5c8796574f-ltxj9 -n wordpress
Name: wordpress-5c8796574f-ltxj9
...
Volumes:
wordpress-persistent-storage:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: wp-pv-claim
ReadOnly: false
PVC for pod 'wordpress-5c8796574f-ltxj9' is 'wp-pv-claim'. Make sure the relevant backups are available for 'wp-pv-claim'
PVC and volume name.
kubectl get persistentvolumeclaims -n wordpress
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
wp-pv-claim Bound pvc-59addab3-11d1-44cc-b81d-8315053b1df1 5Gi RWO longhorn-static 10m
Volume and backup name.
kubectl get backupvolumes.longhorn.io -n longhorn-system
NAME CREATEDAT LASTBACKUPNAME LASTBACKUPAT LASTSYNCEDAT
pvc-59addab3-11d1-44cc-b81d-8315053b1df1 2023-04-05T14:13:30Z backup-b54cb5cebd8b4091 2023-04-05T14:14:06Z 2023-04-06T09:34:00Z
A backup exists for PVC 'wp-pv-claim' and it is called: backup-b54cb5cebd8b4091. The PVC can therefore be deleted now.
Deleting PVC 'wp-pv-claim'.
kubectl delete pvc wp-pv-claim -n wordpress
persistentvolumeclaim "wp-pv-claim" deleted
The process will hang in 'terminating' state until the workload is re-deployed - or by deleting the relevant pod. Confirm the volume is actually deleted.
kubectl get pvc -n wordpress
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
wp-pv-claim Terminating pvc-59addab3-11d1-44cc-b81d-8315053b1df1 5Gi RWO longhorn-static 18h
If you have re-used the old volume name, workload should have picked up the restored volume name.
kubectl get persistentvolumeclaims -n wordpress
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
mysql-pv-claim Bound pvc-2a530ed0-2dfc-4f6e-a731-0eae79f246a7 5Gi RWO longhorn-static 19h
wp-pv-claim Bound pvc-59addab3-11d1-44cc-b81d-8315053b1df1 5Gi RWO longhorn-static 2m5s
Snip of 'kubectl describe pod'. The pod is again using the relevant PV:
kubectl describe pods wordpress-55c9ff4b54-rfnb7 -n wordpress
...
Volumes:
wordpress-persistent-storage:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: wp-pv-claim
ReadOnly: false
For 3.2) only the backups and the relevant helm chart or deployment, service, PVC files are available.
After restoring namespace, secrets, service, PVC and deployments, the workload is running again. But it is just a blank/default WordPress website (in my case). For a WordPress site this would be the default 'Install WordPress' page. I need to restore all the stuff I have added - configurations, templates, images, stories.
I have two volumes: pvc-59addab3-11d1-44cc-b81d-8315053b1df1 (WordPress) and pvc-d735df79-6107-42ff-be45-4a8e4336df66 (MySQL.)
kubectl get backupvolumes.longhorn.io -n longhorn-system
NAME CREATEDAT LASTBACKUPNAME LASTBACKUPAT LASTSYNCEDAT
pvc-59addab3-11d1-44cc-b81d-8315053b1df1 2023-04-05T14:13:30Z backup-b54cb5cebd8b4091 2023-04-05T14:14:06Z 2023-04-05T14:15:01Z
pvc-d735df79-6107-42ff-be45-4a8e4336df66 2023-04-05T14:13:46Z backup-485e03e3292146d9 2023-04-05T14:14:13Z 2023-04-05T14:15:01Z
kubectl get pvc -n wordpress
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
mysql-pv-claim Bound pvc-2a530ed0-2dfc-4f6e-a731-0eae79f246a7 5Gi RWO longhorn 82s
wp-pv-claim Bound pvc-b3da03f1-8e27-4c40-8f8d-f67c1fd3fa73 5Gi RWO longhorn 61s
So volume pvc-2a530ed0-2dfc-4f6e-a731-0eae79f246a7 is MySQL and pvc-b3da03f1-8e27-4c40-8f8d-f67c1fd3fa73 is WordPress.
My WordPress backup is bound to a (deleted) pod: wordpress-5d958b88df-cmqbt and a volume name: pvc-59addab3-11d1-44cc-b81d-8315053b1df1. So, I need to restore my WordPress backup as: pvc-b3da03f1-8e27-4c40-8f8d-f67c1fd3fa73 (the newly created volume name for the WordPress workload).
When you delete the volume, your workload will fall fail until the backup is in place.
1) Delete volumen: pvc-b3da03f1-8e27-4c40-8f8d-f67c1fd3fa73
2) Restore backup pvc-6123db39-4997-4ca7-a0f2-8af8c8d9a061 and create a volume called: pvc-b3da03f1-8e27-4c40-8f8d-f67c1fd3fa73
Again; deleting the PVC will result in nothing mush before the workloads are re-deployed. Both PVC's is stuck in 'Terminating'.
kubectl get pvc -n wordpress
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
mysql-pv-claim Terminating pvc-2a530ed0-2dfc-4f6e-a731-0eae79f246a7 5Gi RWO longhorn 12m
wp-pv-claim Terminating pvc-b3da03f1-8e27-4c40-8f8d-f67c1fd3fa73 5Gi RWO longhorn 11m
Now restore the backup to a volume with a new name (pvc-b3da03f1-8e27-4c40-8f8d-f67c1fd3fa73). pvc-b3da03f1-8e27-4c40-8f8d-f67c1fd3fa73 is the volume created by the latest deployment. This time you cannot use the option: 'Use previous Name' (Previous name will point to a different volume than what the deployment is actually using).
Instead use:
Name: pvc-b3da03f1-8e27-4c40-8f8d-f67c1fd3fa73 (volume name for the new deployment.)
'Number of replicas' and 'Access Mode' must also be set.
For this specific WordPress volume. Name: pvc-b3da03f1-8e27-4c40-8f8d-f67c1fd3fa73. Number of replicas: 2. Access Mode: ReadWriteOnce.
After restoring the backups. Volume matches the current deployment of the workloads. I.e., my WordPress/MySQL pods is using these volumes (the names of the 'Attached to' pods is nonsense. These are the deleted pods and it is not a problem).
Lost, but not for long
Heart and mind in search of hope
Found, and we're whole again