Proxmox: Restore Virtual Machines via ZFS snaphots
Sometimes you wish you could go back in time when working with your virtual machines in your homelab. With proxmox and ZFS snapshots, this goal can easily be achieved.
Introduction
Proxmox is really good in managing a personal virtual environment. However, for managing ZFS snapshots I highly recommend the command line, because some very specific commands are required. So if you don’t like the command line, this tutorial may not be for you.
Use cases
Going back in time may be useful in the following use cases:
- Ransomware somehow managed to encrypt your data
- A Windows Update broke something
- You accidentally deleted a file
- although in this case it may be easier to mount the snaphot and look for the file instead of restoring the whole vm
Prerequisites
- No fear of the command line
- Proxmox installed on a ZFS volume (encryption is also supported)
- A recent ZFS snapshot to go back to
- Optional:
zfs-auto-snapshot
for auto snapshot management
Digression: Painlessly installing zfs-auto-snapshot
If you painlessly would add some features to your proxmox server (e.g. zfs-auto-snapshot
), you could take a look at these setup scripts:
- Xshok Scripts
- For
zfs-auto-snapshot
specifically this script part https://github.com/extremeshok/xshok-proxmox/blob/35c323b8fb2f7fefb0e6dc214426db1230cbea3a/install-post.sh#L351
- For
- Tteck Scripts
Gathering the required information
Find out the disk identifier
- Go the proxmox WebInterface
- Select the virtual machine you would like to revert
- Go to
Hardware
and check the device part of the usedHard Disk
(e.g.local-zfs:vm-501-disk-2,iothread=1,size=100G
) - Note down the disk identifier (e.g.
vm-501-disk-2
)
List the existing snapshots
You need a snapshot to go back in time. If you configured zfs-auto-snapshot
, then you should be able to go back hourly
, daily
, weekly
, monthly
, etc. If not, you can only go back to manually created snapshots - here is how to list them:
zfs list -t snapshot | grep vm-501-disk-2
Output may look like this:
rpool/data/vm-501-disk-2@zfs-auto-snap_daily-2023-06-21-0425 1.18G - 65.1G -
...
rpool/data/vm-501-disk-2@zfs-auto-snap_frequent-2023-06-25-1000 2.42M - 64.9G -
rpool/data/vm-501-disk-2@zfs-auto-snap_frequent-2023-06-25-1005 2.34M - 64.9G -
rpool/data/vm-501-disk-2@zfs-auto-snap_frequent-2023-06-25-1010 2.55M - 64.9G -
rpool/data/vm-501-disk-2@zfs-auto-snap_frequent-2023-06-25-1015 1.23M - 64.9G -
rpool/data/vm-501-disk-2@zfs-auto-snap_hourly-2023-06-25-1017 332K - 64.9G -
rpool/data/vm-501-disk-2@zfs-auto-snap_daily-2023-06-21-0425 1.18G - 65.1G -
Clone the snapshot you want to go back to
Let’s say you had a Ransomware attack and you don’t exactly know the point in time, when the attack started. Rather than blindly rolling back to a snapshot you guessed (which deletes all newer snapshots of the same timeline), you should instead clone
the snapshot to a new location. This way you can effortlessly try something out without destroying any existing data or snapshots.
To try the snapshot 4 days ago (rpool/data/vm-501-disk-2@zfs-auto-snap_daily-2023-06-21-0425
):
# clone an existing snapshot to a new disk (e.g. use disk-9 as marker for clones)
# I use disk 9 as convention for cloned snapshots
zfs clone rpool/data/vm-501-disk-2@zfs-auto-snap_daily-2023-06-22-0425 rpool/data/vm-501-disk-9
# verify the cloned disk exists
zfs list | grep vm-501-disk-9
rpool/data/vm-501-disk-0 592K 1.42T 376K -
rpool/data/vm-501-disk-1 496K 1.42T 116K -
rpool/data/vm-501-disk-2 123G 1.42T 65.2G -
rpool/data/vm-501-disk-9 8K 1.42T 65.2G -
# verify the disk has the correct origin
zfs get origin rpool/data/vm-501-disk-9
NAME PROPERTY VALUE
rpool/data/vm-501-disk-9 origin rpool/data/vm-501-disk-2@zfs-auto-snap_daily-2023-06-22-0425
Booting the cloned snapshot
To use the new disk, you have to trick proxmox into using the snapshot disk, instead of the original disk.
- Go the proxmox WebInterface
- Stop the Virtual Mashine, you are trying to restore
- Go to the command line and edit the config file manually
# optional but recommended: backup the original config file (in case you break something) cp /etc/pve/nodes/proxmox/qemu-server/501.conf /root/501.conf-2023-06-25 vi /etc/pve/nodes/proxmox/qemu-server/501.conf # change vm-501-disk-2 (original) to vm-501-disk-9 (the clone)
- Now boot the virtual machine and check if everything is working as expected
- If everything works as expected, stop the machine again
- Revert the changes to
501.conf
(yes, revert the config changes you just made)vi /etc/pve/nodes/proxmox/qemu-server/501.conf # change vm-501-disk-9 (clone) back to to vm-501-disk-2 (the original)
- Destroy the clone snapshot
# destroy the snapshot clone, it is no longer needed zfs destroy -r rpool/data/vm-501-disk-9
- Rollback the target snapshot (CAUTION: This will also delete ALL newer snapshots after the one you’re rolling back to)
# rollback to the working state zfs rollback -r rpool/data/vm-501-disk-2@zfs-auto-snap_daily-2023-06-22-0425
And… that’s it. Now you boot up your virtual machine again and it is as if you actually did go back in time.
If you prefer a more visual demonstration of the above process, you could take a look at this video: https://www.youtube.com/watch?v=D1JiI5MfavI&t=1175s
Have fun!