Category Archives: Proxmox

List all Proxmox VM snapshots on a given Ceph pool

It’s useful to occasionally review all of the snapshots that exist on your Ceph pool so that you can identify old ones that can probably be deleted – not just to free up space, but to ensure you are not wasting performance by having too many layers of unneeded snapshots hanging around.

The below script will make this easy and fast to do.

It is a refinement of Proxmox Forum user 43n12y’s excellent suggestion, and it looks like this in use – shown here being used in human-readable table mode, rather than the default JSON output – the JSON output is intended to be fed to some other script to carry out automated follow-up actions like sending an email, and/or deleting snapshots over a certain age automatically, etc:

root@vm00:~/scripts# ./snapshots.sh -p MYCEPHPOOL -t
Node VMID VM Name Snapshot Dates
vm01 1033 cpanel1.XXXXXXX.net-CL8 "2024-06-07"
vm01 121 fw0.XXXXXX.YYYYYYY.net "2024-08-27","2024-08-27","2024-08-27","2024-08-27","2024-08-27"
vm00 2004 unifi.ZZZZZZZ.co.uk "2024-09-14"
vm01 2007 vm-XXXXX.YYYYY.net "2024-07-19"
#/bin/bash

# Help Function
Help()
{
        echo
        echo "Show information about VM snapshots that exist on a given Ceph pool"
        echo
        echo "Syntax: $0 -p POOLNAME"
        echo
        echo "Required:"
        echo "-p NAME   Specify the pool name"
        echo
        echo "Options:"
        echo "-h        Print this help message"
        echo "-t        Print out as a table - i.e. don't output as the default JSON"
        echo
        echo "To get a list of pool names, you can use:"
        echo "ceph osd pool ls"
        echo
}

unset -v pool
unset -v text

while getopts hp:t opt; do
        case $opt in
                h) Help ; exit ;;
                p) pool=$OPTARG ;;
                t) text=true ;;
                *) Help ; exit ;;
        esac
done

: ${pool:?-p is required: You must specify the Ceph pool with -p NAME}

tmpfile=$(mktemp /tmp/snapshot-search-$pool.XXXXXX)

pvesh get /cluster/resources --type vm --output-format json-pretty >> ${tmpfile}

if [ $text ]; then
        unset -v tabledata
fi

for vmid in $(rbd ls -l -p $pool | grep "^vm-.*@" | cut -f 2 -d "-" | uniq); do
        vmname=$(jq -r ".[] | select ( .vmid == ${vmid}) | .name" ${tmpfile})
        node=$(jq -r ".[] | select ( .vmid == ${vmid}) | .node" ${tmpfile})
        filter=".[] | select ( .name != \"current\" ) + {\"vmname\": \"${vmname}\",\"vmid\": \"${vmid}\"} | .snaptime |= strftime(\"%Y-%m-%d\")"
        if [ $text ]; then
                unset -v snapdates
                snapdates=$(pvesh get "/nodes/$node/qemu/$vmid/snapshot" --output-format=json | jq -r '[.[] | select(.snaptime) | (.snaptime | tonumber | todate[:10])] | @csv');
                tabledata="$tabledata$node!$vmid!$vmname!$snapdates\n"
        else
                pvesh get /nodes/${node}/qemu/${vmid}/snapshot --output-format json-pretty | jq "${filter}"
        fi
done

if [ $text ]; then
        printf $tabledata | column -t -s "!" -N "Node,VMID,VM Name,Snapshot Dates" -T "Node","VM Name" -W "Snapshot Dates"
fi

rm ${tmpfile}

Mount a Proxmox Backup Server using a non-standard TCP port

Let’s say for sake of argument you have multiple PBS hosts behind a NAT router and for whatever reason you’re not going to, or don’t need to, run a VPN between every PVE host that needs to push backups to a PBS behind that NAT router.

By default, PBS runs on port tcp/8007 and is architected such that PVE hosts “push” (connect) to the PBS host.

If you want to mount a PBS backup location using a port other than tcp/8007, you will need to use the command line on the PVE host to do so.

Hypothetical Scenario

Let’s say we have a PBS host (simply called “pbs”) that is accessible – using a NAT port forward – via the IP 100.64.100.10 using port tcp/8099.

On that PBS host I have a datastore named NVME0, and I have created a namespace called “mynamespace”.

The PBS host fingerprint (viewable at Configuration > Certificates > and double click the cert, or Datastore > NVME0 > “Show Connection Information”) is a0:b1:c2:d3:e4:f5:a6:b7:c8:d9:e0:d1:e2:f3:a4:b5:c6:d7:e8:f9:a0:b1:c2:d3:e4:f5:a6:b7:c8:d9:e0:f1.

On that PBS host, I have created a user “myuser” with an api token named “backups” which gave me the token secret “12345678-1234-1234-abcd-1a2b3c4d5e6f” – and granted permissions to the namespace mynamespace on the datastore NVME0.

I want the PBS storage to appear as “pbs-NVME0-mynamespace” on the PVE interface.

Mounting your non-standard PBS storage

Using a root shell on the PVE host you want to mount the PBS storage on, use pvesm like so:

pvesm add pbs pbs-NVME0-mynamespace \
--fingerprint "a0:b1:c2:d3:e4:f5:a6:b7:c8:d9:e0:d1:e2:f3:a4:b5:c6:d7:e8:f9:a0:b1:c2:d3:e4:f5:a6:b7:c8:d9:e0:f1" \
--server 100.64.100.10 \
--port 8099 \
--datastore NVME0 \
--namespace mynamespace \
--username myuser@pbs\!backups \
--password 12345678-1234-1234-abcd-1a2b3c4d5e6f

(NB: all the elements you should change to suit your environment are in bold, and pay special attention to the use of a backslash “\” before the “!” token delimiter given for the username to prevent “!” being interpreted as a special character by the bash shell – you will need to add the backslash in to your token username yourself)

Grant a Proxmox Backup Server user and API token access only to a specific namespace

If like us you have multiple namespaces on a single PBS instance, you will want to be able to create user and token rights that only grant access to the specific namespace that token actually needs in order to properly follow the principle of least access.

Once you have created a user and the API token for that user you’re going to use to authenticate with, you need to create the permissions to grant access only to the target namespace.

Let’s say you have a Datastore named “NVME0”. The user and token will need (non-propagated!) DatastoreAudit on the Datastore itself:

As will their token:

You then need to add DatastoreBackup on the namespace. You will have to type the namespace in manually after the /datastore/NVME0 path, so if your namespace was called.. “namespace”, then the permissions would be granted on /datastore/NVME0/namespace:

You’re now ready to mount your namespace “namespace” directly on your PVE host using your API token.

(It’s probably worth mentioning that these permissions will *only* give the PVE permissions to write new backups and restore from existing backups, but not to delete/prune backups that are on the PBS. We use scripts / policy on the PBS itself for deleting backups to prevent an attacker that gets elevation / VM escape on the PVE cluster from being able to wipe the backups on the PBS systems, which run on separate hardware. If you are in an environment where this isn’t as important, you might grant more than “DatastoreBackup” on /datastore/NVME0/namespace to allow pruning/deletion to be managed directly from the PVE interface).

Migrate a VM between separate Proxmox hosts/clusters

You want to move a VM between cluster A and cluster B, or standalone host A to standalone host B?

As of PVE 7.3:

qm remote-migrate

Now, this is documented here, but the documentation on how to specify the API token argument is honestly not at all clear. So here’s a quick howto page with how to use the command and some gotchas you might trip over on the way like I did.

Make an API token on the target host:

Datacenter -> Permissions -> API Tokens

I have not researched what the minimum privileges are, and so here am going with a non privilege seperated root api key.

“Token ID” is just any text string. I have just used TOKENID in this screenshot to match my notes below.
You should consider setting an expiry date so that you can’t accidentally leave a ‘forever key’ with full root access enabled.

Proxmox will then present you with what I guess you would call your full token ID (user and Token ID from the previous screen combined together) and secret:

Then on the shell of the source host, you do the below

NB: the “apitoken” value you have to provide to the CLI tool should take the form of an entire HTTP header for whatever reason, and you combine the user, token id and secret as shown below (Bits in bold are what you need to set for your environment – I have constructed the APIToken below to match the screenshot above so you can see where the elements came from).

qm remote-migrate SOURCEVMID TARGETVMID apitoken='Authorization: PVEAPIToken=root@pam!TOKENID=c362d275-5e68-4482-a4c1-a0114c2ea408',host=TARGETIP,port=8006,fingerprint='HOSTFINGERPRINT' --target-bridge=vmbr0 --target-storage=ZFS-Mirror --online=true

NB; I am doing online migration because as of writing, in Proxmox 8.2.2;

  • Ceph source does not support offline migration, period (ERROR: no export formats for 'SOURCESTORAGE:vm-SOURCEVMID-disk-0' - check storage plugin support!), and
  • when I tried instead to use an NFS volume as the source, I found that a ZFS target does not support offline migration unless it is from a ZFS source: (ERROR: migration aborted (duration 00:00:01): error - tunnel command '{"export_formats":"qcow2+size","format":"qcow2","volname":"vm-SOURCEVMID-disk-0.qcow2","migration_snapshot":0,"storage":"local-lvm","allow_rename":"1","with_snapshots":1,"cmd":"disk-import"}' failed - failed to handle 'disk-import' command - unsupported format 'qcow2' for storage type lvmthin)
  • while I suppose I probably could have exported from NFS to LVM or another NFS store on the target, my default local-lvm storage volume on the target host does not have enough storage for the VM to go on there, all of the storage is tied up in the ZFS-Mirror volume.

Then I hit a new problem:

ERROR: online migrate failure - error - tunnel command '{"migrate_opts":{"remote_node":"TARGETVMHOST","type":"websocket","spice_ticket":null,"migratedfrom":"SOURCEVMHOST","nbd":{"scsi0":{"drivestr":"TARGETSTORAGE:vm-TARGETVMID-disk-0,aio=native,cache=writeback,discard=on,format=raw,iothread=1,size=32G,ssd=1","volid":"TARGETSTORAGE:vm-TARGETVMID-disk-0","success":true}},"storagemap":{"default":"TARGETSTORAGE"},"network":null,"nbd_proto_version":1},"start_params":{"skiplock":1,"forcemachine":"pc-i440fx-9.0+pve0","statefile":"unix","forcecpu":null},"cmd":"start"}' failed - failed to handle 'start' command - start failed: QEMU exited with code 1

Not very descriptive, but fortunately you can get the /reason/ qemu exited with code 1 from the target host – checking the output of the ‘qmtunnel’ task in the task history on the target host, I see that the combination of disk options which are set on my source ceph storage are not valid with the target ZFS storage, and so the task is aborting:

QEMU: kvm: -drive file=/dev/zvol/TARGETSTORAGE/vm-TARGETVMID-disk-0,if=none,id=drive-scsi0,cache=writeback,aio=native,discard=on,format=raw,detect-zeroes=unmap: aio=native was specified, but it requires cache.direct=on, which was not specified.

To fix this, I switched aio back to the default io_uring on the source, restarted the VM to make that take effect, and then restarted the migrate.

You see the output of the copy progress on the shell window as it runs, but it also shows up in the proxmox web ui as a regular migrate task which you can montior through the task viewer as normal:

Once the task is done, if you did not specify --delete=true then you will need to issue qm unlock SOURCEVMID to unlock the VM on the source host in order to be able to delete it.