Recently at work I ran into an odd issue with a customer and MSSQL backups. The customer had setup some maintenance plans to backup their databases according to a schedule, fulls one day a week followed by differential backups and then rolling over. Pretty common. But when the poop hit the fan and the customer needed to do a restore they found their diffs wouldn’t work. When the customer restored the full and then tried to restore the diffs they received an error:
This differential backup cannot be restored because the database has not been restored to the correct earlier state.
The reason this error is given is because the differential backup is not part of the time line, meaning another full backup was taken in between the time the last full and differential you are trying to restore (the logs were truncated). Ok, so let’s find out when that backup was taken. Below is a script to lookup the backup history for a specific database. Just replace DBNAME with the name of your database.
— Get Backup History for required database
SELECT TOP 100
CAST(CAST(s.backup_size / 1000000 AS INT) AS VARCHAR(14)) + ‘ ‘ + ‘MB’ AS bkSize,
s.backup_finish_date) AS VARCHAR(4)) + ‘ ‘ + ‘Seconds’ TimeTaken,
CAST(s.first_lsn AS VARCHAR(50)) AS first_lsn,
CAST(s.last_lsn AS VARCHAR(50)) AS last_lsn,
WHEN ‘D’ THEN ‘Full’
WHEN ‘I’ THEN ‘Differential’
WHEN ‘L’ THEN ‘Transaction Log’
END AS BackupType,
FROM msdb.dbo.backupset s
INNER JOIN msdb.dbo.backupmediafamily m ON s.media_set_id = m.media_set_id
WHERE s.database_name = DB_NAME() — Remove this line for all the database
ORDER BY backup_start_date DESC, backup_finish_date
The mystery unfolds. I found a full backup was taken by some mysterious device. I knew we had a DPM server running virtual machine snapshots, but it isn’t agent based. The DPM server is simply taking a snapshot of the virtual machine, not the SQL server itself. So it wouldn’t take a full backup right? Well I thought I was right since the time stamp of the full backup and the time the job ran in DPM were hours apart, even accounting for time zone differences. On top of that the full backups over the past month were all at different times.
But the device being used was a virtual device (ID 7) and it’s name was a guid. I couldn’t find anything else that was taking these backups on a regular schedule so it had to be DPM. Which is when I found this. The KB is a different version of DPM and the server OS, but issue #3 is what I was facing.
Consider the following scenario:
A virtual machine (VM) is being backed up on a server that is running Hyper-V.
At the same time, an application backup operation is being performed in the same VM.
In this scenario, some data is truncated from the application backup in the VM. Therefore, this behavior causes data loss.
The resolution is also in that KB aside from applying the hotfix (I did NOT apply the hotfix).
You can apply the following registry entry in a virtual machine to fix issue 3 for that virtual machine:
Value: 0 or 1
If this registry entry is created and its value is set to 1, application backup will not be affected by the virtual machine backup operation on the server that is running Hyper-V. If this registry entry does not exist, or if its value is 0, issue 3 occurs.
Voila, after creating that registry dword the backups DPM took no longer truncated the logs in the SQL server. So going forward, if you are using data protection manager to backup Hyper-V virtual machines you need to make sure you create that registry dword. If you do not the internal VSS on the virtual machine will run a full backup of the MSSQL database in response to DPM taking a snapshot. This will in turn break any backups you have configured in the server.