Master Database Failure After Applying Patch

I have a SQL Server instance running SQL 2008 R2 on a Hyper-V virtual Windows 2012 R2 server. After applying the security update for SQL Server 2008 R2 SP3 (KB4057113), the instance failed to start successfully. The SQL Error Log contained the following error messages:

2018-01-30 18:08:28.17 spid9s Exporting ##MS_AgentSigningCertificate## to S:\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\MS_AgentSigningCertificateEDA2C2C9-78C0-4500-8B20-000B27D64CA0.cer

2018-01-30 18:08:28.17 spid9s Error: 15240, Severity: 16, State: 1.

2018-01-30 18:08:28.17 spid9s Cannot write into file ‘S:\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\MS_AgentSigningCertificateEDA2C2C9-78C0-4500-8B20-000B27D64CA0.cer’. Verify that you have write permissions, that the file path is valid, and that the file does not already exist.

2018-01-30 18:08:28.19 spid9s Error: 912, Severity: 21, State: 2.

2018-01-30 18:08:28.19 spid9s Script level upgrade for database ‘master’ failed because upgrade step ‘sqlagent100_msdb_upgrade.sql’ encountered error 15240, state 1, severity 16. This is a serious error condition which might interfere with regular operation and the database will be taken offline. If the error happened during upgrade of the ‘master’ database, it will prevent the entire SQL Server instance from starting. Examine the previous errorlog entries for errors, take the appropriate corrective actions and re-start the database so that the script upgrade steps run to completion.

2018-01-30 18:08:28.20 spid9s Error: 3417, Severity: 21, State: 3.

2018-01-30 18:08:28.20 spid9s Cannot recover the master database. SQL Server is unable to run. Restore master from a full backup, repair it, or rebuild it. For more information about how to rebuild the master database, see SQL Server Books Online.

Interestingly, the program had successfully written a file to the same directory one second earlier:

2018-01-30 18:08:27.37 spid9s Created #InstmsdbAgentCertPath with value S:\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\MS_AgentSigningCertificate.cer

Manually, I granted full control to the S:\MSSQL10_50.MSSQLSERVER\MSSQL\DATA directory for the service account being used by the SQL instance (and the SQL Agent Service) and restarted the instance.

The patch installation completed without error and the instance started successfully.

I have applied this hotfix to multiple other servers, including servers on the same SQL and OS version as well as others, without incident.

UPDATE: This server failed a few days later due to disk issues, so in retrospect, I think the patch issues were probably tied to the disk issues and unrelated to the patch.

Master Database Failure After Applying Patch

Management Data Warehouse Collection Failures

I recently discovered that the Query Statistics and Server Activity collection sets were no longer working in the Management Data Warehouse on a SQL 2012 server. Both the SQL Agent Jobs and the Data Collection Sets seemed to be enabled and running but no data was being loaded. I disabled and re-enabled the collection sets several times, but that didn’t help.

The first clue as to the source of the issue came from the collection set log, which can be found by navigating to Management > Data Collection > System Data Collection Sets > Right clicking on Server Activity (or Query Statistics) and selecting View Logs. This log showed Success for the upload job, but every time the collection job ran, the following error message appeared in the log:

Failed to create kernel event for collection set: {GUID}. Inner Error ————-> Cannot create a file when that file already exists.

The source of the problem turned out to be dcexec.exe components that were running on the server from a previous launch of the data collection process. Dcexec.exe is the data collector run-time component that launches when a non-cached collection set is launched.

On a default configuration of the Management Data Warehouse, the Query Statistics and Server Activity collection sets are non-cached (Disk Usage is cached). In this case, there should be two instances of dcexec.exe running on the server. In the Task Manager, these are displayed as Background processes called Data Collector Execution Utility.

On my server, there were five instances of these processes running. This was a live production server, so before doing anything, I verified that the only thing that uses these processes is the Management Data Warehouse. Since that already wasn’t working, I felt it was safe to kill these processes on the production server.

In SSMS, I stopped all of the data collection sets. In Task Manager on the server, I ended all of the Data Collector Execution Utility tasks. Then, in SSMS, I started all of the data collection sets. They immediately started working.

This is a great source of information about how the Data Collector works under the hood:

Microsoft Technet Data Collector Architecture and Processing



Management Data Warehouse Collection Failures

Name Search Returns Syntax Error on Specific Names

I encountered an interesting bug in an older third-party application this week. The application includes a customer look-up feature by first and/or last name. Occasionally, instead of returning results, the application returns the following syntax error:

Select Error: SQLSTATE = 42000
Microsoft OLE DB Provider for SQL Server
Incorrect syntax near ‘Isaac’

The error seemed to be linked to specific names–each time the same name was entered, the application would return the same error. Querying the same view with the same parameters would not return an error in SSMS.

The key to the problem, and a workaround, came from the names themselves:

Isaac Newton
Isabella Rossellini
Robin Nottingham
Ishmael Reed

(These aren’t the exact same names, but they follow the same pattern.)

Notably, three of the first names start with “Is,” which is also a reserved keyword in SQL, but Robin Nottingham looked like an exception, until I realized that his last name also starts with the reserved keyword “Not.”

Somehow, the way that the application was passing the parameters to the SQL 2008 database was causing SQL to interpret the keywords independent of the parameter and causing the error.

The most obvious workaround seemed to be to enclose the search criteria in [brackets], “quotation marks”, or ‘apostrophes’. In this case, brackets failed but quotation marks worked! (We never tested apostrophes since quotation marks worked but I’ve included them here because that could be a workaround in a different application.)

Name Search Returns Syntax Error on Specific Names

Sysadmin permission removal causes login error

Recently, I set up a SQL 2014 development system for a new vendor database. Initially, I gave the developers sysadmin permission to perform the install. When the install was complete, I reviewed the setup, and saw that a new SQL login had been added for the application that had been granted sysadmin permission.

I removed the sysadmin permission from both the Windows Group that the developers were using to log in to the server and to the SQL login that had been added for the application.

And, I got calls that neither the developers nor the application could log in to the server. For the Windows Group, the following error was logged in the SQL Event Log:

Error: 18456, Severity: 14, State: 11

Login failed for user ‘Domain\DevGroup’. Reason: Token-based server access validation failed with an infrastructure error. Check for previous errors.

For the SQL login, this error was logged:

Error: 18456, Severity: 14, State: 12

Login failed for user ‘ApplicationUser’. Reason: Login-based server access validation failed with an infrastructure error. Check for previous errors.

I googled the errors, but the information that I turned up indicated that deleting the login and re-creating it would solve the problem. It did not.

Granting the following permissions explicitly did fix the issue:

GRANT CONNECT SQL TO [ApplicationUser];

As of SQL 2014, Microsoft no longer grants CONNECT ON ENDPOINT permission to the public role as it did in previous versions. This improves the default security of an instance, but as soon as I removed sysadmin permission, the users couldn’t connect to the instance at all.


Sysadmin permission removal causes login error

Trim Functions in the WHERE clause

It’s fairly well-known that functions should be avoided in T-SQL WHERE clauses, so when I recently spotted one running long on our server and delaying our overnight processing, I knew it could be easily fixed. Here’s how.

The code was long and complex, but in this case, only a couple lines of code needed to be updated to fix the query. In the portion of code I was working with, a column that contained the same data was defined differently in two different tables.

In Table_1, sku1 was defined as a varchar(26), but in Table_2, it was defined as a nchar(26). The sql code was loading some of the data from table1 into a cursor variable, which was also defined as a varchar(26). If there was a value in the sku1 cursor variable, some other attributes were queried by joining table2 to table3, trimming both the column value in table2 and the declared cursor variable. Here’s what the code looked like:

if @cursor_sku is not null and len(@cursor_sku) > 0
select A.sku_attributes
    from Attribute_Table A
left outer join Table_2 T2
    on T2.sku_id=A.sku_id
    where ltrim(rtrim(T2.sku2)) = ltrim(rtrim(@cursor_sku))

The ltrim and rtrim functions in the WHERE clause require that every row in the table is scanned to trim extra spaces before executing the join, so to get the attributes for a single row, the query is performing more than 4,000 logical reads per execution. Average run time is 931 ms per execution.

To fix this, change the cursor variable to match the data definition in Table_2:

declare @cursor_sku nchar(26);

Change the select statement that’s retrieving data to load the cursor to:

cast(sku1 as nchar(26))

Since we’re testing the length of the sku, we should trim before evaluating the length:

if @cursor_sku is not null and len(ltrim(rtrim(@cursor_sku))) > 0

Finally, remove the trim statements from the where clause:

where T2.sku2 = @cursor_sku

The query now does two index seeks, each of which returns 1 row. The query does 6 logical reads and executes in 1 ms.

Trim Functions in the WHERE clause

Seize Schema Master Fails with Access is Denied

On a Windows Server 2012 R2 domain controller in our sandbox, logged in as the Domain Administrator, I attempted to seize all of the fsmo roles using the following PowerShell command:

Move-ADDirectoryServerOperationMasterRole -Identity “Target_DC_Name” –OperationMasterRole 0,1,2,3,4 -force

I successfully seized the PDC Emulator, RID Master, and Infrastructure Master roles, but the seizure of Schema Master failed with an Access is Denied message.

The Administrator account was a member of Schema Admins and Enterprise Admins, so it had the correct permissions assigned. After much online searching, I found the suggestion to make Schema Admins the primary group for the account. The default was Domain Users. After making Schema Admins the primary group, the PowerShell command worked to seize the Schema Master role.

Seize Schema Master Fails with Access is Denied

A Couple PowerShell Tips

Yesterday, I created a PowerShell script to restore a development database on a nightly basis. After getting all the kinks worked out, I decided to copy the script to one of my development SQL servers and run it as a SQL Agent job on the server.

When I ran the script via the SQL Agent, it failed with the following message:

A job step received an error at line 3 in a PowerShell script. The corresponding line is ‘Add-Type -Assembly Microsoft.SqlServer.SMO’. Correct the script and reschedule the job. The error information returned by PowerShell is: ‘Could not load file or assembly ‘Microsoft.SqlServer.Smo, Version=, Culture=neutral, PublicKeyToken=89845dcd8080cc91’ or one of its dependencies. The system cannot find the file specified. ‘. Process Exit Code -1. The step failed.

The first several Google hits indicated that I needed to load additional components on the server, or possibly, load different components to match the server version. I downloaded and installed additional and newer components several times, but I kept getting the same error.

After more digging, I found a blogger who had gotten this error when he used the Add-Type command, but was able to run the script fine with the older, deprecated, method of loading assemblies. In his case, there were multiple different versions of SQL running on his development server. In my case, there was only one version of SQL, but I had installed the Feature Pack for SQL 2012 before SQL 2012 SP 2, so there had been multiple versions of the SMO installed.

I changed

Add-Type -Assembly Microsoft.SqlServer.SMO

for all of my assemblies to


and I moved on to a new error:

A job step received an error at line 3 in a PowerShell script. The corresponding line is ‘$space.ForegroundColor = $host.ui.rawui.ForegroundColor’. Correct the script and reschedule the job. The error information returned by PowerShell is: ‘Exception setting “ForegroundColor”: “Cannot convert null to type “System.ConsoleColor” due to invalid enumeration values. Specify one of the following enumeration values and try again. The possible enumeration values are “Black, DarkBlue, DarkGreen, DarkCyan, DarkRed, DarkMagenta, DarkYellow, Gray, DarkGray, Blue, Green, Cyan, Red, Magenta, Yellow, White”.” ‘. Process Exit Code -1. The step failed.

When a PowerShell is run as a SQL Agent Job, there is no console, so a Clear-Host command in the script causes this error. I commented out the Clear-Host, and the script is working perfectly!

A Couple PowerShell Tips