martincostello/sqllocaldb

Cannot use SqlLocalDb with code coverage in VS 2015

jabbera opened this issue · 44 comments

I don't know if this happened in previous visual studio versions, but creating a unit test with the following body and running with code coverage:

    [TestMethod]
    public void TestMethod1()
    {
        SqlLocalDbApi.CreateInstance("MikeTest");
    }

Results in the following error which I have been unable to work around. This has to a fairly common use case so I'm, hoping for some insight. I didn't see anything on the wiki about it.

Test Name: TestMethod1
Test FullName: UnitTestProject1.UnitTest1.TestMethod1
Test Source: c:\users\mbarry\documents\visual studio 2015\Projects\UnitTestProject1\UnitTestProject1\UnitTest1.cs : line 12
Test Outcome: Failed
Test Duration: 0:00:00.1879061

Result StackTrace:
at System.Data.SqlLocalDb.SqlLocalDbApi.InvokeThrowOnError(Func`1 func, Int32 traceEventId, String instanceName)
at System.Data.SqlLocalDb.SqlLocalDbApi.CreateInstance(String instanceName, String version)
at System.Data.SqlLocalDb.SqlLocalDbApi.CreateInstance(String instanceName)
at UnitTestProject1.UnitTest1.TestMethod1() in c:\users\mbarry\documents\visual studio 2015\Projects\UnitTestProject1\UnitTestProject1\UnitTest1.cs:line 13
Result Message:
Test method UnitTestProject1.UnitTest1.TestMethod1 threw exception:
System.Data.SqlLocalDb.SqlLocalDbException: Error occurred during LocalDB instance startup: SQL Server process failed to start.

Hi Mike,

I've seen this error a few times in the past, but not consistently enough to get me particularly worried as it seemed to be a tempremental environmental issue related to SQL LocalDB itself.

However the problem has now started happening in the CI builds despite the change I made seemingly having nothing to do with it.

It's interesting that you specifically mention code coverage, as that's not an angle which I've used to delve into this issue before. In the same environment, does this consistently fail when code coverage is enabled and succeed when code coverage is enabled?

Thanks,
Martin

Also, would you mind providing some environment details such as Windows, .NET, SQL LocalDB and Visual Studio versions that you have installed?

Details from the Windows Event Log for a failing test:

Log Name:      Application
Source:        Application Error
Date:          22/07/2015 07:13:01
Event ID:      1000
Task Category: (100)
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      **********
Description:
Faulting application name: sqlservr.exe, version: 2014.120.4213.0, time stamp: 0x55777a44
Faulting module name: unknown, version: 0.0.0.0, time stamp: 0x00000000
Exception code: 0xc0000005
Fault offset: 0x0000000000007518
Faulting process ID: 0x16b8
Faulting application start time: 0x01d0c445767c39a8
Faulting application path: C:\Program Files\Microsoft SQL Server\120\LocalDB\Binn\sqlservr.exe
Faulting module path: unknown
Report ID: b442ff5c-3038-11e5-bf00-c4850881a55a
Faulting package full name: 
Faulting package-relative application ID: 
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="Application Error" />
    <EventID Qualifiers="0">1000</EventID>
    <Level>2</Level>
    <Task>100</Task>
    <Keywords>0x80000000000000</Keywords>
    <TimeCreated SystemTime="2015-07-22T06:13:01.000000000Z" />
    <EventRecordID>416356</EventRecordID>
    <Channel>Application</Channel>
    <Computer>**********</Computer>
    <Security />
  </System>
  <EventData>
    <Data>sqlservr.exe</Data>
    <Data>2014.120.4213.0</Data>
    <Data>55777a44</Data>
    <Data>unknown</Data>
    <Data>0.0.0.0</Data>
    <Data>00000000</Data>
    <Data>c0000005</Data>
    <Data>0000000000007518</Data>
    <Data>16b8</Data>
    <Data>01d0c445767c39a8</Data>
    <Data>C:\Program Files\Microsoft SQL Server\120\LocalDB\Binn\sqlservr.exe</Data>
    <Data>unknown</Data>
    <Data>b442ff5c-3038-11e5-bf00-c4850881a55a</Data>
    <Data>
    </Data>
    <Data>
    </Data>
  </EventData>
</Event>
Log Name:      Application
Source:        Application Error
Date:          22/07/2015 07:13:01
Event ID:      1000
Task Category: (100)
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      **********
Description:
Faulting application name: sqlservr.exe, version: 2014.120.4213.0, time stamp: 0x55777a44
Faulting module name: sqlmin.dll, version: 6.3.9600.17736, time stamp: 0x550f4336
Exception code: 0xc0000142
Fault offset: 0x00000000000ec180
Faulting process ID: 0x16b8
Faulting application start time: 0x01d0c445767c39a8
Faulting application path: C:\Program Files\Microsoft SQL Server\120\LocalDB\Binn\sqlservr.exe
Faulting module path: sqlmin.dll
Report ID: b461fe1c-3038-11e5-bf00-c4850881a55a
Faulting package full name: 
Faulting package-relative application ID: 
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="Application Error" />
    <EventID Qualifiers="0">1000</EventID>
    <Level>2</Level>
    <Task>100</Task>
    <Keywords>0x80000000000000</Keywords>
    <TimeCreated SystemTime="2015-07-22T06:13:01.000000000Z" />
    <EventRecordID>416358</EventRecordID>
    <Channel>Application</Channel>
    <Computer>**********</Computer>
    <Security />
  </System>
  <EventData>
    <Data>sqlservr.exe</Data>
    <Data>2014.120.4213.0</Data>
    <Data>55777a44</Data>
    <Data>sqlmin.dll</Data>
    <Data>6.3.9600.17736</Data>
    <Data>550f4336</Data>
    <Data>c0000142</Data>
    <Data>00000000000ec180</Data>
    <Data>16b8</Data>
    <Data>01d0c445767c39a8</Data>
    <Data>C:\Program Files\Microsoft SQL Server\120\LocalDB\Binn\sqlservr.exe</Data>
    <Data>sqlmin.dll</Data>
    <Data>b461fe1c-3038-11e5-bf00-c4850881a55a</Data>
    <Data>
    </Data>
    <Data>
    </Data>
  </EventData>
</Event>
Log Name:      Application
Source:        SQLLocalDB 12.0
Date:          22/07/2015 07:13:01
Event ID:      528
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      **********
Description:
Windows API call WaitForMultipleObjects returned error code: 575. Windows system error message is: {Application Error}
The application was unable to start correctly (0x%lx). Click OK to close the application.
Reported at line: 3728. 
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="SQLLocalDB 12.0" />
    <EventID Qualifiers="51653">528</EventID>
    <Level>2</Level>
    <Task>0</Task>
    <Keywords>0x80000000000000</Keywords>
    <TimeCreated SystemTime="2015-07-22T06:13:01.000000000Z" />
    <EventRecordID>416360</EventRecordID>
    <Channel>Application</Channel>
    <Computer>**********</Computer>
    <Security />
  </System>
  <EventData>
    <Data>WaitForMultipleObjects</Data>
    <Data>575</Data>
    <Data>{Application Error}
The application was unable to start correctly (0x%lx). Click OK to close the application.
</Data>
    <Data>3728</Data>
    <Data>
    </Data>
  </EventData>
</Event>

Based on the behaviour of the AppVeyor CI this appears to be unrelated to code coverage.

Since installing Visual Studio 2015, test runs now appear to hang after the [AssemblyCleanup] method is invoked.

No problems on a Windows 10 VM (Build 10240) with only Visual Studio 2015 Enterprise RTM installed.

On my local dev machine I'm getting different behaviours based on the version of the Visual Studio Test Runner being used.

Using the 2013 version causes various tests to fail due to being unable to start the SQL Server process used by SQL LocalDB.
Using the 2015 version doesn't have this error, but hangs at the end of the test run.

I'm answering these on my phone so it's going to be a series of small comments:

In the same environment, does this consistently fail when code coverage is enabled and succeed when code coverage is enabled?

Yes. This is 100% consistent.

Windows 7 64 bit
Visual studio 2013 and 2015 installed. (Shells for previous versions are installed for ssms.)
SQL express 11 and 12.

I'm having this issue in 2015 with the new test runner. It doesn't seem to matter 32 or 64 bit test runner.

Thanks @jabbera, I'll investigate further this evening.

No problem. When I get to work I'll try a fresh windows 7 vm with just 2015 as well as get you the pertinent event viewer details.

I'm guessing it have something to do with the clr profiler being used. The code coverage adapter uses the clr profiling API. I tried unsetting the environment variables that declare the coverage profiler before making the first API call but it didn't work.

OK great, thanks!

One last question: presumably this works fine in 2013, it's just 2015 that's got the problem?

I haven't tried that. Will let you know when I get to work.

2013 doesn't work either. Same event viewer details:

- <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
- <System>
  <Provider Name="SQLLocalDB 12.0" /> 
  <EventID Qualifiers="51653">528</EventID> 
  <Level>2</Level> 
  <Task>0</Task> 
  <Keywords>0x80000000000000</Keywords> 
  <TimeCreated SystemTime="2015-07-22T13:28:53.000000000Z" /> 
  <EventRecordID>66558</EventRecordID> 
  <Channel>Application</Channel> 
  <Computer>MBARRY-BOS-D.gmo.tld</Computer> 
  <Security /> 
  </System>
- <EventData>
  <Data>WaitForMultipleObjects</Data> 
  <Data>575</Data> 
  <Data>{Application Error} The application was unable to start correctly (0x%lx). Click OK to close the application.</Data> 
  <Data>3728</Data> 
  <Data /> 
  </EventData>
  </Event>

I'm reproducing this via the UI right clicking on the test and hitting analyze code coveage for selected tests.

I've narrowed the failure down farther to:

int hr = NativeMethods.CreateInstance("12.0", "MIKE", 0);
Console.WriteLine("{0:x}", hr);
Assert.AreEqual(0, hr);

@jabbera Cool thanks. I'll start looking into this shortly.

I'm currently looking at #8, but it might be that that issue is a different symptom of the underlying problem causing this issue.

My current working theory is that the code coverage profiler is injecting itself into the SQL LocalDB process, which is then causing a side-effect somewhere which manifests itself either as the SQL Server process failing to start or a hang in the Visual Studio Test Runner when it tries to shutdown (#8).

Assuming 83c87cb "fixes" #8 then I'll see if there's something I can add to the .runsettings file to stop the SQL Server process being profiled.

@jabbera If you run the tests in the BlogSample.sln solution do you have the problem there too? I appear to be manifesting this bug as a hang rather than SQL Server failing to start, but it's happening consistently for me with this repo's own tests. However the tests with the samples are fine.

I do have the same issue in that solution.

I took a look at this and changed the code to use IntPtr instead of ref int, but it hasn't resolved the problem on x86 or x64. Not using IntPtr should be fine though as the ref keyword should instruct the P/Invoke marshalling to treat the parameter as an input and output and copy/allocate memory accordingly.

I think the next thing I'll try is cloning this repo at the various tags starting from 1.0.0.0 and see if this is an issue introduced in a specific release which could help track it down, or whether it's a more general problem relating to the profiler rather than anything wrong in the assembly's code.

FYI I'm still looking into this. It's proving quite the brain-scratcher.

Some more errors that I've seen during investigation:

Log Name:      Application
Source:        SQLLocalDB 12.0
Date:          25/07/2015 17:08:04
Event ID:      512
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      **********
Description:
The "DataDirectory" registry value is missing in the LocalDB instance registry key: {954A8FD7-16F4-4987-952D-EF48C75B8BE3}
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="SQLLocalDB 12.0" />
    <EventID Qualifiers="35269">512</EventID>
    <Level>2</Level>
    <Task>0</Task>
    <Keywords>0x80000000000000</Keywords>
    <TimeCreated SystemTime="2015-07-25T16:08:04.000000000Z" />
    <EventRecordID>438705</EventRecordID>
    <Channel>Application</Channel>
    <Computer>**********</Computer>
    <Security />
  </System>
  <EventData>
    <Data>{954A8FD7-16F4-4987-952D-EF48C75B8BE3}</Data>
  </EventData>
</Event>
Log Name:      Application
Source:        Application Error
Date:          25/07/2015 17:08:04
Event ID:      1000
Task Category: (100)
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      **********
Description:
Faulting application name: sqlservr.exe, version: 2014.120.4213.0, time stamp: 0x55777a44
Faulting module name: unknown, version: 0.0.0.0, time stamp: 0x00000000
Exception code: 0xc0000005
Fault offset: 0x0000000000007518
Faulting process ID: 0x218c
Faulting application start time: 0x01d0c6f4168789d5
Faulting application path: C:\Program Files\Microsoft SQL Server\120\LocalDB\Binn\sqlservr.exe
Faulting module path: unknown
Report ID: 544becfc-32e7-11e5-bf01-c4850881a556
Faulting package full name: 
Faulting package-relative application ID: 
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="Application Error" />
    <EventID Qualifiers="0">1000</EventID>
    <Level>2</Level>
    <Task>100</Task>
    <Keywords>0x80000000000000</Keywords>
    <TimeCreated SystemTime="2015-07-25T16:08:04.000000000Z" />
    <EventRecordID>438706</EventRecordID>
    <Channel>Application</Channel>
    <Computer>**********</Computer>
    <Security />
  </System>
  <EventData>
    <Data>sqlservr.exe</Data>
    <Data>2014.120.4213.0</Data>
    <Data>55777a44</Data>
    <Data>unknown</Data>
    <Data>0.0.0.0</Data>
    <Data>00000000</Data>
    <Data>c0000005</Data>
    <Data>0000000000007518</Data>
    <Data>218c</Data>
    <Data>01d0c6f4168789d5</Data>
    <Data>C:\Program Files\Microsoft SQL Server\120\LocalDB\Binn\sqlservr.exe</Data>
    <Data>unknown</Data>
    <Data>544becfc-32e7-11e5-bf01-c4850881a556</Data>
    <Data>
    </Data>
    <Data>
    </Data>
  </EventData>
</Event>
sqlservr.exe - Application Error

The application was unable to start correctly (0xc0000142). Click OK to close the application.

Interesting observation from debugging/testing is that not specifying /EnableCodeCoverage to the Visual Studio Test Runner but having the code coverage data collector configured in the specified .runsettings file also causes the issue to occur.

Through various combinations of trial and error I'm fairly certain I've narrowed this down to an issue in something being loaded by SQL LocalDB not liking being run when code coverage is enabled, rather than being an issue in the code in this repo.

I'm currently trying to work out which module is causing the problem so I can provide you with an exclusion that can be added to a .runsettings file to workaround the problem.

I haven't narrowed down the specific faulting module yet, but adding something similar to the following to a .runsettings file appears to fix the problem for me:

<?xml version="1.0" encoding="utf-8"?>
<RunSettings>
  <DataCollectionRunSettings>
    <DataCollectors>
      <DataCollector friendlyName="Code Coverage" uri="datacollector://Microsoft/CodeCoverage/2.0" assemblyQualifiedName="Microsoft.VisualStudio.Coverage.DynamicCoverageDataCollector, Microsoft.VisualStudio.TraceCollector, Version=12.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a">
        <Configuration>
          <CodeCoverage>
            <ModulePaths>
              <Exclude>
                <ModulePath>.*system32.*</ModulePath>
              </Exclude>
          </CodeCoverage>
        </Configuration>
      </DataCollector>
    </DataCollectors>
  </DataCollectionRunSettings>
</RunSettings>

No problem.

It looks like the culprit, for me at least, is cryptbase.dll. If you find the same (or similar) I'll close this issue as external and add a note to the Wiki.

Unfortunately that did not solve the issue for me. I'll try systematically exluding libraries to see what I can make happen

OK thanks for letting me know.

I used Process Explorer to work out what modules SQL LocalDB was loading when it started up, and then went though those one by one until I got to cryptbase.dll, though narrowing it down to %System32% narrowed the field quite a bit as it ruled out the SQL-related modules.

I'm way more hosed then you are I believe. I changed to using the Include method found here:
https://msdn.microsoft.com/en-us/library/jj159530.aspx

to only include my library. It still crashes. Using an exclude of .* allows the test to run, but provides no CA. I think I'm going to have to break this down to a much smaller reproduction and submit it to MS. I don't this is a bug in your code, obviously.

I used the include method to fix the repo in 9b4fbee initially before I started working out which one was causing the problem in my case. At first I had a few problems with includes as I kept getting the pattern wrong, but I got there in the end.

If you add the link to your Connect bug here once you've logged it I'll up-vote it and comment if I've got anything worth adding.

Are you happy for me to close this as an external issue? Sorry I couldn't get you enough information to actually fix your problem!

Thanks! Turns out it repos even using the console...

https://connect.microsoft.com/VisualStudio/Feedback/Details/1602253

Well that definitely rules me out too then! Must be something about the way the profiler instruments child processes.

I've up-voted the bug and I think there's enough there for them to go on without me chipping in on the comments.

I hope you get your test code working again soon.

Still working through this with MSFT, but this seems to definitively work around the issue.

<?xml version="1.0" encoding="utf-8" ?>
<RunSettings>
  <DataCollectionRunSettings>
    <DataCollectors>
      <DataCollector friendlyName="Code Coverage" uri="datacollector://Microsoft/CodeCoverage/2.0" assemblyQualifiedName="Microsoft.VisualStudio.Coverage.DynamicCoverageDataCollector, Microsoft.VisualStudio.TraceCollector, Version=11.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a">
        <Configuration>
          <CodeCoverage>
            <CollectFromChildProcesses>False</CollectFromChildProcesses>
          </CodeCoverage>
        </Configuration>
      </DataCollector>
    </DataCollectors>
  </DataCollectionRunSettings>
</RunSettings>

@jabbera Ah nice - that's a lot cleaner than having to play "hunt the bad DLL"!

@jabbera Just clicked on the Connect bug link and it returns a 404. Have you made it private? No worries if so, just wondering.

@martincostello I didn't not. I received and email asking for feedback with the workaround provided. (They can not reproduce) when I clicked on the link to tell them it worked I can no longer access the item either!

@jabbera FYI the bug appears to have re-appeared on Connect.

I've been in communication with MSFT about this. Do you have citrix xen desktop installed on the machine that exhibits the behavior? Do you have any entries in: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Windows\AppInit_DLLs

These a dlls that are loaded into every process on the desktop.

Mike

Interesting. I'll take a look on my home machine when I get time in the next few days and see if I have either. AFAIK I don't have Citrix installed, and I've never edited that Registry key. I'll also see if I can find out whether those keys exist on AppVeyor CI VMs.