SQL Server & BI Blog by Andreas Wolter

Tag: "extended events"

Where is that Preemptive Wait coming from? Database Ownership and Performance: a journey through SQL Server internals with XEvents, Windbg and Wireshark

Feb 3rd

Posted by Andreas Wolter in Tracing & Monitoring

(EN)
In this article I will bring together several techniques for troubleshooting a performance- and security-related “phenomenon” I recently noticed when doing some tests with Natively Compiled Stored Procedures.

(DE)
In diesem Artikel zeige ich einige Techniken zum Troubleshooten eines Performance- und sicherheitsrelevanten „Phänomens“, das mir kürzlich aufgefallen ist, als ich ein paar Tests mit Natively Compiled Stored Procedures durchführte.

Background:

In SQL Server 2014, Natively Compiled Stored Procedures did not support EXECUTE AS CALLER but EXECUTE AS “SpecificUser” was required. This user could be specified with “Username”, “Self” or simply “Owner” – in which case the owner of the procedure usually reverts to the schema owner, which mostly reverts to the Database Owner altogether.

Hintergrund: In SQL Server 2014 unterstützten Natively Compiled Stored Procedures EXECUTE AS CALLER nicht, sondern es war EXECUTE AS “SpecificUser” erforderlich. Dieser Nutzer konnte mit “Username”, “Self” oder einfach “Owner” spezifiziert werden – in dem Fall kehrt der Besitzer der Prozedurnormalerweise zum „schema owner“ zurück, was meistens ganz auf den Database Owner zurückkehrt.

The phenomenon I encountered was that I noticed some pretty strange long execution times when running a workload consisting of a very basic stored procedure that does nothing more than insert one row of data into a small, unspectacular table.

The insert itself should run in less than a second for 1000 rows, but I could see up to 5 seconds.
The body of the procedure looks like this:

Das Phänomen, dem ich begegnete, war, dass ich einige ziemlich seltsam lange Ausführungszeiten bemerkte, als ich eine Workload laufen ließ, die aus einer sehr einfachen Stored Procedure bestand, die nichts weiter tat, als eine Datenreihe in eine kleine, unspektakuläre Tabelle einzufügen.
Das Insert selbst sollte weniger als eine Sekunde für 1000 Zeilen laufen, doch ich konnte bis zu 5 Sekunden sehen.
Der Inhalt der Prozedur sieht so aus:

SET NOCOUNT ON;
SET XACT_ABORT ON;

INSERT UserSchema.TableName
(4 columns, int and char)
VALUES
(Parameters);

RETURN 0;

The analysis:

Since neither the query plan nor “Show Statistics ON” showed anything unusual, I took a step back and started a top-down-analysis with the Waits and Queues Methodology (which, if you are new in this area, has been a proven methodology for performance analysis for over a decade and is explained for SQL Server probably for the first time in detail here: SQL Server 2005 Performance Tuning using the Waits and Queues)
Using the new session-level waits-DMV in SQL Server 2016 (sys.dm_exec_session_wait_stats) I saw the following distribution of waits:

Die Analyse:

Da weder der Abfrageplan noch “Show Statistics ON“ irgendetwas Ungewöhnliches zeigten, ging ich einen Schritt zurück und begann eine Top-Down-Analyse mit der Waits and Queues Methode (die, wenn ihr in diesem Bereich neu seid, eine seit über einem Jahrzehnt bewährte Methode zur Performance-Analyse ist und für SQL Server wahrscheinlich zum ersten Mal im Detail hier erklärt wird: SQL Server 2005 Performance Tuning using the Waits and Queues)
Bei der Anwendung der neuen Session-Level Waits-DMV im SQL Server 2016 (sys.dm_exec_session_wait_stats) sah ich die folgende Waits-Verteilung:

SELECT dm_exec_session_wait_stats.wait_type
        , SUM(dm_exec_session_wait_stats.wait_time_ms) - SUM(dm_exec_session_wait_stats.signal_wait_time_ms) AS resource_wait_time_ms
        , SUM(dm_exec_session_wait_stats.signal_wait_time_ms) AS signal_wait_time_ms
        , SUM(dm_exec_session_wait_stats.wait_time_ms) AS wait_time_ms
        , MAX(dm_exec_session_wait_stats.max_wait_time_ms) AS max_wait_time_ms
        , SUM(dm_exec_session_wait_stats.waiting_tasks_count) AS waiting_tasks_count
FROM sys.dm_exec_session_wait_stats AS dm_exec_session_wait_stats
        INNER JOIN sys.dm_exec_sessions AS dm_exec_sessions
                ON dm_exec_session_wait_stats.session_id = dm_exec_sessions.session_id
WHERE dm_exec_sessions.program_name ='Sarpedon-WorkoadSimulation-InsertProcs'
GROUP BY dm_exec_session_wait_stats.wait_type
ORDER BY max_wait_time_ms DESC;

From this list, Latch-contention, blocking and some waiting for IO is to be expected with a very concurrent workload (50 threads trying to insert on the last page).
The wait-types marked red are the ones that caught my attention.
As Preemptive waits are a quite different beast, and I initially thought maybe there was some authentication issue with the client, I decided to take a look at the CMEMTHREAD.

Von dieser Liste sind Latch-contention, Blocking und einiges Warten auf IO bei einer sehr simultanen Workload zu erwarten (50 Threads, die versuchen, sich auf der letzten Page einzufügen).
Die rot-markierten Wait-Typen sind diejenigen, auf die ich aufmerksam wurde.
Da Preemptive Waits ein spezielles Biest sind, und ich anfangs dachte, dass es vielleicht ein Authentifikationsproblem beim Client gab, entschied ich mich, einen Blick auf den CMEMTHREAD zu werfen.

The CMEMTHREAD is an indicator that many threads are contending simultaneously for a thread-safe memory object.
Now since we are running a highly concurrent workload, I could just leave it at that, but remember: “Never make quick decisions by confusing symptom with cause”. Keep looking for the real cause. (A good article by Paul Randal on exactly this issue: “Avoiding Knee-Jerk Performance Troubleshooting”)

So, the question is, which memory object is so special here.

Der CMEMTHREAD ist ein Indikator dafür, dass viele Threads gleichzeitig um ein thread-safe Memory-Object konkurrieren.
Nun, da wir eine höchst konkurrierende Workload laufen haben, könnte ich es einfach darauf beruhen lassen, doch denkt dran: „Niemals schnelle Entscheidungen treffen, indem man das Symptom mit der Ursache verwechselt.“ Sucht weiter nach der wahren Ursache. (Hier ein guter Artikel von Paul Randal zu genau diesem Thema: “Avoiding Knee-Jerk Performance Troubleshooting”)

Also ist die Frage, welches Memory-Object hier so besonders ist.

Time for Extended Events

One of the niftiest features of extended events is, that you can get a full stack dump of just a single thread, using the Action “sqlserver.create_dump_single_thread”. ”. Similarly, using “package0.callstack” you can get a callstack of the last 16 frames.
You can analyze the outcome with Windbg. – Or, using Trace Flag 3656, you can even query the Extended Event targets directly with XQuery from within Management Studio and the callstack will be materialized, provided you have loaded the correct symbols. (Nothing you should do on a production Server!)
(An example how to do that can be found here: Resolving DTC Related Waits and Tuning Scalability of DTC)

With the dump opened in Windbg the following function calls inside the SQL Server engine can be seen:

Zeit für Extended Events

Eine der raffiniertesten Features von Extended Events ist, dass man einen vollständigen Stack Dump von einem einzelnen Pfad erhalten kann, wenn man die Action “sqlserver.create_dump_single_thread” verwendet.
Auf ähnliche Weise kann man einen Callstack der letzten 16 Frames erhalten, wenn man “package0.callstack” verwendet.
Man kann das Ergebnis mit Windbg analysieren – oder, wenn man Trace Flag 3656 verwendet, kann man sogar die Extended Event Targets direkt mit XQuery aus Management Studio heraus abfragen und der Callstack wird materialisiert. (Nichts, das man auf einem Produktions-Server machen sollte!)

(Ein Beispiel dafür, wie das gemacht werden kann, ist hier: Resolving DTC Related Waits and Tuning Scalability of DTC)

Mit dem in Windbg geöffneten Dump können die folgenden Funktions-Calls in der SQL Server Engine gesehen werden:

The Stack dump rings a bell. Right before allocating the memory (CMemThread<CMemObj>::Alloc (red), several functions with very familiar names were called (orange):

Bei dem Stack Dump läuten doch einige Glocken. Kurz vor dem Zuteilen des Memory (CMemThread<CMemObj>::Alloc (rot) wurden mehrere Funktionen mit sehr bekannt klingenden Namen abgerufen (orange):

sqllang!CWindowsSecurityPrimaryInfo::GetNtGroupsViaAuthZ
sqllang!CreateLoginToken
sqllang!CreateLoginTokenForImpersonation

And those in turn must have been caused by the sqllang!CAutoExecuteAsContext::Set, sqllang!CMsqlExecContext::FExecute and so on (green).
At this point I pretty much knew where to look. But just to show you an example of how this can be done without even touching Windbg, here is an (accumulated and commented) result from the Extended Event callstack-Action:

Und diese wiederum müssen von sqllang!CAutoExecuteAsContext::Set, sqllang!CMsqlExecContext::FExecute verursacht worden sein, und so weiter (grün).
In diesem Moment wusste ich im Prinzip schon, wo ich suchen musste. Aber um euch ein Beispiel zu zeigen, wie man das auch machen kann, ganz ohne Windbg anzurühren, ist hier ein (gesammeltes und kommentiertes) Ergebnis der Extended Event Callstack-Action:

This callstack results from the PREEMPTIVE_OS_LOOKUPACCOUNTSID Wait-Type, which I noticed would happen exactly once per procedure call. When analyzing different callstacks together I could draw the conclusion that the following function calls lead to the respective Wait-Types:

Dieser Callstack resultiert vom PREEMPTIVE_OS_LOOKUPACCOUNTSID Wait-Type, was, wie ich bemerkte, genau einmal pro Procedure Call geschehen würde. Beim Analysieren verschiedener Callstacks zusammen konnte ich die Schlussfolgerung ziehen, dass die folgenden Funktions-Calls zu den jeweiligen Wait-Types führen:

sqllang.dll!LookupAccountNameInternal -> PREEMPTIVE_OS_LOOKUPACCOUNTSID
sqllang.dll!CWindowsSecurityPrimaryInfo::GetNtGroupsViaAuthZ -> PREEMPTIVE_OS_AUTHZINITIALIZERESOURCEMANAGER
sqllang.dll!CWindowsSecurityPrimaryInfo::Init -> PREEMPTIVE_OS_AUTHORIZATIONOPS

The Preemptive Waits read from the Extended Events file-target can be ordered in their occurrence (from top to bottom), leading to the following picture:

Die Preemptive Waits, aus dem Extended Events Datei-Ziel gelesen, können nach ihrem Auftreten sortiert werden (von oben nach unten), was folgendes Bild ergibt:

The here unreadable callstack for the call to AUTHORIZATIONOPS (undocumented) on the bottom.

Der hier unlesbare Callstack für den Call AUTHORIZATIONOPS (undokumentiert) unten.

First outcome:

From those calls one can deduct that not the client is authenticating at SQL Server, but actually the procedure call itself causes Windows API calls (LOOKUPACCOUNTSID, AUTHZINITIALIZERESOURCEMANAGER) for authentication purposes.
Now at the very latest it’s time to check the ownership-chain for the stored procedure.
So I checked the header of the proc, and indeed, just as I expected it contained a “WITH EXECUTE AS OWNER”.
That alone can’t be it, so the question is, who is the owner. I already suspected it and there it was: The database was owned by a LOCAL Windows account.
Now, that is rare, but it all makes sense now: For every execution of the procedure, the ownership had to be verified – and since it is a Windows-Account, it had to go out of SQL Server OS, using a Preemptive call to Windows, wait, and continue only after Windows returns control of this thread back to SQL Server.

Erstes Ergebnis:

Aus diesen Call lässt sich ableiten, dass es nicht der Client ist, der sich bei SQL Server authentifiziert, sondern dass es tatsächlich der Procedure Call selbst ist, der Windows API Calls (LOOKUPACCOUNTSID, AUTHZINITIALIZERESOURCEMANAGER) für Authentifizierungszwecke verursacht.
Spätestens jetzt ist es an der Zeit, die Besitzerkette für die Stored Procedure zu überprüfen. Also habe ich den Kopf der Prozedur geprüft, und tatsächlich, genau wie ich erwartet hatte, enthielt es ein “WITH EXECUTE AS OWNER”.
Das allein kann es nicht sein, also stellt sich die Frage, wer der Owner ist. Ich vermutete es bereits, und da war es: Die Datenbank gehörte zu einem LOKALEN Windows-Konto.
Nun, das ist eher selten, doch jetzt ergibt alles Sinn: Für jede Ausführung der Prozedur musste der Besitz verifiziert werden – und da es ein Windows-Konto ist, musste es raus aus SQL Server OS gehen, indem ein Preemptive Call an Windows verwendet wird, dann warten, und erst dann weitermachen, nachdem Windows die Kontrolle dieses Pfades an SQL Server zurückgegeben hatte.

Testing the improvement options:

Now, obviously, I did not stop here but rather made some more tests. Based on the fact that there are 3 possible types of owner for objects within SQL Server, I compared the execution times (and waits) between those 3 possible
Database owner-types:

1. SQL Account

2. Windows DomainAccount

3. Local Windows Account

- I did also test with Admin vs. non-Admin and NTLM vs. Kerberos but saw no difference.

Testen der Verbesserungsmöglichkeiten:

Natürlich hörte ich nicht hier auf, sondern machte noch weitere Tests. Ausgehend von der Tatsache, dass es mögliche Typen von Owner für Objekte innerhalb SQL Server gibt, verglich ich die Ausführungszeiten (und Wartezeiten) zwischen diesen 3 möglichen Datenbank-Owner-Types:

SQL Account
Windows DomainAccount
Local Windows Account

- Ich testete auch mit Admin vs. non-Admin und NTLM vs. Kerberos, stellte jedoch keinen Unterschied fest.

So here is the result of the Performance-Comparison with different Database-/Procedure-Owners:

Hier ist also das Ergebnis des Performance-Vergleichs mit unterschiedlichen Datenbank-/Procedure-Owners:

Interpretation:

1) The obvious: Using a local Windows Account results in a huge performance-penalty for such a simple INSERT-statement

2) Using a Domain Account I could notice that every 10 minutes the next execution would be a bit slower.

Interpretation:

Das Offensichtliche: Die Verwendung eines lokalen Windows-Kontos führt zu eine riesigen Performance-Impakt für ein solch simples INSERT-Statement
Beim Verwenden eines Domain-Kontos konnte ich sehen, dass die nächste Ausführung alle 10 Minuten etwas langsamer war.

Further analysis

When checking the Wait-stats again, I could see that usually the Windows-Domain-Account had the following simple waits:

Weitere Analyse

Beim erneuten Überprüfen der Warte-Statistiken konnte ich feststellen, dass der Windows-Domain-Account meistens die folgenden einfachen Waits hatte:

Pretty ok and nothing to worry about. But the first call after 10 minutes would always result in the same wait-types as I observed for the local Windows Account, except that the wait-times are much much lower. (You can compare the below picture with the one from the beginning of the article.)

Ziemlich in Ordnung und nichts, worüber man sich Sorgen machen müsste. Doch der erste Call nach 10 Minuten ergab immer dieselben Wait-Typen, die ich für das lokale Windows-Konto beobachtete, außer dass die Wartezeiten um Vieles geringer sind. (Ihr könnt das Bild unten mit dem am Anfang des Artikels vergleichen.)

Behind the Scenes: Network Tracing

To explain the difference, I ran a Network Trace using Wireshark
And exactly every 10 minutes I saw some additional KERBEROS-traffic going to the Domain Controller. Below you see the opened data frame with the Request for a Ticket Granting Service Ticket (TGS-REQ).

Hinter den Kulissen: Network Tracing

Um den Unterschied zu erklären, führte ich ein Network Trace mit Wireshark aus.
Und genau alle 10 Minuten sah ich einigen zusätzlichen KERBEROS-Traffic zum Domain Controller fließen. Unten seht ihr den geöffneten Datenrahmen mit dem Request for a Ticket Granting Service Ticket (TGS-REQ).

Background infos:

As the User, on whose behalf the service requests the service ticket, is identified using the user name and user realm/domain (SQL\SQLDBOwner in my case), the padata type PA-FOR-USER is used, as you can see in the screenshot. You can even see the Checksum added for protection.

The S4U2self extension of the PA-DATA structure allows a service to obtain a service ticket to itself on behalf of a user. The user is identified to the KDC using the user's name and realm.

Hintergrundinformationen:

Wenn der Nutzer, für den der Service das Service-Ticket anfordert, unter Verwendung des Nutzernamens und user realm/Domäne (SQL\SQLDBOwner in meinem Fall) identifiziert wird, wird der Padata Type PA-FOR-USER verwendet, wie im Screenshot zu sehen. Man kann sogar die Checksum sehen, die zum Schutz hinzugefügt wurde.

Die S4U2self Extension der PA-DATA-Struktur erlaubt einem Service, ein Service-Ticket im Auftrag eines Nutzers für sich selbst zu beziehen. Der Nutzer wird gegenüber dem KCD identifiziert, unter Verwendung des Nutzernamen und Realm.

Luckily the DC answers right away with a TGS-REP, containing the PA-DATA-structure with ticket for the service that was named in the TGS-REQ from above:

Zum Glück antwortet der DC sofort mit einem TGS-REP, der die PA-DATA-Struktur mit einem Ticket für den oben im TGS-REQ genannten Service enthält:

There is plenty literature available if you want to learn more about the Kerberos Protocol-Elements. Here is a documentation from Microsoft:

Es gibt eine Menge an Literatur, falls ihr mehr über die Kerberos-Protokollelemente erfahren wollt. Hier ist eine Dokumentation von Microsoft:

Kerberos Protocol Extensions: Service for User and Constrained Delegation Protocol

The 10-minutes puzzle:

1) After doing some extensive testing, I conclude that the 10-minute interval after which a new Ticket Granting Service Ticket-Request is initiated must be a SQL Server specific enhancement for Domain Accounts to avoid making this round trip for every SID-validation. The 10 minutes are consistent and independent from other workload influencers. The cause behind is not documented.

2) The much faster validation for the AD-Account is very fast thanks to some caching in Windows. (Thank you, Jack Richins from formerly SQL Security Team, for reminding me of this)

3) For the local Windows Account, there is no such performance improvement via caching which results in much slower response times.

Das 10-Minuten-Puzzle:

Nach ausgiebigem Testen schließe ich, dass das 10-Minuten-Intervall, nach dem ein neuer Ticket Granting Service Ticket Request eingeleitet wird, eine SQL-Server-spezifische Erweiterung für Domain-Accounts sein muss, um diese Rundreise nicht für jede SID-Validierung machen zu müssen. Die 10 Minuten sind konsistent und unabhängig von anderen Workload-Beeinflussern. Die Ursache dahinter ist nicht dokumentiert.
Die viel schnellere Validierung für das AD-Konto ist so schnell dank einigem Caching in Windows. (Danke, Jack Richins vom ehemaligen SQL Security Team, der mich daran erinnert hat)
Für das lokale Windows-Konto gibt es keine solche Performance-Verbesserung via Caching, was zu viel längeren Antwortzeiten führt.

Final results and learnings:

1) Using local users for SQL Server can create noticeable impact for short transactions. Yet another reason to stay away from local accounts.

2) For the SQL Account, naturally no Windows calls are necessary at all, but the performance-advantage compared to the AD-Account is marginal, even for high-speed scenarios.

3) Network latency matters even between SQL Server and DC. Not just for the initial Login-phase but even for ongoing validations from within SQL-statements.

4) I can stick to my recommendation for Database Ownerships (SQL Server Database Ownership: survey results & recommendations) :-)

Endergebnis und Gelerntes:

Die Verwendung von lokalen Nutzern für SQL Server kann deutliche Auswirkungen für kurze Transaktionen verursachen. Noch ein weiterer Grund dafür, sich von lokalen Konten fernzuhalten.
Für den SQL Account sind natürlich keine Windows-Calls notwendig, doch der Performance-Vorteil gegenüber dem AD-Account ist geringfügig, selbst für High-Speed-Scenarien.
Netzwerk-Latenz spielt selbst zwischen SQL Server und DC eine Rolle. Nicht nur für die anfängliche Login-Phase, sondern sogar für fortlaufende Validierungen innerhalb von SQL-Statements.
Ich kann meine Empfehlung für Datenbank-Ownerships aufrechterhalten (SQL Server Database Ownership: survey results & recommendations) :-)

I hope you found this an interesting read.

Ich hoffe, dies war eine interessante Lektüre für euch.

Andreas

PS: for the geeks among you:

The Stack Dump in Text format together with some comments:

P.S. Für die Geeks unter euch:

Der Stack Dump im Textformat mitsamt einigen Kommentaren:

Child-SP RetAddr Call Site

000000e9`2f0b79d8 00007ffe`f9e51118 ntdll!NtWaitForSingleObject+0xa

-> this function is actually deprecated (Hello, Microsoft?)

000000e9`2f0b79e0 00007ff7`04999fef KERNELBASE!WaitForSingleObjectEx+0x94
000000e9`2f0b7a80 00007ff7`04999d7d sqlservr!CDmpDump::DumpInternal+0x22f
000000e9`2f0b7b20 00007ff7`049a15b8 sqlservr!CDmpDump::DumpFilter+0xcd
000000e9`2f0b7c10 00007ffe`ef0ef2cb sqlservr!memset+0x1819
000000e9`2f0b7c40 00007ffe`fc98398d msvcr120!__C_specific_handler+0x93 [f:\dd\vctools\crt\crtw32\misc\amd64\chandler.c @ 162]
000000e9`2f0b7cb0 00007ffe`fc9493a7 ntdll!RtlpExecuteHandlerForException+0xd
000000e9`2f0b7ce0 00007ffe`fc948fbd ntdll!RtlDispatchException+0x197
000000e9`2f0b83b0 00007ffe`f9e58b9c ntdll!RtlRaiseException+0x18d
000000e9`2f0b8b70 00007ff7`04999c9c KERNELBASE!RaiseException+0x68
000000e9`2f0b8c50 00007ffe`f05602c6 sqlservr!CDmpDump::Dump+0x4c
000000e9`2f0b8c90 00007ffe`f105c411 sqllang!SQLDumperLibraryInvoke+0x1f6
000000e9`2f0b8cd0 00007ffe`f105ce94 sqllang!SQLLangDumperLibraryInvoke+0x161
000000e9`2f0b8d80 00007ffe`f102cd0b sqllang!CImageHelper::DoMiniDump+0x475
000000e9`2f0b8f90 00007ffe`f072e9c4 sqllang!stackTrace+0x9db
000000e9`2f0ba9b0 00007ffe`f072f6ae sqllang!XeSqlPkg::IsSystem+0x174
000000e9`2f0bab90 00007ffe`ef2e779a sqllang!XeSqlPkg::CreateDumpSingleThreadActionInvoke+0x1e

-> you can clearly see how this stack dump was created: XEvents

000000e9`2f0babc0 00007ffe`ef2a1b8e sqldk!XeSosPkg::wait_info_external::Publish+0x1a9

-> Now we will see our wait-type when we ask the DMVs or XEvents

000000e9`2f0bac20 00007ffe`ef2a63a4 sqldk!SOS_Scheduler::UpdateWaitTimeStats+0x596
000000e9`2f0bb530 00007ffe`f0d94fac sqldk!SOS_Task::PopWait+0xba
000000e9`2f0bb590 00007ffe`f0d9481e sqllang!CWindowsSecurityPrimaryInfo::GetNtGroupsViaAuthZ+0x75c

-> this will lead to the Preemptive Wait: PREEMPTIVE_OS_AUTHZINITIALIZERESOURCEMANAGER

000000e9`2f0bb890 00007ffe`f0d956ed sqllang!CWindowsSecurityPrimaryInfo::Init+0x2ce
000000e9`2f0bba00 00007ffe`f041a88b sqllang!GetWindowsSecurityPrimaryInfo+0xbe
000000e9`2f0bba60 00007ffe`f041cadb sqllang!CreateLoginToken+0x2d7
000000e9`2f0bbc50 00007ffe`f041dca8 sqllang!CreateLoginTokenForImpersonation+0xcb2

-> an Impersonation is happening (EXECUTE AS)

000000e9`2f0bc400 00007ffe`f0358342 sqllang!CAutoExecuteAsContext::Set+0xe2d
000000e9`2f0bc680 00007ffe`f001245e sqllang!CSECExecAsRuntimeServices::_Set+0x424
000000e9`2f0bc900 00007ffe`f0012041 sqllang!CMsqlExecContext::FExecute+0x336
000000e9`2f0bcc30 00007ffe`f0d1d83d sqllang!CSQLSource::Execute+0x983
000000e9`2f0bcdd0 00007ffe`f0d1d241 sqllang!CStmtExecProc::XretLocalExec+0x26e
000000e9`2f0bce50 00007ffe`f0d19f98 sqllang!CStmtExecProc::XretExecExecute+0x481
000000e9`2f0bd600 00007ffe`f00119ca sqllang!CXStmtExecProc::XretExecute+0x38
000000e9`2f0bd640 00007ffe`f0012933 sqllang!CMsqlExecContext::ExecuteStmts<1,1>+0x40d
000000e9`2f0bd780 00007ffe`f0012041 sqllang!CMsqlExecContext::FExecute+0xa9e
000000e9`2f0bdab0 00007ffe`f0cd3f6d sqllang!CSQLSource::Execute+0x983
000000e9`2f0bdc50 00007ffe`f0ce0e6c sqllang!ExecuteSql+0x93d
000000e9`2f0be7d0 00007ffe`f0ce1549 sqllang!CSpecProc::ExecuteSpecial+0x15c
000000e9`2f0be8d0 00007ffe`f001a82b sqllang!CSpecProc::Execute+0x299
000000e9`2f0bea00 00007ffe`f0021542 sqllang!process_request+0xe61
000000e9`2f0befd0 00007ffe`f00210a3 sqllang!process_commands_internal+0x2df
000000e9`2f0bf050 00007ffe`ef2a5bfd sqllang!process_messages+0x253
000000e9`2f0bf260 00007ffe`ef2a58f5 sqldk!SOS_Task::Param::Execute+0x231
000000e9`2f0bf860 00007ffe`ef2a554d sqldk!SOS_Scheduler::RunTask+0xaa
000000e9`2f0bf8d0 00007ffe`ef2cd7c8 sqldk!SOS_Scheduler::ProcessTasks+0x3cd
000000e9`2f0bf9b0 00007ffe`ef2cdb10 sqldk!SchedulerManager::WorkerEntryPoint+0x2a1
000000e9`2f0bfa80 00007ffe`ef2cdcd7 sqldk!SystemThread::RunWorker+0x8f
000000e9`2f0bfab0 00007ffe`ef2cd9f8 sqldk!SystemThreadDispatcher::ProcessWorker+0x2de
000000e9`2f0bfb60 00007ffe`fc6e13d2 sqldk!SchedulerManager::ThreadEntryPoint+0x1d8
000000e9`2f0bfc10 00007ffe`fc9603c4 kernel32!BaseThreadInitThunk+0x22
000000e9`2f0bfc40 00000000`00000000 ntdll!RtlUserThreadStart+0x34

EOF

Windbg, Wireshark, extended events, kerberos, performance analysis, security

How to import Extended Events session file target and parse deadlock-graph / Import von Extended Events Ereignisdateien und Parsen von Deadlock-Graphen

Mar 31st

Posted by Andreas Wolter in Extended Events

No comments

(DE)
Diesmal soll es um etwas gehen, das Extended Events direkt von Haus aus besser und leichter verfügbar machen, als sämtliche Techniken zuvor:
Speichern von aufgetretenen Deadlock-Ereignissen, und das Analysieren der entsprechenden Deadlock-Graphen.

(EN)
This time it’s about one of the things that Extended Events do better and more easy than all techniques before:
Saving of deadlock-events that occurred and analyzing the respective deadlock-graph.

Deadlocks, die man hier erläutert finden kann, treten zwischen mindestens 2 Arbeits-Threads auf, oft aber sind mehr als 2 beteiligt.
- Ich wähle „Thread“ absichtlich, da Prozess nicht genau genug wäre. Meistens treten Deadlocks zwar zwischen verschiedenen Sessions/Prozessen auf, aber es kann sich auch ein Prozess, der auf parallelen Threads ausgeführt wird, selbst blockieren.
Die Fehlernummer für Deadlocks ist die sicherlich allen SQL Profis bekannte „1205“.

Deadlock, explained here, occur between at least 2 threads but often more than 2 are involved.
- I am using the term “thread” on purpose, since process would not be sufficiently accurate. Mostly Deadlocks in fact do happen between different sessions/processes, but it is also possible that a process running with parallel threads blocks itself.
The error number for deadlocks is “1205” – probably known to all SQL professionals.

Um Deadlocks zu Tracen gab es vor SQL Server 2008 nur die folgenden Möglichkeiten:

1. Eine SQL Trace mit Profiler oder auch ohne GUI mit Protokollierung in trc-Datei oder live in der GUI

2. Trace Flags 1204, 1205 oder 1222, welche in das Errorlog des SQL Servers protokollieren – „wunderbar“ besonders auf „ereignisreichen“ Servern, und denkbar schwierig für die Analyse.

Manch einer mag sich noch an diese grausigen Einträge im Error-Log erinnern:

For tracing deadlocks before SQL Server 2008 there were merely the following ways:

1. A SQL Trace with Profiler or without GUI with logging in a trace-file or live inside the GUI

2. Trace Flags 1204, 1205 or 1222, which log into SQL Server’s errorlog – „great“ especially on “eventful” servers.

Some may remember those horrible entries in the error log:

3. Event Notifications, welche die Daten per Service Broker Queue in XML-Format zur Verfügung stellen.
Hier findet sich ein ausführliches Beispiel mit Speicherung in Tabelle und automatischem Mail-Versand mit Info über das Event.

4. Ein WMI-Alert, welcher die Daten ebenfalls per XML bereitstellt und mit einem Auftrag oder einer Benachrichtigung gekoppelt diesen protokollieren bzw. darüber informieren kann.

3. Event Notifications which provide the data via Service Broker Queue in XML-format.
Here you find an elaborate example with storing in table and automatic email with information about the event.

4. A WMI-Alert which also provides the data via XML und tied to a job can also log respectively inform about it.

All diese Ansätze, so unterschiedliche Vorteile sie auch haben mögen haben eines gemeinsam: Sie müssen erst eingerichtet werden. Im Zweifelsfalle also nachdem die ersten Deadlocks aufgetreten sind.

Seit SQL Server 2008 jedoch werden sämtliche Deadlocks von Hause aus protokolliert.

Das geschieht durch die automatisch laufende system_health Extended Event session.

Diese speichert neben Deadlocks noch weitere wichtige Server-Ereignis- und Fehler-Informationen ab. Hier finden sich Details dazu.

Seit SQL Server 2012 protokolliert die system_health session neben dem Memory-Ziel „Ring_Buffer“ auch in die Ereignisdatei, womit Deadlocks und weitere Fehler auch nach Serverneustart noch eine ganze Weile (bis zu mehreren Tagen) zur Verfügung stehen – bis der Rollover der 5 Dateien stattgefunden hat.

Für SQL Server 2008 Instanzen empfehle ich, das Datei-Ziel hinzuzufügen.

All those approaches, as different as their respective advantages may, have ine thing in common: They have to be set up. In case of doubt after the first deadlocks occurred.

But since SQL Server 2008 all deadlocks are being logged by default.

This is done by the automatically running system_health Extended Event session.

This, alongside deadlocks logs further important server events and error-information. Here you find details on it.

Since SQL Server 2012 the system_health session apart from the memory-target “ring_buffer” also logs into an event-file, so that deadlocks and other error stay available even after a server restart for quite a while (up to a few days) – until the rollover of the 5 files has occurred.

For SQL Server 2008 instances I recommend to add the file-target.

Wie kann man diese Daten auslesen?
Das geht per T-SQL relativ einfach, wenn man weiß, dass diese Daten über die DMV sys. dm_xe_session_targets verfügbar sind.
So liest man das Ring-Buffer-Ziel aus:

How can you read-out that data?
This works quite easily va T-SQL once you know that this data is available via the DMV sys. dm_xe_session_targets.
This is how you read-out the ring_buffer-Target.

SELECT

target_data

FROM

sys.dm_xe_session_targets AS dm_xe_session_targets

INNER JOIN sys.dm_xe_sessions AS dm_xe_sessions

ON dm_xe_session_targets.event_session_address = dm_xe_sessions.address

WHERE

dm_xe_sessions.name = 'system_health'

AND target_name = 'ring_buffer'

„Target_data“ ist XML-text gespeichert als unicode-text-Blob, welches man bei entsprechender Konvertierung zu XML sogar in SSMS direkt öffnen kann.
Um nur die Deadlock-Events dort heraus zu filtern, muss man die XQuery-Erweiterungen von T-SQL bemühen.

Das sieht dann vom Code her so aus:

„Target_data“ is XML-text stored as unicode-text-blob which can even be opened direct in SSMS after converting to XML.
To filter out only the deadlocks you need to use the XQuery-extensions of T-SQL.
And this is how that looks like in code plus results:

Wenn man einen solchen Deadlock-Report in XML-Format in SSMS anklickt, bekommt man ihn in einem neuen Fenster als XML-Dokument angezeigt:

When klicking at one of those Deadlock-Report reports in XML-format in SSMS it will be opened in a new window as XML-document:

- im Bild habe ich zur Übersicht einige Knoten zugeklappt.

Wichtig hier sind: die victim-list, welche bei einem Multi-Victim-Deadlock dann je Victim eine Prozess-Id enthält, die man wiederum in der process-list näher untersuchen kann.

Im executionStack findet man dann denn ausgeführten SQL-Befehl und ggfl. sogar direkt den Prozedurnamen.

Darunter folgt in der resource-list eine Liste der beteiligten (gesperrten) Ressourcen.

Sämtliche Nodes sind mit Details wie der Zeilennummer, Objektname, Lock-Modus, Wartezeit, SPID und natürlich Zeitstempeln gespickt.

- in the picture I collapsed some nodes for brevity.

Most important here are: the victim-list, which for a multi-victim-deadlock contains one process-id per victim, which again one can investigate more deeply in the process-list.

In the executionStack one can find the executes SQL-command and if applicable even the procedure name.

Below in the resource-list follows a list of all involved (blocked) resources.

All nodes are peppered with details such as row number, objectname, lock-mode, waiting time, spid and of course timestamps.

Wer hier direkt weiter ansetzen möchte, kann das XML natürlich weiter auseinandernehmen und einzelne Knoten versuchen relational in Spaltenform darzustellen.
Dafür muss man zunächst den Deadlock-Graphen extrahieren:

Who wants to start from here can of course take apart the XML and try to break individual nodes relationally into columns.
For that one has to first extract the deadlock-graph.

CAST(XEventData.XEvent.query('(data/value/deadlock)[1]') AS xml) AS DeadlockGraph

Von dort aus kann man leicht die einzelnen Knoten parsen.
An das Transaction Isolation Level kommt man z.B. mit der value-Methode, wenn man einmal weiß, auf welcher Ebene/in welchem Knoten des Graphen sie sich befindet:

From there one can parse the individual nodes easily.
The transaction isolation level for example can be reached with the value-method once you know on which level/in which node in the graph it resides.

XMLCol.value('(/deadlock/process-list/process)[1]/@isolationlevel', 'varchar(100)')

Auf die Weise kann man mit fast allen Attributen und Elementen verfahren.
Bei den Sperr-Informationen stößt man dabei jedoch an die Grenzen der statischen relationalen Darstellung, da dort je nach gesperrter Resource völlig andere Attribute auftauchen.

In this manner one can proceed with almost all attributes and elements.
For locking-information one will get to the limits of the static relational representation since depending on the locked resource there will be completely different attributes.

Das zweite Problem tritt dann auf, wenn man Multi-Victim-Deadlocks vor sich hat.

Potentiell kann es ja eine Unmenge an gesperrten und gekillten Victim-Prozessen geben, so dass man da kaum ein immer ausreichendes Schema aufbauen kann.

Hier ein Beispiel eines Multi-Victim-Deadlocks, dargestellt mit der XEvents GUI (Profiler beherrschte das, wie so vieles, nicht)

The second problem arises once you are dealing with a multi-victim-deadlock.

Potentially there can be a vast number of locked and killed victim-processes, so that one can hardly prepare an ever sufficient schema.

Here an example of a multi-victim-deadlock, displayed with the XEvents GUI (Profiler could not handle that as many other things)

Letzten Monat habe ich meinen SQL Deadlock-Collector & Parser bei codeplex frei downloadbar veröffentlicht.
- hier der direkte Link: SQL Deadlock Collector & Parser

- Hier zum Artikel mit ausführlicher Beschreibung:

freier Deadlock-Sammler & -Parser basierend auf Erweiterten Ereignissen

Dort habe ich mich so entschieden, immer die beiden ersten Prozesse gegenüberzustellen, was bei den meisten Deadlocks schon sehr weiterhilft.

Das sieht dann so aus:

Last month I published my SQL Deadlock-Collector & Parser freely downloadable at codeplex.
- here the direct link: : SQL Deadlock Collector & Parser

- here the article with detailed explanaition:

free Deadlock-Collector & -Parser based on Extended Events

There I decided always contrast the first two processes , which helps for most deadlocks a lot already.

This is how it looks like:

Wie kann man nun die ja ebenfalls von er system_health-session geschriebenen Event-Dateien einlesen?

Dafür gibt es die Systemfunktion sys.fn_xe_file_target_read_file.
Diese benötigt den Pfad zur xel-Datei und den Namen der ersten Datei, welche eingelesen werden soll.
Die Dateien liegen übrigens im Log-Verzeichnis der SQL Server Instanz.
Wenn man den Dateinamen nicht angibt, werden alle Dateien im Verzeichnis eingelesen.
Was die Funktion jedoch wirklich praktisch macht, ist, dass man bereits den Pfad mit einer Wildcard für den Dateinamen versehen kann, dass man den Dateinamen praktisch kaum noch benötigt.

So kann das dann aussehen:

How is it possible to read-in the eventfiles which are written to as well by the system_health-session?

For that the system-function sys.fn_xe_file_target_read_file is a provided.
It needs the path to the xel-file and the name of the first file which is to be read-in.
The files are located in the log-directory of the SQL Server instance by the way.
If the filename is not specified all files in the directory will be read-in.
What makes the function really handy is, that one can already arm the path with a wildcard, so that practically one does not even need the filename any more.

This is how that can look like:

Wer bereits den Deadlock-Collector zum Parsen und Speichern seiner Deadlocks verwendet, kann damit ebenfalls manuell Deadlocks von X-beliebigen Systemen in die SQL_Analysis_Data-Datenbank importieren.
Dafür verwendet man einfach die Prozedur Locking.ins_DeadLock in der Datenbank SQL_Analysis_Code mit dem Parameter @XESource.

Dieser ist standardmäßig auf die Verwendung des Ring_Buffers geschaltet.
Aber man kann auch einfach den Pfad + Dateinamen der zu importierenden xel-Datei übergeben:

Who is already using the Deadlock-Collector to parse and save ones deadlocks can also use it to import deadlocks from any system at will into the SQL_Analysis_Data-database.

For that simply use the procedure Locking.ins_DeadLock in the database SQL_Analysis_Code with the parameter @XESource.

It is set to use the ring_buffer by default.
But one can simply pass in the path + filename of the xel-file to import:

USE SQL_Analysis_Code;

DECLARE @DeadlocksCollected int;

EXECUTE Locking.ins_DeadLock

@XESource = 'D:\LogFiles\SQLServer\SQL2012Dev\system_health_*.xel'

, @DeadlocksCollected = @DeadlocksCollected OUTPUT;

SELECT @DeadlocksCollected AS [Number of new Deadlocks collected:];

Anschließend kann man die Deadlocks wie hier beschrieben über Sichten oder Ad-Hoc T-SQL bequem halbwegs in relationaler Struktur analysieren.

Afterwards one can comfortably analyse the deadlocks via views or ad-hoc T-SQL in a half-way relational structure as wie described here.

Hinweis:

Es ist möglich, dass Deadlocks aufreten, die nicht in den system_health_session-Speicherzielen, also weder in Ring_Buffer noch in der Eventdate zu finden sind. Das geschieht, wenn der Deadlock-Graph zu groß für den Puffer-Speicher ist.
Um sicherzustellen, dass auch größere Deadlocks gesammelt werden können, empfehle ich, den Puffer-Speicher der system_health-session zu erhöhen.

Note:

It is possible that deadlocks occur which do not appear in the system_health-session targets. Neither in the ring_buffer not in the filetarget.
This happens when the deadlock-graph is too big for the buffer-memory.

In order to make sure that larger deadlock graphs are also captured, I recommend to increase the maximal work memory of the system_health session.

Und wer jetzt immer noch auf Profiler & SQL Trace setzt (seit SQL Server 2012 aus gutem Grunde deprecated), hier noch ein Tipp:
So sieht ein MultiVictim-Deadlock im Profiler aus:

And who now is still using Profiler & SQL trace (deprecated since SQL Server 2012 for good reasons), here a tip:
This is how a multi-victim-deadlock looks in Profiler:

Das sieht eher wie ein „Multi-Winner-Deadlock“ aus ;-)

Und so stellt sich dieser in der XEvents GUI dar:

Looks more like a „multi-winner-deadlock” ;-)

And this is how the XEvents GUI displays it:

HappyXEvent Parsing

Andreas

deadlock, extended events, multi-victim-deadlock, parsing, profiler is dead, sql deadlock-collector, sql trace _ profiler, system_health, xquery

Free Deadlock-Collector & -Parser, based on Extended Events session system_health / freier Deadlock-Sammler & -Parser basierend auf Erweiterten Ereignissen

Feb 27th

Posted by Andreas Wolter in Locking & Blocking

11 comments

(en)
Today I redeemed a promise made long ago: To provide the deadlock-collector framework, which I developed for use at my customers’, as a Codeplex project.

Until now I have only given it out to attendees of my SQL Server Master-Classes and a few fellow Microsoft Certified Masters. From today on it is freely available to everyone.

(de)
Heute habe ich ein Versprechen eingelöst, das ich vor langer Zeit gemacht habe: Und zwar das Framework für den Deadlock-Collector, das ich zum Einsatz bei meinen Kunden entwickelt habe, als Codeplex-Projekt bereitzustellen.

Bisher habe ich es nur an Teilnehmer meiner SQL Server Master-Classes und einige Microsoft Certified Master-Kollegen herausgegeben. Ab heute ist es für jeden frei verfügbar.

If you just want to get it real quick, here is the link to the Codeplex project where you can download the necessary objects:

>> sqldeadlockcollector.codeplex.com <<

For more information on how to set up the solution and what kind of analysis it enables read below.

Wenn Sie ihn einfach schnell haben möchten, ist hier der Link zum Codeplex-Projekt, wo Sie die nötigen Objekte herunterladen können:

>> sqldeadlockcollector.codeplex.com <<

Weitere Informationen dazu, wie man die Lösung aufstellt und welche Art von Analyse sie ermöglicht, finden Sie im Folgenden.

Contents

Description
Technical Implementation
Supported SQL Server Versions & Editions
Setup High-level
Setup the Jobs
Integration of the Reports
Usage for Analysis
The Reports
Availability
Notes & Recommendations
Call to Action: Quick Poll

Inhalte

Beschreibung
Technische Umsetzung
Unterstützte SQL Server Versionen & Editionen
Setup High-level
Setup der Aufträge
Integration der Berichten
Nutzung zur Analyse
Die Berichte
Verfügbarkeit
Hinweise & Empfehlungen
Call to Action: Kurze Umfrage

Description

The SQL Server Deadlock-Collector collects information which can be read out from the ring buffer in the memory of the SQL Server, and stores this information in a database-table.

Aloof from the pure deadlock graph, the data is parsed, and the first 2 respective processes of the deadlock graph are contrasted in 2 columns, respectively, with information such as “process ID, SQL/procedure text, host name, application name” etc. (procedure name is only available from SQL Server 2012 onwards because of incomplete xml in the former version).

Beschreibung

Der SQL Server Deadlock-Collector sammelt Informationen, die aus dem Ring-Buffer im Speicher des SQL Server ausgelesen werden können und speichert diese Informationen in einer Datenbanktabelle.

Abseits vom reinen Deadlock-Graphen werden die Daten zerlegt und die ersten 2 jeweiligen Prozesse des Deadlock-Graphs werden in jeweils 2 Spalten gegenübergestellt, mit Informationen wie z.B. „Prozess-ID, SQL/Prozedur Text, Hostname, Anwendungsname“ etc. (Prozedur-Name ist aufgrund unvollständigem xml in der vorherigen Version erst ab SQL Server 2012 verfügbar).

This way, one obtains a tabular representation of the deadlock processes. Since there may be more than 2 involved processes, the complete graph is always also present. (red in the image below)

Auf diese Weise erhält man eine tabellarische Darstellung der Deadlock-Prozesse. Da es mehr als 2 beteiligte Prozesse geben kann, ist der komplette Graph auch immer präsent. (rot in der Abbildung unten)

With the aid of a few sample reports, administrators will obtain a quick overview on the frequency of the occurring deadlocks in the respective databases and be able to pass on the data for further analysis to the developers or external analysis even by simply providing a copy/backup of the database.

Developers will thus obtain a valuable and quickly interpretable version of the occurring deadlock combination and be able to implement prevention strategies with this information.

Mithilfe einiger Beispielberichte erhalten Administratoren eine schnelle Übersicht über die Häufigkeit der auftretenden Deadlocks in den jeweiligen Datenbanken, und können die Daten darüber sogar durch das einfache Bereitstellen einer Kopie/eines Backups der Datenbank an die Entwickler oder externe Analysten weitergeben.

Entwickler erhalten damit eine wertvolle und schnell interpretierbare Version der auftretenden Deadlock-Kombinationen, und können mit diesen Informationen Vermeidungsstrategien umsetzen.

Technical Implementation

The complete SQL Deadlock Collector & Parser -solution consists of 3 (small) databases, a SQL Server Agent job and 3 example Reporting Services Reports for Management Studio integration.

Through the utilization of the information already existing in memory by the Extended Events system_health-session, the SQL Deadlock Collector & Parser is absolutely lightweight and has scarcely any impact on a productive system. Solely at execution time of the data collection job (which is configurable at free will, every 6-24 hours are common), minimal resources are required.

Technische Umsetzung

Die komplette SQL Deadlock Collector & Parser – Lösung besteht aus 3 (kleinen) Datenbanken, einem SQL Server Agent Auftrag und 3 Beispiel- Reporting Services Berichten Management Studio Integration.

Durch die Nutzung der durch die system_health-Extended Events Session bereits im Arbeitsspeicher existierenden Informationen ist der SQL Deadlock Collector & Parser absolut leichtgewichtig und hat kaum eine Auswirkung auf ein produktives System. Lediglich zur Ausführungszeit des Datensammlungsauftrags (der frei konfigurierbar ist, alle 6-24 Stunden sind dabei gängig) werden minimale Ressourcen erfordert.

Supported SQL Server Versions & Editions

The SQL Deadlock Collector & Parser is designed to run on all SQL Server versions and editions from SQL Server 2008 SP2 onwards.

The provided example reports for Management Studio Integration only work with Management Studio 2012 onwards. Since SSMS is available in a free Express Edition, being mostly backwards compatible to SQL Server 2008 / R2, this is a minor constraint.

Here you can get SSMS Express in Version 2014:

www.microsoft.com/en-US/download/details.aspx?id=42299

Unterstützte SQL Server Versionen & Editionen

Der SQL Deadlock Collector & Parser ist so ausgelegt, dass er auf allen SQL Server Versionen und Ausgaben ab SQL Server 2008 SP2 läuft.

Die bereitgestellten Beispielberichte für Management Studio Integration funktionieren nur mit Management Studio 2012 und höher. Da SSMS als kostenlose Express Edition verfügbar ist, und dabei rückwärts kompatibel mit SQL Server 2008 / R2 ist, ist dies eine geringe Einschränkung.

Hier können Sie die SSMS Express als Version 2014 erhalten:

www.microsoft.com/de-DE/download/details.aspx?id=42299

Setup High-level

The SQL Deadlock Collector & Parser is available in the form of backup files of the respective databases which are 3 in total:

SQL_Analysis_Data
SQL_Analysis_Code
SQL_Analysis_Reporting

As can be seen by means of the denominations, in this relatively small project, too, it is being proceeded according to proven Sarpedon Quality Lab methodology of separating data, code and reporting. Thus, updates are being facilitated.

The 3 databases must be set up on the same server.

Setup High-level

Der SQL Deadlock Collector & Parser ist in Form von Backup-Dateien der jeweiligen Datenbanken verfügbar, insgesamt 3:

SQL_Analysis_Data
SQL_Analysis_Code
SQL_Analysis_Reporting

Wie anhand der Bezeichnungen zu erkennen ist, wird auch in diesem relativ kleinen Projekt nach bewährter Sarpedon Quality Lab Methodik der Trennung von Daten, Code und Reporting vorgegangen. So werden Updates erleichtert.

Die 3 Datenbanken müssen auf demselben Server eingerichtet werden.

Setup instructions:

Recover the 3 databases in the respective version (2008 or 2012) onto your Server

The “2008”-Version targets all Systems from SQL Server 2008 SP2 up to SQL Server 2008 R2 SP2
The “2012”-Version targets all Systems from SQL Server 2012 RTM CU3 up to SQL Server 2014 RTM CU6
All databases have to reside on the same monitored SQL Server Instance. If multiple Instances are target of collection, the deadlock collector’s databases and jobs have to be installed on each single instance. Multi-Instance-collection is not implemented

Set up the collection Job
Add a schedule for the collection
Repeat the same for the maintenance job that removes old collected deadlocks after a specified timeframe
Optional: integrate sample custom reports in SSMS for Database ”SQL_Analysis_Reporting”

If multiple Instances are to be monitored, repeat each step for each instance

Setup Anweisungen:

Stellen Sie die 3 Datenbanken in der jeweiligen Version (2008 oder 2012) auf Ihrem Server wieder her

Die “2008”-Version ist für alle Systeme ab SQL Server 2008 SP2 bis zum SQL Server 2008 R2 SP2 da
Die “2012”-Version ist für alle Systeme ab SQL Server 2012 RTM CU3 bis zum SQL Server 2014 RTM CU6 da
Alle Datenbanken müssen auf derselben überwachten SQL Server Instanz liegen. Wenn mehrere Instanzen Ziel der Datenerfassung sind, müssen die Datenbanken und Aufträge des Deadlock Collectors auf jeder einzelnen Instanz installiert werden. Multi-Instance-Collection ist nicht implementiert

Richten Sie den Sammler-Auftrag ein
Fügen Sie einen Zeitplan für die Datenerfassung hinzu
Wiederholen Sie das gleiche für den Maintenance Job, der alte erfasste Deadlocks nach einem festgesetzten Zeitrahmen entfernt
Optional: Integrieren der Beispiel-Custom Berichte in SSMS für die Datenbank ”SQL_Analysis_Reporting”

Wenn mehrere Instanzen überwacht werden sollen, wiederholen Sie alle Schritte je Instanz

Setup the Jobs

Subsequently, two SQL Server Agent Jobs should be installed which will be configured as follows:

Einrichten der Jobs

Danach sollten zwei SQL Server Agent Aufträge eingerichtet werden, die wie folgt konfiguriert werden:

Name: SQL_Analysis_Collect_Deadlocks

Description: Job that collects Deadlock-Graphs from system_health to store in SQL_Analysis_Data.

Step 1, Type T-SQL, Name: Exec Proc Locking ins_DeadLock

Database context: SQL_Analysis_Code

Command: DECLARE @DeadlocksCollected int;

EXECUTE Locking.ins_DeadLock

@DeadlocksCollected = @DeadlocksCollected OUTPUT;

SELECT @DeadlocksCollected AS [Number of new Deadlocks collected:];

Name: SQL_Analysis_Remove_Old_Deadlocks

Description: Job that removes collected Deadlocks older than X days from SQL_Analysis_Data

Step 1, Type T-SQL, Name: Exec Proc Locking del_DeadLock

Database context: SQL_Analysis_Code

Command: DECLARE @DeleteOlderThanDate date, @DeadlocksRemoved int;

SET @DeleteOlderThanDate = DATEADD(dd, -60, SYSDATETIME())

SELECT @DeleteOlderThanDate AS [LatestDateToKeep:]

Execute Locking.del_DeadLock

@DeleteOlderThanDateX = @DeleteOlderThanDate

, @DeadlocksRemoved = @DeadlocksRemoved OUTPUT;

SELECT @DeadlocksRemoved AS [Number of removed deadlocks/rows from table:];

Please note that the instructions in bold must be used exactly as stated above; all others are suggestions and can be changed to your according naming styles.

In addition to that, set up a time schedule (as always I recommend a dedicated but shared schedule). For instance, every 6-12 hours, depending on the number of errors in the system_health session which works in FIFO-mode.

Bitte beachten Sie, dass die fettgedruckten Anweisungen genau wie oben angegeben verwendet werden müssen; alle anderen sind Vorschläge und können Ihren Namensrichtlinien angepasst werden.

Stellen Sie zusätzlich dazu einen Zeitplan (ich empfehle wie immer einen dedizierten aber Shared Schedule). Zum Beispiel alle 6-12 Stunden, je nach Anzahl der Fehler in der system_health Session, die im FIFO-Modus arbeitet.

After installing the databases the first time, you will find a few rows of example deadlocks in the Date-Database – also accessible via the views from the Reporting-Database.
The sole purpose of those is to help understand how the SQL Deadlock Collector & Parser works.

Nachdem Sie die Datenbanken das erste Mal eingerichtet haben, werden Sie einige Zeilen an Beispiel-Deadlocks in der Data-Datenbank finden – ebenfalls erreichbar über die Sichten aus der Reporting-Datenbank.
Der alleinige Zweck dieser ist es, besser zu verstehen zu helfen, wie der SQL Deadlock Collector & Parser funktioniert.

You can delete those manually or by using the maintenance-procedure Locking.del_DeadLock as shown in the SQL Agent job-example.

Sie können diese manuell oder mithilfe der Wartungs-Prozedur Locking.del_DeadLock, wie im SQL-Agent Auftragsbeispiel gezeigt, löschen.

Integration of Reports

In SQL Server management Studio navigate to the database “SQL_Analysis_Reporting” and there in the menu go to Reports – Custom reports, to choose the 3 provided rdl-files (separate download) for integration.

Integration der Berichte

Navigieren Sie im SQL Server Management Studio zur Datenbank “SQL_Analysis_Reporting” und dort im Menü zu Berichte – Benutzerdefinierte Berichte, um die 3 bereitgestellten rdl-Dateien (gesonderter Download) zur Integration auszuwählen.

When the warning pops up you need to say “run” in order to execute the reports.

Wenn die Warnung eingeblendet wird, müssen Sie „Ausführen“ auswählen, um die Berichte auszuführen.

After that the 3 reports will show up in the list. For more information on Custom Reports in Management Studio read here: Custom Reports in Management Studio

Danach werden die 3 Berichte in der Liste angezeigt. Für mehr Informationen zu benutzerdefinierten Berichten in Management Studio können Sie hier weiterlesen: Custom Reports in Management Studio

Usage for Analysis

When deadlocks are collected, these can be analyzed in different ways with the help of sample views located in the database SQL_Analysis_Reporting. These range from general statistics down to detailed statements, resources and execution plans.

Verwendung zur Analyse

Wenn Deadlocks gesammelt werden, können diese auf verschiedenen Wegen mithilfe der Beispielsichten in der Datenbank SQL_Analysis_Reporting analysiert werden. Diese reichen von allgemeinen Statistiken bis zu detaillierten Statements, Ressourcen und Ausführungsplänen.

A special feature is the inclusion of the original query plans (provided that they are available in the cache at the moment of collection) of the first 2 involved processes – very handy for analysis of cause.

Ein besonderes Feature ist die Einbeziehung des originalen Ausführungsplanes (vorausgesetzt, dass diese zum Zeitpunkt der Sammlung im Cache verfügbar sind) der ersten beiden involvierten Prozesse – sehr praktisch bei der Ursachenanalyse.

A SQLHandle of 0xFF will be shown when the handle from the deadlock-graph cannot be resolved any more. This happens for trivial plans or when the plan has been evicted from cache since the deadlock occurred.

The complete deadlock graph can also be used to get a visual representation of the deadlock graph.

Ein SQLHandle von 0xFF wird angezeigt, wenn das Handle aus dem Deadlock-Graphen nicht mehr aufgelöst werden kann. Das geschieht bei trivialen Plänen oder wenn der Plan seit Auftreten des Deadlock aus dem Cache geworfen wurde.
Der komplette Deadlock-Graph kann auch verwendet werden, um eine visuelle Repräsentation des Deadlock Graphen zu erhalten.

Example of a simple deadlock with 2 processes.

Beispiel eines einfachen Deadlocks mit 2 Prozessen.

Example of a multi-victim-deadlock with 4 processes.
Since a deadlock can involve an unforeseeable number of processes, but a database table by nature has a strict design, I not only keep the original complete deadlock graph but also extract the list of victims, processes and execution stacks into one xml-field each for simplified analysis.

Beispiel eines Multi-Victim-Deadlocks mit 4 Prozessen.
Da ein Deadlock eine unvorhersehbare Anzahl an Prozessen involvieren kann, aber eine Datenbanktabelle von Natur aus ein striktes Design hat, behalte ich nicht nur den originalen kompletten Deadlock-Graphen sondern extrahiere auch die Liste der Victims/Opfer, Prozesse und Execution Stacks in jeweils ein XML-Feld zur vereinfachten Analyse.

Clicking on the victim-list for a deadlock with 4 victims opens a new xml-window containing just the node of victims with their respective process-id for further, manual matching with the execution stack, for example

Klickt man auf die Victim-Liste eines Deadlocks mit 4 Victims öffnet sich ein neues xml-Fenster, welches nur den Knoten der Victims mit ihrer jeweiligen Prozess-ID für weiteres manuelles Verknüpfen mit zum Beispiel dem Execution Stack.

<victim-list>

</victim-list>

The Reports

On top of the views one can implement custom reports for reoccurring analysis. 3 example reports are included in the project and can be integrated into Management Studio directly. (Version 2012 and 2014)

Die Berichte

Basierend auf den Sichten kann man nutzerdefinierte Berichte für wiederholte Analysen implementieren. 3 Beispielberichte sind in dem Projekt enthalten und können direkt in Management Studio integriert werden. (Version 2012 und 2014)

Availability

The SQL Deadlock Collector & Parser is available as open source project under Microsoft Public License (Ms-PL) at codeplex:

sqldeadlockcollector.codeplex.com

Verfügbarkeit

Der SQL Deadlock Collector & Parser ist als Open Source Project unter Microsoft Public License (Ms-PL) auf Codeplex verfügbar:

sqldeadlockcollector.codeplex.com

Notes & Recommendations

1) The event-time in SQL Server 2008 may strongly deviate (hours and days!) due to a mistake in the XML-code of the system_health session. Please also see the following Connect Item:

Hinweise & Empfehlungen

1) Die Event-Zeit unter SQL Server 2008 kann aufgrund eines Fehlers im XML-Code der system_health session stark abweichen (Stunden und Tage!). Siehe dazu das folgende Connect Item:

http://connect.microsoft.com/SQLServer/feedback/details/649362/extended-event-session-system-health-diferent-timestamp-datetime-and-datetime-on-server-getdate

2) In order to make sure that larger deadlock graphs are also captured, I recommend increasing the maximal memory of the system_health session as one sees fit.

3) The size of the Data-database (SQL_Analysis_Data) may greatly expand with time. Use the clean-up procedure (Locking.del_DeadLock) inside the Code-database to remove old deadlock entries.

2) Um sicherzustellen, dass auch größere Deadlock-Graphen erfasst werden, empfehle ich, den maximalen Speicher der system_health session nach eigenem Ermessen zu erhöhen.

3) Der Umfang der Data-Datenbank kann mit der Zeit stark wachsen. Verwenden Sie die Aufräum-Prozedur (Locking.del_DeadLock) in der Code-Datenbank, um alte Deadlock Einträge zu entfernen.

Call to Action: Quick Poll

I absolutely invite you to use my code free of charge and comment on any problems you may have or improvements you would like to see.

All I would love for you to do is send me a high level statistics of what kind of deadlocks you see on your systems.

Simply run the following query after you have collected new deadlocks, and send me the result, purely consisting of the Number of affected processes and number of deadlocks.

Call to Action: Kurze Umfrage

Ich lade Sie dazu ein, meinen Code kostenlos einzusetzen und Probleme, die Sie antreffen sollten zu kommentieren.

Alles, worüber ich mich freuen würde, wäre, wenn Sie mir eine allgemeine Statistik darüber zusenden, welche Art von Deadlock Sie auf Ihren Systemen sehen.

Dafür können Sie einfach, nachdem Sie neue Deadlocks gesammelt haben, die folgende Abfrage ausführen, und mir das Ergebnis zusenden, welches lediglich aus der Anzahl der betroffenen Prozesse und Anzahl Deadlocks besteht.

SELECT
        [AffectedProcesses]
      , COUNT(*) AS [Number of Deadlocks]
    FROM
        [SQL_Analysis_Reporting].[Locking].[v_DeadlockInfos]
    GROUP BY
        [AffectedProcesses]
    ORDER BY
        [AffectedProcesses] ASC

-->

AffectedProcesses Number of Deadlocks
----------------- -------------------
2 3
3 1

(2 row(s) affected)

You can use comments or also send me an Email if you have access to it. (due to spam bots I will not post it here openly though)

Thank you for all sendings!

Sie können Kommentare verwenden, oder mir eine E-Mail schicken, wenn Sie auf diese Zugriff haben (aufgrund der Spam-Bots werde ich meine Email-Adresse hier jedoch nicht öffentlich posten).

Ich bedanke mich für alle Zusendungen!

I hope you find my tool useful, and good luck with the analysis of your hopefully not too many deadlocks.

Ich hoffe, Sie können mein Tool gut verwenden und wünsche Ihnen viel Erfolg bei der Analyse Ihrer hoffentlich nicht allzu vielen Deadlocks.

Andreas

deadlock, extended events, system_health

New Extended Events for Tracing in SQL Server 2014

Apr 5th

Posted by Andreas Wolter in Extended Events

2 comments

Neue “Erweiterte Ereignisse” (XEvents) für Tracing in SQL Server 2014

(de)
Dieser Artikel enthält eine Liste der neuen Erweiterte Ereignisse (“Extended Events”), dem Tracing Framework in SQL Server in dem neuen 2014 Release.

Zunächst gibt es 3 neue Pakete („Packages“), die alle zu der neuen Arbeitsspeicher(„In-Memory“) -OLTP Engine: XTP, aufgeteilt auf 3 DLLs:

(en)
This article contains a list of new Extended Events, the tracing framework in SQL Server in the new 2014 release.

To begin with there are 3 new packages, all belonging to the new In-Memory OLTP Engine: XTP, split up into 3 dlls:

Package	description	Module name
XtpRuntime	Extended events for the XTP Runtime	{InstanceDir}\MSSQL\Binn\ hkruntime.dll
XtpEngine	Extended events for the XTP Engine	{InstanceDir}\MSSQL\Binn\ hkengine.dll
XtpCompile	Extended events for the XTP Compile	{InstanceDir}\MSSQL\Binn\ hkcompile.dll

Die Anzahl der Extended Events ist von 625 in SQL Server 2012 (SP1) auf 870 in SQL Server 2014 (RTM) angestiegen. Das sind 245 neue Events! – Reichlich Möglichkeiten, in Internas einzutauchen :-)
2 Events sind umbenannt worden:

recovery_incremental_checkpoint in

recovery_indirect_checkpoint

und hekaton_slow_param_passing in

natively_compiled_proc_slow_parameter_passing

Ja, richtig gelesen. Es gab bereits im SQL Server 2012 eine Anzahl an Events für die XTP Engine. Es kommt recht häufig vor, dass man Bits von zukünftigen Entwicklungen in der aktuellen Release findet.

Nebenbei: SQL Trace wurde nicht im geringsten angefasst. Die Anzahl von SQL Trace Events bleibt mit 180 seit SQL Server 2008 stabil. – Here ist ein Überblick von Extended Events gegen SQL Trace Events in den verschiedenen SQL Server Versionen: Extended Events vs SQL Trace im Vergleich – oder warum SQL Trace & Profiler einfach von gestern sind :-)

Weiter zu XEvents: es gibt keine neuen Prädikate, aber eine Anzahl neuer Maps, was natürlich nicht überrascht bei der Menge neuer Events.
Targets, Actions, Typen und Session Optionen sind unverändert.

Und hier ist die Liste der neuen Extended Events:

The number of Extended Events increased from 625 in SQL Server 2012 (SP1) to 870 in SQL Server 2014 (RTM). That makes 245 new events! – Plenty of opportunities to look into internals :-)
2 events have been renamed:

recovery_incremental_checkpoint into

recovery_indirect_checkpoint

and hekaton_slow_param_passing into

natively_compiled_proc_slow_parameter_passing

Yes, you read correctly. There have already been a couple of events for the new XTP engine within SQL Server 2012. It’s quite common that you can find bits of future developments within the current release.

By the way: SQL Trace has not been touched by the slightest bit. The number of SQL Trace events stays at 180 since SQL Server 2008. – here you find an overview of the number of Extended Events vs SQL Trace events in the various SQL Server versions: Comparing Extended Events vs SQL Trace – or why SQL Trace & Profiler are just a thing of the past :-)

Continuing with XEvents: There are no new predicates, but a couple more maps, which is of course not surprisingly considering the amount of new events.
Targets, actions, types and session options are unchanged.

And here is the list of all new extended events:

Package name	Event name	Description
qds	query_store_async_shutdown_failed	Fired when query store encounters an error during async query store shutdown
qds	query_store_background_cleanup_task_failed	Fired if the background task for cleanup of QDS stale data is not completed successfully
qds	query_store_background_task_creation_failed	Fired if the background task for QDS data persistence could not be created
qds	query_store_background_task_initialization_failed	Fired if the background task for QDS data persistence could not be initialized
qds	query_store_background_task_persist_failed	Fired if the background task for QDS data persistence is not completed successfully
qds	query_store_begin_persist_runtime_stat	Fired immediately before current runtime statistics for a query plan is persisted to the database.
qds	query_store_bloom_filter_false_positive	Fired if the Bloom filter for QDS statement texts gives a false positive result
qds	query_store_check_consistency_init_failed	Fired when check consistency task fails during initialization
qds	query_store_database_initialization_failed	Fired if initialization of the Query Store for database has failed. The Query Store will be disabled for this database
qds	query_store_db_cleanup__finished	Fired if cleanup of QDS stale data finished for particular database.
qds	query_store_db_cleanup__started	Fired if cleanup of QDS stale data started for particular database.
qds	query_store_disk_size_check_failed	Fired when a check against Query Store on-disk size limit fails
qds	query_store_disk_size_info	Fired when a check against QDS on-disk size is performed
qds	query_store_execution_runtime_info	Fired when runtime information is sent to the query store.
qds	query_store_execution_runtime_info_discarded	Fired when runtime information sent to the query store is discarded.
qds	query_store_execution_runtime_info_evicted	Fired when runtime information sent to the query store is evicted.
qds	query_store_flush_failed	Fired when query store failed to flush dirty data
qds	query_store_loaded	Fired when query store is loaded
qds	query_store_notify_dirty_shutdown_on_partition_startup	Fired when previous instance of query store for the partition is shutdown by force in order to allow creation of a new instance
qds	query_store_notify_force_failure_failed	Fired when query store failed to notify force failure
qds	query_store_persist_task_init_failed	Fired when persist task fails during initialization
qds	query_store_plan_forcing_failed	Occurs when forcing of plan from qds fail
qds	query_store_plan_persistence_failure	Fired if there's a failure to persist plan
qds	query_store_plan_removal	Fired when plan is removed
qds	query_store_query_persistence_failure	Fired if there's a failure to persist query
qds	query_store_read_write_failed	Fired if the read/write to Query Store internal tables failed
qds	query_store_statement_not_found	Fired in case when statement couldn't be found due to race condition or ambiguous user request.
qds	query_store_unloaded	Fired when query store is unloaded from memory
sqlos	nonpreemptive_long_syncio	record long sync io operation in nonpreemptive worker
sqlos	stuck_dispatcher_callback_executed	Stuck dispatcher callback executed
sqlos	wait_completed	Occurs when there is a wait completed on a SQLOS controlled resource. Use this event to track wait completion.
sqlserver	after_natively_compiled_proc_entry_removal_on_drop	Fired after the procedure cache entry is flushed when dropping a natively compiled procedure.
sqlserver	availability_replica_state	Occurs when the Availability Replica is starting or shutting down.
sqlserver	before_natively_compiled_proc_entry_removal_on_drop	Fired before the procedure cache entry is flushed when dropping a natively compiled procedure.
sqlserver	before_redo_lsn_update	Occurs just prior to the update of the EOL LSN
sqlserver	buffer_pool_eviction_thresholds_recalculated	Lazywriter and/or worker clock has wrapped the BUF array and thresholds are re-calculated.
sqlserver	buffer_pool_extension_pages_evicted	Page is evicted from the buffer pool extension cache.
sqlserver	buffer_pool_extension_pages_read	Page is read from the buffer pool extension cache.
sqlserver	buffer_pool_extension_pages_written	Page or contiguous set of pages evicted into the buffer pool extension cache.
sqlserver	check_phase_tracing	Occurs when DBCC CHECK enters a new phase of the checking. Use this event to trace the phases of DBCC CHECK process.
sqlserver	check_thread_message_statistics	Occurs when a phase of DBCC CHECK is finished. Use this event to collect the number of messages a DBCC CHECK thread has sent or received.
sqlserver	check_thread_page_io_statistics	Occurs when a phase of DBCC CHECK is finished. Use this event to collect the number of logical, physical, and read-ahead IOs a DBCC CHECK thread has performed.
sqlserver	check_thread_page_latch_statistics	Occurs when a phase of DBCC CHECK is finished. Use This event to collect the number and time of page latch and IO latch waits.
sqlserver	clustered_columnstore_index_rebuild	Occurs when clustered index on the table was rebuilt. This event is raised three times for ALTER index rebuild operation on CCSI. The event is raised when the operation takes lock on index rebuild resource, when lock is taken on the table and when S lock on the table is upgraded to SCH_M lock to switch indexes in metadata.
sqlserver	column_store_code_coverage	Code coverage Xevent for columnstore code.
sqlserver	column_store_index_build_low_memory	Occurs when Storage Engine detects low memory condition and the rowgroup size is reduced.
sqlserver	column_store_index_build_process_segment	Occurs when a segment is processed
sqlserver	column_store_index_build_throttle	Shows the statistics of columnstore index build parallelism throttling
sqlserver	columnstore_tuple_mover_begin_compress	Occurs when column store tuple mover starts compressing a deltastore.
sqlserver	columnstore_tuple_mover_end_compress	Occurs when column store tuple mover is done compressing a deltastore.
sqlserver	database_xml_deadlock_report	Produces a deadlock report for a victim, with information scoped to the victim's database.
sqlserver	db_lock_acquired_from_cache	Occurs when a DB lock is acquired from the XactWorkspace DB lock cache.
sqlserver	db_lock_released_from_cache	Occurs when a DB lock is released from the XactWorkspace DB lock cache.
sqlserver	ddl_with_wait_at_low_priority	A DDL statement was executed using the WAIT_AT_LOW_PRIORITY options
sqlserver	diag_quantum_end	Occurs when the diag is notified of quantum end.
sqlserver	dyn_throttle_checkpoint	Occurs when checkpointing has been dynamically throttled with a new Checkpoint rate.
sqlserver	feature_extension	Occurs when received and parses data for a feature in feature extension.
sqlserver	file_handle_in_use	Fired when a file handle we're trying to delete is in use and we don't expect it to be. The typical response is dumping all the handles in the system via SysInternals HANDLE.EXE
sqlserver	file_read_enqueued	File read enqueued
sqlserver	file_read_throttled	File read throttled
sqlserver	file_write_enqueued	File write enqueued
sqlserver	file_write_throttled	File write throttled
sqlserver	ghost_cleanup_task_packet_enqueue	A task packet is enqueued
sqlserver	ghost_cleanup_task_process_packet	A task packet is dequeued and processed
sqlserver	ghost_cleanup_task_process_pages_for_db_packet	Purge the pages that contain ghost records found in a database
sqlserver	ghost_cleanup_task_start	Ghost cleanup task start
sqlserver	ghost_cleanup_task_suspend	Ghost cleanup task suspend
sqlserver	hadr_tds_synchronizer_payload_skip	Hadron Tds Listener Synchronizer skipped a listener payload because there were no changes since the previous payload.
sqlserver	lock_request_priority_state	The priority state of a lock request
sqlserver	log_cache_write_block	Writing a log block to the log cache
sqlserver	metadata_ddl_add_column	Occurs when an ALTER TABLE ADD column operation is updating base index.
sqlserver	metadata_ddl_alter_column	Occurs when an ALTER TABLE ALTER column operation is updating base index.
sqlserver	metadata_ddl_drop_column	Occurs when an ALTER TABLE DROP column operation is updating base index.
sqlserver	mixed_extent_activation	Track mixed extent activation and deactivation operations.
sqlserver	mixed_extent_allocation	Track mixed extent allocation operations
sqlserver	mixed_extent_deallocation	Track mixed extent deallocation operations.
sqlserver	mixed_page_allocation	Track mixed page allocation operations
sqlserver	mixed_page_deallocation	Track mixed page allocation operations
sqlserver	mixed_page_scan_file	Track the activity of SGAM scans for mixed page allocation
sqlserver	mixed_page_scan_page	Track the activity of SGAM scans for mixed page allocation
sqlserver	mixed_page_skipextent	Track the activity of SGAM scans for mixed page allocation
sqlserver	natively_compiled_proc_execution_started	Fired before a natively compiled procedure is executed.
sqlserver	natively_compiled_proc_slow_parameter_passing	Occurs when a Hekaton procedure call dispatch goes to slow parameter passing code path
sqlserver	process_killed_by_abort_blockers	A process is killed by an ABORT = BLOCKERS DDL statement
sqlserver	query_execution_batch_hash_aggregation_finished	Occurs at the end of batch hash aggregation.
sqlserver	query_execution_batch_hash_children_reversed	Occurs each time when hash join reverses build and probe side while processing data spilled to disk.
sqlserver	query_execution_batch_hash_join_spilled	Occurs each time when hash join spills some data to disk in batch processing.
sqlserver	query_optimizer_estimate_cardinality	Occurs when the query optimizer estimates cardinality on a relational expression.
sqlserver	query_optimizer_force_both_cardinality_estimation_behaviors	Both traceflags 2312 and 9481 were enabled, attempting to force both old and new cardinality estimation behaviors at the same time. The traceflags were ignored. Disable one or both of the traceflags.
sqlserver	query_store_failed_to_capture_query	Fired if the Query Store failed to capture query. The Query Store will not track statistics for this query
sqlserver	query_store_failed_to_load_forced_plan	Fired if the query failed to load forced plan from QDS. Forcing policy will not be applied
sqlserver	query_store_persist_on_shutdown_failed	Occurs when SQL Server fails to store dirty entries in Query Store on database shutdown.
sqlserver	recovery_indirect_checkpoint	Chose to enqueue an indirect checkpoint as we are near our recovery target
sqlserver	remove_database_cache	Remove database cache
sqlserver	server_max_workers	Occurs when a request enqueue fails because a worker was not guaranteed.
sqlserver	session_recoverable_state_change	Occurs when the server determines a state change in term of session recovery on a connectionresilency-enabled connection.
sqlserver	sql_transaction_commit_single_phase	Occurs when a sql transaction is committed with single phase commit.
sqlserver	xfcb_blob_properties_obtained	Windows Azure Storage blob property is obtained from response header.
sqlserver	xfcb_failed_request	Failed to complete a request to Windows Azure Storage.
sqlserver	xfcb_header_obtained	Response header is obtained from request to Windows Azure Storage.
sqlserver	xfcb_read_complete	Read complete from Windows Azure Storage response.
sqlserver	xfcb_request_opened	A request is opened to Windows Azure Storage.
sqlserver	xfcb_send_complete	Request send to Windows Azure Storage is complete.
sqlserver	xfcb_write_complete	Request send to Windows Azure Storage is complete.
sqlserver	xtp_create_procedure	Occurs at start of XTP procedure creation.
sqlserver	xtp_create_table	Occurs at start of XTP table creation.
sqlserver	xtp_db_page_allocation_allowed	Indicates that page allocations for the database are allowed.
sqlserver	xtp_db_page_allocation_disallowed	Indicates that page allocations for the database are disallowed due to memory pressure.
sqlserver	xtp_deploy_done	Occurs at completion of XTP object deployment.
sqlserver	xtp_matgen	Occurs at start of MAT generation.
sqlserver	xtp_offline_checkpoint_scan_start	Fired by XTP offline checkpoint when the checkpoint thread begins.
sqlserver	xtp_offline_checkpoint_scan_stop	Fired by XTP offline checkpoint when the checkpoint thread stops.
sqlserver	xtp_recover_done	Occurs at completion of log recovery of XTP table.
sqlserver	xtp_recover_table	Occurs at start of log recovery of XTP table.
sqlserver	xtp_storage_table_create	Occurs at just before the XTP storage table is created.
ucs	ucs_connection_rejected_by_proxy_whitelist	After a connection attempt to the UCS proxy endpoint is rejected by whitelist check
ucs	ucs_proxy_connect_next_hop	UCS proxy next hop connection
ucs	ucs_proxy_receive_proxy_connect_message	UCS proxy receive proxy connect message
ucs	ucs_proxy_route_add	UCS proxy route added
ucs	ucs_proxy_route_disable	UCS proxy route disabled
ucs	ucs_proxy_route_refresh	UCS proxy route refreshed
ucs	ucs_proxy_send_proxy_connect_message	UCS proxy send proxy connect message
XtpCompile	cgen	Occurs at start of C code generation.
XtpCompile	invoke_cl	Occurs prior to the invocation of the C compiler.
XtpCompile	mat_export	Occurs at start of MAT export.
XtpCompile	pitgen_procs	Occurs at start of PIT generation for procedures.
XtpCompile	pitgen_tables	Occurs at start of PIT generation for tables.
XtpEngine	after_changestatetx_event	Fires after transaction changes state.
XtpEngine	alloctx_event
XtpEngine	attempt_committx_event	Is raised when a transaction is asked to commit.
XtpEngine	before_changestatetx_event	Fires before transaction changes state.
XtpEngine	dependency_acquiredtx_event	Raised after transaction takes a dependency on another transaction.
XtpEngine	endts_acquiredtx_event	Fires after transaction acquires an end timestamp.
XtpEngine	gc_base_generation_evaluation	Indicates that an evaluation of updating the GC base generation has been made.
XtpEngine	gc_base_generation_updated	Indicates that the oldest active transaction hint used for calculating the GC base generation has been updated.
XtpEngine	gc_cycle_completed	Indicates that a GC notification has been enqueued.
XtpEngine	gc_notification	Indicates that GC is processing a notification.
XtpEngine	redo_single_hk_record	Redo on a HK log record
XtpEngine	trace_add_delta_filter_begin	Adding delta filter.
XtpEngine	trace_add_duplicate_delta_filter	Adding duplicate delta filte.
XtpEngine	trace_adding_tx_filter	Adding tx filter.
XtpEngine	trace_begin_close_ckpt_processing	Beginning close checkpoint processing: checkpoint flush fence = $2 ($3 state).
XtpEngine	trace_cfd_entry_deleted	Cleaned up data CFD entry for file $2 $3.
XtpEngine	trace_cfd_rows_created	Successfully created CFD rows.
XtpEngine	trace_cfdtable_dump	CfdTable dump
XtpEngine	trace_cfp_handles_opened	Opened CFP handles for {data: $2 delta: $3} in section $4.
XtpEngine	trace_cfp_removed_from_freelist	Removed CFP from freelist for destination file.
XtpEngine	trace_checkpoint_file_flush	Indicating that a file has started a flush operation.
XtpEngine	trace_checkpoint_load_begin	Starting Checkpoint load with checkpoint timestamp=$2.
XtpEngine	trace_checkpoint_write_io	Logs an event indicating that an IO has started (when event_type == IoStarted) or that an IO has completed (when event_type == IoComplete).
XtpEngine	trace_checksum_validation_succeeded	Checksum validation for Hekaton checkpoint file during backup succeeded.
XtpEngine	trace_ckpt_close_begin	Closing checkpoint in active/passive state.
XtpEngine	trace_ckpt_close_signaled	Signaling checkpoint close. The checkpoint policy object is flagged to close the current checkpoint
XtpEngine	trace_ckpt_closed	Checkpoint closed
XtpEngine	trace_ckpt_load_thread_stats	Statistics of the checkpoint load thread.
XtpEngine	trace_ckpt_mrt_dump	Mrt dump
XtpEngine	trace_ckpt_serialization_state_dump	Checkpoint serialization state dump.
XtpEngine	trace_ckpt_stream_io_stats	Io statistics for this particular stream.
XtpEngine	trace_close_file_log_rec_created	Successfully created close file log record.
XtpEngine	trace_data_file_pages_txs	This event gives the number of pages and the transaction range for a data file.
XtpEngine	trace_data_from_page_deserialized	Deserialized data from page $2 in file $4.
XtpEngine	trace_data_row_modified	Data row modified/deleted.
XtpEngine	trace_delete_cfd_rows_begin	Matching MRT not found for the Data CFD. Deleting corresponding CFD Rows.
XtpEngine	trace_delete_DeletedObjectTable_row_begin	Deleting row from DeletedObjectTable in the context of transaction: $5.
XtpEngine	trace_delete_merged_source_file	Deleting merged source $7 file $5:$6. DeleteLSN = $2:$3:$4.
XtpEngine	trace_delete_mrtrow_for_target_begin	Deleting mrt row for target.
XtpEngine	trace_delta_entry_skipped	Skipping delta {$3, $4, $5} because it is beyond the snapshot checkpoint: $6.
XtpEngine	trace_delta_file_future_count_rows_pages_update	Updating future count pages and count rows for delta file.
XtpEngine	trace_delta_file_lastgood_count_rows_pages_update	Updating lastgood count pages and count rows for delta file.
XtpEngine	trace_delta_file_pages_txs	This event gives the number of pages and the transaction range while reading a delta file for a particular operations.
XtpEngine	trace_delta_file_processed	Completed processing of Delta file.
XtpEngine	trace_delta_file_set_clean_or_dirty	Delta file set clean or dirty.
XtpEngine	trace_delta_filter_removed	Removing delta filter from DeltaFilterMap due to transaction filter.
XtpEngine	trace_delta_watermark_removed	Removed delta watermark.
XtpEngine	trace_delta_watermark_updated	Updated delta watermark row for FileId: $4.
XtpEngine	trace_deserialize_chained_records	Info pertaining to the deserialized content of the mrtRecords chain.
XtpEngine	trace_deserialize_data_from_page_begin	Deserializing data from page $2 in file $4.
XtpEngine	trace_deserialize_source_delta_file_begin	Deserializing source delta file $4: page range [$2, $3).
XtpEngine	trace_dump_cfd_row	Dump cfd row
XtpEngine	trace_dump_mrt_row	Merge Request
XtpEngine	trace_file_close_skipped	Skipping file close for $2 as the cfp had been merged, It could not be found in the Storage array.
XtpEngine	trace_file_skipped	Skipping file $4 due to reason $5.
XtpEngine	trace_file_skipped_with_tx_range	Skipping file $5 with TX range { $3 -> $4 } due to MaxTxId == InvalidTxId.
XtpEngine	trace_first_page_header_flushed	Flushed first page header for file.
XtpEngine	trace_flush_io_operation_for_delta_file	This trace is trying to catch pages issued for flush that didn't make it to the file, or were lacking the correct DeltaWatermark Timestamp.
XtpEngine	trace_flush_skipped_for_closed_data_file	Skipping flush for data file ('$2') as it is closed.
XtpEngine	trace_future_count_pages_update	Updating future count pages.
XtpEngine	trace_initial_merge	Doing initial merge for destination $3 at CheckpointTs: $2.
XtpEngine	trace_insert_DeletedObjectTable_row_begin	Inserting row to DeletedObjectTable in the context of transaction: $5.
XtpEngine	trace_intialize_worker_file_begin	Processing file close.
XtpEngine	trace_mark_data_file_closed	Marking the active data file as closed to prevent the offline worker fom flushing them.
XtpEngine	trace_merge_cancelled	Merge was cancelled for destination $2.
XtpEngine	trace_merge_complete_log_rec_ignored	MergeCompleteLogRecord ignored for destination file $3.
XtpEngine	trace_merge_complete_log_rec_processed	Successfully processed MergeCompleteLogRecord.
XtpEngine	trace_mrt_dump	Mrt dump.
XtpEngine	trace_mrt_row_inplace_update	Mrt row updated in place.
XtpEngine	trace_mrtrow_not_found	Could not find row with Destination ID: $2 in MRT table.
XtpEngine	trace_new_delta_watermark_inserted	Inserted new delta watermark row for FileId $3.
XtpEngine	trace_offline_process_stale_merge_item	Offline processing of stale merge item {$2, $3} Destination $4.
XtpEngine	trace_pending_tx	Pending Transaction
XtpEngine	trace_populate_storage_array	Populating the storage array.
XtpEngine	trace_post_writefile	This event is fired after issuing IO on the proxy.
XtpEngine	trace_process_file_close_begin	Processing file close.
XtpEngine	trace_process_merge_complete_log_record	Process merge complete log record.
XtpEngine	trace_process_merge_request	Process Merge Request
XtpEngine	trace_process_source_file	Processing source $4 file $2.
XtpEngine	trace_process_target_file	Processing target $4 file $2.
XtpEngine	trace_queue_merge_work_item	Queue work item: {$2, $3} $5 - $6 with CheckpointTs as $4.
XtpEngine	trace_recovered_existing_file	Recovered exising files files allocated during forward processing that are not already part of the CCB.
XtpEngine	trace_resyncing_state_of_file	Resyncing state of file $4 ('$5') post recovery: $2 pages, $3 rows.
XtpEngine	trace_serialization_object_cleaned	Cleaned serialization object.
XtpEngine	trace_serialize_abort_transaction_begin	Serializing the abort transaction filter to the delta cache corresponding to its own transaction range.
XtpEngine	trace_serialize_tx_begin	Serialize TS begin for completing the checkpoint in active state.
XtpEngine	trace_serialize_tx_end	Serialize TS end for completing the checkpoint.
XtpEngine	trace_serialized_delta_rows_to_file	Serialized delta rows to file.
XtpEngine	trace_source_file_load_stats	Load stats for source files in merge.
XtpEngine	trace_storage_refreshed_for_file	Storage refreshed for file.
XtpEngine	trace_uninitialize_file	uninitializing_file
XtpEngine	trace_wrote_delta_watermark	Wrote delta watermark for file: $3, checkpointTs: $2.
XtpEngine	waiting_for_dependenciestx_event	Raised when we have explicitly waited for dependencies to clear.
XtpEngine	xtp_before_create_log_record	Fired before submitting a request to the host to serialize a non-transactional log record.
XtpEngine	xtp_checkpoint_file_flush	Indicates the point at which a given file has been flushed to disk.
XtpEngine	xtp_checkpoint_file_flush_complete	Indicates the point at which all in-flight buffers have been flushed to disk.
XtpEngine	xtp_checkpoint_file_read	Indicates reading of a file in XTP checkpoint recovery.
XtpEngine	xtp_checkpoint_worker_active_log	Indicates that the checkpoint worker has started processing the log records in the active portion of the log.
XtpEngine	xtp_checkpoint_worker_started	Indicates that the checkpoint worker has started processing the log records.
XtpEngine	xtp_checkpoint_worker_stopped	Indicates that the checkpoint worker has started processing the log records.
XtpEngine	xtp_checkpoint_write_io	Indicates that the checkpointing subsystem has issued or completed a write IO.
XtpEngine	xtp_complete_checkpoint	Indicates the begin and end of complete checkpoint processing.
XtpEngine	xtp_create_log_record	Fires when the XTP engine creates a log record.
XtpEngine	xtp_delta_marked_dirty	Indicates that a delete/update was serialized and marked a delta file dirty
XtpEngine	xtp_merge_complete_log_record	Indicates merge complete log record is posted to the log.
XtpEngine	xtp_merge_log_complete_log_record	Indicates merge log record was interpreted by the offline worker.
XtpEngine	xtp_merge_process_log_record	Indicates merge log record was interpreted by the offline worker.
XtpEngine	xtp_merge_request_log_record	Indicates merge request log record is posted to the log.
XtpEngine	xtp_merge_request_started	Indicates merge request has been picked up by the merge worker thread.
XtpEngine	xtp_root_deserialized	Indicates that the load of a checkpoint root is complete.
XtpEngine	xtp_root_serialized	Indicates that the write of the checkpoint root is complete.
XtpRuntime	bind_md	Occurs prior to binding metadata for a memory optimized table.
XtpRuntime	bind_tables	Occurs prior to binding tables for a natively compiled procedure.
XtpRuntime	create_table	Occurs prior to creating memory optimized table.
XtpRuntime	deserialize_md	Occurs prior to deserializing metadata.
XtpRuntime	load_dll	Occurs prior to loading the generated DLL.
XtpRuntime	recover_done	Occurs at completion of checkpoint recovery of a memory optimized table.
XtpRuntime	recover_table	Occurs at start of checkpoint recovery of a memory optimized table.
XtpRuntime	serialize_md	Occurs prior to serializing metadata.
XtpRuntime	unload_dll	Occurs prior to unloading the generated DLL.

Happy tracing,

Andreas

extended events, sql server 2014, sql trace _ profiler

Comparing Extended Events vs SQL Trace – or why SQL Trace & Profiler are just a thing of the past :-)

Nov 21st

Posted by Andreas Wolter in Extended Events

No comments

Extended Events vs SQL Trace im Vergleich – oder warum SQL Trace & Profiler einfach von gestern sind :-)

(de)
Zur Erinnerung: Extended Events sind seit SQL Server 2008 in SQL Server integriert. Und seit SQL Server 2012 SP1 sind alle Events verfügbar, die es in SQL Trace gibt. Zudem sind Extended Events seit SQL Server 2012 auch für Analysis Services verfügbar (Tracing Analysis Services (SSAS) with Extended Events – Yes it works and this is how).

Für alle, die noch mit dem alten Werkzeug SQL Server Profiler (Profiler ist das Frontend für SQL Trace, gestartet mit sp_trace_create) arbeiten, und sich noch nicht für die neue Technologie entscheiden konnten, hier eine kleine Entscheidungshilfe.

Was Extended Events (XEvents) besser als SQLTrace machen:

(en)
As a reminder: Extended Events have been integrated in SQL Server since SQL Server 2008. And since SQL Server 2012 SP1, all events existing in SQL Trace have been available. In addition, Extended Events have also been available for Analysis Services since SQL Server 2012 Tracing Analysis Services (SSAS) with Extended Events – Yes it works and this is how).

For those of you who are still working with the old tool SQL Server Profiler (Profiler is the frontend for SQL Trace, started with sp_trace_create) and have not quite been able to decide for the new technology, here is some decision guidance.

What Extended Events (XEvents) do better than SQL Trace:

Einzige Möglichkeit neue Features wie FileStream, FileTable, AlwaysOn, ColumnStore, Hekaton/XTP etc. zu Tracen
Viel mehr Events tracebar, auch bereits für ältere Releases (siehe *1, *2 unten)
Deutlich geringerer Beobachter-Overhead
siehe auch: Performance overhead of tracing with Extended Event targets vs SQL Trace under CPU Load
Performance/Last lässt sich konfigurieren & tunen
Events und Filter lassen sich live anpassen – während aktiver Session also
Ofiziell 2 µs/Event gg. 4ms/Event in SQLTrace
Event-Verlust konfigurierbar
Effiziente Filterung durch Architektur
Komplexe Prädikate wie z.B. Zähler oder last_error, less_than_min_datatype oder greater_than_max_datatype
Korrelation von Events möglich
Möglichkeit Events vom Client bis in die Datenbank zu verfolgen
Einfache Automatisierung
Direkt in Management Studio integriert
Viele Analysen direkt in der GUI möglich (Um diese noch zu verbessern, bitte hier bei Microsoft Connect voten: Extended Events UI Export Display Settings: include grouping)
Query_hash zum Identifizieren von identischen Abfragen verfügbar
Keine 10 Klicks zum Aufsetzen einer simplen Session inkl. Filter
Unterschiedliche Speicher-Ziele für EventDaten zur Auswahl (6)
„Ergebnisorientierte“ Ziele wie Counter und Histogramm
Multiple Ziele lassen sich für “On the fly - Top-Down Analysen” kombinieren
Möglichkeiten für ganz neue Einblicke in Interna der Datenbank-Engine (Latching, Spinlocks, Multi-victim-Deadlock, Wait_Infos per session/query, Caching-Vorgänge, Ghost-cleanup, Analyse von Page Splits, Page-Compression Vorgänge, um nur einige zu nennen)
Stack Tracen eines einzelnen Prozesses möglich – anstelle eines vollständigen Server Dumps
Definition mit Standard DDL-Statements
API zur Integration in eigene Tools verfügbar
PowerShell-Unterstützung
Last but not least: Endlich ein Grund, XML & XQuery zu lernen? ;-)

The only possibility of Tracing new features like FileStream, FileTable, AlwaysOn, ColumnStore, Hekaton/XTP etc.
Many more Events traceable, even for older releases (see *1, *2 below)
Significantly less Observer-Overhead, also see: Performance overhead of tracing with Extended Event targets vs SQL Trace under CPU Load
Performance/Overhead can be configured and tuned
Events and Filters can be adapted live – meaning during active session
Official 2 µs/Event vs. 4ms/Event in SQLTrace
Event-loss is configurable
Efficient filtering through architecture
Complex predicates such as Counter or last_error, less_than_min_datatype oder greater_than_max_datatype
Correlation of Events possible
Possibility of following Events from Client into the Database
Easy Automation
Directly integrated in Management Studio
Many analysis directly inside the GUI possible (in order to improve them, please vote at Microsoft Connect: Extended Events UI Export Display Settings: include grouping)
Query_hash for identification of identical queries available
No 10 clicks to set up a simple session including filter
Choice of different destinations for storing EventData (6)
“Goal-oriented” destinations such as Counter and Histogram
Multiple destinations can be combined for “On the fly – top-down analyses”
Possibilities for entirely new insights into internal matters of database engine (Latching, Spinlocks, Multi-victim-Deadlock, Wait_Infos per session/query, Caching-processes, Ghost-cleanup, analysis of Page Splits, Page-Compression processes, to name just a few)
Stack Tracing of a single process possible – instead of a complete Server Dump
Definition with standard DDL-Statements
API for integration into one’s own tools available
PowerShell support
Last but not least: Finally a good reason to learn XML & XQuery? ;-)

*1
Extended Events vs. SQL trace Events per Version

*2
As an example: For Service Broker there are:
15 Events in SQLTrace vs. 44 Events in XEvents (SQL Server 2012 SP1)

Ich hoffe, das hilft dem einen oder anderen, die alte Gewohnheit abzulegen, und die kleine Lernphase in Kauf zu nehmen.

Eine Liste aller Extended Events in SQL Server 2012 SP1 samt Ihrem Gegenstück in SQL Trace, welche man für Migrationszwecke (SQLTrace -> XEvent Trace) verwenden kann findet sich in dieser Seite. (Aufgrund einer Größenbeschränkung passte sie nicht mehr hier hinein.)

I hope this helps some of you to unlearn the old habit and accept the little learning phase.

A list of all Extended Events in SQL Server 2012 SP1 including its counter piece in SQL Trace that can be used for migration purposes (SQLTrace-> XEvent Trace) is available on this page. (Due to size restriction, it didn’t fit in here anymore.)

> Mapping Extended Events with sys.trace_xe_event_map to SQL Trace <

Happy better Tracing

Andreas

PS: Für 2014 befindet sich mit den SQL Server Master-Classes zum Thema „Tracing mit Extended Events“ auch die nächste Runde der Trainings zu diesem Thema in der Planung :-)

P.S. For 2014, the next round of training in this topic is being developed in conjunction with SQL Server Master-Classes on the topic “Tracing with Extended Events” :-)

extended events, sql trace _ profiler

SQL Server & BI Blog by Andreas Wolter

Tag: "extended events"

Where is that Preemptive Wait coming from? Database Ownership and Performance: a journey through SQL Server internals with XEvents, Windbg and Wireshark

How to import Extended Events session file target and parse deadlock-graph / Import von Extended Events Ereignisdateien und Parsen von Deadlock-Graphen

Free Deadlock-Collector & -Parser, based on Extended Events session system_health / freier Deadlock-Sammler & -Parser basierend auf Erweiterten Ereignissen

New Extended Events for Tracing in SQL Server 2014

Comparing Extended Events vs SQL Trace – or why SQL Trace & Profiler are just a thing of the past :-)

Recent (5) Articles

XML Feeds

Tags used in blog-articles