Categories: "SQL Server Performance"

Configuring a server side trace

When I'm after SQL Server performance  problems, SQL Server Profiler is still my number one tool. Allthough I know that extended events provide a mor lightweight solution, those are still a bit cumbersome to use (but I've seen that we can expect some improvements with SQL Server 2012).

When I'm using profiler to isolate performance issues, I try to configure server side traces, whenever possible. Fortunately, SQL Server Profiler will help you creating a script for a server side trace (File/Export/Script Trace Definition), so you don't have to figure out all the event- and column-codes. Very good!

As I was doing the same configuration again and again, I decided to separate the TSQL code for the configuration inside a stored procedure.

And here comes dbo.configureServerSideTrace:

if object_id('dbo.configureServerSideTrace', 'P') is not null
  drop procedure dbo.configureServerSideTrace
go

-- Example for:
-- Start Trace
--   declare @traceID int
--   exec dbo.configureServerSideTrace @traceStatus = 1
--                                    ,@traceID = @traceID output
--                                    ,@maxFileSize = 10000
--                                    ,@traceFileName = N'e:\VMITrace\Undo'
--                                    ,@spId = @@spid
--
-- End Trace
--   exec dbo.configureServerSideTrace @traceStatus = 0, @traceID = @traceID



create procedure dbo.configureServerSideTrace
                                 (@traceStatus   bit                 
-- 1 => Start Trace
                                                                     
-- 0 => Stop Trace
                                 ,@traceID       int output          
-- If the Trace is started, this param will return the TraceID
                                                                     
-- For stopping the trace, the param has to be provided
                                 ,@spId          int           = null
-- provide the @@spid, if you want to filter only events for this conection
                                                                     
-- Optional. If not provided => no filter. Not needed for stopping the trace
                                 ,@maxFileSize   bigint        = 5000
-- Maximum Trace File Size in Megabyte. Trace will be stopped, if the filesize is reached.
                                 ,@traceFileName nvarchar(200) = null
-- Name of the trace file (server side!)
                                                                     
-- Optional. Not neded for stoping the trace
                                                                     
-- Attention! If the file already exists, the SP will yield an error
                                                                     
-- and no trace is started.
                                 )
as
begin

if
(@traceStatus = 0
)
begin
   exec sp_trace_setstatus @TraceID,
0
  
exec sp_trace_setstatus @TraceID,
2
  
return
;
end

-- Create a Queue
declare @rc
int

exec
@rc = sp_trace_create @TraceID output, 0, @traceFileName, @maxfilesize, NULL
if (@rc != 0) goto error

-- Set the events
declare @on
bit
set
@on = 1
exec sp_trace_setevent @TraceID, 43, 15, @on
exec sp_trace_setevent @TraceID, 43, 48, @on
exec sp_trace_setevent @TraceID, 43, 1, @on
exec sp_trace_setevent @TraceID, 43, 34, @on
exec sp_trace_setevent @TraceID, 43, 35, @on
exec sp_trace_setevent @TraceID, 43, 51, @on
exec sp_trace_setevent @TraceID, 43, 4, @on
exec sp_trace_setevent @TraceID, 43, 12, @on
exec sp_trace_setevent @TraceID, 43, 13, @on
exec sp_trace_setevent @TraceID, 43, 14, @on
exec sp_trace_setevent @TraceID, 43, 22, @on
exec sp_trace_setevent @TraceID, 42, 1, @on
exec sp_trace_setevent @TraceID, 42, 14, @on
exec sp_trace_setevent @TraceID, 42, 22, @on
exec sp_trace_setevent @TraceID, 42, 34, @on
exec sp_trace_setevent @TraceID, 42, 35, @on
exec sp_trace_setevent @TraceID, 42, 51, @on
exec sp_trace_setevent @TraceID, 42, 4, @on
exec sp_trace_setevent @TraceID, 42, 12, @on
exec sp_trace_setevent @TraceID, 45, 16, @on
exec sp_trace_setevent @TraceID, 45, 48, @on
exec sp_trace_setevent @TraceID, 45, 1, @on
exec sp_trace_setevent @TraceID, 45, 17, @on
exec sp_trace_setevent @TraceID, 45, 18, @on
exec sp_trace_setevent @TraceID, 45, 34, @on
exec sp_trace_setevent @TraceID, 45, 35, @on
exec sp_trace_setevent @TraceID, 45, 51, @on
exec sp_trace_setevent @TraceID, 45, 4, @on
exec sp_trace_setevent @TraceID, 45, 12, @on
exec sp_trace_setevent @TraceID, 45, 13, @on
exec sp_trace_setevent @TraceID, 45, 14, @on
exec sp_trace_setevent @TraceID, 45, 22, @on
exec sp_trace_setevent @TraceID, 45, 15, @on
-- XML Statistics Profile
exec sp_trace_setevent @TraceID, 146, 1, @on
exec sp_trace_setevent @TraceID, 146, 51, @on
exec sp_trace_setevent @TraceID, 146, 4, @on
exec sp_trace_setevent @TraceID, 146, 12, @on

-- Filter: Log only events for the provided @@spid
if @spId is not
null
  exec sp_trace_setfilter @TraceID, 12, 0, 0, @spID

-- Set the trace status to start
exec sp_trace_setstatus @TraceID, 1

goto finish

error:
select ErrorCode=@rc

finish:
end

go

Some annotations:

  • For the parameters, see the comments.
  • Don't specify a filename extension for the trace file. .TRC will be added automatically.
  • Ensure that the output file does not already exist. Otherwise you'll get an error.
  • Very often I replace the code for starting and stopping the trace inside "interesting code" inside a stored procedure. That is, I'm wrapping some more or less awkward code by starting and stopping a trace like this:

declare @traceID int
exec
dbo.configureServerSideTrace @traceStatus =
1
                                
,@traceID = @traceID
output
                                 ,@maxFileSize =
10000
                                
,@traceFileName =
N'e:\MyTrace\Test'
                                 ,@spId =
@@spid

--
-- Code of interest
--


exec dbo.configureServerSideTrace @traceStatus = 0, @traceID = @traceID

Cheers.

Calculating SQL Server Data Compression Savings

SQL Server 2008 Enterprise edition comes with an opportunity for storing table or index data in a compressed format which may save huge amount of storage space and - much more important - IO requests and buffer pool utilization. There's two different options for data compression, namely Row and Page level compression. This blog post is not concerned with how these two work internally and will also not explain the differences between the two. If you like to know more about this, you find much of useful information on the internet - including links to further articles (e.g. here, here, and here).

Whether compression is worth or not isn't an easy question to answer. One aspect that has to be taken into account is certainly the amount of storage that may be saved by storing a distinct table or index in any of the two compressed formats. SSMS offers a Data Compression Wizard than can provide storage-saving estimates for row or page level compression. From the context menu for a table or index just open Storage/Manage Compression. In the Combo box at the top select the compression type (Row or Page) and press the Calculate Button at the bottom. Here's a sample of a calculated saving for an index:

image

Unfortunately, SSMS does not offer an option for calculating estimated savings for more than one table or index at once. If you, let's say, would like to know the estimated storage savings of page level compression for you largest 10 tables, there's no GUI support in SSMS that will assist you in finding an answer. This is, where the stored procedure sp_estimate_data_compression_savings comes in handy. This procedure - as you may have guessed from its name - provides estimated savings for row or page level compression for any table or index. You have to provide the table or index as a parameter to the procedure. In other words: The procedure will only calculate the estimations for one table or index at a time. If you want to retrieve the calculations of more than one table or index as a result set, there's some more work to do, since the procedure has to be invoked multiple times. Here's a script that calculates the estimated savings of page level compression for the database in context.

-- Determine the estimated impact of compression
-- NOTE: This script is only for SQL Server Enterprise and Developer edition.

set
nocount on

-- We create a temp table for the result
if (object_id('tempdb..#comp', 'U') is not
null)
drop table #comp
go
create table #comp
(
object_name
sysname
,schema_name
sysname
,index_id
int
,partition_number
int
,[size_with_current_compression_setting (KB)]
bigint
,[size_with_requested_compression_setting (KB)] bigint

,[sample_size_with_current_compression_setting (KB)] bigint

,[sample_size_with_requested_compression_setting (KB)]
bigint
)
go

-- Calculate estimated impact of page level compression for all
-- user-tables and indexes in all schemas.
-- NOTE:
-- 1) To get the estimated impact of row level compression change the last parameter
-- of sp_estimate_data_compression_savings to 'row' instead.
-- 2) We don't care about partitioning here. If this is important for you,
-- you have to modify forth parameter of sp_estimate_data_compression_savings.
-- Please refer to BOL.

declare @cmd nvarchar(max
)
set @cmd =
''
select @cmd =
@cmd
+
';insert #comp exec sp_estimate_data_compression_savings '''
+ schema_name(schema_id)+''','''
+ name + ''',null, null, ''page'''
from sys.
tables
where objectproperty(object_id, 'IsUserTable') = 1
exec (@cmd)

;
-- Do some further calculations for a more meaningful result
with
compressionSavings as
(
select quotename(schema_name) + '.' + quotename(object_name) as
table_name
,
index_id
,
[size_with_current_compression_setting (KB)]
,
[size_with_requested_compression_setting (KB)]
,cast(
case
when [size_with_current_compression_setting (KB)] = 0 then
0
else 100.0*(1.0-
1.0
*
[size_with_requested_compression_setting (KB)]
/[size_with_current_compression_setting (KB)]
)
end as decimal(6,2)) as
[Estimated Savings (%)]
from #comp
)
select cs.
table_name
,isnull(i.name, i.type_desc) as
index_name
,cs.
[size_with_current_compression_setting (KB)]
,cs.
[size_with_requested_compression_setting (KB)]
,cs.[Estimated Savings (%)]

from compressionSavings as
cs
left outer join sys.indexes as
i
on i.index_id = cs.
index_id
and i.object_id = object_id(cs.table_name, 'U'
)
order by cs.[Estimated Savings (%)]
desc

-- Get rid of the temp table
drop
table #comp
go

The script calculates the impact of Page level compression but may easily be adapted to consider Row level compression instead. Please read the comments inside the script. Also, please notice that the script will only run on SQL Server Enterprise and Developer edition. All other editions don't provide the opportunity for data compression.

Here's a partial result retrieved from running the script against the AdventureWorksDW2008R2 database.

image

If you execute the script, please be aware that it may produce some extensive I/O. Running the script against your production database at business hours wouldn't be a very good idea therefore.

SQL Server Start Time

Have you tried finding out the time, your SQL Server instance has been started? There are some sophisticated solutions, like the one from Tracy Hamlin (twitter), which takes advantage of the fact that tempdb is re-created every time, SQL Server starts. Her solution goes like this:

select create_date
from sys.databases where database_id=2

Another answer to the question, I've seen sometimes on the internet queries the login time for any of the system processes:

select login_time
from sys.dm_exec_sessions where session_id=1

This was my preferred way - until yesterday, when I discovered the following simple method:

select sqlserver_start_time
from sys.
dm_os_sys_info

Easy, isn't it? Interestingly though, every of the above three queries yields a different result. Here's a query with a sample output:

select (select sqlserver_start_time
from sys.dm_os_sys_info) as
sql_server_start_time
,(select
create_date
from sys.databases where database_id=2) as
tempdb_creation_time
,(select
login_time
from sys.dm_exec_sessions where session_id=1) as sysprocess_login_time

Result:

image

It seems the SQL Server service must be started first. Only after the service is running, tempdb is created followed by a subsequent start of all sysprocesses. I can't imagine that the diverse three times make any difference in practice, e.g. if you try finding out for how many hours your SQL Server instance is running. But out there may be existing applications that have to be aware of the difference.

Exploring SQL Server Blockings and Timeouts

Last Thursday I was giving a presentation about information collection and evaluation of SQL Server Blockings and Timeouts at the regional PASS chapter meeting in Munich.

You may download the presentation as well as the corresponding scripts here (German only).

Multiple statistics sharing the same leading column

SQL Server will not prevent you from creating identical indexes or statistics. You may think that everything is under your control, but have you ever added a missing index that has been reported by the query optimizer, or the DTA? If so, you will certainly have created duplicate statistics. Have a look at the following example:

-- create test table with 3 columns
-- filler is only nedded to widen the row

create
table T1
(

c1
int not null identity(1,1) primary
key
,c2 int not
null
,filler nchar(200) not null default
'#'
)
go

-- insert 500000 rows with random
-- values between 0 and 49999 for c2

insert T1(c2
)
select top(500000) abs(checksum(newid())) %
50000
from sys.fulltext_system_stopwords as
a
,sys.fulltext_system_stopwords as b

Before invoking the following SELECT command, allow the actual execution plan being displayed (Strg-M or selecting Query/Include Actual Execution Plan from the menu).

-- get some data
select * from T1 where c2 = 100

If you look at the execution plan, you see that the optimizer complains about a missing index on column c2. The prognosticated improvement is about 99%, so adding the index it's certainly a good idea. Let's do it:

-- add missing index
create nonclustered index ix_1 on t1(c2)

Perfect! The query runs much faster now and needs a lot fewer resources. But have a look at the table's statistics:

image

You see three statistics, one for the primary key, a second one for our created index ix_1, and a third one that was automatically created during execution plan generation for the first SELECT statement. This is the statistics, named _WA_Sys.. If the AUTO CREATE STATISTICS option is set to ON, the optimizer will add missing statistics automatically. In our little experiment, the optimizer had to generate this column statistics on column c2 in order to make some assumptions about the number of rows that had to be processed.

And here's the problem: When creating the index on column c2, a statistics on this column is also created, since every index has a corresponding linked statistics. That's just the way it works. At the time the index was added, the column statistics on c2 (that _WA_Sys. statistics) already existed. If you don't remove it manually, this statistics will remain there forever, although it is needless now. All it's good for is to increase maintenance efforts during statistics updates. You can safely remove this statistics by executing:

drop statistics t1._WA_Sys_...

If you didn't think about this before, there's a chance that you'll find some of those superfluous statistics duplicates inside your database(s). Here's a query that finds index-related and column-statistics that match on the first column. Looking for matches on the first column is sufficient here, since the optimizer only automatically adds missing single-column statistics.

with all_stats as
(
select s.object_id, s.name as stats_name, s.stats_id, s.
has_filter
,s.auto_created, sc.
stats_column_id
,sc.column_id, c.name as
column_name
from sys.stats as
s
inner join sys.stats_columns as
sc
on s.stats_id = sc.
stats_id
and s.object_id = sc.
object_id
and sc.stats_column_id =
1
inner join sys.columns as
c
on c.object_id = s.
object_id
and c.object_id = sc.
object_id
and c.column_id = sc.
column_id
where objectproperty(s.object_id, 'IsUserTable') = 1
)
select row_number() over(partition by s1.object_id, s1.column_name order by s1.column_name) as
stats_no
,object_name(s1.object_id) as
table_name
,s1.stats_name as
stats_name
,s2.stats_name as
identical_stats_name
,s2.
column_name
from all_stats as
s1
inner join all_stats as
s2
on s1.object_id = s2.
object_id
and s1.stats_id != s2.
stats_id
and s1.stats_column_id = s2.
stats_column_id
and s1.column_id = s2.
column_id
and s1.has_filter = s2.
has_filter
and s1.auto_created != s2.auto_created

With that query at hand, you may easily find redundant statistics. Here's a sample output:

image

If you'd like to find out more about SQL Server statistics, you may want to check out my series of two articles, published recently on the Simple-Talk platform.

Part 1: Queries, Damned Queries and Statistics
Part 2: SQL Server Statistics: Problems and Solutions

It's free, so you might want to give it a try. Feel free to vote, if you like it!

Did you know: Aggregate functions on floats may be non-deterministic

One day some of the report-users mentioned that, every time they run a report, they get different results. My first idea was that there were some undergoing data changes, probably from a different connection/user, so this would explain it. But it turned out that no modifications were made. Even setting the database to read only did not help. Numbers in reports differed by about 20% with every execution.

Delving into it, I could isolate the problem. It was a single SELECT statement that, when invoked, returned different results. The numbers differed by up to 20% in value without any data changes being performed!

Have a look at the following sample. We create a test table to demonstrate what I'm talking about:

use tempdb
go

if
(object_id('SumTest', 'U') is not null)
drop table SumTest
go
create table SumTest
(
floatVal float not null
,decimalVal decimal(20,4) not null
,filler nchar(300) not null default '#'
)
go

The table has three columns, where the third column only serves the purpose of filling up the row, so the table contains more data pages.

Let's now insert 600000 rows into our table:

declare @x float
set
@x = 1000000000000000.9999

-- Insert 300000 identical rows
insert SumTest(floatVal, decimalVal)
select top(300000) @x, @x
from sys.trace_event_bindings as b1
,sys.trace_event_bindings as b2

-- Again insert 300000 rows.
-- This time with negative sign
insert SumTest(floatVal, decimalVal)
select top(300000) -@x, -@x
from sys.trace_event_bindings as b1
,sys.trace_event_bindings as b2

The first INSERT statement adds 300000 rows with positive values for the two columns floatVal and decimalVal. After that, we insert another 300000 rows, this time with inverse signs. So in total, values for each of the two columns should add up to zero. Let's check this by invoking the summation over all rows a few times:

select sum(floatVal) as SumFloatVal
,sum(decimalVal) as SumDecimalVal
from SumTest
union
select sum(floatVal) as SumFloatVal
,sum(decimalVal) as SumDecimalVal
from SumTest
union
select sum(floatVal) as SumFloatVal
,sum(decimalVal) as SumDecimalVal
from SumTest
union
select sum(floatVal) as SumFloatVal
,sum(decimalVal) as SumDecimalVal
from SumTest
union
select sum(floatVal) as SumFloatVal
,sum(decimalVal) as SumDecimalVal
from SumTest
union
select sum(floatVal) as SumFloatVal
,sum(decimalVal) as SumDecimalVal
from SumTest
union
select sum(floatVal) as SumFloatVal
,sum(decimalVal) as SumDecimalVal
from SumTest
union
select sum(floatVal) as SumFloatVal
,sum(decimalVal) as SumDecimalVal
from SumTest
union
select sum(floatVal) as SumFloatVal
,sum(decimalVal) as SumDecimalVal
from SumTest

And here's the result:

image

As for the DECIMAL column, the outcome is as expected. But look at the totals for the FLOAT column. It's perfectly understandable, the sum will reveal some rounding errors. What really puzzled me is the difference between the numbers. Why isn't the rounding error the same for all executions?

I was pretty sure that I discovered a bug in SQL Server and posted a regarding item on MSFT's connect platform (see here).

Unfortunately nobody cared about my problem, and so I took the opportunity of talking to some fellows of the SQL Server CAT team on the occasion of the 2009 PASS Summit. After a while, I received an explanation which I'd like to repeat here.

The query is executed in parallel, as the plan reveals:

image

When summing up values, usually the summation sequence doesn't matter. (If you remember some mathematics from school that's what the commutative law of addition is about). Therefore, reading values in multiple threads and adding up the values in any arbitrary order is perfect, as the order doesn't have any influence on the result. Well, at least theoretically. When adding float values, there's floating point arithmetic rounding errors with every addition. These added-up rounding errors are the reason for the non-zero values of the float totals in our example. So that's ok, but why different results with almost every execution? The reason for this is parallel execution. Added-up rounding errors depend on the sequence, so the commutative law does not really apply to these errors. There's a chance that the sequence of rows changes with every execution, if the query is executed in parallel. And that's why the results change, dependent only on some butterfly wing movements at the other side of the world.

If we add the MAXDOP 1 query hint, only one thread is utilized and the results are the same for every execution, although rounding errors still remain present. So this query:

select sum(floatVal) as SumFloatVal
,sum(decimalVal) as SumDecimalVal
from SumTest option (maxdop 1)

will be executed by using the following (single thread) execution plan:

image

This time the result (and also the rounding error) is always the same.

Pretty soon after delivering the explanation, the bug was closed. Reason: the observed behavior is "by design".

I can understand that the problem originates from computer resp. processor architecture and MSFT has no chance of control therefore.

Although.

When using SSAS' write back functionality, SSAS will always create numeric columns of FLOAT data types. There's no chance of manipulating the data type; it's always float!

Additionally, SSAS more often than not inserts rows into write back tables with vastly large resp. small values. When looking an these rows, it appeared that they are created solely with the intention of summing up to zero. We discovered plenty of these rows containing inverse values that usually should nullify in total, but apparently don't. By the way that's why closing the bug with the "By Design" explanation makes me somewhat sad.

So, probably avoiding FLOATs is a good idea! Unfortunately, this is simply not possible in all cases and sometimes out of our control.

No automatically created or updated statistics for read-only databases

Ok, that's not really the latest news and also well documented. But keep in mind that database snapshots also fall into the category of read-only databases, so you won't see automatically maintained statistics for these as well.

I've seen the advice of taking a snapshot from a mirrored database for reporting purposes many times. Even books online contains a chapter Database Mirroring and Database Snapshots, where the scenario is explained.

The intention is to decouple your resource intensive reporting queries from the OLTP system which is generally a good idea. But keep in mind that:

  1. Reporting queries are highly unpredictable, and
  2. Reporting queries differ from OLTP queries.

So there's more than only a slightly a chance that query performance will suffer from missing statistics in your reporting snapshot, since the underlying OLTP database simply does not contain all statistics (or indexes) that your reporting applications could take advantage of. These missing statistics can't be added in the snapshot, because it's a read-only database. Additionally, you won't experience any automatic updates of stale statistics in the snapshot. And moreover, any added or updated statistics in the source database are not transferred into the snapshot, of course. This can affect your query performance significantly!

Have a look at the following example:

use master
go

-- Create test database
create database ssBase
go
use
ssBase
go
-- create test table and insert 10000 rows
create table t1
(
c1 int not null primary key
,c2 int not null
)
go
go
insert
t1(c1,c2)
select top 10000
row_number() over(order by current_timestamp)
,checksum(newid())
from sys.fulltext_system_stopwords
go

-- create snapshot of test database
-- NOTE: the filename has to be adjusted
-- for your own experimenting
create database ssBase_SS on
(
name = ssBase
,filename = 'e:\SqlTestData\ssBase_SS.ss'
)
as snapshot of ssBase;
go

The script above creates a test database with one table and adds some rows to the table before creating a snapshot of this database. As we have never used column c2 in any query (besides the INSERT), there won't be any statistics for column t1.c2.

Now let's query the snapshot and have a look at the actual execution plan. Here's the query:

-- Be sure to show the actual execution plan
-- for the following query
use ssBase_SS

select
* from t1
where c2 = 10

Here's the actual execution plan:

image

Clearly evident that the optimizer detects a missing statistics, although the options AUTO CREATE STATISTICS and AUTO UPDATE STATISTICS have both been set to ON. The plan reveals a noticeable difference between the actual and estimated number of rows and also a warning regarding the missing statistics.

So, keep that in mind when using snapshots for reporting applications.

If you'd like to find out more about SQL Server statistics, you may want to check out my series of two articles, published recently on the Simple-Talk platform.

Part 1: Queries, Damned Queries and Statistics
Part 2: SQL Server Statistics: Problems and Solutions

It's free, so you might want to give it a try. Feel free to vote, if you like it!