=tg= Thomas Grohser - Latest Comments on increasing the performance of count(*)

Christoph Ingenhaag in response to: increasing the performance of count(*)

Thu, 24 Feb 2011 18:25:46 +0000

An indexed view is another choice. With 1100000 rows in MyTable a select makes 957 logical reads on my system using the IX_ID index. A select on MyView (code follows) makes 2 logical reads create view dbo.MyView with schemabinding as select count_big(*) as cnt from dbo.MyTable go create unique clustered index cuidx on MyView(Cnt) go select cnt from MyView with (noexpand) It is interessting the noexpand hint is necessary with more then 1000000 rows in MyTable on my system... (with Express Editions you need this hint) And, the inserts are faster without the IX_ID Index. The update of the indexed view costs almost nothing. To check this I used the number function from Steve Kass (http://stevekass.com/2006/06/03/how-to-generate-a-sequence-on-the-fly/) and this statement: insert into MyTable(Payload) select replicate('ABC', 100) from dbo.numbers(1, 100000) Please check the plan. Maybe I have overseen something.

admin in response to: increasing the performance of count(*)

Wed, 26 Jan 2011 19:33:36 +0000

LOL. Agreed, in that table dimensions I guess it is really not worth it departing from the "standard" way of doing things. :-)

tgrohser in response to: increasing the performance of count(*)

Wed, 26 Jan 2011 14:53:15 +0000

The actual problem was in the size of about 0 to 2000 rows, so not a huge table, the exact count was not 100% relevant, we thought about query in the system tables too but we found out that the actual work for sql server in this size of table was smaller letting him count than finding the right object, the coresponding partitions and then reading the result. Sure there is overhead for the extra index but the counting was done much more often then inserts. for large tables I totaly agree the system tables are the much better way to go

admin in response to: increasing the performance of count(*)

Wed, 26 Jan 2011 13:22:49 +0000

If I were to perform a COUNT(*) constantly on a large table I would maybe revise this strategy and question the requirement at all. Even with a tailored index just to support that query, the actual work still has to be carried out by SQL Server and I wouldn't be surprised when the query still runs like a dog. However, if accuracy of the COUNT(*) isn't important at all, say for example, if you want to use this for some kind of paging, or track growth over time, or other stuff where you can live with a more or less "good approximation", it might be an option to get the row_count from the system tables such as sys.partitions. Of course, with the usual caveat, that system tables can change over time...