Talks about Clustered and Non clustered index in SQL 2005 and how it works, what is the need for it.
1. Introduction
We all know that data entered in the tables are persisted in the physical drive in the form of database
files. Think about a table, say Customer (For any leading bank India), that has around 16 million
records. When we try to retrieve records for two or three customers based on their customer id, all
16 million records are taken and comparison is made to get a match on the supplied customer ids.
Think about how much time that will take if it is a web application and there are 25 to 30 customers
that want to access their data through internet. Does the database server do 16 million x 30
searches? The answer is no because all modern databases use the concept of index.
2. What is an Index
Index is a database object, which can be created on one or more columns (16 Max column
combination). When creating the index will read the column(s) and forms a relevant data structure to
minimize the number of data comparisons. The index will improve the performance of data retrieval
and adds some overhead on data modification such as create, delete and modify. So it depends on
how much data retrieval can be performed on table versus how much of DML
(Insert, Delete and Update) operations.
In this article, we will see creating the Index. The below two sections are taken from my previous
article as it is required here. If your database has changes for the next two sections, you can directly
go to section 5.
Note that there are no constraints at present on these tables. We will add the constraints one by one.
Now this column does not allow null values and duplicate values. You can try inserting values to
violate these conditions and see what happens. A table can have only one Primary key. Multiple
columns can participate on the primary key column. Then, the uniqueness is considered among all
the participant columns by combining their values.
5. Clustered Index
The primary key created for the StudId column will create a clustered index for
the Studid column. A table can have only one clustered index on it.
When creating the clustered index, SQL server 2005 reads the Studid column and forms a Binary
tree on it. This binary tree information is then stored separately in the disc. Expand the
table Student and then expand theIndexes. You will see the following index created for you when
the primary key is created:
With the use of the binary tree, now the search for the student based on the studid decreases the
number of comparisons to a large amount. Let us assume that you had entered the following data in
the table student:
The index will form the below specified binary tree. Note that for a given parent, there are only one
or twoChilds. The left side will always have a lesser value and the right side will always have a
greater value when compared to parent. The tree can be constructed in the reverse way also. That is,
left side higher and right side lower.
Execution without index will return value for the first query after third comparison.
Execution without index will return value for the second query at eights comparison.
Execution of first query with index will return value at first comparison.
Execution of second query with index will return the value at the third comparison. Look below:
1. Compare 107 vs 103 : Move to right node
2. Compare 107 vs 106 : Move to right node
3. Compare 107 vs 107 : Matched, return the record
If numbers of records are less, you cannot see a different one. Now apply this technique with a
Yahoo email user accounts stored in a table called say YahooLogin. Let us assume there are 33
million users around the world that have Yahoo email id and that is stored in the YahooLogin. When
a user logs in by giving the user name and password, the comparison required is 1 to 25, with the
binary tree that is clustered index.
Look at the above picture and guess yourself how fast you will reach into the level 25. Without
Clustered index, the comparison required is 1 to 33 millions.
The above explanation is for easy understanding. Now a days SQL server is using the B-Tree
techniques to represent the clustered index.
Got the usage of Clustered index? Let us move to Non-Clustered index.
2. From the displayed dialog, type the index name as shown below and then click on the Add button to
select the column(s) that participate in the index. Make sure the Index type is Non-Clustered.
3. In the select column dialog, place a check mark for the column class. This tells that we need a nonclustered index for the column Student.Class. You can also combine more than one column to
create the Index. Once the column is selected, click on the OK button. You will return the dialog
shown above with the selected column marked in blue. Our index has only one column. If you
selected more than one column, using the MoveUp and MoveDown button, you can change order of
the indexed columns. When you are using the combination of columns, always use the highly
repeated column first and more unique columns down in the list. For example, let use assume the
correct order for creating the Non-clustered index is: Class, DateOfBirth, PlaceOfBirth.
4. Click on the Index folder on the right side and you will see the non-clustered index based on the
column class is created for you.
the eligibility for a good book right? One you are impressed, you want to see your favorite topic of
Regular Expressions and how it is explained in the book. What will you do? I just peeped at you from
behind and recorded what you did as below:
1. You went to the Index page (it has total 25 pages). It is already sorted and hence you easily picked
up Regular Expression that comes on page Number 17.
2. Next, you noted down the number displayed next to it which is 407, 816, 1200-1220.
3. Your first target is Page 407. You opened a page in the middle, the page is greater than 500.
4. Then you moved to a somewhat lower page. But it still reads 310.
5. Then you moved to a higher page. You are very lucky you exactly got page 407. [Yes man you got it.
Otherwise I need to write more. OK?]
6. Thats all, you started exploring what is written about Regular expression on that page, keeping in
mind that you need to find page 816 also.
In the above scenario, the Index page is Non-Clustered index and the page numbers are clustered
index arranged in a binary tree. See how you came to the page 407 very quickly. Your mind actually
traversed the binary tree way left and right to reach the page 407 quickly.
Here, the class column with distinct values 1,2,3..12 will store the clustered index columns value
along with it. Say for example; Let us take only class value of 1. The Index goes like this:
Performance Monitor
(Tools)
Overview
Windows Performance Monitor or PerfMon is another great tool to capture metrics for your
entire server. So far we discussed DMVs and Profiler which are great tools for finding out
what is occurring within your SQL Server instance, but sometimes there are issues outside
of SQL Server that may be causing performance issues. In this section we will take a look
at PerfMon and how this tool can be used.
Explanation
The Performance Monitor tool allows you to capture and graph many aspects for the
Windows server. There are counters for .NET, Disks, Memory, Processors, Network, etc...
as well several counters related to each instance of SQL Server on the box. If you have
multiple instances running on one server, the counters are available for each instance so
you can see what is occurring at the instance level.
Above is the default look and feel when you launch this tool. Here we can see there is one
counter "% Processor Time" that is being tracked. For this counter we can see the
following items:
Last - this is the last value that was captured for this counter
Average - this is the average value for the duration
Minimum - this is the minimum value for the duration
Maximum - this is the maximum value for the duration
Duration - this is the total collection time period and in this case it is 1:40 which is 1
minute and 40 seconds
From this we can tell when there are peaks for specific counters that may be causing
performance issues.
From this window we can select additional counters such as Memory, Physical Disk and SQL
Server specific counters. To add a counter select the counter and click the Add button. The
below screen shot shows multiple counters that have been selected. Click OK when you are
done to start capturing this data.
The other thing you will want to do is change your duration and frequency for collecting
data. By default it is set to sample the data every 1 second for a duration of 100
seconds. To change this right click on the graph and selectProperties and a new window
like the following will appear. If you click on the General tab you can set the sampling
settings as shown below. In addition there are several other properties you can modify in
this window.
Useful Counters
Once you start to explore all of the counters it can be overwhelming since there are so
many to choose from, so here are a few counters that would make sense to begin
capturing. Also, once you start collecting it is also difficult to tell if you have an issue or not
based on the values that are returned. Since there are no hard and fast rules for all
counters the best approach is to capture these values when your system is running fine, so
you can create a baseline. Then you can use these baseline numbers when you start to
capture data. You can find some information online about specific counters and threshold
values for each counter.
Memory
o Available MBytes
Physical Disk
o Avg. Disk sec/Read
o Avg. Disk sec/Write
Processor
o % Processor Time
SQL Server: Buffer Manager
o Page Life Expectancy
o Buffer cache hit ratio
SQL Server: SQL Statistics
o Batch Requests/sec
o Compilations/sec
Additional Information
Here are some additional items related to Performance Monitor.
Automate Performance Monitor Statistics Collection for SQL Server and Windows
Setup Performance Monitor to always collect SQL Server performance statistics
Free Microsoft Tools to Help Setup and Maintain PerfMon for SQL Server
Video - Top PerfMon Counters for Analyzing SQL Server Performance Issues
Poster - SQL Server Perfmon Counters of Interest
Overview
Another way to get performance related information from SQL Server is to use the built-in
performance reports. The reports were first introduced with SQL Server 2005 as an add-on,
but are now standard with later versions. The reports provide useful information that can
assist you in determining where your performance bottlenecks may be. The data from these
reports is pulled from DMVs as well as the default trace that is always running.
Explanation
To access the reports, open SSMS, right click on the SQL Server instance name and
select Reports > Standard Reports as shown below.
There are several reports related to performance that can be used to see current activity as
well as historical activity. Here is a list of some of the available reports.
Server Dashboard
Scheduler Health
Memory Consumption
Activity - All Blocking Transactions
Activity - Top Sessions
Activity - top Connections
Top Connections by Block Transactions Count
Top Transaction by Locks Count
Performance - Batch Execution Statistics
Performance - Object Execution Statistics
Performance - Top Queries by Average CPU Time
Performance - Top Queries by Average IO
Performance - Top Queries by Total CPU Time
Performance - Top Queries by Total IO
Take the time to explore these reports to determine which report best suits your
performance monitoring needs.
Explanation
The Query Execution Plans describe the steps and the order used to access or modify data
in the database. Once you have this information you can identify what parts of the query
are slow.
SQL Server can create execution plans in two ways:
Actual Execution Plan - (CTRL + M) - is created after execution of the query and
contains the steps that were performed
Estimated Execution Plan - (CTRL + L) - is created without executing the query and
contains an approximate execution plan
Execution plans can be presented in these three ways and each option offers benefits over
the other.
Text Plans
Graphical Plans
XML Plans
When beginning to work with execution plans, the graphical plan is usually the easiest place
to start unless your plan is very complex, then the text plans are sometimes easier to read.
Here is a simple query and its execution plan. To include the Actual Execution Plan press
CTRL + M in the query window and then execute the T-SQL code.
-- query 1
SELECT ProductKey,ProductSubcategoryKe
y
FROM AdventureWorksDW..DimProduct
WHERE ProductKey<100
-- query 2
SELECT ProductKey,ProductSubcategoryKey
FROM AdventureWorksDW..DimProduct
WHERE Color<>'Silver'
Here we can see that query 1 is doing an Index Scan and query 2 is doing a Clustered Index
Scan. We can also see that query 1 is 3% of the batch and query 2 is 97%. Also, we can
see that SQL Server is recommending that we add a new nonclustered index for query
2. So based on this output we know that query 2 is something that should be addressed.
So you can see that once we have identified what queries are taking a long time using
Profiler we can then look at the query execution plan to determine what needs to be tuned
to make the query perform better. As with most things the more you use execution plans
the easier it gets to identify the issue and what can be done to resolve the issue.
Note that not all execution plans are this simple and sometimes they are very difficult to
read and interpret, so for additional information read this tutorial Graphical Query Plan
Tutorial.
Additional Information
Here are some additional items related to Execution Plans.
Overview
SQL Server also includes another performance tool called the Database Engine Tuning
Advisor or DTA. This tool allows you to have SQL Server analyze one statement or a batch
of statements that you captured by running a Profiler or server side trace. The tool will then
go through each statement to determine where improvements can be made and then
presents you with options for improvement.
Explanation
The Database Engine Tuning Advisor is basically a tool that helps you figure out if additional
indexes are helpful as well as partitioning. Here is a summary of the options:
In addition to identifying opportunities for improvement, DTA will also create a T-SQL script
that you can run to actually implement its recommendations.
Here is an example of a query and how we can use DTA to analyze the query and make
recommendations. From within a query window right click and select the DTA option as
shown.
After you select the specific options click on Start Analysis and this will run the DTA tool to
identity any potential improvements.
Here we can see that DTA recommends adding a new index for table DimProduct.
The Database Engine Tuning Advisor can also be launched from within SSMS by clicking on
Tools > Database Engine Tuning Advisor.
Additional Information
Here are some additional items related to the Database Engine Tuning Advisor.
Performance Issues
Overview
There are several factors that can degrade SQL Server performance and in this section we
will investigate some of the common areas that can effect performance. We will look at
some of the tools that you can use to identify issues as well as review some possible
remedies to fix these performance issues.
We will cover the following topics:
Blocking
Deadlocks
I/O
CPU
Memory
Role of statistics
Query Tuning Bookmark Lookups
Query Tuning Index Scans
Troubleshooting Blocking
(Performance Issues)
Overview
In order for SQL Server to maintain data integrity for both reads and writes it uses locks, so
that only one process has control of the data at any one time. There are serveral types of
locks that can be used such as Shared, Update, Exclusive, Intent, etc... and each of these
has a different behavior and effect on your data.
When locks are held for a long period of time they cause blocking, which means one process
has to wait for the other process to finish with the data and release the lock before the
second process can continue. This is similar to deadlocking where two processes are
waiting on the same resource, but unlike deadlocking, blocking is resolved as soon as the
first process releases the resource.
Explanation
As mentioned above, blocking is a result of two processes wanting to access the same data
and the second process needs to wait for the first process to release the lock. This is how
SQL Server works all of the time, but usually you do not see blocking because the time that
locks are held is usually very small.
It probably makes sense that locks are held when updating data, but locks are also used
when reading data. When data is updated an Update lock is used and when data is read a
Shared lock is used. An Update lock will create an exclusive lock on the data for this
process and a Shared lock allows other processes that use a Shared lock to access the data
as well and when two processes are trying to access the same data this is where the locking
and blocking occurs.
Here are various ways you can identify blocking for your SQL Server instance.
sp_who2
In a query window run this command:
sp_who2
This is the output that is returned. Here we can see the BlkBy column that shows SPID 60
is blocked by SPID 59.
Activity Monitor
In SSMS, right click on the SQL Server instance name and select Activity Monitor. In the
Processes section you will see information similar to below. Here we can see similar
information as sp_who2, but we can also see the Wait Time, Wait Type and also the
resource that SPID 60 is waiting for.
Here is the output and we can see the blocking information along with the TSQL commands
that were issued.
Additional Information
Additional resources:
Next Steps
If you are faced with a blocking situation, be sure to consider all of your options in the short and
long term. To resolve the immediate issue, you may need to KILL some spids, but to resolve the
issue you may need to change your database design, change your data access, add NOLOCK
hints to particular queries, etc.
Check out these tips to learn more about locking and blocking
o Understanding SQL Server Locking
o Understanding SQL Server Blocking
o Tip category: Locking and Blocking Tips
Overview
A common issue with SQL Server is deadlocks. A deadlock occurs when two or more
processes are waiting on the same resource and each process is waiting on the other
process to complete before moving forward. When this situation occurs and there is no way
for these processes to resolve the conflict, SQL Server will choose one of processes as the
deadlock victim and rollback that process, so the other process or processes can move
forward.
By default when this occurs, your application may see or handle the error, but there is
nothing that is captured in the SQL Server Error Log or the Windows Event Log to let you
know this occurred. The error message that SQL Server sends back to the client is similar
to the following:
Msg 1205, Level 13, State 51, Line 3
Transaction (Process ID xx) was deadlocked on {xxx} resources with another
process
and has been chosen as the deadlock victim. Rerun the transaction.
In this tutorial we cover what steps you can take to capture deadlock information and some
steps you can take to resolve the problem.
Explanation
Deadlock information can be captured in the SQL Server Error Log or by using Profiler /
Server Side Trace.
Trace Flags
If you want to capture this information in the SQL Server Error Log you need to enable one
or both of these trace flags.
1204 - this provides information about the nodes involved in the deadlock
1222 - returns deadlock information in an XML format
Deadlock graph - Occurs simultaneously with the Lock:Deadlock event class. The
Deadlock Graph event class provides an XML description of the deadlock.
Lock: Deadlock - Indicates that two concurrent transactions have deadlocked each
other by trying to obtain incompatible locks on resources that the other transaction
owns.
Lock: Deadlock Chain - Is produced for each of the events leading up to the
deadlock.
Event Output
In the below image, I have only captured the three events mentioned above.
ck
on the Events Extraction Settings and enable this option as shown below.
Additional Information
Here are some additional artilces about deadlocks.
Overview
There are several things that you can do to improve performance by throwing more
hardware at the problem, but usually the place you get the most benefit from is when you
tune your queries. One common problem that exists is the lack of indexes or incorrect
indexes and therefore SQL Server has to process more data to find the records that meet
the queries criteria. These issues are known as Index Scans and Table Scans.
In this section will look at how to find these issues and how to resolve them.
Explanation
An index scan or table scan is when SQL Server has to scan the data or index pages to find
the appropriate records. A scan is the opposite of a seek, where a seek uses the index to
pinpoint the records that are needed to satisfy the query. The reason you would want to
find and fix your scans is because they generally require more I/O and also take longer to
process. This is something you will notice with an application that grows over time. When
it is first released performance is great, but over time as more data is added the index
scans take longer and longer to complete.
To find these issues you can start by running Profiler or setting up a server side trace and
look for statements that have high read values. Once you have identified the statements
then you can look at the query plan to see if there are scans occurring.
Here is a simple query that we can run. First use Ctrl+M to turn on the actual execution
plan and then execute the query.
SELECT * FROM Person.Contact
Here we can see that this query is doing a Clustered Index Scan. Since this table has a
clustered index and there is not a WHERE clause SQL Server scans the entire clustered
index to return all rows. So in this example there is nothing that can be done to improve
this query.
In this next example I created a new copy of the Person.Contact table without a clustered
index and then ran the query.
SELECT * FROM Person.Contact2
Here we can see that this query is doing a Table Scan, so when a table has a Clustered
Index it will do a Clustered Index Scan and when the table does not have a clustered index
it will do a Table Scan. Since this table does not have a clustered index and there is not a
WHERE clause SQL Server scans the entire table to return all rows. So again in this
example there is nothing that can be done to improve this query.
Here we can see that we still get the Clustered Index Scan, but this time SQL Server is
letting us know there is a missing index. If you right click on the query plan and
select Missing Index Details you will get a new window with a script to create the missing
index.
We can see that we still have the Table Scan, but SQL Server doesn't offer any suggestions
on how to fix this.
Another thing you could do is use the Database Engine Tuning Advisor to see if it gives you
any suggestions. If I select the query in SSMS, right click and select Analyze Query in
Database Engine Tuning Advisor the tools starts up and I can select the options and
start the analysis.
Below is the suggestion this tool provides and we can see that recommends creating a new
index, so you can see that using both tools can be beneficial.
Summary
By finding and fixing your Index Scans and Table Scans you can drastically improve
performance especially for larger tables. So take the time to identify where your scans may
be occurring and create the necessary indexes to solve the problem. One thing that you
should be aware of is that too many indexes also causes issues, so make sure you keep a
balance on how many indexes you create for a particular table.
Additional Information
Here are some additional items related to the Index Scans and Table Scans.
Overview
When we were looking at the index scan and table scan section we were able to eliminate
the scan which was replaced with an index seek, but this also introduced a Key
Lookup which is something else you may want to eliminate to improve performance.
A key lookup occurs when data is found in a non-clustered index, but additional data is
needed from the clustered index to satisfy the query and therefore a lookup occurs. If the
table does not have a clustered index then a RID Lookupoccurs instead.
In this section we will look at how to find Key/RID Lookups and ways to eliminate them.
Explanation
The reason you would want to eliminate Key/RID Lookups is because they require an
additional operation to find the data and may also require additional I/O. I/O is one of the
biggest performance hits on a server and any way you can eliminate or reduce I/O is a
performance gain.
So let's take a look at an example query and the query plan. Before we do this we want to
first add the nonclustered index on LastName.
USE [AdventureWorks]
GO
CREATE NONCLUSTERED INDEX [IX_LastName]
ON [Person].[Contact] ([LastName])
GO
Now we can use Ctrl+M to turn on the actual execution plan and run the select.
SELECT * FROM Person.Contact WHERE LastName = 'Russell'
If we look at the execution plan we can see that we have an Index Seek using the new
index, but we also have a Key Lookup on the clustered index. The reason for this is that the
nonclustered index only contains the LastName column, but since we are doing a SELECT *
the query has to get the other columns from the clustered index and therefore we have a
Key Lookup. The other operator we have is the Nested Loops this joins the results from the
Index Seek and the Key Lookup.
So if we change the query as follows and run this again you can see that the Key Lookup
disappears, because the index includes all of the columns.
SELECT LastName FROM Person.Contact WHERE LastName = 'Russell'
Here we can see that we no longer have a Key Lookup and we also no longer have the
Nested Loops operator.
If we run both of these queries at the same time in one batch we can see the improvement
by removing these two operators.
SELECT * FROM Person.Contact WHERE LastName = 'Russell'
SELECT LastName FROM Person.Contact WHERE LastName = 'Russell'
Below we can see that the first statement takes 99% of the batch and the second statement
takes 1%, so this is a big improvement.
This should make sense that since the index includes LastName and that is the only column
that is being used for both the SELECTed columns and the WHERE clause the index can
handle the entire query. Another thing to be aware of is that if the table has a clustered
index we can include the clustered index column or columns as well without doing a Key
Lookup.
The Person.Contact table has a clustered index on ContactID, so if we include this column in
the query we can still do just an Index Seek.
SELECT ContactID, LastName FROM Person.Contact WHERE LastName = 'Russell'
Here we can see that we only need to do an Index Seek to include both of these columns.
So that's great if that is all you need, but what if you need to include other columns such as
FirstName. If we change the query as follows then the Key Lookup comes back again.
SELECT FirstName, LastName FROM Person.Contact WHERE LastName = 'Russell'
Additional Information
Here are some additional items related to the Key/RID Lookups.
Overview
To ensure that data access can be as fast as possible, SQL Server like other relational
database systems utilizes indexing to find data quickly. SQL Server has different types of
indexes that can be created such as clustered indexes, non-clustered indexes, XML indexes
and Full Text indexes.
The benefit of having more indexes is that SQL Server can access the data quickly if an
appropriate index exists. The downside to having too many indexes is that SQL Server has
to maintain all of these indexes which can slow things down and indexes also require
additional storage. So as you can see indexing can both help and hurt performance.
In this section we will focus on how to identify indexes that exist, but are not being used
and therefore can be dropped to improve performance and decrease storage requirements.
Explanation
When SQL Server 2005 was introduced it added Dynamic Management Views (DMVs) that
allow you to get additional insight as to what is going on within SQL Server. One of these
areas is the ability to see how indexes are being used. There are two DMVs that we will
discuss. Note that these views store cumulative data, so when SQL Server is restated the
counters go back to zero, so be aware of this when monitoring your index usage.
DMV - sys.dm_db_index_operational_stats
This DMV allows you to see insert, update and delete information for various aspects for an
index. Basically this shows how much effort was used in maintaining the index based on
data changes.
If you query the table and return all columns, the output may be confusing. So the query
below focuses on a few key columns. To learn more about the output for all columns you
can check out Books Online.
SELECT OBJECT_NAME(A.[OBJECT_ID]) AS [OBJECT NAME],
I.[NAME] AS [INDEX NAME],
A.LEAF_INSERT_COUNT,
A.LEAF_UPDATE_COUNT,
A.LEAF_DELETE_COUNT
FROM
SYS.DM_DB_INDEX_OPERATIONAL_STATS (db_id(),NULL,NULL,NULL ) A
INNER JOIN SYS.INDEXES AS I
ON I.[OBJECT_ID] = A.[OBJECT_ID]
AND I.INDEX_ID = A.INDEX_ID
WHERE OBJECTPROPERTY(A.[OBJECT_ID],'IsUserTable') = 1
Below we can see the number of Inserts, Updates and Deletes that occurred for each index,
so this shows how much work SQL Server had to do to maintain the index.
DMV - sys.dm_db_index_usage_stats
This DMV shows you how many times the index was used for user queries. Again there are
several other columns that are returned if you query all columns and you can refer to Books
Online for more information.
SELECT OBJECT_NAME(S.[OBJECT_ID]) AS [OBJECT NAME],
I.[NAME] AS [INDEX NAME],
USER_SEEKS,
USER_SCANS,
USER_LOOKUPS,
USER_UPDATES
FROM
SYS.DM_DB_INDEX_USAGE_STATS AS S
INNER JOIN SYS.INDEXES AS I ON I.[OBJECT_ID] = S.[OBJECT_ID] AND
I.INDEX_ID = S.INDEX_ID
WHERE OBJECTPROPERTY(S.[OBJECT_ID],'IsUserTable') = 1
AND S.database_id = DB_ID()
Here we can see seeks, scans, lookups and updates.
The seeks refer to how many times an index seek occurred for that index. A seek is
the fastest way to access the data, so this is good.
The scans refers to how many times an index scan occurred for that index. A scan is
when multiple rows of data had to be searched to find the data. Scans are
something you want to try to avoid.
The lookups refer to how many times the query required data to be pulled from
the clustered index or the heap(does not have a clustered index). Lookups are also
something you want to try to avoid.
The updates refers to how many times the index was updated due to data changes
which should correspond to the first query above.
Additional Information
Here are some additional articles about indexes.
Overview
SQL Server is usually a high I/O activity process and in most cases the database is larger
than the amount of memory installed on a computer and therefore SQL Server has to pull
data from disk to satisfy queries. In addition, since the data in databases is constantly
changing these changes need to be written to disk. Another process that can consume a lot
of I/O is the TempDB database. The TempDB database is a temporary working area for SQL
Server to do such things as sorting and grouping. The TempDB database also resides on
disk and therefore depending on how many temporary objects are created this database
could be busier than your user databases.
Since I/O is such an important part of SQL Server performance you need to make sure your
disk subsystem is not the bottleneck. In the old days this was much easier to do, since
most servers had local attached storage. These days most SQL Servers use SAN or NAS
storage or to further complicate things more and more SQL Servers are running in a
virtualized environment.
Explanation
There are several different methods that can be used to track I/O performance, but as
mentioned above with SAN / NAS storage and virtualized SQL Server environments, this is
getting harder and harder to track as well as the rules have changed as far as what should
be tracked to determine if there is an I/O bottleneck. The advantage is that there are
several tools available at both the storage level and the virtual level to aid in performance,
but we will not cover these here.
There are basically two options that you have to monitor I/O bottlenecks, SQL Server DMVs
and Performance Monitor counters. There are other tools as well, but these are two options
that will exist in every SQL Server environment.
DMV - sys.dm_io_virtual_file_stats
This DMV will give you cumulative file stats for each database and each database file
including both the data and log files. Based on this data you can determine which file is the
busiest from a read and/or write perspective.
The output also includes I/O stall information for reads, writes and total. The I/O stall is the
total time, in milliseconds, that users waited for I/O to be completed on the file. By looking
at the I/O stall information you can see how much time was waiting for I/O to complete and
therefore the users were waiting.
The data that is returned from this DMV is cumulative data, which means that each time
you restart SQL Server the counters are reset. Since the data is cumulative you can run
this once and then run the query again in the future and compare the deltas for the two
time periods. If the I/O stalls are high compared to the length of the that time period then
you may have an I/O bottleneck.
SELECT
cast(DB_Name(a.database_id) as varchar) as Database_name,
b.physical_name, *
FROM
sys.dm_io_virtual_file_stats(null, null) a
INNER JOIN sys.master_files b ON a.database_id = b.database_id and a.file_id
= b.file_id
ORDER BY Database_Name
Here is partial output from the above command.
Performance Monitor
Performance Monitor is a Windows tool that let's you capture statistics about SQL Server,
memory usage, I/O usage, etc... This tool can be run interactively using the GUI or you can
set it up to collected information behind the scenes which can be reviewed at a later
time. This tool is found in the Control Panel under Administrative tools.
There are several counters related to I/O and they are located under Physical Disk and
Logical Disk. The Physical Disk performance object consists of counters that monitor hard
or fixed disk drive on a computer. The Logical Disk performance object consists of counters
that monitor logical partitions of a hard or fixed disk drives. For the most part, they both
contain the same counters. In most cases you will probably use the Physical Disk
counters. Here is a partial list of the available counters.
Now that storage could be either local, SAN, NAS, etc... these two counters are helpful to
see if there is a bottleneck:
Avg. Disk sec/Read is the average time, in seconds, of a read of data from the
disk.
Avg. Disk sec/Write is the average time, in seconds, of a write of data to the disk.
The recommendation is that the values for both of these counters be less than 20ms. When
you capture this data the values will be displayed as 0.000, so a value of 0.050 equals
50ms.
Resource Monitor
Another tool that you can use is the Resource Monitor. This can be launched from Task
Manager or from the Control Panel.
Below you can see the Disk tab that shows current processes using disk, the active disk files
and storage at the logical and physical level. The Response Time (ms) is helpful to see how
long it is taking to service the I/O request.
Overview
SQL Server provides a great tool that allows you to see what statements are running on
your SQL Server as well as collecting metrics such as duration, number of reads, number of
writes, the machine that ran the query, etc... this tool is known as Profiler.
Profiler is a GUI based tool that runs a SQL Server trace to capture the metrics listed above
as well additional data. This data can then be used to determine where your SQL Server
performance issues are related to your TSQL code. Running a trace without using Profiler is
known as a Server Side Trace. You can create and start the trace using TSQL commands
instead of having to use the GUI.
Explanation
Most people begin using Profiler to run a trace, because the GUI is pretty easy to get a trace
setup and running. Once you understand the advantages of using a server side trace you
will begin to use these more frequently unless you are troubleshooting an issue that is
occurring at that exact time.
Profiler
The Profiler tool can be launched in one of these ways:
In SSMS, select Tools > SQL Server Profiler from the menu
A Trace Properties window will open and you can click Run to start the trace with the
default settings
Events
A good starting place is to capture just these two events. These will show you all completed
batches and metrics related to the batch. A batch is basically a set of work, like a stored
procedure, that contains mulieple statements.
Columns
As far as columns go just select all columns and once you see the data that is captured you
can reduce the amount of columns you are capturing.
Filters
Filters allow you to further define what is captured. To set filters click on Column Filters.
So if you only want to capture data for a specific process you can filter on SPID as an
example. Another good starting point is to filter onDuration. I like to set the value to 500
to only show statements that take 500ms or longer. Again this is just a starting point.
Once you have the settings you want you can run the trace.
Additional Information
Here are some additional articles related to Profiler and Server Side Traces.
Additional Information
I/O issues may not always be a problem with your disk subsystem. Just because you see a
slow down or I/O waits occurring there may be other issues that you need to consider such
as missing indexes, poorly written queries, fragmentation or out of date statistics. We will
cover these topics as well in this tutorial.
Here are some additional articles about I/O performance.
Overview
SQL Server is a great platform to get your database application up and running fast. The
graphical interface of SQL Server Management Studio allows you to create tables, insert
data, develop stored procedures, etc... in no time at all. Initially your application runs great
in your production, test and development environments, but as use of the application
increases and the size of your database increases you may start to notice some
performance degradation or worse yet, user complaints.
This is where performance monitoring and tuning come into play. Usually the first signs of
performance issues surface from user complaints. A screen that used to load immediately
now takes several seconds. Or a report that used to take a few minutes to run now takes
an hour. As I mentioned these issues usually arise from user complaints, but with a few
steps and techniques you can monitor these issues and tune accordingly, so that your
database applications are always running at peak performance.
In this tutorial we will cover some of the common issues with performance such as:
deadlocks
blocking
missing and unused indexes
I/O bottlenecks
poor query plans
statistics
wait stats
fragmentation
We will look at basic techinques all DBAs and Developers should be aware of to make sure
their database applications are performing at peak performance.
In this section we will look at the following tools to give you an introduction as to what they
are used for an how you can use them to collect performance related data.
Overview
With the introduction of SQL Server 2005, Microsoft introduced Dynamic Management Views
(DMVs) which allow you to get better insight into what is happening in SQL Server. Without
these new tools a lot of the information was unavailable or very difficult to obtain.
DMVs are a great tool to help troubleshoot performance related issues and once you
understand their power they will become a staple for your Database Administration.
Explanation
The DMVs were introduced in SQL 2005 and with each new release, Microsoft has been
adding additional DMVs to help troubleshoot issues. DMVs actually come in two flavors
DMVs (dynamic management views) and DMFs (dynamic management functions) and are
sometimes classified as DMOs (dynamic management objects). The DMVs act just like any
other view where you can select data from them and the DMFs require values to be passed
to the function just like any other function.
The DMVs are broken down into the following categories:
Here are some of the more useful DMVs that you should familiarize yourself with:
Additional Information
Here are some additional articles about DMVs.
sys.dm_db_index_physical_stats (Transact-SQL)
sys.dm_db_missing_index_columns (TransactSQL)
sys.dm_db_missing_index_group_stats
(Transact-SQL)
sys.dm_db_fts_index_physical_stats (Transact-SQL)
sys.dm_db_persisted_sku_features (Transact-SQL)
sys.dm_db_task_space_usage (Transact-SQL)
sys.dm_db_uncontained_entities (Transact-SQL)
sys.dm_database_copies (Windows Azure SQL
Database_
sys.dm_db_objects_impacted_on_version_change
(SQL Database)
sys.dm_io_cluster_shared_drives (TransactSQL)
sys.dm_io_virtual_file_stats (Transact-SQL)
sys.dm_os_memory_pools (Transact-SQL)
sys.dm_os_nodes (Transact-SQL)
sys.dm_os_performance_counters (TransactSQL)
sys.dm_os_dispatcher_pools (Transact-SQL)
sys.dm_os_process_memory (Transact-SQL)
sys.dm_os_hosts (Transact-SQL)
sys.dm_os_latch_stats (Transact-SQL)
sys.dm_os_loaded_modules (Transact-SQL)
sys.dm_os_memory_brokers (Transact-SQL)
sys.dm_os_memory_cache_clock_hands (TransactSQL)
sys.dm_os_schedulers (Transact-SQL)
sys.dm_os_stacks (Transact-SQL)
sys.dm_os_sys_info (Transact-SQL)
sys.dm_os_sys_memory (Transact-SQL)
sys.dm_os_tasks (Transact-SQL)
sys.dm_os_virtual_address_dump (TransactSQL)
sys.dm_os_volume_stats (Transact-SQL)
sys.dm_os_wait_stats (Transact-SQL)
sys.dm_os_waiting_tasks (Transact-SQL)
sys.dm_os_windows_info (Transact-SQL)
sys.dm_os_workers (Transact-SQL)
The following SQL Server Operating Systemrelated dynamic management views are Identified
for informational purposes only. Not supported. Future compatibility is not guaranteed..
sys.dm_os_function_symbolic_name
sys.dm_os_memory_allocations
sys.dm_os_worker_local_storage
sys.dm_os_ring_buffers
sys.dm_os_sublatches
sys.dm_exec_query_memory_grants
(Transact-SQL)
sys.dm_exec_query_optimizer_info
(Transact-SQL)
sys.dm_exec_query_plan (Transact-SQL)
sys.dm_exec_query_resource_semaphores
(Transact-SQL)
sys.dm_exec_cached_plans (Transact-SQL)
sys.dm_exec_cached_plan_dependent_objects
(Transact-SQL)
sys.dm_exec_connections (Transact-SQL)
sys.dm_exec_cursors (Transact-SQL)
sys.dm_exec_describe_first_result_set (TransactSQL)
sys.dm_exec_describe_first_result_set_for_object
(Transact-SQL)
sys.dm_exec_plan_attributes (Transact-SQL)
sys.dm_exec_procedure_stats (Transact-SQL)
sys.dm_exec_query_stats (Transact-SQL)
sys.dm_exec_requests (Transact-SQL)
sys.dm_exec_sessions (Transact-SQL)
sys.dm_exec_sql_text (Transact-SQL)
sys.dm_exec_text_query_plan (TransactSQL)
sys.dm_exec_trigger_stats (Transact-SQL)
sys.dm_exec_xml_handles (Transact-SQL)
Note
sys.dm_resource_governor_workload_groups
(Transact-SQL)
sys.dm_repl_schemas
sys.dm_repl_tranhash
sys.dm_repl_traninfo
sys.dm_db_mirroring_auto_page_repair (TransactSQL)
sys.dm_clr_loaded_assemblies
sys.dm_clr_tasks