Database Index

According to WIKIPEDIA, a database index is a data structure that improves the speed of data retrieval on a database table at the cost of additional writes and storage space to maintain the index. One of the most used data structure to implement index is B-tree. Generally, there are two types of index: clustered index and non-clustered index.

To look up certain records in a table, it would require O(N) to go through the entire table before finishing the search. With the help of index, it could achieve O(logN) lookup performance.

Clustered index

A clustered index defines how the data will be physically stored in a table by altering the physical orders of data blocks, therefore a table can have at most one clustered index. After declaring the clustered index, data will be sorted by it.

Relations to the primary key

The primary key enforces unique values in a specific column at the logical level. When declaring a primary key, SQL Server enforces this constraint at the physical level by creating a clustered index by default, though this could be changed by explicitly declaring using the non-clustered index.

CREATE TABLE T1 (
	ID INT NOT NULL PRIMARY KEY,
	LastName INT NOT NULL,
	FirstName INT NOT NULL
);

CREATE TABLE T2 (
	ID INT NOT NULL PRIMARY KEY NONCLUSTERED,
	LastName INT NOT NULL,
	FirstName INT NOT NULL
);

Non-clustered index

Non-clustered index stores column values that the index is created and the address of the record (clustered index)

As clustered index sorts the data physically in the table, a non-clustered index is stored in one place while table data is stored in another place. Inside the non-clustered index, only column values that the index is created on will be stored and sorted in a certain order, as well as the address of the record that the column value belongs to.

Since column values will be copied for each non-clustered index, indexing would consume extra space. Moreover, it would decrease the performance of inserts, updates, and deletes to maintain the index structure.

Note that, by default, if clustered is not specified, a non-clustered index will be created.

CREATE INDEX idx_lastname
ON Persons (LastName);

Single Index, Composite Index, and Covering Index

A single index uses one field to build the index, while a composite index uses a concatenation of multiple fields to build the index.

-- single index
CREATE INDEX idx_lastname ON Persons (LastName);
-- composite index
CREATE INDEX idx_lastname_firstname ON Persons (LastName,FirstName);

Note that, for the composite index, it follows the leftmost-prefix rule. Simply speaking, given an index as follows:

CREATE INDEX idx_a_b_c ON foo (a,b,c);

Only searching on a, a,b, and a,b,c could search the index efficiently, not any others.

However, there’s on exception: when a non-clustered index includes all columns in the query, SQL Sever will not need to perform additional lookup through the clustered index to retrieve the data requested. Therefore the following also works.

CREATE INDEX idx_a_b_c ON foo (a,b,c);
SELECT a FROM foo where b = xxx;

Reference

  • https://zhuanlan.zhihu.com/p/40820574
  • https://zhuanlan.zhihu.com/p/23624390
  • https://docs.microsoft.com/en-us/sql/relational-databases/indexes/create-nonclustered-indexes?view=sql-server-ver15#:~:text=By%20default%2C%20a%20nonclustered%20index,does%20not%20include%20XML%20indexes.
  • https://www.sqlshack.com/what-is-the-difference-between-clustered-and-non-clustered-indexes-in-sql-server/#:~:text=A%20clustered%20index%20defines%20the,index%20on%20that%20particular%20column.
  • https://dbadiaries.com/sql-server-covering-index-and-key-lookup#:~:text=A%20covering%20index%20is%20a,it%20is%20a%20faster%20operation.
Subscribe
Notify of
guest
2 Comments
Most Voted
Newest Oldest
Inline Feedbacks
View all comments