The introduction of ROW_NUMBER() function with SQL SERVER 2005 solves the problem of calculating row numbers. This is one of the exciting feature and eliminates the problem of creating additional temp tables to achieve the result.
ROW_NUMBER function generates unique incrementing integers from 1 and on. They are only allowed in the SELECT and ORDER BY clauses of a query and not allowed in the query WHERE clause.
Syntax:
ROW_NUMBER ( ) OVER ( [ PARTITION BY value_expression , ... [ n ] ] order_by_clause )
Basics:
- OVER – You define a set of rows for the function to work with per each underlying row or when using empty parentheses, the OVER clause represents the entire underlying query’s result set
- PARTITION BY – The argument to partition the query result set by the column name, the row number will reset based on the value changing in the columns supplied. It is like the using the GROUP BY clause
- ORDER BY – orders the rows in each partition by the column name
Coding Example
Simple Table
AnimalID | Name | Type |
100 | Camel | Mammal |
101 | Turkey | Bird |
102 | Cat | Mammal |
103 | Deer | Mammal |
104 | Pigeon | Bird |
105 | Duck | Bird |
Query: SELECT AnimalID, Name, Type, ROW_NUMBER() OVER (Order By Name) as RowID FROM Animal Result:
AnimalID | Name | Type | RowID |
100 | Camel | Mammal | 1 |
102 | Cat | Mammal | 2 |
103 | Deer | Mammal | 3 |
105 | Duck | Bird | 4 |
104 | Pigeon | Bird | 5 |
101 | Turkey | Bird | 6 |
Query: SELECT AnimalID, Name, Type, ROW_NUMBER() OVER(PARTITION BY Type ORDER BY Name) as RowID FROM Animal ORDER BY Type, Name Result:
AnimalID | Name | Type | RowID |
105 | Duck | Bird | 1 |
104 | Pigeon | Bird | 2 |
101 | Turkey | Bird | 3 |
100 | Camel | Mammal | 1 |
102 | Cat | Mammal | 2 |
103 | Deer | Mammal | 3 |
Difference between ROW_NUMBER(), RANK(), DENSE_RANK()
- ROW_NUMBER(): This one generates a new row number for every row, regardless of duplicates within a partition.
- RANK(): This one generates a new row number for every distinct row, leaving gaps between groups of duplicates within a partition.
- DENSE_RANK(): This one generates a new row number for every distinct row, leaving no gaps between groups of duplicates within a partition.
Coding Example
WITH T(StyleID, ID) AS (SELECT 1,1 UNION ALL SELECT 1,1 UNION ALL SELECT 1,1 UNION ALL SELECT 1,2) SELECT *, RANK() OVER(PARTITION BY StyleID ORDER BY ID) AS 'RANK', ROW_NUMBER() OVER(PARTITION BY StyleID ORDER BY ID) AS 'ROW_NUMBER', DENSE_RANK() OVER(PARTITION BY StyleID ORDER BY ID) AS 'DENSE_RANK' FROM T
StyleID | ID | RANK | ROW_NUMBER | DENSE_RANK |
---|---|---|---|---|
1 | 1 | 1 | 1 | 1 |
1 | 1 | 1 | 2 | 1 |
1 | 1 | 1 | 3 | 1 |
1 | 2 | 4 | 4 | 2 |
Filtering using the ROW_NUMBER()
As mentioned earlier in the post that you cannot have ROW_NUMBER() in the WHERE clause of your sql query but if you want to filter the query using the ROW_NUMBER() there is a workaround to create a nested query. Such as:
Select from ( Select row_number() over (partition by x order by x) SrNo,* from x )A where A.SrNo=1
Issues with ROW_NUMBER()
While it is exciting to use ROW_NUMBER() function but recently in one of my query it hurt my performance almost by 3 fold. ROW_NUMBER() are faster when using with simple queries with large number of rows. In my experience I was joining several tables with INNER JOIN and LEFT JOINS it cost me too much on performance. To solve this issue, I had to declare a temporary table and insert all the records and used temp tables to query.
Performance Result
Description | CPU Time | Elapsed Time | Rows |
Using ROW_NUMBER() | 32141 ms | 32608 ms | 837 |
Using Temp Table | 1169 ms | 1314 ms | 837 |
Happy Quering!!!