Using ROW_NUMBER() with MS SQL SERVER

The introduction of ROW_NUMBER() function with SQL SERVER 2005 solves the problem of calculating row numbers. This is one of the exciting feature and eliminates the problem of creating additional temp tables to achieve the result.

ROW_NUMBER function generates unique incrementing integers from 1 and on. They are only allowed in the SELECT and ORDER BY clauses of a query and not allowed in the query WHERE clause.

Syntax:

ROW_NUMBER ( )
 OVER ( [ PARTITION BY value_expression , ... [ n ] ] order_by_clause )

Basics:

  • OVER – You define a set of rows for the function to work with per each underlying row or when using empty parentheses, the OVER clause represents the entire underlying query’s result set
  • PARTITION BY – The argument to partition  the query result set by the column name, the row number will reset based on the value changing in the columns supplied. It is like the using the GROUP BY clause
  • ORDER BY – orders the rows in each partition by the column name

Coding Example

Simple Table

AnimalID Name Type
100 Camel Mammal
101 Turkey Bird
102 Cat Mammal
103 Deer Mammal
104 Pigeon Bird
105 Duck Bird
Query:
SELECT AnimalID, Name, Type, ROW_NUMBER() OVER (Order By Name) as RowID
FROM Animal

Result:
AnimalID Name Type RowID
100 Camel Mammal 1
102 Cat Mammal 2
103 Deer Mammal 3
105 Duck Bird 4
104 Pigeon Bird 5
101 Turkey Bird 6
Query: 
SELECT AnimalID, Name, Type, ROW_NUMBER() OVER(PARTITION BY Type ORDER BY Name) as RowID 
FROM Animal
ORDER BY Type, Name

Result:
AnimalID Name Type RowID
105 Duck Bird 1
104 Pigeon Bird 2
101 Turkey Bird 3
100 Camel Mammal 1
102 Cat Mammal 2
103 Deer Mammal 3

Difference between ROW_NUMBER(), RANK(), DENSE_RANK()

  • ROW_NUMBER(): This one generates a new row number for every row, regardless of duplicates within a partition.
  • RANK(): This one generates a new row number for every distinct row, leaving gaps between groups of duplicates within a partition.
  • DENSE_RANK(): This one generates a new row number for every distinct row, leaving no gaps between groups of duplicates within a partition.

Coding Example

WITH T(StyleID, ID)
     AS (SELECT 1,1 UNION ALL
         SELECT 1,1 UNION ALL
         SELECT 1,1 UNION ALL
         SELECT 1,2)
SELECT *,
       RANK() OVER(PARTITION BY StyleID ORDER BY ID)       AS 'RANK',
       ROW_NUMBER() OVER(PARTITION BY StyleID ORDER BY ID) AS 'ROW_NUMBER',
       DENSE_RANK() OVER(PARTITION BY StyleID ORDER BY ID) AS 'DENSE_RANK'
FROM   T

 

StyleID ID RANK ROW_NUMBER DENSE_RANK
 1  1  1  1  1
 1 1  1  2  1
 1  1  1  3  1
 1  2  4  4  2

Filtering using the ROW_NUMBER()

As mentioned earlier in the post that you cannot have ROW_NUMBER() in the WHERE clause of your sql query but if you want to filter the query using the ROW_NUMBER() there is a workaround to create a nested query. Such as:

Select from ( Select row_number() over (partition by x order by x) SrNo,* from x )A where A.SrNo=1

Issues with ROW_NUMBER()

While it is exciting to use ROW_NUMBER() function but recently in one of my query it hurt my performance almost by 3 fold. ROW_NUMBER() are faster when using with simple queries with large number of rows. In my experience I was joining several tables with INNER JOIN and LEFT JOINS it cost me too much on performance. To solve this issue, I had to declare a temporary table and insert all the records and used temp tables to query.

Performance Result

Description CPU Time Elapsed Time Rows
Using ROW_NUMBER() 32141 ms 32608 ms 837
Using Temp Table 1169 ms 1314 ms 837

 

Happy Quering!!!