Tuesday, July 3, 2007

Retrieving Distinct string values without using Distinct in T-SQL

I don’t know many developers to have not used Distinct in the queries at one time or the other. If you look up distinct in SQL Server Book it is defined as “Returns a set, removing duplicate tuples from a specified set.” True but not quite, when used for string column it also sorts the column alphabetically by default you like it or not.



Recently I came across a scenario where I needed to get distinct string values sorted by date time without sorting it in alphabetical order. I was dumb struck to find none of the SQL Server methods provided direct support to do so. I finally came up with the below mentioned process to achieve this, though would not say that it’s the best way forward, but till I find a better way, I am sharing the best I know.

DECLARE @Table1 TABLE
(
ID INT IDENTITY,
Value VARCHAR(50)
)

DECLARE @Table2 TABLE
(
ID INT IDENTITY,
Value VARCHAR(50)
)




INSERT INTO @Table1(Value)
Values('A')

INSERT INTO @Table1(Value)
Values('Z')

INSERT INTO @Table1(Value)
Values('B')

INSERT INTO @Table1(Value)
Values('F')

INSERT INTO @Table1(Value)
Values('A')

INSERT INTO @Table1(Value)
Values('F')

INSERT INTO @Table1(Value)
Values('N')

INSERT INTO @Table1(Value)
Values('C')

INSERT INTO @Table1(Value)
Values('N')

INSERT INTO @Table1(Value)
Values('B')

DECLARE @LOOP
INT
DECLARE @COUNT INT
SET @LOOP =1
SELECT @COUNT = Count(*) FROM @TABLE1
WHILE @COUNT>=@Loop
BEGIN
DECLARE @VALUE VARCHAR(50)
SELECT @VALUE = VALUE From @TABLE1 Where
mailto:ID=@Loop

IF NOT EXISTS (SELECT * FROM @TABLE2 Where VALUE = @VALUE)
BEGIN
INSERT INTO @TABLE2(VALUE)
VALUES (@VALUE)
END
SET @Loop = @Loop +1
END

SELECT * FROM @TABLE2




kick it on DotNetKicks.com


Please provide comments on the method defined. Also please enlighten me with any other approach that I have over looked.

2 comments:

EE wanna-be said...

Since you're using a temp table you can just delete the duplicates, using code similar to this blog entry:

http://blog.sqlauthority.com/2007/03/01/sql-server-delete-duplicate-records-rows/

It appears to be more efficient than copying the rows into a second temporary table. I did limited testing so I won't swear that it's always more efficient.

HemantG said...

I thought of that, but you see that does not work when you have no unique key and other wise that would lead to excesive looping, we will have to loop throught the table multiple times