Continue to Site

Eng-Tips is the largest engineering community on the Internet

Intelligent Work Forums for Engineering Professionals

  • Congratulations SDETERS on being selected by the Eng-Tips community for having the most helpful posts in the forums last week. Way to Go!

eliminate duplicate records

Status
Not open for further replies.

NTrade

Computer
Jan 10, 2007
7
I enjoy sql - but sometimes can't get my head around the puzzle; Have Table1 due to sloppy data entry have duplicates in the Table1:

OrderID Name Address Email
1 Joe xxxx 123@test.com
2 Joe S. x xxx 123@test.com
3 Fred yyy 456@this.com
4 Jim zzz ttt@you.com
5 Joey S x-xxx 123@test.com
6 Jimbo zzz ttt@you.com

The email field is being used as the basis of determining a duplicate and there is about 25 dups per 1000 records or so...(frequently more than 2 dups often up to 4 of the same).

It is presumed the higher OrderID value is the record to be carried forward. Thus I need to return:

OrderID Name Address Email
3 Fred yyy 456@this.com
5 Joey S x-xxx 123@test.com
6 Jimbo zzz ttt@you.com

have been scratching around some sql statements for awhile and would very much welcome input from someone a bit more versant than I...thanks in advance....


 
Replies continue below

Recommended for you

Depends on which SQL you're using. You will be looking for the syntax for TOP or LIMIT in the manual, and doing a query using ORDER BY DESC on the ID field

Good Luck
johnwm
________________________________________________________
To get the best from these forums read faq731-376 before posting

Steam Engine enthusiasts:
 
You might also want to see if "SELECT DISTINCT" may be useful.

Good Luck
--------------
As a circle of light increases so does the circumference of darkness around it. - Albert Einstein
 
thnks both - but am still struggling....would welcome more input....
 
am thinking about this:

SELECT * FROM Table1 As X WHERE X.Email = Table1.Email AND X.OrderID > Table1.OrderID

but it will return only comparable records:
5 Joey S x-xxx 123@test.com
6 Jimbo zzz ttt@you.com

it leaves out:
3 Fred yyy 456@this.com

because the OrderID is not > than any duplicate.....

and if I use >= it will return all records of the table
 
Found a relatively easy solution via 2 step;

SELECT Table1.Email, Max(Table1.[Order ID]) AS [MaxOfOrder ID]
FROM Table1
GROUP BY Table1.Email;

This gave me what I need because the nondups are by definition the max OrderID value of their group of 1...


The second step was just to join the full table with this where the OrderID were identical....
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor