Return all column names for a duplicate value in a table

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP



Return all column names for a duplicate value in a table



I would like to know is it possible to return all value for duplicate column with same ID value for oracle sql



My table design would below



Table A


Name ID Order Year
------ ------ ------- ------
JOHN 1 ORD123 2017
JAKE 2 ORD122 2018
JES 2 ORD111 2017
JOHN 3 ORD323 2012
NICK 4 ORD133 2011
AMY 4 ORD222 2010
MUS 4 ORD132 2010



I want the result of the query to be as below


Name ID Order Year
------ ------ ------- ------
JAKE 2 ORD122 2018
JES 2 ORD111 2017
NICK 4 ORD133 2011
AMY 4 ORD222 2010
MUS 4 ORD132 2010





Sure thing it is possible, you could do it with a subselect or a group by having count('X') > 1. Anyway this is a common problem and is pretty much already answered in stackoverflow
– bradbury9
Aug 10 at 7:26


group by having count('X') > 1





I would suggest you marking the question that you think is best as accepted If you hadn't tried I would encourage give mt0's answer a go. ATM all three answers should solve your problem.
– bradbury9
Aug 10 at 9:06






yeah u right alll was working... i could only choose 1 option that kind sad... as all 3 was good. thanks guys
– james
Aug 10 at 9:09





IMHO, the purpose of the accepted answer is flag an answer as "most helpful/the best answer to the original poster/OP". That way is easy to differentiate from a group of related questions those with a working answer and go directly to the best approach/answer. If a question has multiple working/useful answers or not usefull answers, those can be detected with the upvotes or downvotes.
– bradbury9
Aug 10 at 10:19




3 Answers
3



You can use an analytic function to do it in a single table scan:



SQL Fiddle



Oracle 11g R2 Schema Setup:


CREATE TABLE TableA ( Name, ID, "Order", Year ) AS
SELECT 'JOHN', 1, 'ORD123', 2017 FROM DUAL UNION ALL
SELECT 'JAKE', 2, 'ORD122', 2018 FROM DUAL UNION ALL
SELECT 'JES', 2, 'ORD111', 2017 FROM DUAL UNION ALL
SELECT 'JOHN', 3, 'ORD323', 2012 FROM DUAL UNION ALL
SELECT 'NICK', 4, 'ORD133', 2011 FROM DUAL UNION ALL
SELECT 'AMY', 4, 'ORD222', 2010 FROM DUAL UNION ALL
SELECT 'MUS', 4, 'ORD132', 2010 FROM DUAL;



Query 1:


SELECT Name, ID, "Order", Year
FROM (
SELECT t.*,
COUNT(*) OVER ( PARTITION BY id ) AS num_duplicates
FROM tableA t
)
WHERE num_duplicates > 1



Results:


| NAME | ID | Order | YEAR |
|------|----|--------|------|
| JAKE | 2 | ORD122 | 2018 |
| JES | 2 | ORD111 | 2017 |
| NICK | 4 | ORD133 | 2011 |
| AMY | 4 | ORD222 | 2010 |
| MUS | 4 | ORD132 | 2010 |



Using IN ( SELECT ... GROUP BY id HAVING COUNT(*) > 1 ) or an aggregation in a self-join will require two table/index scans.


IN ( SELECT ... GROUP BY id HAVING COUNT(*) > 1 )



Update



if i have two condition, is it posisble also, maybe id and year?



Query 2: Just add the additional columns to the PARTITION BY clause:


PARTITION BY


SELECT Name, ID, "Order", Year
FROM (
SELECT t.*,
COUNT(*) OVER ( PARTITION BY id, year ) AS num_duplicates
FROM tableA t
)
WHERE num_duplicates > 1



Results:


| NAME | ID | Order | YEAR |
|------|----|--------|------|
| AMY | 4 | ORD222 | 2010 |
| MUS | 4 | ORD132 | 2010 |





if i have two condition, is it posisble also, maybe id and year?
– james
Aug 10 at 8:47






Adding the year into the equation it would be COUNT(*) OVER ( PARTITION BY id, YEAR) AS num_duplicates. That way the analytic function would reset (would recount) for each ID-YEAR pair.
– bradbury9
Aug 10 at 8:55


COUNT(*) OVER ( PARTITION BY id, YEAR) AS num_duplicates





@james yes, updated with an example
– MT0
Aug 10 at 8:57





thanks for all you in making me understand and helping me
– james
Aug 10 at 8:59



Have a sub-query to return id's that exists more than once.


select *
from tableA
where id in (select id from tableA
group by id
having count(*) > 1)





performance wise, would this be better than inner join?
– bradbury9
Aug 10 at 7:28





@bradbury9, I did ask myself the same question. Actually I don't know if Oracle optimizes the queries different or not. Someone can perhaps check the execution plan.
– jarlh
Aug 10 at 7:32





I think it could be pretty similar or the same, one table scan for the duplicated ID's and the table scan for the data retrieval. For readability I prefer a join.
– bradbury9
Aug 10 at 7:37






can i know why count(*) instead of coun(id)?
– james
Aug 10 at 7:41





@james, count(*) counts rows (for each id). count(id) counts non-null id values (for each id.) Same result in this case.
– jarlh
Aug 10 at 7:43


count(*)


count(id)



You can do it with either a subquery in the where clause or with a join between the duplicated ID's and the main table. For the subquery approach check jarlh's answer.



I think that for large tables you shuld make sure the table is correctly indexed.


select
a.*
from
(
select id from tableA group by id having count(*) > 1
) dupes
inner join TableA a dupes on a.id = dupes.id





can i know why count(*) instead of coun(id)?
– james
Aug 10 at 7:41





this is better for performance part since it a large amount very large amount of recrods in the table?
– james
Aug 10 at 7:42






@james Long time ago it did matter, nowadays it is not a problem performance wise. You could also count('X') with the same results.
– bradbury9
Aug 10 at 7:43


count('X')





ic... so the count no matter can be anything right?
– james
Aug 10 at 7:44






@james the result could vary if the column admits null values. Performance wise, doesnt matter. I like to count('X') or count(*) to avoid referencing columns that could change its name in the future and break the SQL
– bradbury9
Aug 10 at 7:47


count('X')


count(*)






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

Firebase Auth - with Email and Password - Check user already registered

Dynamically update html content plain JS

How to determine optimal route across keyboard