How to delete heavy records from a table in the most performant way?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP



How to delete heavy records from a table in the most performant way?



This is the case: I have a table with 16,000 rows, with a child table with 4,000,000 rows. The parent table has a column with a lot of data (it's a wkt, used for geometry). I need to cleanup the data periodically, and at this moment I need to delete 5685 parent rows along with 1,400,000 child rows. I'm struggling to write the most performant query to achieve this. My current method is this:



1) Get all the ids from the parent table from the rows that needs to be deleted.



SELECT Id, ValidTo From ParentTable Where ValidTo < someDate;



2) For each id I find I am executing following commands with:



DELETE FROM ChildTable WHERE ParentId = IdFromStepOne;



DELETE FROM ParentTable WHERE Id = IdFromStepOne



This is taking 15 minutes for 95-100 records, so it will be done in 14 hours.. Can this be written more performant?
I'm coding in .Net Core and using Entitiy Framework for you information.



Thanks in advance!





Which dbms are you using?
– jarlh
Aug 10 at 7:35





Do the DELETE's in reasonably sized transactions. Perhaps delete 1000 or 10000 rows per transaction.
– jarlh
Aug 10 at 7:36




2 Answers
2



You query shows that you are looping through each id and deleting the child & parent rows.



Use IN clause to perform it for multiple values.


DELETE FROM ChildTable WHERE ParentId in (SELECT Id From ParentTable Where ValidTo < someDate)

DELETE FROM ParentTable WHERE Id in (SELECT Id From ParentTable Where ValidTo < someDate)



As you need to delete rows in two tables, you will need 2 queries and the SELECT query doesn't need to select the ValidTo column but only the Id.


SELECT


ValidTo



I would write these queries:


DELETE FROM ChildTable ct
WHERE EXISTS (SELECT pt.Id FROM ParentTable pt WHERE ct.Id_parent = pt.Id AND pt.ValidTo < someDate);

DELETE FROM ParentTable
WHERE ValidTo < someDate;



Using pl/sql you should be able to select the ParentTable's Ids to delete only one time.


Id


Query1 => SELECT Id FROM ParentTable WHERE ValidTo < someDate
Query2 => DElETE FROM ChildTable WHERE id_parent IN [results of Query 1]
Query3 => DELETE FROM ParentTable WHERE Id IN [results of Query 1]





Is using a subquery performant enough? Will it check every record of the table then (4,000,000) records for each record?
– SanderSoetaert
Aug 10 at 8:02





It will check every record unless you have an index on the ValidTo column.
– Chocolord
Aug 10 at 9:01


ValidTo





I only have an index on parentId column of the child table
– SanderSoetaert
Aug 10 at 10:18





Then the query finding the Ids of the parents will check every records to test the ValidTo value, the pl/sql alternative could help by running this query only one time.
– Chocolord
Aug 10 at 10:32


Id






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

make 2 or more post in bootsrap

Store custom data using WC_Cart add_to_cart() method in Woocommerce 3

Firebase Auth - with Email and Password - Check user already registered