How to wisely use an update query

Sep 30, 2019 by - Category - Backend

As we all know Update is the most expensive SQL query, Hence it should be used wisely. As if you did not care about the data-set, optimization, and performance before writing the query you are definitely going to be screwed!!!
Here I have mentioned a recent case and a few experiments that I have gone through to improve the performance of my script that was actually updating millions of records in my DB.

So I am having classifieds table where I have millions of classified ads and I need to update all the ads which are from a specific city depends upon some metadata. Here I have mentioned the approaches that I followed and What I learned from those approaches

Simple update query for small data-set

Does the function look good?

I’ll say yes if your data-set is small I mean if the total number of rows being processed by your query is in thousands, not millions or so.

But if your data-set is large and the number of rows being affected by your data set is in million you will definitely gonna be in trouble, as your script will either take minutes of time to run or you will get out of memory error as the number of update queries will increase

So I have modified the function above as follows

Reduced number of update queries

Does the function look good? 

No doubt the function above reduced the number of updates queries But the problem about big data-set is still there

So here comes another approach that is also called chunking

Reduced number of update queries and increased performance with chunking for large data-set

Hence we are passing the data to the update query in chunks like 5000 records at a time so if we have 1,00,000 records to be affected by this we will not run 1,00,000 update queries (approach one). That will badly increase the run time of the code.
we are not even running one update query with millions of keys in WHERE IN clause (approach two) that will give out of memory error.
But we are executing the query in chunks of 5000 which will not only increase the speed of the query but it leaves the change of out of memory error.

So the chunking and WHERE IN clause finally helped us to handle update query for large data-set.

Finally, we have a health and fast running update query for our program !!!

Leave a Reply

Your email address will not be published. Required fields are marked *

fourteen − 3 =