Hi This is me!
The phlegmatic wannabe!!
Today was one of the worst days of my life. All the unexpected things happened today. Somehow, me, my team mate and my team lead dint lose our cool. But I know we are standing at the edge of the cliff right now, one strong breeze can throw us to any side of it. And I am praying that it throws us back to where we were earlier.
Every month our Unix Scripts that load the data into the datawarehouse run in the first week. We need to monitor them constantly and in case we find few scripts not working properly we have to fix the problem and re-run those.
Our clients send us their data regarding various plans (they are offering to their employees, customers etc.). We load this data in our datawarehouse after the ETL process. This ETL and loading is done by our scripts. This time due to several issues with server our jobs dint run properly and left us in a dilema. We were not at all sure if things have gone alright. We began with a strategy and somewhere down the line after three days of fixing we began to realize that our datawarehouse was probably getting short of data, i mean data was not there. We maintain 3 years of history data in our datawarehouse and all of that was deleted by some mishappening which we weren't able to figure out at all.. Some how we were assuming that it was the fault of the server migration and all but when we took the matter upto our team lead, he came and we began to explore about the probable causes.
All of a sudden, he opened one file that had few sql queries in it. We were using it to delete unwanted and incorrect data for last 2 days. The queries looked like this:
delete from prsn_clnt_afltn where clnt_id in (01834) and btch_cyc_id = 120 and em_modl_id = 0 or em_modl_id is null;
this is just one of those queries in that file out of 39. That means, in all, data was being deleted from 39 tables and that too all of that. I saw that query and was shell shocked. I knew after an instant glance that this is the one deadly query that has caused havoc for our work and jobs. I told my team lead that I was sure that was our fault even though he was not confirm yet. But I was sure coz I could see that "or" part hanging alone.
Had the query been:
delete from prsn_clnt_afltn where clnt_id in (01834) and btch_cyc_id = 120 and (em_modl_id = 0 or em_modl_id is null);
every thing would have been all fine. And we would have been cherishing our life as always. But this careless fault has given us the lesson of our lives. I dont know how the things are going to be and how will my team lead solve the problem but I know its time to wake up!! Somehow I feel very sorry for my hardworking team mate Gagandeep sir. Hez one gem, hez really one of the most respectable people around me. And best part hez a master in his own right, but coz of this mistake that we came across he also looked tired of it all.
Well, thanks to udit, ankush and sumit sir who made our life easier by cracking jokes (our very own PJs and GJs) and making the moments a bit lighter. I really love the way god sent these people into my life. I know we all wont be togather for eternity but I would like to say these guys have got me back to my own.
We also had one Ashutosh sir in our team, an extremely intelligent, decent, caring and fun loving person. He had left hewitt only 3 weeks ago, and we are sure had he been here everything would have been going fine, as ryt now most of the load is coming onto Gagan sir who is taking a great care of it but still in chaos mistakes are bound to happen, and the server migration, database bounces etc had caused but ulimate chaos for us.
Hopefully it all will soon become fine, and we will be on track again. But till then fingers crossed, just wishing for the best :D
No comments:
Post a Comment