On the 7th day of SQL…

Gaston - arrogant dude from Beauty and the Beast. Because I can

Following in Tim Ford’s (blog|twitter) footsteps, I also went non-technical with my favorite post for this awesome twelve days of SQL, which is one of Brent Ozar’s (blog|twitter) collection of bad great ideas that I’m very honored to be a part of.     I struggled to pick a favorite for this series since, well, there are many of them and it’s hard for me to just pick one,  but I hold this particular near and dear to my heart since this is a simple fact that we all know, but sometimes overlook (me included).

This post serves to remind me that, as a Production DBA, I need to be aware of everything that touches my servers and databases.   I’m not talking about fancy Change Management process (that could be a separate post on its own) but it’s the awareness that other groups sometimes don’t have, or hey, sometimes we as a DBAs forget too.

I’m talking to maintain a centralized log about every change that happens on the server.

 

How many times have we modified an index on the fly and forgot to document that change?   It’s not really a code change, is it?   Well, being a Production DBA on a very busy server, every change matter.    On that day I noticed an anomaly in our monitoring software, a spike where there shouldn’t be one.  Red letters that said something had changed and I needed to look to see what it was.  It was not impacting customers, but it could if I let it go.  That is why I set my monitoring thresholds so I know before they do that something needs attention.  Anyway, rather than go through a thorough diagnostics, log analysis, and alerting the team to start culling through code changes, I simply popped open list of the changes that had been made to the server and even though it wasn’t in red and flashing, the root cause popped right out and sure enough, if we didn’t take action, customers would know.  The fact I had a running log of even the subtle changes made to the server allowed me to quickly deploy a fix to a couple indexes and had the devs increase our cache time and voila, the site sailed through a record Thanksgiving traffic with 100% availability and response times that may make others jealous.

Don’t get me wrong, this is not just a need to keep track of the things we DBA’s and our developers do, we also need to be aware of other teams small tweaks on the server or network, such as, oh I don’t know, replacing the network card or swapping the processor, or something silly like that.  Even subtle difference like changing the network cable or plugging it into a different switch port can have huge impacts and I need to know!   For the network or sysadmin teams, that might be nothing, but that could have huge impact to our DB performance if we are not aware of it.  Ever try and troubleshoot a port speed mismatch without the support of the netadmin?  Or that pesky (but well intentioned) SAN admin who thinks you can do just fine with a RAID5 instead of a RAID10?  Not to mention tools are almost designed so that notification is an afterthought, that SAN admin can do the whole thing on-line without your knowledge and there is no down time, huh.  Did you ever think the marketing hype of “no down time” would be abused?  How many times have you heard an admin claim, “no one will know” only to hear help desk phones start ringing without warning?  Subtle changes can make a big difference in the big picture, and don’t even get me started on load testing in the test environment.  Anyway…

My favorite post is written by Jonathan Kehayias (blog|twitter).  He is a co-author of this awesome book, Professional SQL Server 2008 Internals and Troubleshooting (every SQL Professional should have this book – and no, Brent didn’t make me say this).  I have gotten to know him during SQL PASS 2010 when he dropped by to my office with Brent and whipped out some fancy extended events code to help me out identify some of the performance issue that I experienced.   He’s a well respected member of the SQL Community, an intelligent fella’ , MVP and I would love to just pick his brain anytime I can.

 

Without further ado, here’s the post:

http://sqlblog.com/blogs/jonathan_kehayias/archive/2010/08/03/there-is-no-such-thing-as-a-small-change-to-a-production-database.aspx

Coming next, a dear friend of mine who I met in person during another of Brent’s bad idea, SQL Cruise.   Karen Lopez (blog|twitter) is incredibly talented and has an ocean of knowledge about database design/architecture.   She’s very active in the community, a renowned speaker, an icon for Women in IT and an awesome person to hang out with (even at a character dining in Disney.)     Please check on her blog tomorrow to see what her favorite post of the year.

Enjoy

Free Conference at SQL PASS

I’ve been invited to this event coordinated by Brent Ozar (Blog | @BrentOzar) called Free-Con.   It is free (yes, really) and it is the most unique event I ever be part of.     I sat in the same room with 15 other people that I truly admired and adore.   You can read about the event itself on Brent’s post here so I won’t go over it in detail on what’s all about.    What I do want to mention here is what I personally get out of it.  

First of all, I feel honored.   Initially, I also felt very intimidated.   I looked around the room, and I saw an author of one of my favorite SQL books,  Grant Fritchey (Blog | @GFritchey).   I had to buy his book, SQL Server 2008 Query Performance Tuning Distilled (Expert’s Voice in SQL Server) twice since the pages on the first one got rip off from me flipping to that book so many times and I get to shake his hand today.     

So what do I get out of it?

See, I am a production DBA.   Well, I manage a small Production DBA team.  When I say small, it really is only one other DBA in my team so I still get to do all the dirty fun stuff us DBA have to do.    I’m not a consultant.   I do, however, want to be a Rockstar DBA and an expert within my own company.   Brent Ozar cover this very topic on one of his post here and Free-Con today gave me more ammo to achieve that.   One of the most important thing that I take home from it was  how to communicate like a consultant to the business as a production DBA.   Since I have no desire to be a consultant (at least right now), I need to be able to apply everything I learned today from the presentation and the discussion within the audience to my role and for me, that’s my missing piece of puzzle.  

Let me ask you something.   How many of you ever be in the situation that you were so frustated because you know what was the root cause of some major issue you had, but when you presented that to the business and made recommendation, they totally dismissed it.   The business went ahead and hire a consultant, an expert,  and they came back with the very same recommendation you have.    Sound familiar now?

I am very fortunate that my employer by far is the most awesome employer I ever have.   They supported me in any way that most of production DBA out there will drool and they actually respected me like I am an expert.   However, I’m still far from it and I still need a whole lot of learning  but I believe that I’m in the right track to get there.     I know what more I need to do such as  building my brand and be able to project everything that I do around that brand including how to communicate to the business owner within my company.   I need to do more product review or white paper, to promote myself  within my own employer, more than I already have.    

This present a huge challenge for me that I am happily accepting and looking forward to it.     Thank you, Brent, for letting me to be part of it.    It’s score 10 on my awesome scale, and yes, they are 1-10 scale.

 

I’m still around

Wow.  Where did the time go?  It’s almost the end of September and I haven’t posted anything technical since, um, eon ago.   This blog is intended to be my technical blog, sharing some of the cool SQL fun stuff that I run across or have to deal with, which by the way, I have tons of those but for some reason I have a hard time translating it into a blog post.   So they are all safely written on my little spiral notebook (yes, I’m that old fashioned that I still write on spiral notebook for stuff using my pen) and none of them are out here for the public to see.

I know I need to get better on this.   Share my experiences, my SQL-fun issues and resolutions, heck or maybe just regularly write about my day.   But then again, I wonder who will read my blog if I just write about my day at work.   Nothing really exciting compares with those awesome blog out there that I religiously read every day.

Then it suddenly hit me.

There is always someone that can get a use out of my post.   That someone might just be my wonderful husband and for him to get an idea about my day before he even asks me about it, but I should not have to worry about that.    I have an awesome job and I work with a great team and great managers.   I have plenty to share.

As most of you already know, I manage a small DBA team at a very cool company at Seattle.   Our peak season is around the holiday, mostly Thanksgiving and Christmas and apparently this year, we are about to break our own records.

So my teams focus for the next few months is only one thing.   Performance tuning.   Anything that we can do to improve when it comes to the web site performance is a prime candidate for a makeover.    This is one chance that I can actually ask my developer team to re-write some of the code instead of just doing index tuning.

To start the process, I ran profiler for 24 hours, filtering the duration for anything that runs equal to or greater than 250ms.   I used the tuning TSQL template but I also added a few columns that I think are necessary.   Then I imported the trace files to an application called ClearTrace.   This application is pretty awesome since it parses your trace files and aggregates them based on CPU, Reads, Writes, Duration and also calculates the averages for you.   ClearTrace also parses the SQLText and only gives you the code names and gives you sample parameters if you need them for testing.

Then the fun process begins.   As a DBA, I want to work on the top 50 of all those bad code strings (the ones that have long duration, high CPU, and/or high reads) but I have limited time and limited resources, so I have to work closely with my development team to understand what of each piece of code is doing, and how the code is being called, and so on and so on.

This whole process is not new to me as a Senior DBA, but it sure new to me as the person that now manages the DBA team.   I now have to not only work with the development team, but also our QA team, and our PMO (Project Management Office) to coordinate the effort and cherry pick the ‘bad code’ that we are going to spend resources optimizing.    My DBA side is screaming what do you mean we are leaving bad code here and not doing anything about it’ but in my current role as the one that manages the team and has to play politic nice with others, I  totally understand the business decision behind it.

Our development cycle and change management process requires more resources be involved other than our own team when we modified our legacy code and as much as I feel like I can rewrite the whole list, I know that I can’t play cowgirl DBA anymore like I did years ago.   I need resources from other teams and priorities have to be set, and as a result, some of the stuff will get shuffled to the bottom of the list.   There’s process in place, there’s procedures, checks and balances are in place for making sure that we don’t have technical debt or hotfixes that require more hotfixes.

Another of light bulb moment for me.   We DBA’s want to fix the world to be a better place our legacy database code and optimize it so it performs at lightning speed, but sometimes we have to be able to triage our code and leave the lightly injured ones alone and unattended for the moment.

Let me know if you have similar experience!