disaster recovery | SQL DBA In Training

To start the ball rolling, get a routine. If there isn’t a routine, it will be hard to determine when the routine is off.Having a routine makes it easier to train others coming up the ranks or just have someone fill in for vacation days. Personally, I find a routine helps me focus my early morning energy on the mundane and opens the afternoon to work out issues found in the morning, or better yet, the projects that take a back burner when dealing with emergency issues. Speaking of emergency issues, without a routine, everything feels like an emergency.

There are a number of sites that have a top 10 and I don’t claim to have a special spin on the tasks at hand other than to underline the importance of making the time to these tasks.

Back Ups!
There is no argument that back ups are important the general question is what kind and how often. Surprising, not all (so called) DBAs treasure a SQL server database. As a DBA don’t let anyone talk you out of doing your own database backup. A server backup is not the same and does not give you options or flexibility when recovering. I have discovered that a restore process is required more often because of end user error than an actual disaster event such as hardware failure.
My Preference on a DB recovery plan is to use a FULL recovery model with nightly backups, differentials twice during the day at 10 am and 3 pm. (We operate on a normal 8-5 business day), and log files every hour. Not all servers and databases are created equal. As time and space requirements impend on my backup schedule, this will get adjusted as needed. I should also note that I like consistency, so if I need to change for one database on the server, all of the database back processes change for the whole server. This just makes it easier to administer, maintain and remember about the back up process as a whole.One last note on server backups, they are needed. I just prefer the server backup everything except my ldf, mdf, and ndf files. Most server backup software allows for exceptions.
Restore Backups Periodically.
It is not enough to perform a backup job and never test to make sure that the backup is recoverable. This can be a time consuming process but one that will help you sleep better. More than likely your backup process is automated and if you are like me have another automated process to check to make sure that it completed successfully. One thing is for certain and that is technology is designed to fail. With that in mind, take a day out of the week to test one back up from one server. If nothing else it will keep you familiar with the process and during a real DR event, you will be glad that perfect practice makes perfect.
Monitor Logs.
Microsoft has done a lot to make alerts and notifications available in more than one format. For the minimalist, there are the raw logs and event viewer. For the technical savvy alerts and messages can be forwarded or even emailed. I recently read an article for creating a special job to push the log information to a SQL Server table so that they could be queried and reports produced. In other words, there is no reason why the logs remain an untapped resource for proactively responding to server issues.
Manage SQL Server Jobs
Now that we are familiar with the logs and alerts, it is natural to discuss the need to monitor our automated jobs. Jobs that run successfully for weeks, months, or even years will inevitably fail once in a while. I see all to often that a jr DBA will create a job and with great confidence never test or check to see how successfully it runs. Personally, I like to track my job outcomes in a monitoring database. This allows me easy access to the job history for all of the servers I maintain. I can easily tell by one query which servers need immediate attention.
Database Integrity.
A lot can be found on the need for running integrity checks on the database periodically. This has really proven beneficial for older and larger databases. Best practice states that the DBCC commands should run before full backups. I don’t disagree with the statement, but I don’t always follow it. DBCC can be time consuming and resource intensive. Even though we are operate 8-5 Monday thru Friday, there are several off hour processing jobs that run on several of our servers that limit our time frame for nightly backups.Our method is to run the integrity checks over the weekend. This job does produce a report and should be evaluated after the run. Some issues will cause the job to fail, if it fails the report will guide you to the underlying issue or issues.
Maintain Indexes.
As the database grows by inserts or shrinks by deletes, the indexes will get fragmented and need maintenance. The best thing to automate this process is Smart Maintenance. This will include scripts to backup and do integrity checks. (Thanks Derrick B. for turning me onto the Smart Maintenance)
Create Documentation.
The best ally in time of a melt down is good documentation. It is sad that not everyone sees the need for documenting what they are doing either to help them to remember or for the poor sap who comes after them. Out of all of the items mentioned here, this one is the easiest to start and requires the least amount of skill to do. Start small, but plan big. In other words, start by documenting all of the SQL Servers you maintain include things like service log ins, the version of SQL server to the point release. Include maintenance jobs and their schedules. You may consider tracking purchase and license information or at least a pointer to it such a a purchase order number. What I mean by planning big is to consider this the start of your disaster recovery plan document. Understanding what your current environment looks like will go a long way if you ever need to rebuild from scratch.
Performance Monitor.
If you are a DBA then I know you have heard, “The database is slow.” at least once. We know (with a wink) that it is usually the application code or a poorly written query that is to blame. So how can we defend ourselves? Monitor the server’s performance. There are several tools available within the OS and SQL Server to at least get us the needed ammunition to defend the honor of SQL Server. The common tools to use are Performance Monitor (perfmon), SQL Profiler, Execution Plans, Index Tuning, DB Tuning advisor. There are tons of articles to explain how to use these. Generally the best place to start is with the performance monitor. This will help to at least give an over-all view of the server. I specifically look at CPU, Buffer Pools, Disk and Network. If there is a problem, you will see it in one of these areas first.
Security
This item gets over looked a lot. The story goes along the lines that a manager requests an employee access to a database. So a “new user” is created at the database level. If your lucky, it was a network credential and not a sql login. Now the employee is either no longer employed or their job duties have changed an no longer require access, but no one has told the DBA area to revoke access.
If your company uses AD, I highly recommend moving toward administering SQL groups as opposed to individual logins. This will be the first step towards turning off mixed mode authenticity. Clean up the users by moving them into groups and allowing them access via the groups. The hard sell will probably be your developers who use and “application” or “sql” login. Start now by talking and learning about Kerberos security. Meet your AD administer and start talking Kerberos. Before you know it, your AD manager will be on your side asking the development manager why they are creating security holes in the network.
Disaster Recovery Planning
Out of the ten items this will be the most comprehensive and possibly the most time consuming. But remember the journey of a thousand miles starts with the first step. Start by documenting how things look right now. This will become your baseline. Next, begin documenting your day-to-day issues that crop up like a failed backup job. You are right, a failed job isn’t exactly a DR event, but this will get your team thinking about the process and documenting their process. These notes can then be compiled and published so that you have a resource for your jr DBAs.This will naturally grow and should in compass issues that can occur at the server, instance, database, and table level. Finally a plan of priorities need to be established. This list should be reviewed by all areas of IT and business. You don’t want one area to have a high priority on an application server that uses a database that is staged on a low priority database server.

There is a lot here, and even more on line. With this I will leave you with a story. A farmer was looking for a hired hand and a rather rough and aged looking man approaches the farmer. The man asks about the job and the farmer asks for his qualifications to which the man replies, “I sleep soundly during a storm.” To this the farmer smiled timidly not knowing what he meant, but the farmer desperate for help hired the man. About a month into the job a terrible storm awoke the farmer about 3 a.m. The farmer in a panic runs out and finds the hired man fast asleep. The farmer tries to wake the man and is shouting. The farmer says, “wake up! wake up! The fence needs mending, the cows might escape!” The hired hand in a gruff half asleep voice said, “No worries. The cows are in the barn and the fence is fixed.” The farmer still panicking says “What about the chicken and horses? We need to make sure they are in the barn too.” Again the hired man says “Not to worry, they are all tucked away in the barn. All is good.” The farmer retorts, “Well! what about the barn roof, it leaks! We can’t sleep knowing that the barn roof leaks!” But the irritated hired man rolls over and calmly stated, “Sir, when you hired me I told you my qualification was that I like to sleep soundly through a storm. I took care of the roof, the chickens, the horses, and the cows all before the storm came so that I could sleep through the storm.”

The moral of the story is that now when the sun is shinning, is the time to prepare for when the storm comes. Be ready with good documentation, a good back up and a great all around maintenance plan and you too will be able to sleep soundly during the storm.

Posts Tagged With: disaster recovery

Top Priority: Backups

DBA Top 10 To Do List

Recent Posts

Archives

Categories

Meta