Thursday, November 19, 2009

SVN Verify Crond Monitor Script, Born From Misery.

Over the last few weeks I've had the displeasure of dealing with an imploded SVN FSFS repository. In the process of learning what was wrong and how to go about fixing it (or working around the problem) I discovered that I could have mitigated some of this problem by monitoring my svn repositories for problems with 'svnadmin verify'.

This wouldn't of course help me fix the problem, but it would have identified it long before I did, making the overall data loss that occurred far smaller.  As it stands we lost about 25 revisions from one of our major SVN repositories, as we had to remove everything at and after the damaged repository and recommit it on top of the undamaged remnants.  This means we lost alot of history but not necessarily alot of code.  Still it was an extremely time consuming fix that would have been better simply avoided.

As such I've written a little python script that's simply designed to catch any errors generated by the verify command that is meant to be inserted into a crontab for execution. As such there are a number of things you'll have to edit before it'll be truly useful to you. It's been written this way to avoid many of the major pitfalls associated with running scripts from cron, which has a severely restricted execution environment.

Specifically you must update a few variables to reflect what you would like them to do or point at.

The first and most obvious is updating the /usr/bin/python, to point at your actual version of python. I strongly recommend typing 'which python' and just pasting the results in the place of the above string, if you just put 'python' in there instead your beholden to the path available to crond.

After that it's a simple matter of updating the variables under the 'Setup' comment. I will list each one and explain what you need to put in that position.

svnadmin_path: this is the string returned from 'which svnadmin', this should be an absolute path, and the line under it will make it into such for good measure.

repository_path: this value serves one of two functions, either it's the location of the repository (singular) that you wish to keep an eye on.  Or it's the location of the root of several repositories you wish to keep an eye on.  In either case the path should be absolute.

recursive_path: this is a bool flag with possible values of True or False, if it is set to True the program will treat the repository path as the root of a group of repositories instead of as the target repository itself.

verbose: this is another bool flag with the same possible values.  If set to True it will output short status notes at specific segments throughout the program.

I recommend running the program every day, more often is possible but seems like overkill unless your committing at a rapid pace.  As such, my crontab entry:

0 0 * * * /root/verify_svn.py


The Blue Pill:
First and foremost, this is by no means a 'perfect' piece of software.  First, it has no error checking what so ever, so it doesn't fail gracefully, on the other hand the worst its going to do is attempt to do comparisons on things that make no sense and generate some nonsense output.  Second, it relies on the fact that if you generate stdout output in a crond execution that output is captured and emailed to the owner of the crontab (in my case root, who also owns SVN and the repositories). 

The Red Pill:
That all said, it works and pretty well if a bit slow.  The svnadmin verify command is by no means a quick command even on a small repository.  On a large one it takes as long as ten minutes to execute.  However it does it's job which is to run a verify on all of our repositories every night and notify us if there are any issues (of a nature that verify can detect).  Two possible areas of improvement that I've considered up till now but haven't done anything with is allowing command line arguments and making it work with Nagios somehow.  Command line arguments are pretty easy to get up and running, but undo the whole compact one place to find all the info feature I like about the current implementation.  On the other hand it will be a must if I want Nagios support.  Nagios on the other hand is no nearly so easy, I'm not sure how I would even go about it, but it would prove supremely useful in our overall monitoring strategy where I work.

Entire code:

 #!/usr/bin/python
 #SVN Repository Verify.
 #The script is released under the MIT License, because I really don't care what 
 #you do with it, so long as you don't claim you wrote it.  This code is released
 #with no warrantee, and free to use under the license at the bottom of the file.
 #It should be further noted that I probobly will not spend a huge amount of time updating
 #this script unless I get a compulsion to add some sort of Nagios support to it, 
 #or find a tragic bug in need of fixing.

 #This script was written under the shadow of a fairly severe and irrecoverable 
 #error that one of the SVN repo's I'm responible for experienced
 #this error, one of the many 'line length' errors, is detected by the svnadmin 
 #verify command.  But goes undetected between commits until you hit
 #the issue again.  So I've just put together a short script to run every night (or whenever)
 #using the svnadmin verify command and reporting the any errors.

import os #import os tools.
from platform import python_version #get the python version tool

#this part is messy due to the fact that 'getoutput' is moved from 'commands' to 'subprocess' in python 3.0
#as such I test for the version of python, and adjust the import of 'getoutput' accordingly.
pyversion = python_version()
pyversion = pyversion.split('.')
if eval(pyversion[0]) < 3: #check major version
 from commands import getoutput #retrieve getoutput from commands if py version less than 3.
else:
 from subprocess import getoutput #retrieve getoutput from subprocess if py version 3 or greater.


def main():
 # Setup
 # -The paths asked for below should NOT be relative.
  
 #Provide the path for svnadmin ( this could be replaced with a 'which' command 
 # but i'd rather leave nothing to chance with 'cron', which never behaves well assuming paths.)
 svnadmin_path = '/usr/local/bin/svnadmin'
 svnadmin_path = os.path.abspath(svnadmin_path) #path cleanup
 
 #This is either your single repository path, or the root of multiple repositories.
 repository_path = '/location/of/repository/' 
 repository_path = os.path.abspath(repository_path) #path cleanup
 
 #would you like to use the repo path as a recursive repository root with subdir repositories or as a single repository
 recursive_path = False
 
 #Do you want lots of useful output, or just a bit at the end?
 verbose = True
 
 paths = [] #establish the empty array of path strings.
 
 #if the path is recursive, generate a list of the paths to be verified.
 if recursive_path:
  
  if verbose:
   print 'Recursive Mode Directories:' #if verbose, output recursive mode notification
  
  ls_of_repo = getoutput('ls ' + repository_path) #dump contents into a local string
  ls_of_repo = ls_of_repo.splitlines() #split the string into an array of strings.
  
  for item in ls_of_repo:
   temp = os.path.join(repository_path,item) #generate an actual path for every item in the repository root
   if os.path.isdir(temp): #if the item is a directory add it to the list of paths.
    paths.append(temp)
    if verbose:
     print temp #if verbose, output path

 #else use the repository path on it's own.
 else:
  paths.append(repository_path)
 
  if verbose:
   print 'Single Directory Mode: ' + repository_path #if verbose, output repository path

 #create a basic array to hold errors.
 errors = []
 
 for path in paths:
  if verbose:
   print 'Currently running svnadmin verify on: ' + path # if verbose, output current verify target
  
  log = getoutput(svnadmin_path + ' verify ' + path) #run svnadmin verify on the path
  log = log.splitlines()
  #errors.append(process_log(path, log, verbose))
  errors += (process_log(path, log, verbose))
 
 
 #I'm using this as a cron job, so all I actually have to do is print to stdout and 
 #the owner of the cron job will get an output email.
 
 if errors != []:
  print 'Current SVN Verification Errors'
  for error in errors:
   print error
 
 return 0
 
def process_log(path,log,verbose):
 #this functions entire job is to return a list of error strings from a provided log.
 #the path value is simply a header it could be anything
 errors = []
 for line in log: #process log line by line
  temp = line.split(" ") #split each line to test for the normal 'verified' responce
  if temp[0] != '*' and temp[1] != 'Verified':
   errors.append( path + ": " + line) # append the path and error message to the error log if it fails.
 
 return errors
 
if __name__ == '__main__':
 main()
 
# Copyright (c) <2009> <Garrett McGrath>

 # Permission is hereby granted, free of charge, to any person
 # obtaining a copy of this software and associated documentation
 # files (the "Software"), to deal in the Software without
 # restriction, including without limitation the rights to use,
 # copy, modify, merge, publish, distribute, sublicense, and/or sell
 # copies of the Software, and to permit persons to whom the
 # Software is furnished to do so, subject to the following
 # conditions:

 # The above copyright notice and this permission notice shall be
 # included in all copies or substantial portions of the Software.

 # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
 # EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
 # OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
 # NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
 # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
 # WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
 # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
 # OTHER DEALINGS IN THE SOFTWARE.

0 comments: