November | 2007

Today I encountered my second database corruption or crash on my development machine. The problem is that my IDE drive sucks, an old IBM 75GXP. For those of you who don’t remember, these drives were notorious for having all sorts of problems when they were released. Most of the problems revolved around dying after six months. Regardless, I’m still using this thing for another month.

You can tell if your database became corrupted by restarting the postmaster or typing:

sudo /etc/init.d/posrgresql-8.2 restart

The result of the command will look similar to the following:

2007-11-26 13:14:48 CST LOG:  server process (PID 16312) was terminated by signal 11
2007-11-26 13:14:48 CST LOG:  terminating any other active server processes
2007-11-26 13:14:48 CST LOG:  all server processes terminated; reinitializing
2007-11-26 13:14:48 CST LOG:  database system was interrupted at 2007-11-26 13:14:47 CST
2007-11-26 13:14:48 CST LOG:  checkpoint record is at 0/965E4130
2007-11-26 13:14:48 CST LOG:  redo record is at 0/965E4130; undo record is at 0/0; shutdown TRUE
2007-11-26 13:14:48 CST LOG:  next transaction ID: 0/73629; next OID: 10952256
2007-11-26 13:14:48 CST LOG:  next MultiXactId: 1; next MultiXactOffset: 0
2007-11-26 13:14:48 CST LOG:  database system was not properly shut down; automatic recovery in progress
2007-11-26 13:14:48 CST LOG:  record with zero length at 0/965E4178

Before initiating any recovery options, I recommend backing up your $PGDATA directory. The following directions will attempt to reset the transaction log, and you may loose the latest changes to your databases. The first option for recovering from a crash is to open your ‘/etc/postgresql/8.2/main/postgresql.conf’ file and turn on the following option:

...
zero_damaged_pages = true

After saving the changes, type the following to restart postgresql:

sudo /etc/init.d/postgresql-8.2 restart

If you still see error messages, you may need to restart the entire computer in order to get postgresql to start.

After restarting postgresql, reopen the ‘/etc/postgresql/8.2/main/postgresql.conf’ file and comment out the zero_damaged_pages option. The line should look similar to the following:

#zero_damaged_pages = true

After reading a few posts, it seems that IDE drives in general should not be used when working with databases because of corruption issues. If you are using an IDE drive, I recommend adding the following options to the ‘/etc/postgresql/8.2/main/postgresql.conf’ file.

fsync = on				# turns forced synchronization on or off
wal_sync_method = fsync

The problem with turning on fsync is that it slows down the performance of Postgresql, however, it should prevent further corruption issues with your database.

You can find more information about these configuration options in the Postgresql 8.2 documentation.

Reference:

pg_resetxlog – Tool used as a last resort for resetting the transaction log

The other day I had a nice chat with a colleague about the benefits and pitfalls of using an open source platform. I know this subject has been discussed deeply on various forums articles and blogs. I’ve personally done research for Alcatel-Lucent and the University of Texas at Dallas on the subject. For businesses, the bottom line is cost. Their are certain scenarios in a business where switching platforms, whether its open source or not, does not make sense. The scenario I discussed with my colleague revolved around the cost of switching from a Microsoft platform to a Linux platform for its custom server applications.

My colleague’s business faces two main hurdles before considering a migration: the time it will take to migrate custom applications and the expenses incurred during the migration. If the current system is meeting customer needs and does not need further changes, then migrating over to an open source platform may not be the best choice, especially if the lifetime of the system is not close to expiring. Open source software is best used when continuous scaling is demanded. Because of the licensing fees incurred when scaling can become astronomical, development of the system needs to consider the benefits given to a proprietary solution over a open source solution. For example, Microsoft’s Visual Studio has excellent tools for building .NET applications. If the time saved from developing on this platform is justifiable and the scaling needs do not supercede the cost saved using an open source platform, then using a Microsoft solution may be warranted.

Despite my overall excellent experience developing on the Microsoft platform, I still have a hard time recommending it as the platform of choice, mainly because of cost and time savings. These savings are not obvious to people who have not used an open platform before. One way to explain the cost and time savings incurred is the notion of barriers or hurdles towards completing a goal. When developing or deploying a system, one of the hurdles towards completion is paying for licenses, which incurs some amount of expense in time and money. If the entire application stack is free, then you eliminate that expense altogether for the rest of the system’s lifetime. This allows the complete application stack to be cloned or deployed multitudes of times, whether in a test system, developer system, or production system, without incurring the expense of time and money you would encounter in a purchased product. Removing this hurdle has changed the way entire systems are deployed and general deployment of systems in both open source and commercial projects. One simple example is Debian’s software distribution, ‘apt’. A user can script out the default software configuration for a server with one command line. For example, installing a web server, office suite, browser, a couple of games, a couple of compilers, and a IDE with one shell command. Removing the purchasing barrier also paves the way towards completely automating the scaling of a system’s infrastructure. For the small software vendor, you just can’t do that on a Windows platform.

Blog

by Alex Kuo

Monthly Archives: November 2007

PostgreSQL 8.2: Recovering from a Corrupted Database

When Not to Run on an Open Source Platform