Logging method

There are two methods of logging that DB2 supports: Circular and Archive. Other RDBMSes have similar modes.

Circular

The default that DB2 uses if you don’t change anything is Circular logging. Circular logging is more often appropriate for databases where you don’t care about the data (I have seen it used for databases supporting Tivoli monitoring and other vendors) or for Data Warehousing and Decision Support databases where you have extremely well defined data loading processes that can easily be re-done on demand. You must also be willing to take regular outages for database backups because Circular logging does not allow you to take online backups. Circular logging also does not allow you to rollforward or back through transaction logs to reach a point in time – any restores are ONLY to the time a backup was taken.

On most new builds, I move away from circular logging. I just don’t find it appropriate for most databases, especially transaction processing databases where requirements often include very high availability and the ability to recover from all kinds of disasters with no data loss.

Archive

So why isn’t archive logging the default? Well, it requires proper management of transaction log files. Which can really get you in trouble if you don’t know what you’re doing. If you compress or delete an active transaction log, you will crash your database and have to restore from a backup. I’ve seen it happen, and it’s not fun. The highest frequency of OS level backups you’re willing to do should be applied to the directories holding transaction log files.

I ensure that my archive logs are always on a separate path from the active ones so I, and whoever gets paged out when a filesystem is filling up, can easily see which is which. They should preferably be on a separate filesystem or archived to a location other than the database server itself. TSM, NetBackup, NetWorker, and other third-party tools have interfaces for this.

Scripts work well to manage trasaction log files. A backup script can be a good place to do this. How long you keep them depends on your restore requirements and your overall backup/restore strategy.

To compress archived transaction logs, you can either use the LOGARCHCOMPR1 database configuration parameter, or use a simple cron job to find files in the archive log path older than a certain time frame (1 day or 3 days is most common) and compress them. A nice safe way to delete logs is the prune logs command.

This is one of the areas where it is critical for DBAs to have an excruciatingly high level of attention to detail.

Logging Settings

Ok, ready for the most complicated part?

All the settings discussed here are in the db cfg.

LOGRETAIN

The LOGRETAIN parameter was discontinued in DB2 10.1. The LOG_RETAIN_STATUS parameter can be used to understand what the other settings do, but you cannot set it directly.

To look at the value of this, use this command and look at the first two rows of output:

$ db2 get db cfg for sample |grep -i log
 Log retain for recovery status                          = NO
 User exit for logging status                            = NO

If ‘Log retain for recovery status’ is set to ‘NO’, then you have circular logging. If it is set to ‘Recovery’ then you have archive logging. To enable archive logging, you simply update LOGARCHMETH1 to a valid value.

LOGARCHMETH1

This parameter specifies the separate path for your archive logs to be sent to. It can be a location on DISK, TSM or other VENDOR library.

http://publib.boulder.ibm.com/infocenter/db2luw/v9r7/topic/com.ibm.db2.luw.admin.config.doc/doc/r0011448.html

WTH is this USEREXIT thing?

I undoubtedly have some newer DBAs wondering about this. The LOGARCHMETH1 parameter and others that dictate the location of archive logs was only introduced in DB2 8 (or was it 7?). Before that, we had these nasty things called userexit programs that we had to locate C compilers to compile and be aware of the location of the uncompiled versions to make changes if needed. And the compiled file had to be in the right place with the right permissions. Really, I hated working with them. But the functionality is still in DB2 to use them. I imagine they could do things you can’t do natively, but the parameters are so good that it’d be a rare situation that you need them.

LOGFILSIZ

This is the size of each transaction log file. Generally my default for OLTP databases is 10000, but I’ve gone higher – it’s not unusual to go up to 40,000. Other values may be valid, depending on circumstances

LOGPRIMARY

This determines the number of log files of the size LOGFILSZ that compose the database’s active log files. These are all created on database activation (which happens on first connection), so you don’t want to go too large. But you do want to generally have the space here to handle your active logs.

LOGSECOND

This determines the number of additional active logs that can be allocated as needed. LOGPRIMARY + LOGSECOND cannot exceed 255. The nice thing about LOGSECOND is that these are not allocated on database activation, but only as needed. You can keep database activation faster by having a lower number for LOGPRIMARY and a higher number for LOGSECOND, if database activation time matters. The other awesome thing here is that LOGSECOND can be increased online – one of the few logging parameters that can be. I usually start with 50, but increase if there’s a specific need for more. Remember, these should not be used on an ongoing basis – just to handle spikes.

All the Others

There are all kinds of nifty things you can do with logging. Infinite logging, mirrored logging, logging to a raw device, etc. I’m not going to cover all the logging parameters there are in this post.

Potential issues

Deleting or Compressing an Active Log File

The best case if you delete or compress an active log file is that DB2 is able to recreate it. This may affect your ability to take online backups. The worst (and more likely) case is that your database ceases functioning and you have to restore from backup. Keep your active and archive logs in separate directories to help prevent this, and educate anyone who might try to alleviate a filesystem full issue – System Adminstrators or Ops or even developers. If you do get an error on an online backup referencing the inability to include a log file, take an offline backup just as soon as you can – you will be unable to take online backups until you do.

Filling up a Filesystem Due to Not Managing Log Files

If your archive log filesystem is separate and fills up, it doesn’t hurt anything. If the filesystem your active log path is on fills up, your database will be inaccessible until you clear up the filesystem full. The moment the filesystem is no longer full, the database will function, so there is no need to restore. I recommend filesystem-level monitoring for any filesystems involved in transaction logging.

Deleting Too Many Log Files and Impacting Recovery

If you’re on anything before DB2 9.5, make absolutely sure that you use the “include logs” keyword on the backup command. If you don’t, you may end up with a backup that is completely useless, because you MUST have at least one log file to restore from an online backup. When you delete log files, keep in mind your backup/recovery strategy. There’s very little worse than really needing to restore but being unable to do so because you’re missing a file. I recommend backing up your transaction logs to tape or through other OS level methods as frequently as you can.

Deleting Recent Files and Impacting HADR

Sometimes HADR needs to access archive log files – especially if HADR is behind and needs to catch up. If you run into this situation, you have to re-set-up HADR using a database restore. If you’re using HADR, it is important to monitor HADR so you can catch failures as soon as possible and reduce the need for archive logs.

Log Files Too Small

Tuning the size of your log files may be a topic for another post, but I’ll cover the highlights. Large deletes are the most likely to chew through everything you’ve got. The best solution is to break up large units of work into smaller pieces, especially deletes. Where that’s not possible, you’ll need to increase any of LOGFILSZ, LOGPRIMARY, or LOGSECOND. Only LOGSECOND can be changed without recycling the database.

Log File Saturation

This one confuses the heck out of new DBAs. You get what looks like a log file full, yet the disk is not full and a snapshot says there’s plenty of log space available. The problem here is that with archive logging, log files and each spot in those log files must be used sequentially – even if there are things that have already been committed. Normally the database is rolling through the logs, with the same number of files active at once, but constantly changing which files.

Sometimes an old connection is sitting out there hanging on to a page in the log file with an uncommitted unit of work. Then the connection becomes idle and stays that way, sometimes for days. Then DB2 gets to the point where it has to open another log file, and it can’t because that would be more than it is allowed to allocate. So it throws an error that looks pretty similar to log file full. In that case, you must force off the old idle connection. Details are written to the diag log, and you can also use a database snapshot to get the id of the connection holding the oldest log file.

Often, this is a developer’s connection. Many applications are re-using connections, and manage them fairly well. If a database has these kinds of issues, I like to have a db2 governor running that forces off connections that are IDLE for more than a specified time. For Websphere Commerce databases, 4 hours works well because they are constantly using the connections. Other applications may have different thresholds.