I can’t remember the last time a ulimit bit me, but the time has come again. Everyone is used to removing all the limits for the instance ID at this point, I hope, but have you ever considered the ulimits for your fenced user id?

ENOMEM

The Problem

In the DB2 diagnostic log, I saw error messages like this for 3 out of 4 databases on a fairly new server, occuring multiple times a day:

2016-01-12-17.25.31.588169-300 E302959A3363         LEVEL: Error (OS)
PID     : 36372558             TID : 4627           PROC : db2fmp (C) 0
INSTANCE: db2inst1             NODE : 000           DB   : SAMPLE
APPID   : 192.0.2.0.42856.151231085622
HOSTNAME: server1
EDUID   : 4627                 EDUNAME: db2fmp (C) 0
FUNCTION: DB2 UDB, SQO Memory Management, sqloLogMemoryCondition, probe:100
CALLED  : OS, -, malloc
OSERR   : ENOMEM (12) "There is not enough memory available now."
MESSAGE : Private memory and/or virtual address space exhausted, or data ulimit
          exceeded
DATA #1 : Soft data resource limit, PD_TYPE_RLIM_DATA_CUR, 8 bytes

Not only is the ENOMEM a critical element in this error message relating to the problem I’m describing, but the fact that it’s coming from the process db2fmp. Also critical is the fact that this server is not experiencing memory pressure or memory misconfiguration problems. If it’s coming from a different process, the issue may be different.

A little research led me to conclude I was seeing scenario number 12 from this technote: http://www-01.ibm.com/support/docview.wss?uid=swg21470035

That scenario is that the fenced user id has a ulimit for data.

Resolving the Issue

Finding Fenced User ID

If you do not already know what your fenced user id is, you can determine it using any of these methods:

Method 1

==> cat /db2home/db2inst1/sqllib/ctrl/.fencedID
db2fenc1

In the above, ‘/db2home/db2inst1/‘ would be replaced with the home directory of the DB2 instance owner.

Method 2

==> ps -ef | grep -i [db2]fmp
db2fenc1 10617056 65863810   0   Jan 10      -  0:00 db2fmp
 cogadmf 13631718 11599922   0   Jan 02      -  0:01 db2fmp
...

In this method, there may be many processes, and you can see that I have two DB2 instances on this server, so I get two fenced ids. The parent process id is the process id of db2sysc for the instance, so I could use that to map back which fenced id goes with which instance.

Method 3

==> db2pd -fmp
Database Member 0 -- Active -- Up 3 days 22:27:04 -- Date 2016-01-13-17.34.26.321410
FMP:
Pool Size:       11
Max Pool Size:   200 ( Automatic )
Keep FMP:        YES
Initialized:     YES
Trusted Path:    /db2home/db2inst1/sqllib/function/unfenced
Fenced User:     db2fenc1
...

This will output information about all of the fenced processes, so may be a long list – the fenced user is listed near the top.

Looking at ulimits

Once you know the fenced user, you want to login as that user or su to it. This will list the limits for the user:

$ ulimit -a
time(seconds)        unlimited
file(blocks)         unlimited
data(kbytes)         131072
stack(kbytes)        32768
memory(kbytes)       32768
coredump(blocks)     2097151
nofiles(descriptors) 2000
threads(per process) unlimited
processes(per user)  unlimited

In this case, the data limit is what is causing the problem.

Changing the ulimit

Depending on the division of responsibilities, you may only need to request that your System Administrator change the data limit for that user to unlimited. If you instead have access to root and should change it yourself, you can do this as root:

chuser data=-1 data_hard=-1 db2fenc1

After making this change, you will have to log in as the user again to see the changes. Always verify the changes took effect as expected.

If you have the DBM CFG parameter KEEP_FENCED set to YES (which is the default), you will need to stop and start the DB2 instance before the changes will take effect.

Note that all instructions here are for AIX because that is the OS where I ran into this issue.