Diagnosing uPortal on a Production System

You will need to be able to check the following things:

portal.log

The portal.log will be loacated in the directory that was your current directory from which you started your JVM. Otherwise you can configure /properties/Logger.properties to place it in a specific place.

catalina.out

This file is located in the logs directory under Tomcat's install directory.

web server request log

 Apachectl status

Running apachectl status will output information about apache's client connections. Note, just because they are all full doesn't mean you need to increase them. Usually this indicates an underlying problem with tomcat not responding quickly enough. For instructions on how to enable this in apache go to: http://httpd.apache.org/docs/2.0/mod/mod_status.html

Apache Server Status for localhost

   Server Version: Apache/2.0.46 (Red Hat)

   Server Built: Aug 1 2006 09:25:45

     ----------------------------------------------------------------------

   Current Time: Monday, 18-Sep-2006 07:06:26 CDT

   Restart Time: Sunday, 17-Sep-2006 04:02:54 CDT

   Parent Server Generation: 4

   Server uptime: 1 day 3 hours 3 minutes 31 seconds

   18 requests currently being processed, 8 idle workers

 WWWWWW_WWW..W._W...W_WW___W__..............................W....
 .W...W..........................................................
 ................................................................
 ................................................................

   Scoreboard Key:
   "_" Waiting for Connection, "S" Starting up, "R" Reading Request,
   "W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup,
   "C" Closing connection, "L" Logging, "G" Gracefully finishing,
   "I" Idle cleanup of worker, "." Open slot with no current process

garbage collection log

You will need to set some setting to enable the garbage collection log in your JVM. The following is the setting is what we used for Java 1.4.1.

export CATALINA_OPTS="-Xms256m -Xmx512m -Xloggc:/usr/local/tomcat4/logs/tomcat_gc.log"

Example snipplet (your output may look different depending on JVM and options):

262274.805: [GC 256470K->242332K(260224K), 0.0389740 secs]
262286.923: [GC 258588K->242892K(260224K), 0.0208520 secs]
262299.466: [GC 259141K->245443K(261760K), 0.0377540 secs]
262299.504: [Full GC 245443K->69440K(261760K), 0.5614360 secs]
262303.010: [GC 85685K->70542K(260224K), 0.0092740 secs]
262303.191: [GC 86798K->71190K(260224K), 0.0160590 secs]
262317.124: [GC 87445K->73472K(260224K), 0.0302200 secs]

thread dump of the jvm

On Linux you can use the following to create a thread dump of all the running Java threads.
The thread dump will be written to Tomcat's catalina.out file.
This little script assumes you have only one process called java.

#force the JVM to do a stack dump to stdout which ends up in catalina.out
PID=`ps -AH | pgrep java |head -n 1`
kill -3 $PID

On Windows you can hit CTRL-BREAK in Tomcat's console window to get a thread dump.

cpu usage

On Linux you can look at the load average and the CPU percent used using the top command.
If these are high Don't Panic and test how long it takes to load up a page in uPortal.

network connections

Network connections can help in diagnosing problems with database connection pools.

netstat -e >> file.out

Output example:

Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       User       Inode
tcp        0      0 portalsvr.foo.edu:48787     oraclesvr1.foo.edu:1521     ESTABLISHED tomcat     3065471
tcp        0      0 portalsvr.foo.edu:45708     oraclesvr2.foo.edu:1521     ESTABLISHED tomcat     3040316
tcp        0      0 portalsvr.foo.edu:45715     oraclesvr2.foo.edu:1521     ESTABLISHED tomcat     3040360

... snip...

memory usage reported by the OS

Memory usage reported by the OS can be misleading when running Java. Don't Panic and you'll want to pay more attention to the garbage collection log than to the memory reported by the OS.

Tips:

Leave debug information compiled in. In our performance tests (at Texas Tech University) we found that removing debug information only increased performance by a small factor, yet having the line number information helped diagnose problems (even performance problems) much more quickly.

Set the LogLevel to INFO. The debug info is too verbose for production. In production you should be looking at error level and exceptions most of the time.

Allow Apache to serve up static content like images and html.

Tune your SQL indexes.