Diagnosing uPortal on a Production System
You will need to be able to check the following things:
portal.log
The portal.log will be loacated in the directory that was your current directory from which you started your JVM. Otherwise you can configure /properties/Logger.properties to place it in a specific place.
catalina.out
This file is located in the logs directory under Tomcat's install directory.
web server request log
Apachectl status
Running apachectl status will output information about apache's client connections. Note, just because they are all full doesn't mean you need to increase them. Usually this indicates an underlying problem with tomcat not responding quickly enough. For instructions on how to enable this in apache go to: http://httpd.apache.org/docs/2.0/mod/mod_status.html
Apache Server Status for localhost Server Version: Apache/2.0.46 (Red Hat) Server Built: Aug 1 2006 09:25:45 ---------------------------------------------------------------------- Current Time: Monday, 18-Sep-2006 07:06:26 CDT Restart Time: Sunday, 17-Sep-2006 04:02:54 CDT Parent Server Generation: 4 Server uptime: 1 day 3 hours 3 minutes 31 seconds 18 requests currently being processed, 8 idle workers WWWWWW_WWW..W._W...W_WW___W__..............................W.... .W...W.......................................................... ................................................................ ................................................................ Scoreboard Key: "_" Waiting for Connection, "S" Starting up, "R" Reading Request, "W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup, "C" Closing connection, "L" Logging, "G" Gracefully finishing, "I" Idle cleanup of worker, "." Open slot with no current process
garbage collection log
You will need to set some setting to enable the garbage collection log in your JVM. The following is the setting is what we used for Java 1.4.1.
export CATALINA_OPTS="-Xms256m -Xmx512m -Xloggc:/usr/local/tomcat4/logs/tomcat_gc.log"
Example snipplet (your output may look different depending on JVM and options):
262274.805: [GC 256470K->242332K(260224K), 0.0389740 secs] 262286.923: [GC 258588K->242892K(260224K), 0.0208520 secs] 262299.466: [GC 259141K->245443K(261760K), 0.0377540 secs] 262299.504: [Full GC 245443K->69440K(261760K), 0.5614360 secs] 262303.010: [GC 85685K->70542K(260224K), 0.0092740 secs] 262303.191: [GC 86798K->71190K(260224K), 0.0160590 secs] 262317.124: [GC 87445K->73472K(260224K), 0.0302200 secs]
thread dump of the jvm
On Linux you can use the following to create a thread dump of all the running Java threads.
The thread dump will be written to Tomcat's catalina.out file.
This little script assumes you have only one process called java.
#force the JVM to do a stack dump to stdout which ends up in catalina.out PID=`ps -AH | pgrep java |head -n 1` kill -3 $PID
On Windows you can hit CTRL-BREAK in Tomcat's console window to get a thread dump.
cpu usage
On Linux you can look at the load average and the CPU percent used using the top command.
If these are high Don't Panic and test how long it takes to load up a page in uPortal.
network connections
Network connections can help in diagnosing problems with database connection pools.
netstat -e >> file.out
Output example:
Active Internet connections (w/o servers) Proto Recv-Q Send-Q Local Address Foreign Address State User Inode tcp 0 0 portalsvr.foo.edu:48787 oraclesvr1.foo.edu:1521 ESTABLISHED tomcat 3065471 tcp 0 0 portalsvr.foo.edu:45708 oraclesvr2.foo.edu:1521 ESTABLISHED tomcat 3040316 tcp 0 0 portalsvr.foo.edu:45715 oraclesvr2.foo.edu:1521 ESTABLISHED tomcat 3040360 ... snip...
memory usage reported by the OS
Memory usage reported by the OS can be misleading when running Java. Don't Panic and you'll want to pay more attention to the garbage collection log than to the memory reported by the OS.
Tips:
Leave debug information compiled in. In our performance tests (at Texas Tech University) we found that removing debug information only increased performance by a small factor, yet having the line number information helped diagnose problems (even performance problems) much more quickly.
Set the LogLevel to INFO. The debug info is too verbose for production. In production you should be looking at error level and exceptions most of the time.
Allow Apache to serve up static content like images and html.
Tune your SQL indexes.