...
Even though we are running tomcat 5, Apache Tomcat 6 by Wrox has been particularly helpful in tuning our tomcat 5.5.25 uportal JVM (just make sure you double check syntax...I ran into one instance it is slightly different). I'm sure their tomcat 5 book is good too but the thing that makes this book so good is that toward the back, it includes a good chapter on JVM tuning. I won't try to duplicat the chapter here by any means but I learned several things that I suspected but didn't know for sure. By default tomcat is set up in development mode. What this means is that by default it is set to do way more page compilation then necessary...in fact it is set to recompile each page every 4 seconds. Apache's suggested production settingsApache's suggested production settingsApache's suggested production settings say that setting the modicationTestInterval to a high value will improve performance a lot.
Heartbeat Monitoring
We perform heartbeat monitoring in several ways. We use an online service called Siteuptime to hit our portal every several minutes and notify us when our JVMs go down and come back up.
Health Monitoring
At Heartland Community College, we use a free app called [Lamba Probe|http://www.lambdaprobe.org/d/index.htm] to monitor our heap. It's just another application that can sit along side your portal and portlets. [View a live demo|http://demo.lambdaprobe.org/probe/index.htm] by using "demo" for username and password. Essentially it is a glorified tomcat manager application that allows you to see stats on database connection pools, threads, http and ajp connections and memory usage. Obviously it isn't going to be much help when tomcat hangs because it itself is dependent on tomcat, but it is a great tool for performance monitoring. Just drop the application into your webapps directory and by default, any accounts with the manager role in tomcat-users.xml will be able to log in. Of course it is very easy to change the role being used for authentication or configure the application to use a custom [tomcat realm|http://tomcat.apache.org/tomcat-5.5-doc/ realm-howto.html] to perform authentication against ldap. This application has provided us additional information needed to tune our JVMs. Be sure to add the following java option to tomcat -Dcom.sun.management.jmxremote to enable the detailed memory breakdown.
Update: PsiProbe ( mentioned on parent page) is the community driven fork of Lamba Probe
We are also running a custom linux script that automatically runs every 30 minutes and doing a screen scrap of the tomcat manager app status page for memory levels. We can then graph this data to get trend infomation to help us predict when we should restart the JVM.