Overview
Utilizing Java memory (i.e., heap) and performance analysis tools (e.g., YourKit), an overall profile of uPortal can be developed to provide insight into application server performance related problems. Most of our work has centered around heap-leaks in the effort to track down the out-of-memory situation currently being experienced by myRutgers in production. Finding heap-leaks is an iterative process. As part of that process, this page documents issues related to Excessive Object Creation (EOC).
Creating objects in Java is trivial. In fact, it can be described as being too easy to create objects. When a user logs into uPortal for the first time, over 40MB of objects are created (depending on channel content). The resulting number of objects is much lower than 40MB, but creating over 40MB of objects (most being temporary) in a few seconds is an expensive process. Though the Java Garbage Collector (GC) has made great strides in performance over the past few years, especially in regards to temporary object creation, reducing the large number of objects created can only increase overall application performance.
isDebugEnabled()
Tracing is commonly found in application code. It will usually look like:
for (Enumerator e = users.getEnumeration(); e.hasMoreElements(); ) { User user = (User) e.nextElement(); log.debug("user: " + user + ", account: " + accounts.get(user) + ", classes: " + classes.get(user)); // business logic... }
There is no debate surrounding the value in using a tracing framework in a production application. It is invaluable in debugging problems. However, production applications are not normally run in debug tracing mode. The amount of tracing information generated is overwhelming and would just be deleted.
But, even if the production application is running in non-debug tracing mode, the above code fragment will still go through the (possibly) expensive steps of creating String objects to pass into the debug()
method who will drop the request.
A more GC-friendly approach is to wrap all log requests (not just debug) with a conditional:
for (Enumerator e = users.getEnumeration(); e.hasMoreElements(); ) { User user = (User) e.nextElement(); if (log.isDebugEnabled()) { log.debug("user: " + user + ", account: " + accounts.get(user) + ", classes: " + classes.get(user)); } // business logic...
By always using this strategy, the number of objects created, and therefore the amount of work for the GC, can be greatly reduced.
HTMLdtd.fromChar(int)
Entities are generally used in XML (and HTML) to allow special characters to exist within an XML document. For example:
<html> <body> This is my XML document: &lt;xmldoc&gt; &lt;data/&gt; &lt;/xmldoc&gt; </body> </html>
If "<xmldoc>" had been placed in the body of the document, it would have caused a parsing exception since "<xmldoc>" is not a valid HTML element. The "<" and corresponding ">" are necessary for successful parsing.
For uPortal, the org.jasig.portal.serialize.HTMLdtd
class is responsible for filtering all output XML and replacing entity characters with entities so only valid XML is output from uPortal. It does this by loading the HTMLEntities.res resource file which looks like:
# Character entity references for markup-significant # quot 34 amp 38 lt 60 gt 62 nbsp 160 # # Character entity references for ISO 8859-1 characters # iexcl 161 cent 162 pound 163 ...
When loading the file, all entities are stored in a HashMap keyed on an Integer object of the entity value. So, map.get(new Integer(160))
will return a String object value "nbsp" and when writing the character, would be replaced by " " (done by org.jasig.portal.serialize.BaseMarkupSerializer.printEscaped(int)
.
Though a simple algorithm, the map lookup is performed on every character of the generated file leaving uPortal. An excessive number of objects were being created since a new Integer(int))
was being performed on every character. Since one page is easily 25,000 characters, 25,000 Integer objects were being created to resolve entity mappings.
Our solution was to replace the Integer object keyed map with a String[], indexed by the entity value. The array grows dynamically on initialization to accommodate existing (and new) entity values. Though sparsely populated, it is the faster implementation for resolving an entity String value since now it is just an index into an array. Now new Java objects are created during normal processing of entity replacement.
SubstitutionIntegerFilter.java
The org.jasig.portal.utils.SubstitutionIntegerFilter.java
class is primarily used by org.jasig.portal.ResponseSubstitutionWrapper.getWriter()
to replace current session tags with new tags in documents output by uPortal. It will be given a Writer
object (either a WriteableOutputStreamWrapper
or WriteableWriterWrapper
class) who will receive the substituted integers and write the content to another Writer
.
So, when data is written, it will follow the flow of:
SubstitutionWriter -> SubstitutionIntegerFilter -> WriteableWriterWrapper -> Writer
When writing output data, an "integer" value is buffered in an int[]
in the SubstitutionIntegerFilter
(while a pattern is matched and replaced). When full, it is flushed to the underlying writers. The WriteableWriterWrapper
will allocate a new char []
, then copy each int []
element to the new char []
.
This WriteableWriterWrapper
is overhead, consuming extra memory to convert each int
to char
.
Instead, the SubstitutionIntegerFilter
was modified to be char
based instead of int
based. This removes the need for Writeable*Wrapper
classes without changing functionality.
Files changed include:
org.jasig.portal.utils.SubstitutionIntegerFilter.java
org.jasig.portal.utils.SubstitutionServletOutputStream.java
org.jasig.portal.utils.SubstitutionWriter.java
As part of this change, the following files are no longer used. They were not removed since they may be used again in the future:
org.jasig.portal.utils.WriteableOutputStreamWrapper.java
org.jasig.portal.utils.WriteableWriterWrapper.java
MediaManager.java
The MediaManager
class is a utility class used for loading and resolving media, mime and serializer properties. Though configurable to load different property files, the code-base utilized the class to load property files:
media.properties
- HTTP request user-agent portion to client browser typemime.properties
- client browser type to mime typeserializer.properties
- client browser type to serializer name
Though the property data does not change during production operation, instances of MediaManager
's were being created. In addition to the objects created during utilization of a MediaManager
there is the additional overhead of opening and parsing the three property files repeatedly. Therefore, this class became a target of performance optimization.
The org.jasig.portal.MediaManager
was modified to be a singleton. The following classes were modified to make use of the singleton:
org.jasig.portal.StandaloneChannelRenderer
org.jasig.portal.UserInstance
org.jasig.portal.channels.CError
org.jasig.portal.channels.webproxy.CWebProxy
org.jasig.portal.tools.ChannelServlet
org.jasig.portal.wrsp.ChannelInstanceManager