uPortal Performance - Excessive Object Creation

Overview

Utilizing Java memory (i.e., heap) and performance analysis tools (e.g., YourKit), an overall profile of uPortal can be developed to provide insight into application server performance related problems. Most of our work has centered around heap-leaks in the effort to track down the out-of-memory situation currently being experienced by myRutgers in production. Finding heap-leaks is an iterative process. As part of that process, this page documents issues related to Excessive Object Creation (EOC).

Creating objects in Java is trivial. In fact, it can be described as being too easy to create objects. When a user logs into uPortal for the first time, over 40MB of objects are created (depending on channel content). The resulting number of objects is much lower than 40MB, but creating over 40MB of objects (most being temporary) in a few seconds is an expensive process. Though the Java Garbage Collector (GC) has made great strides in performance over the past few years, especially in regards to temporary object creation, reducing the large number of objects created can only increase overall application performance.

The following image shows a few hotspots captured from the uPortal 2.4.2 quickstart:

The following steps were performed to capture the data:

  1. start the server and attach YourKit profiler
  2. login once and exit
  3. have YourKit start collecting object counts
  4. login once and exit
  5. snapshot
  6. login once and exit
  7. snapshot
  8. compare snapshots

There is much data generated by YourKit, but the image shows the excessive number of objects created by two uPortal-code methods (addressed below). The interesting number is the 240k objects created to render the two pages (on login and logout). Of these, almost 110k were created by the two methods shown (approx 2MB out of 14.5MB).

isDebugEnabled()

Tracing is commonly found in application code. It will usually look like:

for (Enumerator e = users.getEnumeration(); e.hasMoreElements(); ) {
    User user = (User) e.nextElement();
    log.debug("user: " + user +
                   ", account: " + accounts.get(user) +
                   ", classes: " + classes.get(user));
    // business logic...
}

There is no debate surrounding the value in using a tracing framework in a production application. It is invaluable in debugging problems. However, production applications are not normally run in debug tracing mode. The amount of tracing information generated is overwhelming and would just be deleted.

But, even if the production application is running in non-debug tracing mode, the above code fragment will still go through the (possibly) expensive steps of creating String objects to pass into the debug() method who will drop the request.

A more GC-friendly approach is to wrap all log requests (not just debug) with a conditional:

for (Enumerator e = users.getEnumeration(); e.hasMoreElements(); ) {
    User user = (User) e.nextElement();
    if (log.isDebugEnabled()) {
        log.debug("user: " + user +
                       ", account: " + accounts.get(user) +
                       ", classes: " + classes.get(user));
    }
    // business logic...

By always using this strategy, the number of objects created, and therefore the amount of work for the GC, can be greatly reduced.

HTMLdtd.fromChar(int)

Entities are generally used in XML (and HTML) to allow special characters to exist within an XML document. For example:

<html>
       <body>
           This is my XML document:
               &amp;lt;xmldoc&amp;gt;
                    &amp;lt;data/&amp;gt;
               &amp;lt;/xmldoc&amp;gt;
       </body>
    </html>

If "<xmldoc>" had been placed in the body of the document, it would have caused a parsing exception since "<xmldoc>" is not a valid HTML element. The "<" and corresponding ">" are necessary for successful parsing.

For uPortal, the org.jasig.portal.serialize.HTMLdtd class is responsible for filtering all output XML and replacing entity characters with entities so only valid XML is output from uPortal. It does this by loading the HTMLEntities.res resource file which looks like:

# Character entity references for markup-significant
#
quot 34
amp 38
lt 60
gt 62
nbsp 160
#
# Character entity references for ISO 8859-1 characters
#
iexcl 161
cent 162
pound 163
...

When loading the file, all entities are stored in a HashMap keyed on an Integer object of the entity value. So, map.get(new Integer(160)) will return a String object value "nbsp" and when writing the character, would be replaced by " " (done by org.jasig.portal.serialize.BaseMarkupSerializer.printEscaped(int).

Though a simple algorithm, the map lookup is performed on every character of the generated file leaving uPortal. An excessive number of objects were being created since a new Integer(int)) was being performed on every character. Since one page is easily 25,000 characters, 25,000 Integer objects were being created to resolve entity mappings.

Our solution was to replace the Integer object keyed map with a String[], indexed by the entity value. The array grows dynamically on initialization to accommodate existing (and new) entity values. Though sparsely populated, it is a faster implementation for resolving an entity String value since now it is just an index into an array. No new Java objects are created during normal processing of entity replacement.

SubstitutionIntegerFilter.java

The org.jasig.portal.utils.SubstitutionIntegerFilter.java class is primarily used by org.jasig.portal.ResponseSubstitutionWrapper.getWriter() to replace current session tags with new tags in documents output by uPortal. It will be given a Writer object (either a WriteableOutputStreamWrapper or WriteableWriterWrapper class) who will receive the substituted integers and write the content to another Writer.

So, when data is written, it will follow the flow of:

SubstitutionWriter -> SubstitutionIntegerFilter -> WriteableWriterWrapper -> Writer

When writing output data, an "integer" value is buffered in an int[] in the SubstitutionIntegerFilter (while a pattern is matched and replaced). When full, it is flushed to the underlying writers. The WriteableWriterWrapper will allocate a new char [], then copy each int [] element to the new char [].

This WriteableWriterWrapper is overhead, consuming extra memory to convert each int to char.

Instead, the SubstitutionIntegerFilter was modified to be char based instead of int based. This removes the need for Writeable*Wrapper classes without changing functionality.

Files changed include:

  1. org.jasig.portal.utils.SubstitutionIntegerFilter.java
  2. org.jasig.portal.utils.SubstitutionServletOutputStream.java
  3. org.jasig.portal.utils.SubstitutionWriter.java

As part of this change, the following files are no longer used. They were not removed since they may be used again in the future:

  1. org.jasig.portal.utils.WriteableOutputStreamWrapper.java
  2. org.jasig.portal.utils.WriteableWriterWrapper.java

MediaManager.java

The MediaManager class is a utility class used for loading and resolving media, mime and serializer properties. Though configurable to load different property files, the code-base utilized the class to load property files:

  1. media.properties - HTTP request user-agent portion to client browser type
  2. mime.properties - client browser type to mime type
  3. serializer.properties - client browser type to serializer name

Though the property data does not change during production operation, instances of MediaManager's were being created. In addition to the objects created during utilization of a MediaManager there is the additional overhead of opening and parsing the three property files repeatedly. Therefore, this class became a target of performance optimization.

The org.jasig.portal.MediaManager was modified to be a singleton. The following classes were modified to make use of the singleton:

  1. org.jasig.portal.StandaloneChannelRenderer
  2. org.jasig.portal.UserInstance
  3. org.jasig.portal.channels.CError
  4. org.jasig.portal.channels.webproxy.CWebProxy
  5. org.jasig.portal.tools.ChannelServlet
  6. org.jasig.portal.wrsp.ChannelInstanceManager