March 2005 uPortal Developers Meeting Minutes

March 2005 uPortal Developers Meeting Minutes

Revision of April 24, 2005, 23:45 EDT

 

Monday, March 21

 

 

Monday, March 21

 

09:00am

Introduction

@Eric Dalquist

09:15am

uPortal 2.4 Performance and Memory

@ScottS

09:55am

uPortal 2.4.x, Future Releases, Branch Maintenance

@Andrew Petro

10:10am

uPortal 2.x and Spring

@Andrew Petro

10:30am

Morning Break

 

10:45am

DLM Integration

@Mark Boyd

11:25am

uPortal 2.5 Release Issues

@Eric Dalquist

12:00pm

Lunch Break

 

01:00pm

Clearinghouse Involvment and direction

@Eric Dalquist

01:20pm

Documentation

@John Fereira

01:50pm

Sakai Integration

@Andrew Petro and Chuck Severance

02:40pm

Afternoon Break

 

02:45pm

uPortal 2.6

@Andrew Petro

03:25pm

Permissions Framework

@Keith Stacks

04:00pm

End of day

 

 

 

 

 

Tuesday, March 22

 

09:00am

uPortal 3 Introduction

@Peter Kharchenko (Deactivated)

09:10am

Release QA

@Eric Dalquist

10:00am

Morning Break

 

10:20am

Rendering Components and Contexts

@Peter Kharchenko (Deactivated)

11:20am

Pluto Integration

@Eric Dalquist

11:45am

WSRP Producer / Consumer

@Michael Ivanov (Deactivated)

12:20pm

Lunch Break

 

01:20pm

Portlet Development

@Eric Dalquist/@Cris Holdorph

02:20pm

uPortal Leadership Responsibilities

@Eric Dalquist

03:10pm

Afternoon Break

 

03:25pm

Layout Managers

@Peter Kharchenko (Deactivated)

03:35pm

Portal Navigation, Back Button Support

@Peter Kharchenko (Deactivated)

03:45pm

Authentication

@Peter Kharchenko (Deactivated)

03:55pm

uPortal 2 Service Integration

@Peter Kharchenko (Deactivated)

04:00pm

End of day

 

uPortal 2.4 performance and memory

by Scott Battaglia, Rutgers (on behalf of Bill Thompson)

(These notes are based in part on the , which has more details.)

we have 10,000 unique users per day, with over 30,000 total logins. We load balance per session to 4 machines.
we are seeing a bad memory leak.
eventually uportal stops working properly and hangs
we didn't see this in QA
seen in production since November 2004

Yale, U LA Lafayette, U CA Urvine, Cornell also see this we bounce portal when available memory goes below 5%
– may be too aggressive
– people lose their session
– if we don't catch it soon enough Apache also goes down this just makes it barely livable

fixes they've done so far:
– removed caching of IPerson

– CError and CSecureInfo now pass events to wrapped channels (they did not before; this meant the channels would hang on to objects)
– restrict access to ChannelFactory's channel cache, and sync instantiateChannel method
– guest sessions created on time out
– AbstractMultithreadedChannels were not cleaning out their channel state maps

3 months later –
-- leaks still exist
– the search continues

now looking for more leaks

– retooled load tests
– production snapshots
– incremental updates
– reaffirm that loadtest system matches production system retooling: attempt to mimic more closely what a user does in prod
– use more custom layouts in test environment
– fewer people logging out (only 25 percent formally do that)
– hitting more popular channels more aggressively try to match the production throughput

– imitate average user session length
– determine rate at which users access system bought test system with same specs as prod system
– ensure database optimizations are the same
– make sure configuration is the same

take production snapshots
– JVM heap size initially 2 GB
lowered JVM heap size to 128 MB on machine, allows us to compare snapshots (on our developer machines which have only 1G. larger snapshots won't compare)

when memory reaches 10% take that one production server out of load balancing rotation
garbage collect
capture snapshot
wait past session timeout (15 minutes)
take another snapshot
compare

– what objects are still in memory
– how much memory they are using

– how much memory things they reference are using
– they are using YourKit to take the snapshots and compare them

– YourKit reports incoming and outgoing references
– totals for objects of each type
– how much memory they consume
– allows us to compare snapshots, showing the details of each object type

understanding the snapshots

– name
– objects
– shallow size (object itself)
– retained size (plus referenced objects)

can trace the path to the root of the garbage collector

another practice to get a handle on the problem: incremental updates:

instead of doing massive patches (either Rutgers' local fixes, or from JA-SIG), we do one little fix at a time, first on the loadtester, then in production
(Scott gave a couple examples of the minimal fixes)

there is a flurry of discussion on JASIG-DEV about memory issues one thing being discussed is to backport the concurrent threadpool
– there are two other issues in the discussion as potential causes

The Concurrent Thread Library, because the threadpool has some problems.

Aaron Hamid at Cornell wrote a patch to replace the thread library

Rutgers manually applied the patch to 2.4.1 into production
This was during spring break.
It seems to have improved performance rather than fixing the leaks.

AuthorizationImpl
– retains references to principals
– no explicit removal of principal from cache
– whenever a new principal is created it copies the map

– Rutgers has a patch for this that they are going to loadtest then place in production

– then they will put it in the JA-SIG CVS HEAD

– introduced a CacheFactory (generic), an interface. they are trying out WhirlyCache
– -- allows for declaring cache settings and policy in XML
– -- allows for fine-grained caching strategies for each part of uPortal.

General principles

– implementing a finalizer should be a last resort
– consider using an open source solution

– be aware of proper caching (where is needed vs. not, weak and soft references, etc.)
– avoid circular references whenever possible

Adam: (heard this at last year's Java One) Now that Sun has 3 different heaps in JVM, caching is not as important as it used to be for temporary objects.

Eric: Rutgers is contributing things to HEAD. Looks good. The rest of us should be careful not to make the same mistakes.

Peter K.: One problem is you can't replicate this well? Q&A does not find it? Do you do snapshots in Q&A? You see ChannelRender instances are being held?

Scott: Yes, by the finalizer. We are probably going to change so the thread pool uses explicit calls rather than finalizers.

Peter K.: How many requests are necessary to saturate the server? Find that out. Use 'last render time' per user to get an equivalent to 'logout'.

Scott: We have four machines in the production pool.

Peter K: Be sure to track how often channels timeout. When you have a huge memory most of that is objects that can be GC'd. But if you have only 128M for your experiment, there will be very little. Most of Peter's comments here relate to trying to figure out why the QA machine does not experience the problem as clearly, (because if it did the problem would be easier to solve).

Eric and Adam questioned whether 128M is enough to realistically run the portal. Make the heap bigger and see if you can fit it on your developer system. Scott replied that 128M is big enough to help diagnose the memory leaks, which is their purpose in the runs in question.

Adam: You should try visualgc

free from Sun (it is no longer available for 1.4, only for 1.5 any more).

Scott: (Adam asked what JVM and parms on it they are using) Solaris, don't know if we're running in 64bit. They are using the default GC algorithm, and using Apache. Someone asked how many AJP threads, Scott didn't know. (are you overriding the default stack size per thread?) Scott didn't know. (-Xss:threadStackSize=)

Nick from U. Wisconsin – been looking into storing sessions into database (for another application), but Andrew says the uportal session is not fully serializable AND we need more stuff than just the session (static caches etc.), so this isn't doable.

Nick gave a few recommendations

uPortal 2.4.x, Future Release, Branch maintenance

by Andrew Petro, Yale

(This session regards how the uPortal project uses its CVS archive)

(The notes are taken in part from the Slides.)

This will cover
– what we currently do
– our deployer community
– what we could do

what we've done

– patches for only the current release
– in up2: One active patches branch, one Head.

before 2.4 was released, there was:

– the current release (2.3.4)
– the 2-3 patches branch (would become 2.3.5)
– the HEAD – would become uportal 2.4
– the SANDBOX – uportal 3 pre-M1

the 2.4 release

– head was tagged as 2.4

– head continues and becomes uPortal 2.5
– created branch for 2-4-patches
– stopped adding to 2-3-patches, and stopped releasing further 2-3
releases.

right now:

– the current release (uportal 2.4.2)
– the 2-4-patches branch
– the HEAD
– the SANDBOX

Andrew made the point that a lot of schools are deployed on older releases.

– we need timely new releases adding functionality we discover we need and can achieve
– high quality of releases
– patches and support for older versions
– -- there are still questions, problems in uP 2.1
– -- back-porting of fixes.

Andrew wants to keep multiple active patches branches.

Problem – too many places where progress can be made, the small developer community becomes diluted.

Andrew says – the idea would be, the older patches branches remain open, but commits and releases will be only in response to contributions, rather than official.

Patches branches would dry up naturally when the contributor community has moved on.

Adam: Closing older branches was an inducement to encourage people to keep up to date.

Jim: Many sites are highly modified, such that it's a big chore to move up. The original decision was a pragmatic one; it was all that he (Ken) could manage to do.

Peter K: The way we did this encourages development and fixes on newer releases. If we keep the older branches open how would we encourage people to commit to later ones? (especially without people riding herd on it like Ken did). If someone commits a patch to an old branch, the committer should very clearly state that in a particular place. (so that others can consider the patch for later branches) Open a Jira issue on the head and latest patches?

John F: The problem would be, how would they test the newer release if they have no instance of it? It may actually already be fixed in a later release, and such a person might not know.

Andrew: The most appropriate way to track patches against older releases is to put it in CVS.

Spring and uPortal 2.x

by Andrew Petro, Yale

(These notes are based in part on the Slides which has more details)

The problem that the Spring framework effort addresses is dependencies.

We developers are addicts to making dependencies – how do we find all the other things we need for a particular class to do its work.

Traditionally we use ("lookup"): JNDI, or servlet context, or a static object, or use a factory.

The more recently introduced way is "Dependency Injection"

– your object knows what it needs
– exposes constructor arguments/ setter methods to fulfill these dependencies
– (some other package goes around and resolves them)
– (therefore,) the object assumes that dependencies have been fulfilled before it tries to do anything
– bonus points: fail gracefully if they have not

Spring as a dependency injector.
You have an XML configuration of the dependencies.
Our IChannel instances, for example, are not Spring-configured.

In JSR-168
Spring PortletMVC can be used for portlets.

Our traditional solution is to use portal.properties.

PersonDirectory reimplemented for uPortal 2.5: it uses Spring so it can gather up the 'way it works' from various pieces.

"Using Spring for the sake of using Spring" is not what this is about. Instead, this is about producing loosely coupled configurable objects.

A Yale use case: At Yale, there's an XML file available on the web that defines
e.g. student support people; How can Yale bring that into the IPerson?
By wiring in more code, another source. So instead of having a local hack, the Spring thing allows you to paste in a wide variety of implementations.

PersonDirectory becomes a façade in front of the Spring-configured person attribute Data Access Object (DAO) that allows the old code to call PersonDirectory in the traditional way.

If there were any other Spring-configured objects, they would not need the façade or the static services, they could have Spring inject their dependencies. Non-Spring-configured objects can use the static way.

a proposed way forward:

more services being Spring-configured when they need to be significantly touched

Packaging:

Peter K recommended that this could be spun out as a micro-project producing a jar
This might ease maintenance:
– multiple codebases can use this code.
Dan E: Good idea. For example, that would make migration of PAGS (into 3.0) easier.

Distributed Layout Manager integration with base code

by Mark Boyd of SunGard SCT

SCT has their own unique layout manager; although it has another name internally, when discussed for JA-SIG purposes, it's called by its old name, the Distributed Layout Manager, called here DLM.

This will be provided in the base code as an alternative to JA-SIG's Aggregated Layout Manager (ALM).

This project has two phases.

First, the DLM will be added to the base code with the capabilities it has at present in Luminis' code. This has only pushed fragments, configured with an XML file rather than a GUI. Each fragment is associated with a userid set aside for the purpose. To configure the contents of the fragment, the fragment owner logs in to that userid and sets up the tabs, columns, and so on.

– this much will be in 2.5. (his part will be done in about 2 weeks from this meeting)

Second, a user interface for creating and editing fragments, will be created for the DLM, based on the code for the present ALM. Pulled fragments will also be implemented.

In the XML, the 'audience declaration' describes the authorization (What is the audience for this fragment?). Keith Stacks of SunGard SCT is working on the authorization part (see another session elsewhere in these minutes).

Mark demonstrated in Luminis (SCT's portal offering) how their DLM layout works. It has finer grained control for what is locked down in a fragment, than does the ALM. So a user can add columns or channels to a pushed fragment if the owner permits it. Preferences can be pushed along with the fragment.

It indicates locked down channels, columns, or tabs by graying out delete buttons and/or leaving out moving buttons.

for 2.6: they will have their part done in 2 or 3 months
both pushed and pulled fragments
define fragments via UI
new permissions enhancements for defining audiences, differentiating
publishing capabilities.

Nick (U. Wisconsin) – they want to roll out with 2.4 but wish they could use DLM anyhow.

2.5 will be out in two or three weeks
2.6 will be out about 6 months later.

uPortal 2.5 release issues

facilitated by Adam Rybicki
(release schedule or release content)

what's new?

– portlet adapter updates
the adapter was portlet API compliant but there were some bugs. e.g. in ProcessAction, one can now programmatically change the window state of the portlet.
redirects from inside a portlet now work.
we now have expiration caching. (optional feature)
enhancements to Exclusive Window State implementation (uPortal enhancement)
e.g. to support downloads via a portlet.

– RDBMServices refactoring
the ant tasks do the db stuff; JNDI can be used only by the portal itself, and the rdbm.properties are used by the ant tasks.
we need pooled connections – Eric wrote RDBMServices to create its own pooled resource when it can't get to JNDI. therefore able to check out spooled connections outside of Tomcat. It will now give you metadata about the nature of the JDBC connection.
– Person Directory Spring re-write had a whole session on that (see above)
– moved to JAXP XML APIs
Sun has backported these API's from 1.5 to 1.4 so the API can look the same.
This does not conflict with the old 1.4 API's.
– JDK1.4 is now a minimum requirement
don't use 1.3 any more, period
– all exceptions are chained
all through the code.

release topics

release management responsibility
– Rutgers did 2.4.2, can we count on them again for 2.5? Scott from Rutgers says Bill (who was not able to attend) was kind of expecting this.
what remains to be completed
– rewrite of the thread pool – Doug Lee's actual package
– about 55 issues in Jira that are targeted to 2.5. are there critical issues that block the release?
– memory leaks (is it a critical enough issue?)

release date – tentatively april 18th

there was a discussion about whether 2.5 should be held up for the memory leak fixes because Rutgers and others really want that in the release, but some people don't want to wait for it to come out because if it delays too long they couldn't commit it to a fall rollout.

Dan: What is the real issue here? Are we worried about selling this to our respective management, or is it a developer concern, we need to do this by such a date etc.

Mark Boyd: Why don't the people who have strong feelings about this work out the issue?

Cris H (Unicon): From our point of view, it's best to have uPortal work well on Java 1.5, because what if the memory leak problem has been fixed in 1.5? So, I recommend we go forward.

LUNCH BREAK

uPortal Clearinghouse

with Patty Gertz and Paul Lynn from Princeton (via speakerphone)

Clearinghouse has two meanings:

The JA-SIG Clearinghouse application (which has much material related to uPortal, and some other things)

and the Clearinghouse machine that the application runs on, which was acquired by a grant from Sun.
Most of Patty's questions today are about the machine.

It's a Sunfire 280R, 64G disk, 8G ram, 2 processors
Princeton charges an annual fee for hosting a server – includes nightly backups, 24 hour monitoring, OS patching. The first year (well, 18 months) of service was donated by Princeton, but it is not clear as yet whether this will be an ongoing contribution.

This is the home of Jira and Confluence. If someone wants to put up some other app and take full responsibility for maintaining it, that's okay.

There are 3 chunks of filesystem space. a production clearinghouse, a dev clearinghouse, and postgres. Jira and Confluence are both on the development disk. Right now there's only one Tomcat. She wants there to
be two Tomcats, and move Jira and Confluence to the production partition. The second Tomcat (dev) is NOT brought up automatically if it dies or the host reboots.

Who will do the work? Jason at Rutgers probably.

Domain names and certs for this server. What do we want to do in terms of standardizing the JA-SIG domain names? Until this week, they didn't have incorporation papers.

Patty prefers
www.ja-sig.org/something
Someone else assumed something.ja-sig.org
Patty points out that the latter would make it more difficult to get security certificates (for SSL and site identification).
Group agreed with Patty to do it first way.

Paul at Princeton asked if we should add more partitions. Our group had no strong feeling about this.

Patty: If new tools want to be added, be sure not to put up tools that require maintenance or licensing fees.
Each tool needs an owner.
Please notify the JA-SIG board (e.g. Patty) that you want to put it on the machine.
Provide the license agreement and if it's open source, put the source itself into the clearinghouse, so that if the tool owner moves on to other things people will still know why it is there.
Andrew pointed out that the CAS (Central Authorization Service, from Yale) project might choose to use the machine as their home.

Patty thought Eric Dalquist was wanting to move other uportal stuff onto the clearing house machine (but Eric is still not present at this meeting.)
Delaware which hosts the CVS archive (people at first surmised that was Eric's concern) doesn't look like they will pull out.
Adam found Eric's notes, he was suggesting the CVS monitor (provides statistics about the use of a CVS instance) should be moved over to the Clearinghouse; it is currently hosted at Unicon, but it's on a low
powered machine with no provision for regular updating of the database. JA-SIG has a license for FishEye

but that would have to run on the same machine as the CVS.
The group approved of the idea in principle.

What is the legal entity for JA-SIG? Patty says it is now a non-profit corporation in New Jersey.

"JA-SIG, Inc." is the official name.

uPortal Documentation

facilitated by John Fereira

Most of the documentation is now being created on the Wiki (Confluence), so do we keep putting things on the main website or mostly on the Wiki?

Recently there has been less opportunity for people to write the formal documentation. Easier to write briefer pieces – how to solve a specific problem.

John thought that the things on the Wiki would be early drafts of things that would later be turned into formal docs. But that process is not yet happening.

Some months ago John demonstrated running a tool called Anakia on the CVS material to produce more nicely formatted and organized docs. He suggests that this sort of thing should be done.

At present, the entire www.uportal.org IS NOTHING BUT a live view on the docs subdirectory of the CVS project.

Doug G.:
(a) CVS has a way to run a script on check-in that could e.g. generate Anakia documentation if needed.
(b) Wouldn't it be better to have a way to 'digest' the Confluence material, which is inherently in a state of flux, to make more static documentation for releases?

John: has a module in CVS called uportal_documentation that contains a snapshot of all the documentation in xml format. The Anakia ant task transforms the xml documentation into html pages, creating "web" and "print" versions.

uPortal-Sakai integration

Andrew Petro did most of this presentation, with Charles Severance present (he arrived during lunchtime)

(This presentation is in part from the Slides

Sakai had navigation requirements. Originally David Haines at Sakai modified up2.4 to make possible what Sakai wanted, but this proved too drastic.

The new plan is to introduce IFrames and WSRP to Sakai, and look at Sakai through uPortal that way. That's phase I. The second phase will be to make Sakai tools available as JSR-168 portlets that can run in uPortal.

Sakai is a Learning Management System, not a portal as such. A portal should be customizable, many things available at the same time. An LMS is not usually highly customizable by its end users, and users want to work with only one thing at a time.

integration goals:

– Sakai TPP Tools will run as JSR-168 portlets.
– An entire Sakai site can be included at some point in an enterprise portal
IFrames - separate sign on
WSRP – shared sign on.
– Sakai sites, tools, or pages, can be aggregated to produce a personal federated view for an individual – moves toward a personal learning and research environment.

these are NOT goals:
– uPortal administration replaces Sakai administration. No, Sakai has its own.
– uPortal navigation supports Sakai's every Learning and Collaboration environment need. No.

Integration directions

Summer 2004 – decided that Sakai will be at first portal-agnostic

Outline
1.0 – integral aggregator with IFrames 10/04
1.5 – IFrames 02/05 just came out
2.0 – minimal WSRP (05/05)
Post 2.0 – Sakai tools available as JSR-168 portlets that run in uPortal

Phase I

Sakai and uPortal deployed separately based on IFrames in uPortal pointing at Sakai or WSRP pointing at Sakai

This integration will work between Sakai and any portal product which supports IFrames
Management and andministration will be done separately for the two.

Phase II

– Get uPortal and Sakai in the same JVM with uPortal handling navigation and layout for Sakai portlets.
– Sakai portlets will be JSR-168 adapted.
– In phase II Sakai will continue to interoperate with other portals using WSRP and IFrames, with uPortal being WSRP producer.
– There are many technical challenges.

Charles did part of the presentation:
Sakai integration phase I through Sakai 2.0
– parts can appear as IFrames within uportal
– adding a WSRP producer to Sakai so portals can show things thereby.
(Charles showed what this looks like)
Charles wants to use a portal to federate among many different instances of Sakai.

Phase 1 tasks:
WSRP Consumer in uportal

IFrame producer in Sakai (1.5)
WSRP producer in Sakai (2.0)
Working on this: Beth Kirshner (UM), Vishal Prashant (SunGard)
May start effort to build a better WSRP consumer for uPortal if Sakai's producer becomes "better".

Vishal Goenka: I've been comparing WSRP and JSR-168 APIs to see what commonality there is. Whether it is easier to go through the JSR-168 adapter to do a particular function or to write it natively.

Phase II

Sakai tools will appear in uPortal as JSR-168 channels

Administered by uPortal admin tools; that is, with respect to publishing, subscribing, positioning in layout.
Run inside same JVM
uPortal will render all navigation (as now)

Sakai tools use Velocity and JSF )Java Server Faces) view technologies.

Sakai's internals are being rearchitected to have less in its "kernel" – the environment-specific part. This will be the only part that will have to be modified to work with other view technologies such as JSR-168.