uPortal IRC Logs-2008-11-18

[05:07:16 EST(-0500)] * mad (n=chatzill@pcit-8752.HIG.SE) has joined ##uportal
[09:48:37 EST(-0500)] * athena7 (n=athena7@adsl-99-130-147-23.dsl.wlfrct.sbcglobal.net) has joined ##uportal
[09:53:22 EST(-0500)] * EricDalquist (n=dalquist@bohemia.doit.wisc.edu) has joined ##uportal
[10:16:26 EST(-0500)] * lennard1 (n=sparhk@ip68-98-56-21.ph.ph.cox.net) has left ##uportal
[10:31:23 EST(-0500)] * lennard1 (n=sparhk@wsip-98-174-242-39.ph.ph.cox.net) has joined ##uportal
[10:31:40 EST(-0500)] * holdorph (n=holdorph@wsip-98-174-242-39.ph.ph.cox.net) has joined ##uportal
[10:45:19 EST(-0500)] * holdorph (n=holdorph@wsip-98-174-242-39.ph.ph.cox.net) has joined ##uportal
[10:47:26 EST(-0500)] <lennard1> hey... pearson is starting to lean towards some sort of a reporting portlet again. Something that would allow them to monitor usage of the portal.
[10:47:50 EST(-0500)] <EricDalquist> well
[10:47:56 EST(-0500)] <EricDalquist> 3.1 will have database stats logging
[10:47:57 EST(-0500)] <lennard1> any chance something like that already exists... and isn't a guaranteed performance killer?
[10:48:09 EST(-0500)] <EricDalquist> and I'm just finishing up a spring batch tool to aggregate that data
[10:48:17 EST(-0500)] <EricDalquist> all we're missing is reporting off of those tables
[10:48:42 EST(-0500)] <lennard1> what is recorded?
[10:49:57 EST(-0500)] <EricDalquist> so the portal records a whole bunch of stuff
[10:50:11 EST(-0500)] <EricDalquist> you can look at the type hierarchy for PortalEvent
[10:50:18 EST(-0500)] <EricDalquist> our aggregator does:
[10:51:07 EST(-0500)] <EricDalquist> channel render count, average render time, max render time, action count, avg action time, max action time, targeted count, rendered from cache count
[10:51:15 EST(-0500)] <EricDalquist> tab render count, avg render time, max render time
[10:51:24 EST(-0500)] <EricDalquist> concurrent users
[10:51:31 EST(-0500)] <EricDalquist> login frequency
[10:51:40 EST(-0500)] <EricDalquist> total and unique logins
[10:52:07 EST(-0500)] <EricDalquist> each of those can be aggregated at any number of intervals (minute, 5minute, hour, day, week, month, quarter, academic term, year)
[10:52:18 EST(-0500)] <EricDalquist> and each of those is tracked globally and on a per-group basis
[10:52:32 EST(-0500)] <EricDalquist> which groups are tracked is also configurable
[10:52:40 EST(-0500)] <EricDalquist> for a reference on #s
[10:53:00 EST(-0500)] <lennard1> the portal keeps track of the numbers and then periodically writes the data to the db?
[10:53:16 EST(-0500)] <EricDalquist> so the portal itself just writes 'raw' stats data to a database
[10:53:30 EST(-0500)] <lennard1> how often?
[10:53:36 EST(-0500)] <EricDalquist> so one row per 'event' such as login, logout, channel render, tab render, etc
[10:53:42 EST(-0500)] <EricDalquist> the code uses JPA/Hib
[10:53:56 EST(-0500)] <EricDalquist> and batches those raw stats to the DB once per second
[10:53:59 EST(-0500)] <EricDalquist> then
[10:54:04 EST(-0500)] * lennard1 now has perf concerns... but can understand why that decision was made.
[10:54:18 EST(-0500)] <EricDalquist> there is almost no perf overhead
[10:54:30 EST(-0500)] <EricDalquist> the stats framework in the portal has its own thread pool
[10:54:53 EST(-0500)] <lennard1> ok... and that thread handles writing the data.
[10:54:53 EST(-0500)] <EricDalquist> so the 3.0 framework code always generates all stats
[10:55:03 EST(-0500)] <EricDalquist> what this DB version does is uses a concurrent queue
[10:55:09 EST(-0500)] <EricDalquist> portal threads write to that
[10:55:24 EST(-0500)] <EricDalquist> then a background thread fires every second and writes out all queued events
[10:55:32 EST(-0500)] <EricDalquist> the portal NEVER waits for stats
[10:55:42 EST(-0500)] <EricDalquist> the whole stats storing process can fail horribly
[10:55:46 EST(-0500)] <EricDalquist> and the portal will keep on chugging
[10:56:00 EST(-0500)] <EricDalquist> then we have an external aggregation process
[10:56:05 EST(-0500)] <EricDalquist> just finishing this part up
[10:56:07 EST(-0500)] <EricDalquist> uses spring batch
[10:56:15 EST(-0500)] <EricDalquist> reads in data from the raw stats tables
[10:56:18 EST(-0500)] <EricDalquist> and generates aggregates
[10:57:12 EST(-0500)] <EricDalquist> if you're tracking aggregates at a per-minute level 2mil raw stats rows generates about 600k aggregate rows (channel request aggr being 90% of that)
[10:57:19 EST(-0500)] <EricDalquist> we're aggregating at the 5 minute level
[10:57:46 EST(-0500)] <EricDalquist> and look to be translating about 2mil raw stats rows into about 180k rows of aggregates
[10:58:03 EST(-0500)] <EricDalquist> or about 20MB of table/index data per day in our aggregates
[10:58:15 EST(-0500)] <EricDalquist> and we keep about 2 weeks of raw stats which we can use to debug problems if needed
[10:58:16 EST(-0500)] <lennard1> just thinking about how that will scale for a user like pearson...
[10:58:30 EST(-0500)] <EricDalquist> yeah
[10:58:35 EST(-0500)] <EricDalquist> it depends on what they want to track
[10:58:50 EST(-0500)] <lennard1> nearling 6 million users and growing by an insane percentage every year.
[10:58:55 EST(-0500)] <EricDalquist> if you disable per-channel raw event logging that would drop the majority of the events
[10:59:02 EST(-0500)] <EricDalquist> how many concurrent users?
[10:59:16 EST(-0500)] <EricDalquist> and honestly, this is probably your best bet
[10:59:22 EST(-0500)] <EricDalquist> not sure what else you could do
[10:59:29 EST(-0500)] <lennard1> that is the kicker... I happen to think many students just create a new account rather than use their old one.
[10:59:31 EST(-0500)] <EricDalquist> the reality is for large installs ... stats is A LOT of data
[10:59:38 EST(-0500)] <lennard1> peak... probably 150k concurrent users.
[10:59:50 EST(-0500)] <EricDalquist> over what time range?
[10:59:55 EST(-0500)] <EricDalquist> is that 150k in a 5 minute window?
[11:00:33 EST(-0500)] <lennard1> well... for specifics have to look at a report I can't lay my hand on right now.
[11:00:53 EST(-0500)] <EricDalquist> ok
[11:00:55 EST(-0500)] <lennard1> 150k active sessions would be the safe bet.
[11:01:00 EST(-0500)] <EricDalquist> wow
[11:01:15 EST(-0500)] <lennard1> as to how active they are over a given minute or so... that varies.
[11:01:25 EST(-0500)] <lennard1> you can see why perf is a concern(smile)
[11:02:11 EST(-0500)] <EricDalquist> yeah we see between 500 & 800 concurrent users in a 5 minute window (defined by a stats event was generated by a user in that window)
[11:02:30 EST(-0500)] <EricDalquist> so the raw stats storage code is in trunk if you want to take a look
[11:02:54 EST(-0500)] <EricDalquist> we are planning on having this aggregation tool out by the end of the year but could put it out there earlier if folks are really interested
[11:03:00 EST(-0500)] <EricDalquist> we're still missing reporting tools though
[11:03:35 EST(-0500)] <lennard1> right now they are talking about tracking the 'community' functionality we have in the portal.
[11:03:51 EST(-0500)] <lennard1> that is only used by a much smaller subset of users (instructors only)
[11:04:13 EST(-0500)] <EricDalquist> so right now on the portal side you can filter stats by event type
[11:04:14 EST(-0500)] <lennard1> Pearson might be 'really interested'
[11:04:19 EST(-0500)] <lennard1> am in a meeting to find out now
[11:04:26 EST(-0500)] <EricDalquist> but it wouldn't be a stretch to add additional filtering options
[11:04:37 EST(-0500)] <EricDalquist> like only write out events that are from a specific group
[11:05:07 EST(-0500)] * lennard1 nods
[11:06:15 EST(-0500)] <EricDalquist> some other random #s on this stuff
[11:06:29 EST(-0500)] <EricDalquist> with that user load we generate about 50 events/second peak
[11:06:46 EST(-0500)] <EricDalquist> our current aggregator code looks like it can consistently process about 500 events/second
[11:07:00 EST(-0500)] <EricDalquist> database space & management is a BIG part of all of this
[11:07:21 EST(-0500)] <EricDalquist> and the aggregator code right now is oracle specific
[11:08:36 EST(-0500)] <EricDalquist> one thing I know some schools are using is google analytics
[11:08:54 EST(-0500)] <EricDalquist> which can provide a lot of the page/browser/remote host type metrics
[11:10:26 EST(-0500)] <EricDalquist> we would love to be able to collaborate with someone one reporting tools though
[11:41:19 EST(-0500)] <lennard1> do you track average length of sessions?
[11:41:28 EST(-0500)] <EricDalquist> no
[11:41:35 EST(-0500)] <EricDalquist> well not yet
[11:41:42 EST(-0500)] <EricDalquist> it may get added to our aggregates eventually
[11:42:01 EST(-0500)] <EricDalquist> oh and some updated numbers for processing
[11:42:22 EST(-0500)] <EricDalquist> doing 5 minute intervals as our smallest instead of 1 minute we're processing around 1000 events/second
[11:44:31 EST(-0500)] * anastasiac (n=team@142.150.154.160) has joined ##uportal
[11:46:03 EST(-0500)] <EricDalquist> athena7: I had an idea for some uPortal work that isn't major refactoring
[11:46:15 EST(-0500)] <athena7> ooh
[11:46:16 EST(-0500)] <athena7> what's that?
[11:46:17 EST(-0500)] <EricDalquist> cleaning up the uPortal error JSP
[11:46:26 EST(-0500)] <athena7> oh, what's wrong w/ it currently?
[11:46:38 EST(-0500)] <EricDalquist> that is the page that says uPortal Error
[11:46:46 EST(-0500)] <EricDalquist> in black text on a white background
[11:46:49 EST(-0500)] <athena7> ah
[11:47:01 EST(-0500)] <EricDalquist> perhaps we change that into a better designed 'outage' page
[11:47:08 EST(-0500)] <athena7> yes that's not a very friendly page
[11:47:11 EST(-0500)] <athena7> yeah that makes sense
[11:47:13 EST(-0500)] <EricDalquist> hard code it to use the skin's CSS
[11:47:30 EST(-0500)] <athena7> actually i wonder whether we could get some input from the UI types over here too
[11:47:32 EST(-0500)] <athena7> i'll check on that
[11:47:33 EST(-0500)] <EricDalquist> and just clean up the JSP so it is easier for deployers to customize
[11:47:35 EST(-0500)] <EricDalquist> yeah
[11:47:42 EST(-0500)] <athena7> yeah, that sounds like a great idea
[11:47:44 EST(-0500)] <EricDalquist> I was thinking it would be a good think for MattP
[11:47:51 EST(-0500)] <athena7> my thoughts too (smile)
[11:47:55 EST(-0500)] <EricDalquist> you could work on that with him to get him more involved in uPortal
[11:47:55 EST(-0500)] <athena7> i can ask him if he has time
[11:47:58 EST(-0500)] <EricDalquist> great
[11:48:03 EST(-0500)] <athena7> actually he's looking to be involved
[11:48:07 EST(-0500)] <EricDalquist> yeah
[11:48:15 EST(-0500)] <athena7> he's done some great work that needs to get committed
[11:48:20 EST(-0500)] <athena7> it'd be nice to see it get in before 3.1
[11:48:34 EST(-0500)] <EricDalquist> yeah it will
[11:48:42 EST(-0500)] <athena7> some neat features for things like handling if the tabs overrun the page width, etc.
[11:49:01 EST(-0500)] <athena7> by the way, will you thank nick for me for the calendar portlet work he did?
[11:49:05 EST(-0500)] <athena7> i think things are shaping up well
[11:49:25 EST(-0500)] <EricDalquist> great (smile)
[11:49:52 EST(-0500)] <athena7> actually, i wrote a new adapter for CalDAV access
[11:49:54 EST(-0500)] <athena7> which is pretty cool
[11:50:02 EST(-0500)] <athena7> i'm not sure what to do w/ the dependency though
[11:50:20 EST(-0500)] <athena7> i had to make a small modification to get it to not require ETags on all calendar items
[11:50:41 EST(-0500)] <EricDalquist> neat
[11:51:37 EST(-0500)] <athena7> yes
[11:51:49 EST(-0500)] <athena7> but what do we want to do w/ the caldav dependency?
[11:52:03 EST(-0500)] <athena7> it's a snapshot version of the 0.4 release, with a small mod, basically
[11:52:09 EST(-0500)] <athena7> does it make sense to put it in the jasig repo?
[11:52:19 EST(-0500)] <athena7> and if so, do we need to label it so people know it's modified?
[11:52:29 EST(-0500)] <EricDalquist> yeah, or talk to the caldav devs to get the change into their next release
[11:52:45 EST(-0500)] <athena7> their next release seems to be pretty significantly different
[11:53:01 EST(-0500)] <athena7> and the trunk doesn't currently seem to be in a very working state
[11:53:09 EST(-0500)] <athena7> so i'm kind of hesitant to wait for the next release
[11:53:19 EST(-0500)] <EricDalquist> ah
[11:53:20 EST(-0500)] <athena7> i'm looking forward to it, but it seems like it's not to a point where we can use it yet
[11:53:32 EST(-0500)] <EricDalquist> yeah then rename the artifact appropriately and stick it in the jasig repo
[11:53:35 EST(-0500)] <athena7> so we may need to rely on something else until the trunk gets cleaned up for the 0.5 release
[11:53:36 EST(-0500)] <athena7> ok
[11:53:57 EST(-0500)] <athena7> got any suggestions? 0.4-jasig?
[11:54:18 EST(-0500)] <EricDalquist> caldav-jasig-0.4
[11:54:33 EST(-0500)] <EricDalquist> it can be problematic to include non version data in the version string
[11:54:45 EST(-0500)] <athena7> i guess that makes sense
[11:54:55 EST(-0500)] <athena7> so change the artifact id to include jasig?
[11:56:27 EST(-0500)] <athena7> caldav-jasig / version 0.4-snapshot ? or do we need 0.4 in the artifact string as well?
[11:56:57 EST(-0500)] <EricDalquist> yeah so I'd set the groupid to a org.jasig Id
[11:57:04 EST(-0500)] <EricDalquist> then add a bit in the jasig info
[11:57:31 EST(-0500)] <EricDalquist> and keep the version in sync with the source
[11:58:41 EST(-0500)] <athena7> ok, that makes sense
[11:59:07 EST(-0500)] <athena7> if i send you the jar would you be able to deploy it? i'm pretty sure i don't have permissions to the right directory on the server
[11:59:17 EST(-0500)] <EricDalquist> ah yeah I can
[11:59:25 EST(-0500)] <EricDalquist> you may have to poke me a few times to get it done
[11:59:27 EST(-0500)] <EricDalquist> :/
[11:59:34 EST(-0500)] <athena7> ok, thanks (smile)
[11:59:38 EST(-0500)] <athena7> i'll send it over this afternoon
[11:59:40 EST(-0500)] <athena7> no major rush
[11:59:55 EST(-0500)] <athena7> i'll just make sure to wait to commit the adapter until it's available from the jasig repo
[12:00:05 EST(-0500)] <EricDalquist> ok
[12:00:09 EST(-0500)] <athena7> thanks!
[12:00:35 EST(-0500)] <athena7> by the way, does that error page still use all the scriptlet-based stuff?
[12:02:45 EST(-0500)] <EricDalquist> probably
[12:02:52 EST(-0500)] <EricDalquist> so it could use a JSTL kick
[12:07:28 EST(-0500)] <athena7> yeah
[12:07:30 EST(-0500)] <athena7> sounds reasonable
[13:41:42 EST(-0500)] <lennard1> So now pearson is asking questions about deep linking into the portal again.
[13:41:54 EST(-0500)] <lennard1> they ask this question every 6 months or so...
[13:42:17 EST(-0500)] <EricDalquist> so we have a very concrete requirement to allow a search engine to index a portlet in a guest view in uPortal
[13:42:28 EST(-0500)] <lennard1> we have the ability to take a user to a specific tab, now they want to take the user to a specific portlet
[13:42:37 EST(-0500)] <EricDalquist> that requires deep linking
[13:42:53 EST(-0500)] <EricDalquist> yeah
[13:43:24 EST(-0500)] <EricDalquist> my thinking is we can have the current tab and the targeted portlet as part of the URL
[13:43:33 EST(-0500)] <EricDalquist> with the current parameters for that portlet on the end
[13:43:42 EST(-0500)] <EricDalquist> portal-side state would still be needed for the rest of the portlets
[13:43:44 EST(-0500)] * lennard1 nods
[13:43:50 EST(-0500)] <EricDalquist> but that would allow direct navigation to a portlet on a tab
[13:44:02 EST(-0500)] <EricDalquist> part of the question is how human-friendly that URL is
[13:44:06 EST(-0500)] <EricDalquist> but we don't have a requirement there
[13:44:13 EST(-0500)] <lennard1> would likely be very ugly
[13:44:17 EST(-0500)] <EricDalquist> well
[13:44:27 EST(-0500)] <EricDalquist> for tabs we can introduce a fname like key
[13:44:29 EST(-0500)] <lennard1> but... pearson at least could care less about that part at this point.
[13:44:38 EST(-0500)] <EricDalquist> that harder one would be a
[13:44:41 EST(-0500)] <lennard1> but pretty urls would be nicer.
[13:44:46 EST(-0500)] <EricDalquist> 'nice' portlet window id
[13:44:54 EST(-0500)] <EricDalquist> but even that that could be partially acomplished
[13:45:06 EST(-0500)] <EricDalquist> the framework for doing this was put in place with 3.0
[13:45:22 EST(-0500)] <EricDalquist> now that URLs are more separated from the framework
[13:45:33 EST(-0500)] <EricDalquist> especially if you worry less about channels
[13:46:43 EST(-0500)] <EricDalquist> so right now we (well I) am on the hook for this feature for early 09
[13:49:16 EST(-0500)] <lennard1> (smile)
[13:50:03 EST(-0500)] <EricDalquist> plus the additional work of making the guest view work without having a Session
[13:50:13 EST(-0500)] <EricDalquist> (tongue)
[13:51:11 EST(-0500)] <lennard1> yup
[13:52:33 EST(-0500)] <EricDalquist> much of this is easier if I conveniently ignore channels (smile)
[13:52:40 EST(-0500)] <lennard1> (smile)
[13:52:56 EST(-0500)] <athena7> ok, jar sent
[13:53:11 EST(-0500)] <lennard1> pearson decided long ago to go with portlets, and don't care a bit about channels.
[13:53:34 EST(-0500)] <lennard1> think we use one channel right now... but could easily replace that with something else.
[13:53:39 EST(-0500)] <EricDalquist> yeah
[13:53:48 EST(-0500)] <EricDalquist> the only channels we use are iframe and the admin UIs{color}
[13:53:52 EST(-0500)] <lennard1> actually... we don't even use that channel anymore.

[13:54:17 EST(-0500)] <lennard1> yeah... we wouldn't want to deep link to the admin ui.
[13:54:28 EST(-0500)] <EricDalquist> exactly
[13:54:38 EST(-0500)] <EricDalquist> most requirements like this don't count for UI customization or admin tools
[13:54:45 EST(-0500)] <athena7> what happened to our friendly urls requirement? did that ever happen?
[13:54:55 EST(-0500)] <EricDalquist> that is kind of what we're talking about
[13:55:08 EST(-0500)] <EricDalquist> really before we get friendly URLs we need 'bookmarkable' URLs
[13:55:17 EST(-0500)] <EricDalquist> that represent much more state on the URL than we do now{color}
[13:55:22 EST(-0500)] <athena7> right

[13:55:31 EST(-0500)] <EricDalquist> this would also let a user have multiple windows open to one portal instance
[13:55:37 EST(-0500)] <athena7> i remembered that we'd talked about it a few times, even at a pretty concrete level
[13:55:42 EST(-0500)] <athena7> ah.
[13:55:42 EST(-0500)] <EricDalquist> though I think some caching havoc might happen then
[13:55:45 EST(-0500)] <athena7> yeah
[13:55:46 EST(-0500)] <EricDalquist> yeah
[13:55:55 EST(-0500)] <EricDalquist> we have it on our early 09 roadmap here
[13:56:19 EST(-0500)] <athena7> gotcha
[16:20:22 EST(-0500)] <athena7> EricDalquist: does the current trunk require jdk 1.6?
[16:42:45 EST(-0500)] <EricDalquist> ah I don't think so
[17:12:09 EST(-0500)] <lennard1> you'll at least need 1.5 though
[17:12:19 EST(-0500)] <EricDalquist> yes
[17:12:22 EST(-0500)] <EricDalquist> and servlet 2.5
[19:27:49 EST(-0500)] * lennard1 (n=sparhk@wsip-98-174-242-39.ph.ph.cox.net) has left ##uportal