Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 30 Next »

[11:27:20 CST(-0600)] <jwennmacher1> EricDalquist: Good morning. Drew suggested I contact you about working on some statistics reporting.

[11:28:25 CST(-0600)] <EricDalquist> hi

[11:28:27 CST(-0600)] <EricDalquist> awesome

[11:29:21 CST(-0600)] <jwennmacher1> I am at a very early stage; I was just glancing at the statistics that are being collected right now. Initial observation is there are two sets currently collected not being reported on. Portlet/folder added/deleted/removed from layout. I was thinking maybe of starting with one of those (probably portlet). Thoughts?

[11:30:10 CST(-0600)] <EricDalquist> well, let me do a little overview of the whole stats/aggregation/reporting system

[11:30:51 CST(-0600)] <EricDalquist> opening up uPortal source just a minute ....

[11:33:09 CST(-0600)] <EricDalquist> so forgive me if you're familiar with parts of this already but I figure a full picture is good

[11:33:32 CST(-0600)] <EricDalquist> uportal uses an extension of the spring application context event apis

[11:33:38 CST(-0600)] <jwennmacher1> yep

[11:33:56 CST(-0600)] <EricDalquist> one of the event handlers sticks the events onto a concurrent queue

[11:34:08 CST(-0600)] <EricDalquist> and a background thread periodically flushes them out to the db via JpaPortalEventStore

[11:34:16 CST(-0600)] <EricDalquist> these are what we call raw events

[11:34:21 CST(-0600)] <EricDalquist> and are actually stored as JSON CLOBs

[11:35:32 CST(-0600)] <EricDalquist> then there is a background process that runs on one machine in the cluster that periodically aggregates that raw event data into some form that is easier to report on / process

[11:35:46 CST(-0600)] <EricDalquist> PortalEventProcessingManagerImpl is essentially the entry point for all of that logic

[11:36:45 CST(-0600)] <EricDalquist> that uses all instances of IPortalEventAggregator it finds in the app context, so we have sort of a pluggable api for doing this aggregation work

[11:37:26 CST(-0600)] <EricDalquist> right now we have aggregators that track: concurrent users, logins (unique & total), tab renders, & portlet executions

[11:37:40 CST(-0600)] <EricDalquist> this code is really finicky to write

[11:37:45 CST(-0600)] <EricDalquist> and VERY performance sensitive

[11:38:01 CST(-0600)] <EricDalquist> since you need to make sure the aggregator can handle processing the data faster than it is created

[11:39:08 CST(-0600)] <EricDalquist> for example looking at our logs at UW right now

[11:39:13 CST(-0600)] <EricDalquist> the aggregator is falling behind a bit

[11:39:20 CST(-0600)] <EricDalquist> Aggregated 10000 events created at 16.4745 events/second between 2012-12-11T09:54:51.359-06:00 and 2012-12-11T10:04:59.202-06:00 in 1108885ms - 9.0181 e/s a 0.5474x speedup.

[11:39:27 CST(-0600)] <EricDalquist> but we track A LOT of data

[11:39:41 CST(-0600)] <EricDalquist> and this seems to happen as our DB index statistics slowly get out of data

[11:39:46 CST(-0600)] <EricDalquist> out of date*

[11:39:54 CST(-0600)] <EricDalquist> so just something to keep in mind

[11:40:44 CST(-0600)] <EricDalquist> then there is the reporting piece

[11:40:59 CST(-0600)] <EricDalquist> we currently have two of those, LoginTotalsStatisticsController and ConcurrentUsersStatisticsController

[11:41:15 CST(-0600)] <EricDalquist> these are what let us get nice graphs/reports out of the aggregated data

[11:41:24 CST(-0600)] <EricDalquist> ok … does that all make sense?

[11:42:05 CST(-0600)] <jwennmacher1> yes. Good to know, especially the performance requirement

[11:43:13 CST(-0600)] <EricDalquist> so for a place to start

[11:43:28 CST(-0600)] <EricDalquist> I think what may actually be the best spot are the reporting portlets

[11:43:46 CST(-0600)] <EricDalquist> we are currently collecting a ton of data about tab renders and portlet executions

[11:44:01 CST(-0600)] <jwennmacher1> yes. I see there are aggregators already written for portlet execution and tab mapping. I haven't checked to see if they are used yet. Would these be good candidates to consider since some of the foundation work appears to be present? I'm still somewhat of a newbie; I've done a bit of portlet work but not uPortal yet. I only have a few days to contribute before I'm off to another project for a while.

[11:44:27 CST(-0600)] <jwennmacher1> Reporting portlets are what Drew and I discussed.

[11:45:06 CST(-0600)] <EricDalquist> yeah these would be the best place to start

[11:45:14 CST(-0600)] <EricDalquist> I'd probably start with tabs first

[11:45:20 CST(-0600)] <EricDalquist> as they have the simpler of the two data models

[11:45:32 CST(-0600)] <jwennmacher1> Sounds good.

[11:46:05 CST(-0600)] <jwennmacher1> Have the aggregators had adequate performance testing or will I need to be concerned about that?

[11:46:25 CST(-0600)] <EricDalquist> yeah they have had a lot of performance testing

[11:46:28 CST(-0600)] <EricDalquist> many hours with a profile

[11:46:32 CST(-0600)] <EricDalquist> profiler*

[11:46:42 CST(-0600)] <EricDalquist> what I would recommend is starting with LoginTotalsStatisticsController

[11:46:45 CST(-0600)] <EricDalquist> copying that

[11:47:03 CST(-0600)] <EricDalquist> and reworking it to work against the TabRenderAggregationDao

[11:47:45 CST(-0600)] <EricDalquist> so the tab renders have one more "dimension" than logins to

[11:48:06 CST(-0600)] <EricDalquist> logs have: date&time & group

[11:48:17 CST(-0600)] <EricDalquist> logins have* (sorry for all the typos this morning)

[11:48:31 CST(-0600)] <EricDalquist> tab renders have: date&time, group & tab name

[11:48:38 CST(-0600)] <EricDalquist> so that is a little bit of added complexity

[11:49:20 CST(-0600)] <EricDalquist> ConcurrentUsersStatisticsController and LoginTotalsStatisticsController are good examples to get you started though

[11:49:47 CST(-0600)] <EricDalquist> the reporting portlet should just "auto detect" any other controllers that implement BaseStatisticsReportController

[11:49:52 CST(-0600)] <EricDalquist> and show it in the report list

[11:50:13 CST(-0600)] <EricDalquist> so just a copy and paste of LoginTotalsStatisticsController and then reworking for tab renders will be a good first step

[11:50:42 CST(-0600)] <EricDalquist> once you get that working and are more comfortable we can talk about additional report uis

[11:52:20 CST(-0600)] <EricDalquist> since tab renders track render count and then a bunch of data about the render time: sum of squares, population variance, geometric mean, sum of logs, mean, variance, standard deviation, max, min, and sum

[11:52:26 CST(-0600)] <EricDalquist> so lots of time to render data

[11:52:43 CST(-0600)] <EricDalquist> which could turn into some interesting reports

[11:52:46 CST(-0600)] <EricDalquist> even non-graph reports

[11:53:05 CST(-0600)] <EricDalquist> like for the portlet execution side (which tracks the same timing data) we'd love to have a "slowest portlet" report

[11:53:21 CST(-0600)] <EricDalquist> like I can log into the portal and see which portlets are taking the longest to render over the last 5 minutes

[11:53:44 CST(-0600)] <EricDalquist> ok … I think I'm done with my wall of text

[11:53:50 CST(-0600)] <EricDalquist> I'll be around all day/week

[11:54:03 CST(-0600)] <EricDalquist> so just poke me if you have questions or even want to chat about report ideas

[11:57:11 CST(-0600)] <jwennmacher1> Thanks. good idea on slowest portlet. For tabs what are the 'groups' you mentioned as another dimension? It's same as normal groups (everyone, students, etc.)?

[11:57:47 CST(-0600)] <EricDalquist> yes, but to insulate from portal config changes the event aggregation has its own group, tab and portlet lookup tables

[11:58:23 CST(-0600)] <EricDalquist> AggregatedGroupLookupDao, AggregatedTabLookupDao, AggregatedPortletLookupDao

[11:58:40 CST(-0600)] <EricDalquist> these capture the group/tab/portlet data from the primary uPortal daos the first time it is seen

[11:58:48 CST(-0600)] <EricDalquist> and the stats data actuall refers to these

[11:58:54 CST(-0600)] <EricDalquist> this is to that if say a tab or portlet is deleted

[11:58:59 CST(-0600)] <EricDalquist> you don't lose the stats data about it

[11:59:18 CST(-0600)] <EricDalquist> note that you may well run into areas where there are missing APIs

[11:59:30 CST(-0600)] <EricDalquist> like no way to get a list of all the tabs in the lookup dao

[11:59:38 CST(-0600)] <EricDalquist> this is simply due to nothing needing that api yet

[11:59:49 CST(-0600)] <EricDalquist> so you or I will need to add those APIs when you find the holes

[12:01:22 CST(-0600)] <jwennmacher1> Ahh I see what you mean about insulating. Gotcha. Thanks for the overview. That helps me quite a bit.

[12:01:44 CST(-0600)] <jwennmacher1> I'm sure I'll have tons of questions as I dig into it (smile)

[12:04:18 CST(-0600)] <EricDalquist> (smile)

[12:06:42 CST(-0600)] <EricDalquist> drewwills: you have a few minutes to talk about person diretory?

[12:06:48 CST(-0600)] <EricDalquist> on a more abstract level?

[12:30:34 CST(-0600)] <drewwills> i will EricDalquist, sure

[12:31:04 CST(-0600)] <EricDalquist> so looking at PD on a higher level with various features

[12:31:12 CST(-0600)] <EricDalquist> what do you think of the current sql/ldap query templating

[12:31:28 CST(-0600)]

<EricDalquist> where you stick a

Unknown macro: {0}

in where you want the search/restrictions to appear

[13:21:14 CST(-0600)] <drewwills1> sorry EricDalquist... i was on a call

[13:21:26 CST(-0600)] <EricDalquist> no problem

[13:21:59 CST(-0600)] <drewwills1> the issue i run into with that is that sometimes i need more flexibility

[13:22:22 CST(-0600)] <EricDalquist> yeah

[13:22:25 CST(-0600)] <EricDalquist> that is my thought as well

[13:22:29 CST(-0600)]

<drewwills1> i may need to "select foo from bar where netId =

Unknown macro: {username}

"

[13:22:29 CST(-0600)] <EricDalquist> I'm not sure what a solution is though

[13:22:58 CST(-0600)] <EricDalquist> since that works great for simple queries but doesn't work for attribute sources than can have a more flexible search done

[13:23:07 CST(-0600)] <drewwills1> one sec...

[13:23:07 CST(-0600)] <EricDalquist> I'm open to all ideas here (smile)

  • No labels