uPortal IRC Logs-2012-02-29

[09:29:55 CST(-0600)] <brandon> hello

[09:30:07 CST(-0600)] <Arvids> hi

[09:30:15 CST(-0600)] <EricDalquist> hi

[09:30:42 CST(-0600)] <brandon> i'm using the email preview portlet 1.1.2, over imap and we are seeing timeout issues for users will large inboxes

[09:30:53 CST(-0600)] <brandon> are there any strategies for dealing with large inboxes?

[09:32:46 CST(-0600)] <Arvids> I won´t be able to help - not familiar with that portlet

[09:32:49 CST(-0600)] <brandon> we have the connectionTimeout and timeout settings set to 25000 but it seems to be due to large inbox size

[09:33:00 CST(-0600)] <EricDalquist> could just be a bug in the portlet

[09:33:10 CST(-0600)] <EricDalquist> and it needs a better strategy for loading the summary

[09:33:19 CST(-0600)] <EricDalquist> or it could be an intrinsic shortcoming in the javax.mail API

[09:33:37 CST(-0600)] <EricDalquist> but I'm guessing a good solution will require digging into the portlet's code

[09:34:22 CST(-0600)] <brandon> ok

[09:35:38 CST(-0600)] <Arvids> EricDalquist, have you heard of anyone who has 4.0 in production?

[09:35:50 CST(-0600)] <EricDalquist> yes

[09:35:56 CST(-0600)] <EricDalquist> though I can't remember who

[09:36:11 CST(-0600)] <EricDalquist> I think there was a recent user email list post from someone

[09:36:14 CST(-0600)] <Arvids> hmm... i´m curious to find out about their deployment size

[09:36:20 CST(-0600)] <EricDalquist> we're ~5 weeks away from go-live

[09:36:33 CST(-0600)] <EricDalquist> and we'll be deploying to ~ 120k+ users

[09:36:53 CST(-0600)] <Arvids> have you done some stress testing?

[09:37:00 CST(-0600)] <EricDalquist> not yet

[09:37:07 CST(-0600)] <EricDalquist> we'll be doing that in a week or two

[09:37:15 CST(-0600)] <Arvids> our experience shows that portal is highly CPU-intensive

[09:37:45 CST(-0600)] <EricDalquist> you're on 3.2 right now right?

[09:37:57 CST(-0600)] <Arvids> our current production is 3.1

[09:38:05 CST(-0600)] <Arvids> but we were benchmarking 4.0 deployment

[09:38:49 CST(-0600)] <EricDalquist> same hardware?

[09:39:22 CST(-0600)] <Arvids> exactly the same

[09:39:45 CST(-0600)] <EricDalquist> have you tried running it with yourkit watching?

[09:39:54 CST(-0600)] <Arvids> yes

[09:40:02 CST(-0600)] <Arvids> ... on my local computer

[09:40:49 CST(-0600)] <Arvids> my laptop is i7, hence it could handle 20 users without big problems... at that point jMeter machine started to hang up (i3 laptop)

[09:41:21 CST(-0600)] <EricDalquist> what is your server hardware?

[09:42:45 CST(-0600)] <Arvids> quad core xeon (E5420) with 4G ram

[09:43:18 CST(-0600)] <Arvids> ... and dedicated DB server

[09:43:30 CST(-0600)] <EricDalquist> if you'd like I could take a look at the yourkit cpu profile data

[09:44:25 CST(-0600)] <Arvids> will results from my laptop be sufficient?

[09:45:08 CST(-0600)] <EricDalquist> they would be a good first start

[09:46:14 CST(-0600)] <EricDalquist> you want a place to upload the profile data?

[09:46:18 CST(-0600)] <Arvids> now i must figure out which snapshot would give the best overview...

[09:46:34 CST(-0600)] <Arvids> ahh.. place won´t be a problem (smile)

[09:54:57 CST(-0600)] <Arvids> http://arvids.lv/Tomcat-2012-02-28.snapshot

[09:55:25 CST(-0600)] <EricDalquist> nice bandwidth (smile)

[09:56:00 CST(-0600)] <Arvids> strangely enoug Latvia has one of the best average brandwidth (smile)

[09:56:40 CST(-0600)] <Arvids> btw, i´m using sampling, not tracing - i hope that´s ok

[09:57:57 CST(-0600)] <EricDalquist> yeah, that's fine

[10:00:32 CST(-0600)] <EricDalquist> interesting

[10:00:43 CST(-0600)] <Arvids> i´ll now try to set up performance comparision between MSSQL and Postgres

[10:00:51 CST(-0600)] <EricDalquist> looks like deleting from the up_event_session_groups table is the top SQL time user

[10:01:00 CST(-0600)] <EricDalquist> buy a huge margin

[10:02:04 CST(-0600)] <Arvids> that does not concern me so much since it is in aggregator thread wich does not impact rendering a lot

[10:02:32 CST(-0600)] <EricDalquist> well if its chewing CPU time it could be a problem

[10:03:22 CST(-0600)] <Arvids> but it should chew DB CPU time

[10:03:26 CST(-0600)] <EricDalquist> yeah

[10:03:51 CST(-0600)] <Arvids> but our DB server hardly notices those stress tests

[10:03:58 CST(-0600)] <EricDalquist> that's good

[10:05:29 CST(-0600)] <Arvids> ... and CPU consumption should not be caused by our portlets since they´re mostly DB-intensive

[10:06:03 CST(-0600)] <Arvids> i´m somewhat worried about those XSL transformations

[10:06:39 CST(-0600)] <EricDalquist> yeah but they are pretty much the same XSL transforms that uPortal has always used

[10:10:31 CST(-0600)] <EricDalquist> so one potential hotspot I see is PortletWindowRegistryImpl.getAllLayoutPortletWindows(HttpServletRequest request)

[10:11:25 CST(-0600)] <EricDalquist> actually

[10:12:10 CST(-0600)] <EricDalquist> looks like it might be the AuthZ check in org.jasig.portal.portlet.registry.PortletEntityRegistryImpl.checkPortletDefinitionRenderPermissions(IUserInstance, IPortletDefinition)

[10:13:27 CST(-0600)] <EricDalquist> hrm

[10:14:47 CST(-0600)] <EricDalquist> Arvids: you still there?

[10:14:54 CST(-0600)] <EricDalquist> I think org.jasig.portal.security.provider.AuthorizationImpl.doesPrincipalHavePermission(IAuthorizationPrincipal, String, String, String) may be at fault

[10:15:17 CST(-0600)] <Arvids> yes

[10:15:36 CST(-0600)] <Arvids> i´m preparing for another benchmark round

[10:15:44 CST(-0600)] <EricDalquist> actually org.jasig.portal.security.provider.AuthorizationImpl.doesPrincipalHavePermission(IAuthorizationPrincipal, String, String, String, IPermissionPolicy)

[10:16:03 CST(-0600)] <EricDalquist> so there appears to be no caching on that permission check resolution

[10:16:27 CST(-0600)] <EricDalquist> I'd figure out how to get a hold of an Ehcache instance in that method

[10:16:38 CST(-0600)] <Arvids> sounds promising

[10:16:42 CST(-0600)] <EricDalquist> and cache the result of policy.doesPrincipalHavePermission

[10:17:23 CST(-0600)] <EricDalquist> but that one method call appears to be responsible for over 50% of the time spent in the rendering pipeline

[10:17:35 CST(-0600)] <EricDalquist> the cache timeout there will determine how long permissions changes take to propegate

[10:20:09 CST(-0600)] <Arvids> permissions are not changed too frequently, hence even pretty large values can be used there

[10:20:59 CST(-0600)] <EricDalquist> yeah

[10:21:28 CST(-0600)] <EricDalquist> I'd also look at adding @RequestCache to org.jasig.portal.layout.dlm.RDBMDistributedLayoutStore.getFragmentNodeInfo(String)

[10:23:56 CST(-0600)] <EricDalquist> some more smaller stuff:

[10:24:24 CST(-0600)] <EricDalquist> in org.jasig.portal.portlet.dao.jpa.ThreadContextClassLoaderAspect it does this.getClass().getClassLoader() every time an intercepted method is invoked

[10:24:35 CST(-0600)] <EricDalquist> that will never change and should be done in the constructor

[10:24:45 CST(-0600)] <EricDalquist> and the ClassLoader ref should be stored in a private final field

[10:27:32 CST(-0600)] <EricDalquist> yeah so of the 1232 seconds spent rendering the user's layout 74 seconds was spent in XSLT

[10:28:10 CST(-0600)] <EricDalquist> so about 6%

[10:28:53 CST(-0600)] <Arvids> hmm... that won´t be the place where to get noticeable improvements

[10:29:00 CST(-0600)] <EricDalquist> nope

[10:29:11 CST(-0600)] <EricDalquist> I'm pretty sure that permissions check is the biggest culprit

[10:30:08 CST(-0600)] <EricDalquist> yeah

[10:30:20 CST(-0600)] <EricDalquist> every expensive path I trace in the rendering comes back to that permissions check

[10:32:14 CST(-0600)] <EricDalquist> actually its even less, 56 seconds was spent on XSLT

[10:32:32 CST(-0600)] <EricDalquist> so 4.5%

[10:32:44 CST(-0600)] <Arvids> i believe other rendering engines won´t do much better

[10:32:53 CST(-0600)] <Arvids> ... even if not worse

[10:32:57 CST(-0600)] <EricDalquist> yeah

[10:33:07 CST(-0600)] <EricDalquist> and most of the XSLT work only happens once per user

[10:33:12 CST(-0600)] <EricDalquist> then they are just hitting cache

[10:33:20 CST(-0600)] <EricDalquist> but those perm checks happen every time

[10:33:25 CST(-0600)] <EricDalquist> and multiple times per portlet on the page

[10:33:33 CST(-0600)] <Arvids> :O

[10:33:38 CST(-0600)] <EricDalquist> it would be good if we could add @RequestCache to those too

[10:33:46 CST(-0600)] <EricDalquist> but that permissions code is all the old static init stuff

[10:33:50 CST(-0600)] <EricDalquist> it might be doable

[10:33:56 CST(-0600)] <EricDalquist> but just adding the Ehcache may be good enough

[10:35:48 CST(-0600)] <EricDalquist> hrm

[10:35:54 CST(-0600)] <EricDalquist> another thing that could be changed

[10:36:52 CST(-0600)] <EricDalquist> actually ... never mind (tongue)

[10:37:49 CST(-0600)] <EricDalquist> so yeah

[10:37:56 CST(-0600)] <EricDalquist> if you want to do that handfull of changes and try it again

[10:41:24 CST(-0600)] <Arvids> ok, thanks for valuable hints

[10:41:31 CST(-0600)] <EricDalquist> no problem

[10:41:43 CST(-0600)] <EricDalquist> thanks for the time spent gathering the yourkit data and making the changes

[10:41:56 CST(-0600)] <EricDalquist> feel free to send me any new result data to look at

[10:42:16 CST(-0600)] * EricDalquist loves the details the yourkit provides

[11:08:51 CST(-0600)] <Arvids> somehow i must learn to read those details, too

[11:09:27 CST(-0600)] <Arvids> there´s so much information that i can´t get my head around all of it

[11:09:36 CST(-0600)] <EricDalquist> I wonder if skype for linux does screen sharing

[11:09:44 CST(-0600)] <EricDalquist> next one you send me I could show you what I'm looking at

[11:10:11 CST(-0600)] <EricDalquist> looks like it does

[11:10:18 CST(-0600)] <Arvids> yes, it does

[11:10:20 CST(-0600)] <EricDalquist> so if you are interested I could do that

[11:10:29 CST(-0600)] <Arvids> ok, maybe tomorrow

[11:10:37 CST(-0600)] <EricDalquist> well maybe next week

[11:10:41 CST(-0600)] <Arvids> (smile)

[11:10:47 CST(-0600)] <EricDalquist> I'm going to play at a water park for the next 3 days (sad))

[11:10:48 CST(-0600)] <EricDalquist> (smile)

[11:11:39 CST(-0600)] <Arvids> have to go now, see you later

[11:11:52 CST(-0600)] <EricDalquist> later

[11:58:01 CST(-0600)] <holdorph> Google+ Hangouts do screensharing now, fwiw.

[11:59:11 CST(-0600)] <EricDalquist> oh nice

[11:59:25 CST(-0600)] <EricDalquist> maybe I'll advertise it (smile)

[11:59:36 CST(-0600)] <holdorph> it was announced when they added it.

[12:00:12 CST(-0600)] <holdorph> works pretty well for up to 10 people. But hangouts are limited to 10 or fewer right now. so after that you need something else.

[12:01:02 CST(-0600)] <holdorph> but it works in Linux where nearly all the other screensharing options (at least all I've tried) are failing right now

[12:01:15 CST(-0600)] <EricDalquist> nice

[12:01:18 CST(-0600)] <holdorph> it's a shame too, because some of them, like webex and adobe connect used to work in the past.

[12:01:32 CST(-0600)] <EricDalquist> well I'll coordinate with arvids and maybe announce something on the up-dev list for Monday morning

[12:01:40 CST(-0600)] <EricDalquist> we can go over how to use yourkit to trace

[12:02:37 CST(-0600)] <holdorph> in the future, they're going to add direct-record-to-youtube integration with the google+ hangout. it's already available for some celebrities, or people with a lot of 'followers'

[12:33:51 CST(-0600)] <brandon> hello

[12:34:15 CST(-0600)] <brandon> on uportal 3.2.4, is there any way to set the default column layout for the Welcome tab?

[12:34:35 CST(-0600)] <brandon> so that it is 2 equal width columns for every present user and future user?

[12:35:53 CST(-0600)] <athena> it'll probably take an update to the UpdatePreferencesServlet

[12:38:14 CST(-0600)] <brandon> the actual source file?

[12:45:27 CST(-0600)] <athena> yes

[12:45:39 CST(-0600)] <athena> there's no configuration option for that, but you could implement it as a custom code change, i think

[13:28:44 CST(-0600)] <EricDalquist> athena: I see you tracked down that layout bug

[13:33:18 CST(-0600)] <athena> yes!

[13:33:24 CST(-0600)] <athena> turned out to be really simple

[13:33:27 CST(-0600)] <athena> though took me a while to find it

[13:33:31 CST(-0600)] <EricDalquist> yeah I saw that

[13:33:53 CST(-0600)] <athena> glad to have that tracked down - feeling pretty good about the mobile capabilities now that that's resolved

[13:34:13 CST(-0600)] <EricDalquist> great

[13:34:26 CST(-0600)] <EricDalquist> arvids is supplying me with yourkit snapshots

[13:34:30 CST(-0600)] <EricDalquist> so we're tracking down some perf issues

[13:34:34 CST(-0600)] <athena> oh terrific!

[13:34:47 CST(-0600)] <EricDalquist> he asked about a howto for yourkit

[13:34:56 CST(-0600)] <EricDalquist> so I think I'm going to try to setup a g+ hangout for Monday morning

[13:35:02 CST(-0600)] <EricDalquist> and do a demo for uP devs

[13:36:01 CST(-0600)] <EricDalquist> in his logs AuthorizationService was responsible for over 50% of the CPU usage (tongue)

[13:38:36 CST(-0600)] <athena> nice!

[13:38:38 CST(-0600)] <athena> ok, bbiab . . .