uPortal IRC Logs-2008-08-14

[08:17:33 EDT(-0400)] * dstn (n=dstn@unaffiliated/dstn) has joined ##uportal
[09:00:04 EDT(-0400)] * athena7 (n=athena7@adsl-99-149-83-32.dsl.wlfrct.sbcglobal.net) has joined ##uportal
[09:08:50 EDT(-0400)] * colinclark (n=colin@bas1-toronto09-1279543344.dsl.bell.ca) has joined ##uportal
[09:24:01 EDT(-0400)] * EricDalquist (n=dalquist@bohemia.doit.wisc.edu) has joined ##uportal
[09:32:26 EDT(-0400)] * athena7 (n=athena7@adsl-99-184-128-151.dsl.wlfrct.sbcglobal.net) has joined ##uportal
[09:37:42 EDT(-0400)] * anastasiac (n=team@142.150.154.160) has joined ##uportal
[09:45:39 EDT(-0400)] * agherna (n=agherna@panache.ci.uiuc.edu) has joined ##uportal
[10:12:26 EDT(-0400)] * bulloche (n=bulloche@134.250.4.77) has joined ##uportal
[10:19:39 EDT(-0400)] * colinclark (n=colin@142.150.154.101) has joined ##uportal
[10:46:59 EDT(-0400)] <EricDalquist> well adding a robots.txt to exclude fisheye from search engine crawling seems to have fixed our need to reboot
[10:47:31 EDT(-0400)] <EricDalquist> though it is annoying that the fisheye content won't be searchable other than through its own UI
[10:51:20 EDT(-0400)] <athena7> oh, interesting
[10:51:50 EDT(-0400)] <EricDalquist> fisheye was seeing around a total of 50 requests/minute just from robots
[10:51:57 EDT(-0400)] <athena7> wow.
[10:52:01 EDT(-0400)] <EricDalquist> lots of pages to index
[10:52:02 EDT(-0400)] <athena7> poor fisheye
[10:52:03 EDT(-0400)] <athena7> yeah
[10:52:08 EDT(-0400)] <EricDalquist> think of every revision of every file
[10:52:12 EDT(-0400)] <athena7> yeah
[10:52:12 EDT(-0400)] <EricDalquist> plus all of the diffs
[10:52:15 EDT(-0400)] <athena7> yeah
[10:52:16 EDT(-0400)] <athena7> (sad)
[10:52:30 EDT(-0400)] <EricDalquist> so if you see it get unresponsive again let ScottB or I know
[10:52:39 EDT(-0400)] <EricDalquist> but right now the crond restart is off
[10:52:42 EDT(-0400)] <athena7> by the way, i'm having some weird problems with the ajax channel browser in the uportal trunk
[10:52:48 EDT(-0400)] <EricDalquist> yay
[10:52:49 EDT(-0400)] <athena7> i figured i'd broken something w/ the namespacing and fluid work
[10:52:55 EDT(-0400)] <athena7> but the post looks correct
[10:53:51 EDT(-0400)] <EricDalquist> what are the problems?
[10:54:05 EDT(-0400)] <athena7> http://uportal.pastebin.com/m13b2eeb1
[10:56:09 EDT(-0400)] <athena7> firing up my debugger
[11:05:30 EDT(-0400)] <athena7> oh ok, i think ti's just that i happened to pick a broken portlet
[11:05:35 EDT(-0400)] <EricDalquist> ah
[11:05:43 EDT(-0400)] <EricDalquist> which would fail to init
[11:06:03 EDT(-0400)] <athena7> yes (smile)
[11:06:14 EDT(-0400)] <athena7> should probably do something about that, but at least it's not really broken
[11:20:39 EDT(-0400)] <athena7> so did i mention yesterday the IE drag and drop issue in up3?
[11:20:51 EDT(-0400)] <EricDalquist> yeah
[11:20:56 EDT(-0400)] <athena7> ok
[11:20:58 EDT(-0400)] <athena7> wanted to make sure
[11:20:59 EDT(-0400)] <EricDalquist> that it probably wont' get fixed until 3.1?
[11:21:04 EDT(-0400)] <athena7> well
[11:21:14 EDT(-0400)] <athena7> it will get fixed as soon as we migrate to the fluid reorderer
[11:22:10 EDT(-0400)] <EricDalquist> well that is the big feature pegged for 3.1
[11:22:19 EDT(-0400)] <athena7> yeah
[11:22:38 EDT(-0400)] <athena7> if they can find a workaround for the document.write issues that'd be so awesome
[11:22:42 EDT(-0400)] <athena7> the google portlet crashes my browser so hard
[11:22:55 EDT(-0400)] <EricDalquist> is that a related issue?
[11:27:01 EDT(-0400)] <athena7> well
[11:27:22 EDT(-0400)] <athena7> it's the result of an unfixed jquery bug, so it exists with both the present uportal drag and drop and with the fluid reorderer
[11:27:31 EDT(-0400)] <athena7> but fluid seems interested in trying to find a workaround
[11:35:48 EDT(-0400)] <athena7> so jQuery UI changed their packaging such that you can decide which components you'd like included and get back one big minified file
[11:36:02 EDT(-0400)] <athena7> which leaves the question of which components we think we'd like
[11:37:39 EDT(-0400)] <athena7> do we want to just include everything? try and target what the portal needs?
[11:37:57 EDT(-0400)] <EricDalquist> hrm
[11:38:25 EDT(-0400)] <EricDalquist> if namespacing worked well I would say minimal
[11:38:46 EDT(-0400)] <EricDalquist> but could that cause problems with other portlets using jquery?
[11:39:26 EDT(-0400)] <athena7> well, potentially
[11:39:33 EDT(-0400)] <athena7> it'd be best to include everything we need
[11:39:36 EDT(-0400)] <athena7> but nothing extra
[11:39:40 EDT(-0400)] <athena7> from a performance standpoint
[11:39:47 EDT(-0400)] <athena7> but different people may have different needs
[11:40:12 EDT(-0400)] <EricDalquist> yeah ...
[11:40:36 EDT(-0400)] <athena7> so we could include a default that probably works
[11:40:45 EDT(-0400)] <athena7> and let people replace it if necessary
[11:40:46 EDT(-0400)] <EricDalquist> for the portal world we really need a JS library that can be loaded from named files and loaded via passing in a namespace string
[11:41:06 EDT(-0400)] <athena7> but we want to make sure it has a unique name so that it can get perma-cached like we talked about
[11:41:09 EDT(-0400)] <EricDalquist> but I don't know enough JS to have any clue if that is possible or would work
[11:41:17 EDT(-0400)] <EricDalquist> yeah
[11:41:39 EDT(-0400)] <athena7> well we could go back to having separate files for each component
[11:41:49 EDT(-0400)] <athena7> better from a performance standpoint to combine it all, probably though
[11:42:39 EDT(-0400)] <EricDalquist> hrm
[11:42:43 EDT(-0400)] <athena7> from the yahoo documentation, which tuy mentioned before: http://developer.yahoo.com/performance/rules.html#num_http
[11:42:50 EDT(-0400)] <EricDalquist> yeah
[11:42:55 EDT(-0400)] <EricDalquist> it would be fewer request
[11:43:00 EDT(-0400)] <EricDalquist> and less http overhead
[11:43:09 EDT(-0400)] <athena7> i think it probably makes sense to just take a stab at what's likely necessary
[11:43:15 EDT(-0400)] <EricDalquist> I think we need to go to the dev list for this ... I really don't know
[11:43:18 EDT(-0400)] <athena7> and document it thoroughly
[11:43:23 EDT(-0400)] <athena7> yes, that's probably a good idea
[11:44:14 EDT(-0400)] <athena7> hm
[12:12:00 EDT(-0400)] <athena7> how many active branches do we have right now? is it just the trunk and rel-3-0-patches?
[12:12:11 EDT(-0400)] <EricDalquist> yeah
[12:12:13 EDT(-0400)] <athena7> ok
[12:12:15 EDT(-0400)] <EricDalquist> there was a gap branch
[12:12:19 EDT(-0400)] <athena7> and that's gone?
[12:12:20 EDT(-0400)] <EricDalquist> but it is kind of dead
[12:12:33 EDT(-0400)] <EricDalquist> I need to delete and recreate it once someone is actually going to do work on it
[12:13:03 EDT(-0400)] <athena7> ah ok
[12:13:12 EDT(-0400)] <athena7> so we don't need to worry about it for now anyway
[12:14:55 EDT(-0400)] <athena7> so when i do the fluid commits i guess i should block them from going into the 3-0-patches branch?
[12:15:09 EDT(-0400)] <EricDalquist> yeah
[12:15:16 EDT(-0400)] <athena7> ok
[12:15:20 EDT(-0400)] <EricDalquist> you've seen how to do that with svnmerge?
[12:15:30 EDT(-0400)] <athena7> i havent' done it, but i saw the documentation on it
[12:16:16 EDT(-0400)] <EricDalquist> similar to merge I think
[12:16:16 EDT(-0400)] <athena7> i think we can move over the namespacing and css stuff i did though
[12:16:22 EDT(-0400)] <EricDalquist> svnmerge block -r
[12:16:35 EDT(-0400)] <EricDalquist> sounds good, just be sure to do it in different commits
[12:16:43 EDT(-0400)] <athena7> yeah
[12:16:52 EDT(-0400)] <EricDalquist> or just plan on doing the commit directly on the patches branch
[12:16:54 EDT(-0400)] <EricDalquist> either works
[12:17:12 EDT(-0400)] <athena7> yeah
[12:17:26 EDT(-0400)] <athena7> for future stuff post-fluid integration i may have to just do separate commits
[12:17:27 EDT(-0400)] <athena7> we'll see
[12:17:46 EDT(-0400)] <EricDalquist> yup
[12:17:52 EDT(-0400)] <athena7> i added in the drag and drop bug and gave it appropriate affects and fix versions
[12:17:58 EDT(-0400)] <EricDalquist> eventually stuff dirgresses where it can't be easily merged
[12:17:59 EDT(-0400)] <athena7> so at least we'll have it documented for the 3.0 branch
[12:18:02 EDT(-0400)] <athena7> yeah
[12:18:03 EDT(-0400)] <EricDalquist> the poms are already like that
[12:18:06 EDT(-0400)] <athena7> i'm sure
[12:46:14 EDT(-0400)] * athena7 (n=athena7@adsl-99-149-83-32.dsl.wlfrct.sbcglobal.net) has joined ##uportal
[13:23:20 EDT(-0400)] * glenda (n=ggonzale@uni1.unicon.net) has joined ##uportal
[13:28:09 EDT(-0400)] <EricDalquist> FYI .. the URL object is EVIL
[13:28:46 EDT(-0400)] <EricDalquist> it does blocking DNS lookups when you call .equals or .hashcode on it
[13:28:55 EDT(-0400)] <EricDalquist> and those values can CHANGE depending on what the DNS resolves to{color}
[13:31:45 EDT(-0400)] <athena7> yeah that whole class is evil

[13:31:52 EDT(-0400)] <athena7> that's um, great though
[13:32:05 EDT(-0400)] <EricDalquist> yeah
[13:32:10 EDT(-0400)] <athena7> doesn't some of the java dns stuff get cached inappropriately long as well?
[13:32:17 EDT(-0400)] <EricDalquist> forever
[13:32:23 EDT(-0400)] <athena7> yeah that's what i thought
[13:32:32 EDT(-0400)] <EricDalquist> used a Map<URL, Document> as my cache
[13:32:42 EDT(-0400)] <athena7> oh.
[13:32:44 EDT(-0400)] <EricDalquist> it was more expensive to use the URL as the key than just re-create the document
[13:32:50 EDT(-0400)] <athena7> wow.
[13:32:53 EDT(-0400)] <athena7> that's horrible!
[13:32:55 EDT(-0400)] <EricDalquist> because of the expense involved with calling .hashCode and .equals
[13:32:57 EDT(-0400)] <EricDalquist> yup
[13:33:06 EDT(-0400)] <EricDalquist> and there is neat stuff like:
[13:33:09 EDT(-0400)] <athena7> can you key it on the string value of the URL or something instead, or is that hideous as well?
[13:33:27 EDT(-0400)] <EricDalquist> new URL("http://foo.com").equals(new URL("http://bar.com"(wink);
[13:33:34 EDT(-0400)] <EricDalquist> that is true if they resolve to the same IP
[13:33:40 EDT(-0400)] <athena7> WHYYYYY
[13:33:41 EDT(-0400)] <athena7> (sad)
[13:33:44 EDT(-0400)] <EricDalquist> yeah
[13:34:00 EDT(-0400)] <EricDalquist> so I'm going to use the String used to create the URL as the key
[13:34:17 EDT(-0400)] <athena7> yeah that makes sense
[13:36:18 EDT(-0400)] <EricDalquist> and of course Sun can never fix URL
[13:36:24 EDT(-0400)] <EricDalquist> since people may depend on that behavior
[13:36:33 EDT(-0400)] <EricDalquist> great example of why API design is important (tongue)
[13:43:37 EDT(-0400)] * glendago (n=ggonzale@uni1.unicon.net) has left ##uportal
[13:44:07 EDT(-0400)] <athena7> yeah
[14:13:11 EDT(-0400)] * colinclark (n=colin@142.150.154.101) has joined ##uportal
[14:49:12 EDT(-0400)] <EricDalquist> he agherna is Drew anywhere near you?
[14:49:34 EDT(-0400)] <agherna> i think he may be coming back from lunch soon
[14:54:00 EDT(-0400)] <EricDalquist> could you ask him if he'd be willing to jump into irc for a few minutes?
[14:54:50 EDT(-0400)] <agherna> yes
[14:54:58 EDT(-0400)] <EricDalquist> thanks
[14:56:17 EDT(-0400)] <agherna> np
[14:57:56 EDT(-0400)] * lennar1 (n=sparhk@wsip-98-174-242-39.ph.ph.cox.net) has joined ##uportal
[14:58:59 EDT(-0400)] <len> wonder why it won't let me use 'lennard' am always assigned 'lennar1'
[14:59:28 EDT(-0400)] <EricDalquist> probably somone else on freenode using lennard
[14:59:35 EDT(-0400)] <EricDalquist> yup
[14:59:41 EDT(-0400)] <EricDalquist> do '/whois lennard'
[15:00:16 EDT(-0400)] <athena7> or /ns info lennard
[15:01:06 EDT(-0400)] <athena7> unfortunately people seem to be able to squat on accounts here for a while
[15:01:32 EDT(-0400)] <EricDalquist> well lenndard is actually on right now
[15:13:35 EDT(-0400)] <len> has anyone of you heard of uPortal being deployed in JBoss AS?
[15:14:52 EDT(-0400)] <athena7> yeah i think "athena" hasn't actually been used in almost a year
[15:17:28 EDT(-0400)] * awills (n=awills@ras117.admin.uillinois.edu) has joined ##uportal
[15:17:59 EDT(-0400)] * awills tries to stop his ears from ringing
[15:18:05 EDT(-0400)] <EricDalquist> (smile)
[15:18:10 EDT(-0400)] <awills> hey folks
[15:18:21 EDT(-0400)] <len> (smile)
[15:18:22 EDT(-0400)] <EricDalquist> hey
[15:18:41 EDT(-0400)] <EricDalquist> my head is hurting from ideas/issues with this crn object caching stuff
[15:18:58 EDT(-0400)] <EricDalquist> now I'm contemplating threadlocals for objects that are re-usable but not threadsafe
[15:19:05 EDT(-0400)] <EricDalquist> instead of an object pool
[15:19:13 EDT(-0400)] <awills> oh my
[15:19:17 EDT(-0400)] <EricDalquist> just because the TLs would be easier to code
[15:19:25 EDT(-0400)] <EricDalquist> like reading the dom4j docs
[15:19:42 EDT(-0400)] <EricDalquist> their Element is not necessarily threadsafe
[15:20:05 EDT(-0400)] <awills> ahhh, yeppers
[15:20:08 EDT(-0400)] <EricDalquist> either is a ScriptEngine depending on the return value of an attribute
[15:20:13 EDT(-0400)] <EricDalquist> either is a Transformer
[15:20:19 EDT(-0400)] <EricDalquist> but all are really freaking expensive to create
[15:22:43 EDT(-0400)] <awills> would we have better performance creating these objects once and synchronizing on their use then?
[15:23:26 EDT(-0400)] <EricDalquist> for import/export
[15:23:27 EDT(-0400)] <EricDalquist> yes
[15:23:29 EDT(-0400)] <EricDalquist> well no
[15:23:44 EDT(-0400)] <EricDalquist> because one of the eventual steps is to thread these things
[15:23:53 EDT(-0400)] <awills> yes, exactly
[15:23:54 EDT(-0400)] <EricDalquist> and for like the Element how do you sync?
[15:23:58 EDT(-0400)] <EricDalquist> phrase returns an object
[15:24:01 EDT(-0400)] <EricDalquist> then other stuff acts on it
[15:24:05 EDT(-0400)] <EricDalquist> so there is no way to sync it
[15:24:33 EDT(-0400)]

<awills> <synchronized on="$

Unknown macro: {myObj}

"><do-stuff/></sychronized>


[15:24:36 EDT(-0400)] <EricDalquist> yeah
[15:24:55 EDT(-0400)] <EricDalquist> that would defeat the use of threading the tasks for added performance
[15:25:19 EDT(-0400)] <EricDalquist> if it is a command line scripting tool like we use it for uPortal ThreadLocals make sense
[15:25:35 EDT(-0400)] <EricDalquist> there aren't going to be many threads and the app has a finite run time so cleanup isn't an issue
[15:25:49 EDT(-0400)] <EricDalquist> for using it in a long-running app that isn't really that great though
[15:25:55 EDT(-0400)] <EricDalquist> a real cache and object pools would be better
[15:26:02 EDT(-0400)] <EricDalquist> as they can be configured to drop unused objects
[15:26:16 EDT(-0400)] <awills> yeah
[15:26:17 EDT(-0400)] <EricDalquist> but they are a bit more overhead
[15:26:29 EDT(-0400)] <EricDalquist> and I still have no idea how we're going to do this in phrases
[15:26:54 EDT(-0400)]

<EricDalquist> though I guess we could create <pool target="$

Unknown macro: {phrase()}

">


[15:27:06 EDT(-0400)] <awills> we might have to implements some of the more complex bits in tasks
[15:27:26 EDT(-0400)] <EricDalquist> on the plus side
[15:27:34 EDT(-0400)] <EricDalquist> my hackish solutions do good things
[15:27:52 EDT(-0400)] <EricDalquist> with just the scriptengine and dom caching I went from 47 seconds to 20 seconds for 286 layouts
[15:28:14 EDT(-0400)] <EricDalquist> that doesn't include XSLT caching or script compilation
[15:28:14 EDT(-0400)] <awills> yes, that's a big difference
[15:28:27 EDT(-0400)] <EricDalquist> which I think would at least halve it again
[15:30:42 EDT(-0400)] <awills> according to my quick math... at 28.6 layouts a second, that's 3,496 seconds for 100k layouts (58 minutes)
[15:30:51 EDT(-0400)] <EricDalquist> yup
[15:30:58 EDT(-0400)] <EricDalquist> getting much more reasonable
[15:31:10 EDT(-0400)] <awills> 100k is a lot of layouts
[15:31:15 EDT(-0400)] <EricDalquist> yeah
[15:31:23 EDT(-0400)] <EricDalquist> we haven't done any user purging yet either
[15:31:32 EDT(-0400)] <EricDalquist> so our db actually has ~ 160k users in it
[15:31:49 EDT(-0400)] <EricDalquist> with layouts for those that have made customizations (don't have a total on that but we're assuming it is over 50%)
[15:31:59 EDT(-0400)] <EricDalquist> so
[15:32:07 EDT(-0400)] <awills> if you can get a feed of the users to remove, you can use crn-delete -Dtype=user to prune them
[15:32:16 EDT(-0400)] <EricDalquist> yeah
[15:32:19 EDT(-0400)] <EricDalquist> we're going to
[15:32:27 EDT(-0400)] <EricDalquist> actually I think we're going to modify the scripts a bit
[15:32:34 EDT(-0400)] <EricDalquist> so we can give it a list of 'removed' users
[15:32:37 EDT(-0400)] <EricDalquist> and just not export them
[15:32:47 EDT(-0400)] <awills> ah sure
[15:32:51 EDT(-0400)] <EricDalquist> once we're post-update we're supposed to be hooking up into the deactivation process
[15:32:52 EDT(-0400)] <awills> that works too
[15:33:12 EDT(-0400)] <EricDalquist> so are you still ok with having a top level CACHE request attribute?
[15:33:39 EDT(-0400)] <awills> like I talked about? set by the ScriptRunner?
[15:34:17 EDT(-0400)] <EricDalquist> yeah
[15:34:20 EDT(-0400)] <awills> that's something I've had my eye on for a few months at least
[15:34:28 EDT(-0400)] <EricDalquist> right now I have it looking at the incoming request
[15:34:39 EDT(-0400)] <EricDalquist> and only creating a Map for caching if one doesn't already exist
[15:35:08 EDT(-0400)] <awills> can you post an example to pastebin?
[15:35:10 EDT(-0400)] <EricDalquist> I was going to deffer to you on how exactly to code that part to make it flexible so someone could change it to use something more complex like a EHCache if they wanted
[15:35:20 EDT(-0400)] <awills> (so i can follow you better)
[15:35:40 EDT(-0400)] <awills> yeah makes sense
[15:35:45 EDT(-0400)] <EricDalquist> http://uportal.pastebin.com/m6a01ff0b
[15:35:58 EDT(-0400)] <EricDalquist> that is in public TaskResponse run(Task k, TaskRequest req, TaskResponse res) {
[15:41:57 EDT(-0400)] <awills> i suspect this version would work just fine: http://uportal.pastebin.com/m75726f17
[15:42:23 EDT(-0400)] <awills> but it's a small difference
[15:42:38 EDT(-0400)] <EricDalquist> are you ensured you will have a RuntimeRequestResponse?
[15:42:51 EDT(-0400)] <EricDalquist> the blind cast just makes me nervous
[15:44:48 EDT(-0400)] <awills> yeah you're right – it's possible for someone to pass in something of their own... and future developments could make a CCE more likely on new ways
[15:49:12 EDT(-0400)] <awills> in that case how 'bout this: http://uportal.pastebin.com/mc33f826
[15:49:30 EDT(-0400)] <awills> it just seems odd to wrap it twice (wink)
[15:51:14 EDT(-0400)] <awills> I think this approach – just putting in a Map, for now, if it's not already there – is a decent one to start with. If we need more sophistication in the default CACHE in the future, we can add it.
[15:55:09 EDT(-0400)] <EricDalquist> well it doesn't wrap it twice
[15:55:11 EDT(-0400)] <EricDalquist> just once
[15:55:20 EDT(-0400)] <EricDalquist> but the logic is in two places since either may or may not happen
[15:55:43 EDT(-0400)] <awills> yeah, i see now
[15:56:49 EDT(-0400)] <awills> but if there are potentially 2 things that can be added in that method, there's much more likelihood that we'll be using the wrapper... might as well be 100% likelihood
[15:57:05 EDT(-0400)] <EricDalquist> yeah
[15:59:28 EDT(-0400)] <EricDalquist> so a CacheTask and a PoolTask would be extensions of SetAttributeTask
[15:59:37 EDT(-0400)] <awills> I'm good with this enhancement... and furthermore: the technology-specifiac thread safety considerations you're looking at suggest that we're better off leveraging the CACHE in tasks/phrases like <xslt> and $

, b/c these things know what their dealing with and can make appropriate choices
[15:59:52 EDT(-0400)] <EricDalquist> brb
[16:02:27 EDT(-0400)] <awills> instead of a pool, what if the <xslt> task leveraged the CACHE and a ThreadLocal in combination?
[16:04:40 EDT(-0400)] <awills> *"in tasks/phrases like <xslt> and $

(as opposed to <with> and <with-attribute>)"
[16:08:31 EDT(-0400)] <EricDalquist> hrm
[16:08:44 EDT(-0400)] <EricDalquist> so again the concern with phrases is how do you toggle it?
[16:09:00 EDT(-0400)] <EricDalquist> is there a way to use $

without caching?
[16:10:25 EDT(-0400)] <awills> there are several ways to cross that bridge... it's a matter of choosing the most intuative/elegant...
[16:10:37 EDT(-0400)] <EricDalquist> ok
[16:10:47 EDT(-0400)] <awills> I think always caching based on LOCATION strings could work for now
[16:10:50 EDT(-0400)] <EricDalquist> and what about the conern of memory usage with ThreadLocals?
[16:11:06 EDT(-0400)] <EricDalquist> with the tree architecture of CRN scripts there will be no good way to cleanup their contents
[16:11:11 EDT(-0400)] <awills> if LOCATIONSs are equal, the cache can be used
[16:11:22 EDT(-0400)] <EricDalquist> so with long running apps we would slowly accumulate data which would never be removed
[16:11:43 EDT(-0400)] <EricDalquist> oh and I found a fun catch ... can't use URL for any cache keying
[16:11:52 EDT(-0400)] <awills> if we only cache the most recent, it will be small
[16:11:54 EDT(-0400)] <EricDalquist> it does full blocking DNS lookups on .equals and .hashcode
[16:12:06 EDT(-0400)] <EricDalquist> well using a thread local isn't the same as a cache
[16:12:13 EDT(-0400)] <awills> yeah i read about that earlier in the IRC logs... what a mess
[16:12:30 EDT(-0400)] <EricDalquist> or are you thinking a ThreadLocal that holds a cache for that thread
[16:12:38 EDT(-0400)] <EricDalquist> and then that cache can be cleared if needed
[16:12:46 EDT(-0400)] <awills> or even a cachee that holds a ThreadLocal
[16:12:55 EDT(-0400)] <EricDalquist> that isn't how threadlocals work
[16:13:06 EDT(-0400)] <EricDalquist> well not how they work well
[16:13:22 EDT(-0400)] <awills> you can't put one in a cache? i thought it only cared what thread was invoking it
[16:13:31 EDT(-0400)] <EricDalquist> yeah
[16:13:34 EDT(-0400)] <EricDalquist> the problem is claenup
[16:13:42 EDT(-0400)] <awills> the get() method
[16:13:50 EDT(-0400)] <EricDalquist> loosing reference to a ThreadLocal doesn't mean the objects attached to the thread for that local are cleaned up
[16:13:53 EDT(-0400)] <EricDalquist> so you can leak memory
[16:14:33 EDT(-0400)] <awills> as long as a thread survives, right? it's eligable for GC when the thread expires iirc
[16:14:39 EDT(-0400)] <EricDalquist> yeah
[16:14:43 EDT(-0400)] <EricDalquist> but what about a webapplication?
[16:14:47 EDT(-0400)] <EricDalquist> those threads are pooled
[16:14:53 EDT(-0400)] <EricDalquist> you would never reclaim those objects
[16:15:20 EDT(-0400)] <awills> not as long as the app ran, that's right
[16:15:44 EDT(-0400)] <awills> so some of these ideas are overkill perhaps
[16:15:54 EDT(-0400)] <EricDalquist> hrm
[16:16:02 EDT(-0400)] <EricDalquist> actually it looks like it may clean things up in JDK5 on
[16:16:14 EDT(-0400)] <awills> the CACHE will work well for things with certain threading characteristics, less well for others
[16:16:58 EDT(-0400)] <EricDalquist> yeah so never mind on the memory leak
[16:17:08 EDT(-0400)] <awills> ah cool
[16:17:18 EDT(-0400)] <EricDalquist> when you loose the reference to the ThreadLocal all values stored on the thread are elegable for collection
[16:17:30 EDT(-0400)] <awills> that's better then
[16:18:00 EDT(-0400)] <EricDalquist> *all values set by that ThreadLocal on the Thread are eligible for collection
[16:18:33 EDT(-0400)] <EricDalquist> so we're thinking that embedding this logic in the individual tasks and phrases is probably the best bet
[16:19:37 EDT(-0400)] <awills> yes, i think so, b/c the individual items know what APIs they're dealing with, specifically, and can therefore make choices that are 100% appropriate to each situation
[16:20:49 EDT(-0400)] <awills> I'd suggest picking the blockbuster items... leaving the others for the time being (wink)
[16:31:00 EDT(-0400)] <athena7> what would the best java library for soap be these days?
[16:31:19 EDT(-0400)] <athena7> axis? something else?
[16:31:37 EDT(-0400)] <EricDalquist> spring-ws
[16:31:42 EDT(-0400)] <athena7> actually, i think i only need the part to consume soap services, not serve them up
[16:32:05 EDT(-0400)] <EricDalquist> I think spring-ws will do that for you too
[16:32:09 EDT(-0400)] <agherna> axis is an ok choice
[16:32:15 EDT(-0400)] <agherna> i've used that successfully
[16:32:18 EDT(-0400)] <agherna> in the past...
[16:32:21 EDT(-0400)] <agherna> (smile)
[16:32:41 EDT(-0400)] <athena7> oh!
[16:32:46 EDT(-0400)] <athena7> i missing spring-ws
[16:32:47 EDT(-0400)] <athena7> woo
[16:33:00 EDT(-0400)] * athena7 scampers off to springify EVERYTHING
[16:33:02 EDT(-0400)] <athena7> (wink)
[16:34:12 EDT(-0400)] <EricDalquist> lol
[16:35:19 EDT(-0400)] <athena7> seriously, how much time do we spend replacing Other Things with spring's version?
[16:35:35 EDT(-0400)] * colinclark (n=colin@142.150.154.101) has joined ##uportal
[16:35:48 EDT(-0400)] <EricDalquist> well I guess it depends if you're refactoring or writing new
[16:39:33 EDT(-0400)] <EricDalquist> one more question awills ...
[16:39:41 EDT(-0400)] <EricDalquist> we have a 'global' cache Map
[16:39:57 EDT(-0400)] <EricDalquist> there needs to be some sync around that to ensure safety
[16:40:22 EDT(-0400)] <EricDalquist> the minimal sync method would be for each phrase/task to have its own child Map in the global map that it uses
[16:40:43 EDT(-0400)] <EricDalquist> that reduces the sync to a short bit around the global and a longer bit around the per-phrase map
[16:40:56 EDT(-0400)] <EricDalquist> but that doesn't play as well with the global Map being a real cache (with expiration and stuff)
[16:41:18 EDT(-0400)] <EricDalquist> since it would only be holding higher level objects isntead of like individual documents
[16:48:50 EDT(-0400)] <awills> what lvl of synch do ehcache-backed Map objects provide? in other words, what would be the synching characteristics/behavior if that global Map were ehcache-backed?
[16:49:18 EDT(-0400)] <EricDalquist> well they are thread safe
[16:49:28 EDT(-0400)] <EricDalquist> the sync is needed to ensure singleton creation of things like ThreadLocals
[16:49:38 EDT(-0400)] <EricDalquist> though now that I've started on it I don't think it will be too bad
[16:52:11 EDT(-0400)] <awills> well good (smile)
[16:56:14 EDT(-0400)] <EricDalquist> so then the other issue is with scripting
[16:56:24 EDT(-0400)] <EricDalquist> do we cache the ScriptEngine in the ScriptEnginePhrase
[16:56:31 EDT(-0400)] <EricDalquist> or in the ScriptTask/ScriptPhrase?
[16:56:45 EDT(-0400)] <EricDalquist> I ask because caching a ScriptEngine does have implications
[16:57:02 EDT(-0400)] <EricDalquist> it remembers the state of bound variables from one eval to the next
[16:57:11 EDT(-0400)] <EricDalquist> so having just one globally could cause problems
[16:58:54 EDT(-0400)] <EricDalquist> for the ScriptTask I could see adding an option like: cache-mode="singleton|thread-local|prototype"
[17:00:09 EDT(-0400)] <EricDalquist> or a step better: engine-scope="global|instance|invocation" cache-mode="shared|thread-local"
[17:00:30 EDT(-0400)] <EricDalquist> which would cover the options for script engine re-use and thread safety
[17:01:02 EDT(-0400)] <EricDalquist> we could default the phrase to global and shared since script phrases would be less likely to be complex
[17:05:37 EDT(-0400)] * chrisdoyle (n=chrisdoy@mtw160-1.ippl.jhu.edu) has joined ##uportal
[17:11:11 EDT(-0400)] <EricDalquist> also if you're there awills can I use Enum as a phrase return type?
[17:26:22 EDT(-0400)] * chrisdoyle (n=chrisdoy@mtw160-1.ippl.jhu.edu) has left ##uportal
[17:34:33 EDT(-0400)] <awills> I don't see why not... the return type on the method sig is Object... tasks and phrases are meant to coordinate on those sorts of things
[17:34:55 EDT(-0400)] <EricDalquist> well will it be able to take something like foo="bar" in the XML
[17:35:05 EDT(-0400)] <EricDalquist> and automagically return me MyEnum.bar ?
[17:35:12 EDT(-0400)] <awills> iirc, the JSR-223 API allows you to bind either engine-level or script-level... is that nor right?
[17:35:14 EDT(-0400)] <EricDalquist> I'm guessing no
[17:35:33 EDT(-0400)] <EricDalquist> it allows you to bind at the engine factory level or the engine level
[17:35:47 EDT(-0400)] <awills> no, not yet... though I had a plan for that i hope to persue at some point
[17:35:53 EDT(-0400)] <EricDalquist> and I just realized a big problem with putting the engine caching in the ScriptTask and ScriptPhrase
[17:36:05 EDT(-0400)] <EricDalquist> they have no idea what type of ScriptEngine they are using
[17:36:10 EDT(-0400)] <awills> per-task engines?
[17:36:18 EDT(-0400)] <EricDalquist> so they can't know how to create a cache key
[17:36:31 EDT(-0400)] <EricDalquist> they don't know that they need a .groovy suffic
[17:36:36 EDT(-0400)] <EricDalquist> suffix*
[17:36:58 EDT(-0400)] <EricDalquist> not unless we ditch the ScriptEngineTask and move that logic into the ScriptTask and ScriptPhrase
[17:37:16 EDT(-0400)] <EricDalquist> but then it isn't really possible to have a nice configurable ScriptTask unless we do that
[17:37:18 EDT(-0400)] <awills> each script task or phrase will only ever work w/ one type of engine, i should think
[17:37:25 EDT(-0400)] <EricDalquist> yes
[17:37:29 EDT(-0400)] <EricDalquist> but how would a global cache work?
[17:37:41 EDT(-0400)] <EricDalquist> Say I'm find with one groovy ScriptEngine for the whole run
[17:37:47 EDT(-0400)] <EricDalquist> how does a ScriptTask store that in the cache?
[17:38:01 EDT(-0400)] <EricDalquist> It has no idea what to look for in the cache
[17:38:22 EDT(-0400)] <EricDalquist> since it doesn't have a refrence to the ScriptEngine and so can't figure out the type of engine it is supposed to be using
[17:39:37 EDT(-0400)] * agherna_ (n=agherna@ras53.admin.uillinois.edu) has joined ##uportal
[17:39:53 EDT(-0400)] <awills> perhaps the ScriptEnginePhrase could store it in the CACHE
[17:40:15 EDT(-0400)] <EricDalquist> yes
[17:40:19 EDT(-0400)] <EricDalquist> that is possible
[17:40:20 EDT(-0400)] <awills> looking quickly, it seems the ScriptTask always calls the ScriptEnginePhrase to get an engine
[17:40:29 EDT(-0400)] <EricDalquist> but then we loose all ability to configure how it is cached
[17:40:33 EDT(-0400)] <EricDalquist> we have to hard code that
[17:40:51 EDT(-0400)] <EricDalquist> instead of allowing the script writer using a ScriptTask if they want their ScriptEngine cached or not
[17:42:22 EDT(-0400)] <awills> that's how it works now... there's no choice... i bet i can cross that bridge when I come to it
[17:42:32 EDT(-0400)] <EricDalquist> ok
[17:42:40 EDT(-0400)] <EricDalquist> I'll just implement the caching in ScriptEnginePhrase then
[17:43:26 EDT(-0400)] <awills> in that case, you might have to use the CACHE Map on a class- (instead of an instance-) level
[17:43:49 EDT(-0400)] <awills> which should be fine, as long as the details are minded
[17:43:55 EDT(-0400)] <EricDalquist> in which case?
[17:44:47 EDT(-0400)] <awills> maybe not actually... do all ScriptTasks reference the same ScriptEnginePhrase instance? they may
[17:45:17 EDT(-0400)] <EricDalquist> I'm not sure
[17:45:29 EDT(-0400)] <awills> me neither :? not atm
[17:45:56 EDT(-0400)] <awills> yes, i think they would
[17:46:21 EDT(-0400)] <awills> the instance of ScriptEnginePhrase would be created in the grammar and stored as a part of the grammar entry
[17:46:33 EDT(-0400)] <EricDalquist> ok
[17:46:53 EDT(-0400)] <EricDalquist> so can I use the Phrase reference in ScriptTask as part of my cache key?
[17:47:09 EDT(-0400)] <EricDalquist> since that reference should be to a ScriptEnginePhrase?
[17:47:36 EDT(-0400)] <awills> yes, except that ScriptPhrase will have a different instance i think
[17:48:03 EDT(-0400)] <EricDalquist> ah
[17:48:14 EDT(-0400)] <awills> i think you'd be ok using the ScriptEnginePhrase.class together with the engineName
[17:48:18 EDT(-0400)] <awills> somehow
[17:48:29 EDT(-0400)] <EricDalquist> but the ScriptTask doesn't know the engineName
[17:48:38 EDT(-0400)] <EricDalquist> if I just do the caching in ScriptEnginePhrase it is easy
[17:48:43 EDT(-0400)] <EricDalquist> but much less configurable
[17:49:22 EDT(-0400)] <EricDalquist> oh
[17:49:23 EDT(-0400)] <EricDalquist> if I do
[17:49:24 EDT(-0400)] <EricDalquist> final Map<Object, Object> sharedCache = (Map<Object, Object>) cache.evaluate(req, res);
[17:49:26 EDT(-0400)] <awills> yeah exactly... do the caching in the ScriptEnginePhrase, but use the ScriptEnginePhrase.class with the engineName as a cache key
[17:49:32 EDT(-0400)] <EricDalquist> and there is no cache attribute in the request
[17:49:37 EDT(-0400)] <EricDalquist> does that throw an exception?
[17:51:07 EDT(-0400)] <awills> only if you both (1) specify 'AttributePhrase(Attributes.CACHE)' as the default of 'cache' and (2) you don't specify 'cache' manually in the xml
[17:51:25 EDT(-0400)] <EricDalquist> ok
[17:51:42 EDT(-0400)] <EricDalquist> will it be safe to assume there will always be a CACHE attribute?
[17:51:53 EDT(-0400)] <EricDalquist> or should these be coded to deal with not having a CACHE?
[17:52:01 EDT(-0400)] <awills> certainly it will be safe to do so if we add it to the TaskRunner
[17:52:11 EDT(-0400)] <EricDalquist> ok
[17:52:48 EDT(-0400)] <awills> and you can also do this: AttributePhrase(Attributes.CACHE, new LiteralPhrase(new HashMap<Object,Object>))
[17:53:21 EDT(-0400)] <awills> the 2nd parameter is a default, if the specified request attr isn't present
[17:53:26 EDT(-0400)] <EricDalquist> ok
[17:54:36 EDT(-0400)] <awills> as it stand, Attributes.ORIGIN is guarenteed to be there by the CRN "spec" (such as it is)... we're choosing to treat Attributes.CACHE the same way
[17:55:04 EDT(-0400)] <awills> and like i said, i've been eyeing that enhancement for some time... i think it's good
[17:55:26 EDT(-0400)] <EricDalquist> ensuring it is there would make for simpler code
[17:55:37 EDT(-0400)] <awills> sounds good
[17:55:51 EDT(-0400)] <awills> i think a lot of items will use it in the end
[18:19:40 EDT(-0400)] <EricDalquist> so I have Document, ScriptEngine and Transformer caching enabled
[18:19:48 EDT(-0400)] <EricDalquist> now down to 11.5 seconds for 278 layouts
[18:31:45 EDT(-0400)] <EricDalquist> so the only other big hot-spot I see is looking at compiling scripts
[18:32:52 EDT(-0400)] <EricDalquist> I'll do some more cleanup and get a patch emailed tonight since I'm heading out on vacation for a week starting tomorrow
[18:33:34 EDT(-0400)] <EricDalquist> once the script compilation stuff is working I think we'll be on the order of about 25ms per layout
[18:33:39 EDT(-0400)] <EricDalquist> and that is single threaded
[18:38:40 EDT(-0400)] * awills (n=awills@ras117.admin.uillinois.edu) has left ##uportal
[20:01:13 EDT(-0400)] * lennar1 (n=sparhk@wsip-98-174-242-39.ph.ph.cox.net) has joined ##uportal
[20:51:51 EDT(-0400)] * chrisdoyle (n=chrisdoy@mtw160-1.ippl.jhu.edu) has joined ##uportal
[20:55:08 EDT(-0400)] * lennar1 (n=sparhk@wsip-98-174-242-39.ph.ph.cox.net) has left ##uportal
[21:13:46 EDT(-0400)] * chrisdoyle (n=chrisdoy@mtw160-1.ippl.jhu.edu) has left ##uportal