uPortal IRC Logs-2013-09-03

[08:35:07 CDT(-0500)] <jgribonvald> Hi here

[08:35:40 CDT(-0500)] <jgribonvald> i'm loking for some feedback about up 4.0.12 in production

[08:36:25 CDT(-0500)] <jgribonvald> because i'm facing some strange problems with a lot of locked threads on one/two methods

[08:54:30 CDT(-0500)] <jgribonvald> someone ?

[09:02:35 CDT(-0500)] <tlev> which methods are they?

[09:04:50 CDT(-0500)] <jgribonvald> ho sorry

[09:06:11 CDT(-0500)] <jgribonvald> it's on method getPortalUID

[09:07:07 CDT(-0500)] <jgribonvald> our problem is that we have 6 tomcat in load-balancing and a maxthread at 500, and all our tomcat instance go on max

[09:15:08 CDT(-0500)] <jgribonvald> tlev ?

[09:15:42 CDT(-0500)] <tlev> Sorry, hmmm, i am not familiar with this method.

[09:16:15 CDT(-0500)] <tlev> When we upgraded to .12 we did have an issue with the first startup, but other than that it has been running smoothly

[09:17:04 CDT(-0500)] <jgribonvald> the things is that we have 250 threads locked on this method and 250 other on an hibernate

[09:19:53 CDT(-0500)] <jgribonvald> exactly we have 250 BLOCKED thread on org.jasig.portal.RDBMUserIdentityStore.getPortalUID ( RDBMUserIdentityStore.java:265 )

[09:20:18 CDT(-0500)] <tlev> Do you have any database issues?

[09:20:28 CDT(-0500)] <jgribonvald> not really

[09:20:33 CDT(-0500)] <jgribonvald> i didn't see any one

[09:20:59 CDT(-0500)] <jgribonvald> the thread state is BLOCKED

[09:22:21 CDT(-0500)] <tlev> it looks like there is a locking mechanism in getPortalUID

[09:22:40 CDT(-0500)] <tlev> but that is by user

[09:22:49 CDT(-0500)] <jgribonvald> ah we have somes yes

[09:22:52 CDT(-0500)] <jgribonvald> SQL Error: 1205, SQLState: 41000

[09:22:55 CDT(-0500)] <jgribonvald> [netocentre1]ERROR [TP-Processor802-F110094m] sept./03 16:21:44,294 spi.SqlExceptionHelper.[] - Lock wait timeout exceeded; try restarting transaction

[09:24:09 CDT(-0500)] <tlev> to me it seems like an issue waiting on a database connection to return something

[09:24:48 CDT(-0500)] <jgribonvald> yes for me too, but i don't understand why for now

[09:28:27 CDT(-0500)] <jgribonvald> this method try to create a user on database

[09:29:46 CDT(-0500)] <tlev> I believe it creates the user if they don't already exist

[09:30:01 CDT(-0500)] <tlev> if they do exist it just returns there uPortal key

[09:38:37 CDT(-0500)] <jgribonvald> ok

[09:39:06 CDT(-0500)] <jgribonvald> else if we have a problem on apache proxy this can increase the number of threads

[09:39:27 CDT(-0500)] <jgribonvald> do you have some productions exemples on a confgiuration with balencer member ?

[09:39:43 CDT(-0500)] <jgribonvald> and on tomcat connector ?

[09:42:41 CDT(-0500)] <tlev> I don't

[09:42:48 CDT(-0500)] <tlev> maybe someone else does?

[09:56:51 CDT(-0500)] <jgribonvald> tlev we have one little improve to push that can obtain a gain of 1s at each connection

[09:57:10 CDT(-0500)] <jgribonvald> a logger that have a bad call

[11:34:15 CDT(-0500)] <jgribonvald> so a feadback to my problem, for those who are interested

[11:34:21 CDT(-0500)] <jgribonvald> tlev maybe

[11:34:38 CDT(-0500)] <jgribonvald> we have a deadlock on our database

[11:39:58 CDT(-0500)] <jgribonvald> I continue

[11:40:16 CDT(-0500)] <jgribonvald> we have too much new users in our database in the first day

[11:41:02 CDT(-0500)] <jgribonvald> we have too much insert of new users at the same day

[11:41:28 CDT(-0500)] <jgribonvald> more than 20k, but insert stopped to 7k after the first deadlock

[11:41:56 CDT(-0500)] <jwennmacher> Does the db tell you where its deadlock is?

[11:42:22 CDT(-0500)] <jgribonvald> the deadlock was caused of the inser of the same user from 2 deferent instance of portal

[11:42:35 CDT(-0500)] <jgribonvald> yes we had the requests in cause

[11:43:20 CDT(-0500)] <jgribonvald> the insert in up_user of the same user_name by 2 portal nearly at the same time, with a difference of few seconds

[11:44:34 CDT(-0500)] <jwennmacher> Normally I'd say it seems like a) it shouldn't happen that way if load balancing is proper, and b) 1st should complete and then 2nd return uid of the one created by the first.

[11:44:43 CDT(-0500)] <jgribonvald> drew wills lighted me a bit on a same problems and database problem of a prevous same problem but on oracle with up 3.2

[11:44:58 CDT(-0500)] <jgribonvald> yes normaly

[11:45:10 CDT(-0500)] <jgribonvald> but problems of database could do it too

[11:45:23 CDT(-0500)] <jgribonvald> or a user connecting with 2 different browser at the same time

[11:45:45 CDT(-0500)] <jgribonvald> we have some special users that can do a such things (wink)

[11:45:52 CDT(-0500)] <jwennmacher> Do you know what caused #1 to not complete?

[11:46:09 CDT(-0500)] <jgribonvald> not really only speculation

[11:46:40 CDT(-0500)] <jgribonvald> but i think it's a higth average of connection

[11:46:50 CDT(-0500)] <jgribonvald> the first day a school begining

[11:47:35 CDT(-0500)] <jgribonvald> we have a higth configured mysql database

[11:48:09 CDT(-0500)] <jgribonvald> and no problem of performance were related, only at the deadlock problem an I/O wait

[11:48:16 CDT(-0500)] <jgribonvald> but nothing before

[11:48:41 CDT(-0500)] <jgribonvald> so maybe an optimize problem as there were too much insert on up_user at the same first hours

[11:50:27 CDT(-0500)] <jgribonvald> and as the table has many contraint the time to optimize the database, between each insert and check this may cause the problem

[11:50:42 CDT(-0500)] <jgribonvald> but we didn't see any special things on our database

[11:50:54 CDT(-0500)] <jgribonvald> in monitoring

[11:51:00 CDT(-0500)] <jgribonvald> before the proble

[11:52:32 CDT(-0500)] <jgribonvald> jwennmacher: have you some knowledges on apache load-balancing configuration with tomcat ?

[11:52:44 CDT(-0500)] <jgribonvald> I'm looking for comparison with our configuration

[11:53:31 CDT(-0500)] <jgribonvald> as it seems that we have really few times some problems with ajp link that enter for few seconds in error

[11:53:32 CDT(-0500)] <jwennmacher> No I do not. @vcrowley might be able to help with that

[11:53:57 CDT(-0500)] <jgribonvald> ok thx

[11:54:09 CDT(-0500)] <jgribonvald> vcrowley: are you around ?

[11:55:21 CDT(-0500)] <jgribonvald> is there a private mail where i can send my conf or get one for comparison ?

[12:16:18 CDT(-0500)] <tlev> Ah, that makes sense why all the blocks then.

[13:44:00 CDT(-0500)] <obbo> hello

[13:44:41 CDT(-0500)] <obbo> does anyone know if it is possible to have per-announcement audience permissions? instead of per-topic audience permissions in the announcement portlet?