Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 16 Next »

[09:23:27 CST(-0600)] <jwennmacher> Hi Eric.

[09:23:45 CST(-0600)] <EricDalquist> hello

[09:24:27 CST(-0600)] <jwennmacher> I was thinking about the event aggregation issue Anthony reported. I think both he and the person at the Apereo conference indicated their raw events table was growing at around 1M rows a day.

[09:25:35 CST(-0600)] <jwennmacher> I'm not sure if this is that there were 1M inserts or the net is a growth of 1M rows. I just did a rough calculation. If the aggregation is running every 20 minutes and it does at most 10,000 aggregations, it will aggregate at most 720,000 rows in a day.

[09:26:02 CST(-0600)] <EricDalquist> well it runs ever 60 seconds

[09:26:19 CST(-0600)] <EricDalquist> and it catch-up mode (like there were more events to aggregate than aggregated) it should run continueously

[09:26:25 CST(-0600)] <EricDalquist> the problem could be aggregation rate

[09:26:40 CST(-0600)] <EricDalquist> one thing I thought about is that I test this all on oracle

[09:26:50 CST(-0600)] <EricDalquist> which has different index behavior than some other DBs

[09:26:59 CST(-0600)] <EricDalquist> like FKs in oracle are indexes

[09:27:07 CST(-0600)] <EricDalquist> but in some other DBs explicit indexes are also needed

[09:27:21 CST(-0600)] <EricDalquist> that is why I'm interested in seeing a chunk from his 4.0.10 log

[09:27:30 CST(-0600)] <EricDalquist> to see if the aggr is running in catch up mode

[09:27:38 CST(-0600)] <EricDalquist> and if it is what the aggregation rate is

[09:28:05 CST(-0600)] <EricDalquist> next steps would be to filter some of the raw events before they persist

[09:28:11 CST(-0600)] <EricDalquist> or just truncate the raw events table

[09:28:20 CST(-0600)] <EricDalquist> and see if it can keep up with a fresh data set

[09:29:04 CST(-0600)] <jwennmacher> Makes sense.

[09:29:31 CST(-0600)] <jwennmacher> In the log he sent it looked like aggregation was running every 20 minutes or am I misinterpreting the log statements??

[09:30:03 CST(-0600)] <EricDalquist> my guess is that it was completing 10k events every 20 minutes

[09:30:13 CST(-0600)] <EricDalquist> it can't run concurrently

[09:30:23 CST(-0600)] <EricDalquist> so if it starts processing

[09:30:51 CST(-0600)] <EricDalquist> and it takes 20 minutes to process the 10k

[09:31:03 CST(-0600)] <EricDalquist> then you only see one aggregation message every 20 minutes

[09:31:50 CST(-0600)] <jwennmacher> How long does it take to process 10K events at your site?

[09:38:30 CST(-0600)] <EricDalquist> let me check ...

[09:39:59 CST(-0600)] <EricDalquist> we're all caught up so I'm trying to find the biggest interval

[09:40:11 CST(-0600)] <EricDalquist> right now its only doing about 1500 events aggregation run

[09:40:15 CST(-0600)] <EricDalquist> since that is all there is to process

[09:40:52 CST(-0600)] <jwennmacher> In the line Aggregated 10000 events created at 15.5280 events/second between 2013-01-10T22:02:11.000Z and 2013-01-10T22:12:55.595Z in 1311626ms - 7.6241 e/s a 0.4910x speedup

[09:41:02 CST(-0600)] <EricDalquist> Aggregated 2249 events created at 18.5868 events/second between 2013-02-19T13:52:37.960-06:00 and 2013-02-19T13:54:38.994-06:00 in 44113ms - 50.9827 e/s a 2.7430x speedup.

[09:41:11 CST(-0600)] <jwennmacher> what is the 15.5280 events/sec calculation mean?

[09:41:26 CST(-0600)] <jwennmacher> and the 7.6241 e/s?

[09:41:57 CST(-0600)] <EricDalquist> so that first number is 10000 / (2013-01-10T22:12:55.595Z - 2013-01-10T22:02:11.000Z)

[09:42:14 CST(-0600)] <EricDalquist> so for the time range the aggr ran events were created at that rate

[09:42:35 CST(-0600)] <EricDalquist> and then the 2nd number is just 10000/1311.626

[09:42:59 CST(-0600)] <EricDalquist> so the 1311626ms it took to process 10k events works out to 7.6241

[09:43:05 CST(-0600)] <EricDalquist> and this is the problem he has

[09:43:35 CST(-0600)] <EricDalquist> Aggregated 2718 events created at 14.9341 events/second between 2013-02-19T14:55:31.684-06:00 and 2013-02-19T14:58:33.908-06:00 in 46410ms - 58.5650 e/s a 3.9216x speedup.

[09:46:43 CST(-0600)] <jwennmacher> I see. Big difference in performance between yours and his.

[09:46:48 CST(-0600)] <EricDalquist> yes

[09:47:06 CST(-0600)] <EricDalquist> I think we might be at the point where he tries truncating the raw events table

[09:47:11 CST(-0600)] <EricDalquist> and seeing if that makes a difference

[09:47:19 CST(-0600)] <EricDalquist> he's on mysql right?

[09:47:29 CST(-0600)] <EricDalquist> I wonder if there are some other performance tweaks needed

[09:47:29 CST(-0600)] <jwennmacher> One interesting thing I see in Anthony's data is that the time reported for aggregation is fairly constant; e.g. 880s - 1311s when processing 10,000 or 2800 events. I could see that possibly being related to db performance.

[09:47:42 CST(-0600)] <EricDalquist> yeah

[09:47:58 CST(-0600)] <jwennmacher> I don't know if he's on mysql

[09:49:03 CST(-0600)] <EricDalquist> I believe he is

[09:49:07 CST(-0600)] <EricDalquist> since we had to fix a db bug

[09:49:11 CST(-0600)] <EricDalquist> for mysql and floats

  • No labels