niccokunzmann/python-recurring-ical-events

bug: rrule is very slow

Closed this issue · 13 comments

rrule is very slow, for a 1.7MB ical evaluating 7days of events with with ~80 recurring events in calendar

for dayidx in range(7):
   recurring_ical_events.of(calendar).at((day.year, day.month, day.day))

I get the following profiling with i5 CPU:

ncalls tottime percall cumtime percall filename:lineno(function)
635 5.837 0.009 5.837 0.009 {method 'read' of '_ssl._SSLSocket' objects} // google calendar download 5.8sec
966170/963055 5.584 0.000 10.756 0.000 rrule.py:774(_iter) // RRULE eval 5.6s
925050 2.324 0.000 2.453 0.000 rrule.py:1276(ddayset)) // RRULE eval 2.3s
35101 1.395 0.000 14.259 0.000 rrule.py:1381(_iter)/ RRULE eval 1.4s
1965581/1033825 0.844 0.000 12.126 0.000 {built-in method builtins.next}
933067 0.798 0.000 0.798 0.000 {built-in method combine}
71890 0.771 0.000 2.094 0.000 parser.py:321(parts)
933058 0.619 0.000 0.619 0.000 {built-in method fromordinal}

I have many old/ended recurring items.
Is ";UNTIL=" checked before iterating RRULE period?

Hi, could you provide the ICS file and the script so people trying this talk about the same results as you do?
This module uses the dateutil. I wonder then, if this is also relevant for them.

Hi, sorry I only have my private calendar.
Others are also saying builtin rrule can be very slow.
https://stackoverflow.com/questions/1336824/python-dateutil-rrule-is-incredibly-slow

Can we prefilter with ;UNTIL= parameter before we pass anything over to rrule?
Otherwise if I have a biweekly task from 2010 it will have to crawl over 10years if it's not smart enough.

Reading the question, it seems that using the between function might be fast.

Also, using rrule.between() to get dates within a given interval is very fast.

Currently, we use the iteration, see

Maybe using between() would speed it up?

rrule only:

  • .at: 9.128sec
  • .between: 9.094sec
    with prefiltering I would expect 0.1sec.

How would prefiltering work?

Also, with using between(), I meant rrule.between, not this module's between function.

eg
FREQ=WEEKLY;UNTIL=20191023;BYDAY=TH;WKST=SU

UNTIL part already parsed in the code:
rule_list = rule_string.split(";UNTIL=")
rule_list[1]

if UNTIL >= datetime.now():
pass event/line over to rrule
else:
ignore event

I think, there are some optimizations which can be taken:

  • change the UNTIL parameter in the string
  • use the rrule.between() function instead of plain iteration (inc=True should be tested)

Also, having a test event would be great. Is it possible that you identify the event which takes so long and post it here with the code which takes long? This way, we can really optimize - at the moment, I am still not sure how to properly address it.
If you like to contribute code, you can also start adding a (failing) test and create a pull request, see the issue template.

Please find test case attached, querying 28days takes 31sec

test.zip

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   206871    2.193    0.000    4.556    0.000 rrule.py:774(_iter)  <-- slowest 2sec
   133784    2.119    0.000   12.167    0.000 recurring_ical_events.py:131(__init__) <-- slow 2sec
   581423    2.023    0.000    2.023    0.000 {method 'replace' of 'datetime.datetime' objects} <-- slow 2sec
   238440    1.611    0.000    7.950    0.000 rrule.py:1381(_iter)
   612392    1.128    0.000    1.965    0.000 caselessdict.py:56(get)
   133784    1.071    0.000    2.799    0.000 recurring_ical_events.py:197(make_all_dates_comparable)
   238440    1.015    0.000   10.964    0.000 recurring_ical_events.py:228(__iter__)
   878548    0.833    0.000    2.481    0.000 recurring_ical_events.py:45(convert_to_datetime)
    34264    0.826    0.000    1.111    0.000 rrule.py:426(__init__)

@mrx23dot, I added your script as a benchmark in #43. If you like, you can help me find the bottle necks again and we can optimize some more. I also adjusted your script so that caching can take place which gave about a quarter more speed.

Isnt that issue solved now?

old v0.1.18b0
old

head
new

I will close it until someone has an idea how to speed it up further. Thanks!