`VEVENT` duration not handled correctly in case of a DST change between `DTSTART` and `DTEND`
Closed this issue · 22 comments
The following test fails, because there is a DST change between DTSTART and DTEND, which is not handled correctly:
[Test]
public void TestGetOccurrence_DtEndExcluded()
{
var ical = """
BEGIN:VCALENDAR
VERSION:2.0
PRODID:Video Wall
BEGIN:VEVENT
UID:VW6
DTSTAMP:20240630T000000Z
DTSTART;TZID=Europe/London:20241001T000000
DTEND;TZID=Europe/London:20241128T000000
SUMMARY:New home speech.mp4
COMMENT:New location announcement; may need update before Thanksgiving
END:VEVENT
END:VCALENDAR
""";
var calendar = Calendar.Load(ical);
var occurrences = calendar.GetOccurrences(new CalDateTime("20241128T000000", "Europe/London"), new CalDateTime("20250101T000000", "Europe/London")).ToList();
Assert.That(occurrences, Is.Empty);
}
Related issues may appear in cases with recurring events where some of the occurrences do have DST changes between DTSTART and DTEND and others don't.
The following specification from RFC 5545 3.8.5.3 is also related to the topic, because it makes a distinction between the duration being specified via DTEND or via DURATION. Also related to #598
If the duration of the recurring component is specified with the
"DTEND" or "DUE" property, then the same exact duration will apply
to all the members of the generated recurrence set. Else, if the
duration of the recurring component is specified with the
"DURATION" property, then the same nominal duration will apply to
all the members of the generated recurrence set and the exact
duration of each recurrence instance will depend on its specific
start time. For example, recurrence instances of a nominal
duration of one day will have an exact duration of more or less
than 24 hours on a day where a time zone shift occurs. The
duration of a specific recurrence may be modified in an exception
component or simply by using an "RDATE" property of PERIOD value
type.
This is a follow-up of #656. While #656 discusses the case that DTSTART/DTEND are specified in floating time, this issue handles the slightly different case that DTSTART/DTEND specify a time zone.
No, it's not about local time, but floating time:.
The start is a floating a date-only:
var occurrences = calendar.GetOccurrences(new CalDateTime("20241128")).ToList();
A floating time can be converted to a zoned time,
The Value remains the same, only the timezone is set differently.
ToTimeZone
is called from GetOccurrences
.
public IDateTime ToTimeZone(string otherTzId)
{
// That makes the difference:
if (IsFloating) return new CalDateTime(_dateOnly, _timeOnly, otherTzId);
var zonedOriginal = DateUtil.ToZonedDateTimeLeniently(Value, TzId);
var converted = zonedOriginal.WithZone(DateUtil.GetZone(otherTzId));
return converted.Zone == DateTimeZone.Utc
? new CalDateTime(converted.ToDateTimeUtc(), UtcTzId)
: new CalDateTime(converted.ToDateTimeUnspecified(), otherTzId);
}
Does this make sense?
EDIT: Okay for here the value is date-only and remains floating. To convert, it would need a time.
I apologize for causing confusion, by updating the test case in the description multiple times. My assumption regarding #656 was, that it would be fixed with #658, but that it would be likely to persist, if timezones would be specified explicitly. So I intended to create a test, with no floating time involved, so we can analyze this independently. However, in the first version of the description the test case did contain floating time.
Given the updated test case without floating time involved, will it still be fixed by #658?
The test method is now updated as follows, and it fails with one event on CalendarEvent (10/01/2024 00:00:00 +01:00 Europe/London)
, while it should not according RFC 5545 Section 3.6.1,
The case is explicitly mentioned here:
https://github.com/axunonb/ical.net/blob/9870716d38690eec3123510554429df3cc0734fd/Ical.Net/Evaluation/RecurrenceUtil.cs?plain=1#L48C1-L51C1
Looks like something's wrong the endTime in the where section. Couldn't find out why yet.
[Edit]
[Test]
public void GetOccurrenceShouldExcludeDtEndZoned()
{
var ical = """
BEGIN:VCALENDAR
VERSION:2.0
PRODID:Video Wall
BEGIN:VEVENT
UID:VW6
DTSTAMP:20240630T000000Z
DTSTART;TZID=Europe/London;VALUE=DATETIME:20241001T000000
DTEND;TZID=Europe/London;VALUE=DATETIME:20241128T000000
SUMMARY:New home speech.mp4
COMMENT:New location announcement; may need update before Thanksgiving
END:VEVENT
END:VCALENDAR
""";
var calendar = Calendar.Load(ical);
var occurrences = calendar.GetOccurrences(new CalDateTime("20241128T000000", "Europe/London"), new CalDateTime("20250101T000000", "Europe/London")).ToList();
Assert.That(occurrences, Is.Empty); // fails
}
I assume it's not the 'where', it's that dtend is considered to be at 20241128T010000, which is caused by faulty calculations.
Omg, just noted, that the test was still wrong, due to the wrong value type. Fixed again, but probably doesn't make any difference.
Omg, I should have noticed this. But yes, it still fails because the faulty dtend (not the whole where clause)
In RecurringEvaluator
we have
public override HashSet<Period> Evaluate(IDateTime referenceDate, DateTime periodStart, DateTime periodEnd, bool includeReferenceDateInResults)
{
// ...
if (includeReferenceDateInResults && (Recurrable.RecurrenceRules == null || !Recurrable.RecurrenceRules.Any()))
{
rruleOccurrences.UnionWith(new[] { new Period(referenceDate), });
}
// ...
}
where includeReferenceDateInResults
is true
, and the start date gets included, while it should be false
for this test.
From DTSTART and DTEND we have have a change in DST.
Looks like there is no test for for start date <= event period end in `RecurringEvaluator.Evalute()
Why do we get 1 occurrence instead of none:
ical.net/Ical.Net/Evaluation/EventEvaluator.cs
Lines 41 to 65 in add58a9
HashSet<Period> periods
initially do not have theperiod.EndTime
set- The first
foreach
loop determines theperiod.EndTime
using the duration. 11/28/2024 00:00:00 +00:00 Europe/London minus 10/01/2024 00:00:00 +01:00 Europe/London = 58.01:00:00 (58 days and 1 hour) - Adding this duration to the
period.StartTilme
makesperiod.EndTime
1 hour later thanCalendarEvent.DtEnd
and also 1 hour later than thefrom
parameter ofcalendar.GetOccurrences(...)
. - As the
from
parameter date/time is beforeperiod.EndTime
, it is returned as an occurrence.
@minichma Looks like the cause is identified - not sure how a fix could look like.
@minichma If we subtract the difference in zoned time (aka Value
) instead of .AsUtc
this test (and all others we have) succeed.
It does make a certain sense to me, but Is it a viable solution, when DTSTART and DTEND are in different timezones? I guess, in such a case a DURATION must exist...
public virtual TimeSpan GetFirstDuration()
{
if (Properties.ContainsKey("DURATION"))
return Duration;
if (DtStart is not null)
{
if (DtEnd is not null)
{
// before (subtracts AsUtc): return DtEnd.Subtract(DtStart);
return DtEnd.Value - DtStart.Value; // new
}
else if (!DtStart.HasTime)
{
// For cases where a "VEVENT" calendar component
// specifies a "DTSTART" property with a DATE value type but no
// "DTEND" nor "DURATION" property, the event's duration is taken to
// be one day.
return TimeSpan.FromDays(1);
}
else
{
// For cases where a "VEVENT" calendar component
// specifies a "DTSTART" property with a DATE-TIME value type but no
// "DTEND" property, the event ends on the same calendar date and
// time of day specified by the "DTSTART" property.
return TimeSpan.Zero;
}
}
// This is an illegal state. We return zero for compatibility reasons.
return TimeSpan.Zero;
}
Proposal for CalendarEvent.GetFirstDuration()
public virtual TimeSpan GetFirstDuration()
{
if (Properties.ContainsKey("DURATION"))
return Duration;
if (DtStart is { HasTime: false } && DtEnd is null)
{
// For cases where a "VEVENT" calendar component
// specifies a "DTSTART" property with a DATE value type but no
// "DTEND" nor "DURATION" property, the event's duration is taken to
// be one day.
return TimeSpan.FromDays(1);
}
// we can't calculate a duration
if (DtStart == null || DtEnd == null) return default;
// For NodaTime, we need a time zone to calculate the duration
if (DtStart.IsFloating || DtEnd.IsFloating)
{
return DtEnd.Value - DtStart.Value;
}
// Handle time zones and DST transitions.
var startZoned = DateUtil.ToZonedDateTimeLeniently(DtStart.Value, DtStart.TzId);
var endZoned = DateUtil.ToZonedDateTimeLeniently(DtEnd.Value, DtEnd.TzId);
// Nominal Duration:
// This is the duration calculated based on the local date and time values, ignoring time
// zones and DST transitions. It represents the difference in calendar time.
var duration = NodaTime.Period.Between(startZoned.LocalDateTime, endZoned.LocalDateTime, PeriodUnits.Seconds).ToDuration();
// Actual Duration:
// This is the duration calculated based on the Instant values,
// which represent a specific point in time in UTC. It takes into account the time
// zones and DST transitions, providing the actual elapsed time between the two points
// in time in UTC.
// It takes into account the time zones and DST transitions.
// var duration = endZoned.ToInstant() - startZoned.ToInstant();
return duration.ToTimeSpan();
}
Yeah, that's a horrible topic. And its why TimeSpan
is not well suited for this purpose at all, because it cannot distinguish between the different parts of the time and not between nominal and exact durations (see the excerpt from the RFC in the description above), but that's a issue throughout the lib. The method GetFirstDuration()
is somewhat unclear too. We should have two methods, GetFirstDurationNominal
and GetFirstDurationExact
instead and decide which one to use where. And given the fact that the days and time part need to be treated differently, we'd even need different return types than Duration
.
I will try to comment in more detail the next days.
I think there are multiple issues involved here. The most basic one is that the current implementation of CalDateTime
results in (t + d - t != d)
if a DST change is involved.
public static IEnumerable<TestCaseData> TestSomethingTestCases { get; } = [
new TestCaseData(new CalDateTime(2024, 10, 27, 0, 0, 0, tzId: null), TimeSpan.FromHours(4)),
new TestCaseData(new CalDateTime(2024, 10, 27, 0, 0, 0, tzId: CalDateTime.UtcTzId), TimeSpan.FromHours(4)),
new TestCaseData(new CalDateTime(2024, 10, 27, 0, 0, 0, tzId: "Europe/Paris"), TimeSpan.FromHours(4)),
];
[Test, TestCaseSource(nameof(TestSomethingTestCases))]
public void TestSomething(CalDateTime t, TimeSpan d)
{
Assert.Multiple(() =>
{
Assert.That(t.Add(d).Subtract(d), Is.EqualTo(t));
Assert.That(t.Add(d).Subtract(t), Is.EqualTo(d));
});
}
My first thought would be that we should probably remove all the arithmetic operator overloads, because they are ambiguous, and replace the Add*
and Subtract*
with these methods:
AddNominal(TimeSpan)
AddExact(TimeSpan)
SubtractNominal(TimeSpan)
SubtractExact(TimeSpan)
SubtractNominal(IDateTime)
SubtractExact(IDateTime)
where the 'Nominal' variants can only be used if both values have the same tzId set.
The `GetFirstDuration´ should probably be replaced by something like
public struct EventDuration(TimeSpan nominalPart, TimeSpan exactPart)
{
public TimeSpan NominalPart => nominalPart;
public TimeSpan ExactPart => exactPart;
}
public EventDuration GetEffectiveDuration() {
...
}
The GetEffectiveDuration
method could then also account for the different handling required by the RFC if the duration is specified via DURATION vs via DTEND. But all in all would certainly require some additional considerations.
(t + d - t != d)
if a DST change is involved
Great finding
remove all the arithmetic operator overloads from
CalDateTime
Agree
'Nominal' operation variants can only be used if both values have the same tzId set.
Isn't this VEVENT
definition also valid and Nominal
should be used in case of GetFirstDuration
?
DTSTART;TZID=America/New_York;VALUE=DATETIME:20241001T000000
DTEND;TZID=Europe/London;VALUE=DATETIME:20241128T000000
'Nominal' operation variants can only be used if both values have the same tzId set.
Isn't this
VEVENT
definition also valid andNominal
should be used in case ofGetFirstDuration
?DTSTART;TZID=America/New_York;VALUE=DATETIME:20241001T000000 DTEND;TZID=Europe/London;VALUE=DATETIME:20241128T000000
Yes, this is valid, but to my understanding no, this is not supposed to use nominal time, but exact time according to the RFC
If the duration of the recurring component is specified with the
"DTEND" or "DUE" property, then the same exact duration will apply
to all the members of the generated recurrence set.
This is somewhat counterintuitive if the TZ is the same for DTSTART and DTEND, but if the TZ is different like in your example, then there is no meaningful way of calculating a nominal duration, so this probably is the reason for exact duration being specified in this case. GetFirstDuration
is not well defined regarding whether it returns an exact or a nominal duration, which should be improved. When introducing it, I basically just preserved the previous behavior but this should certainly be improved, either by having two variants, one for nominal, one for exact time, or by returning something that can express both (see the drafted EventDuration
above).
no, this is not supposed to use nominal time, but exact time according to the RFC
This was what I not assumed, before looking around what others are doing.
The duration is always 5 hours
PHP
/* Run online with https://onlinephp.io/ */
<?php
$start = new DateTime('2024-10-01 00:00:00', new DateTimeZone('America/New_York'));
$occurrences = [];
$interval = new DateInterval('P1D'); // Assuming daily occurrences
for ($i = 0; $i < 60; $i++) { // Check for 60 days to cover the DST change
$occurrence_start = clone $start;
$occurrence_start->add(new DateInterval('P' . $i . 'D'));
$occurrence_end = clone $occurrence_start;
$occurrence_end->setTimezone(new DateTimeZone('Europe/London'));
$occurrences[] = [
'start' => $occurrence_start->format('Y-m-d H:i:s T'),
'end' => $occurrence_end->format('Y-m-d H:i:s T')
];
}
echo "Occurrences around DST change:\n";
foreach ($occurrences as $occurrence) {
echo "Start: " . $occurrence['start'] . " - End: " . $occurrence['end'] . "\n";
}
?>
Output:
Occurrences around DST change:
Start: 2024-10-04 00:00:00 EDT - End: 2024-10-04 05:00:00 BST
Start: 2024-10-05 00:00:00 EDT - End: 2024-10-05 05:00:00 BST
...
Start: 2024-10-27 00:00:00 EDT - End: 2024-10-27 04:00:00 GMT # DST ends in Europe
Start: 2024-10-28 00:00:00 EDT - End: 2024-10-28 04:00:00 GMT
...
Start: 2024-11-03 00:00:00 EST - End: 2024-11-03 05:00:00 GMT # DST ends in the US
Start: 2024-11-04 00:00:00 EST - End: 2024-11-04 05:00:00 GMT
Python:
from datetime import datetime, timedelta
import pytz
# Define start time with time zone
start = datetime(2024, 10, 1, 0, 0, 0, tzinfo=pytz.timezone('America/New_York'))
occurrences = []
interval = timedelta(days=1) # Assuming daily occurrences
for i in range(60): # Check for 60 days to cover the DST change
occurrence_start = start + i * interval
occurrence_end = occurrence_start.astimezone(pytz.timezone('Europe/London'))
occurrences.append({
'start': occurrence_start.astimezone(pytz.timezone('America/New_York')).strftime('%Y-%m-%d %H:%M:%S %Z%z'),
'end': occurrence_end.strftime('%Y-%m-%d %H:%M:%S %Z%z')
})
print("Occurrences around DST change:")
for occurrence in occurrences:
print(f"Start: {occurrence['start']} - End: {occurrence['end']}")
Output:
Occurrences around DST change:
Start: 2024-10-09 00:00:00 EDT-0400 - End: 2024-10-09 05:00:00 BST+0100
Start: 2024-10-10 00:00:00 EDT-0400 - End: 2024-10-10 05:00:00 BST+0100
...
Start: 2024-10-27 00:00:00 EDT-0400 - End: 2024-10-27 04:00:00 GMT+0000 # DST ends in Europe
Start: 2024-10-28 00:00:00 EDT-0400 - End: 2024-10-28 04:00:00 GMT+0000
...
Start: 2024-11-03 00:00:00 EST-0500 - End: 2024-11-03 05:00:00 GMT+0000 # DST ends in the US
Start: 2024-11-04 00:00:00 EST-0500 - End: 2024-11-04 05:00:00 GMT+0000
...
@minichma Forgive me for dealing with “nominal” and “exact” once again, and being verbose.
Nominal vs accurate in terms of duration
RFC 5545 3.3.6 says: "The format can represent nominal durations (weeks and days) and accurate durations (hours, minutes, and seconds)." So it is just description of the duration, that is accurate or nominal. dur-day
and dur-week
are inaccurate descriptions and often found in colloquial language.
So when we have
BEGIN:VEVENT
UID:uid1@example.com
DTSTAMP:20240331T120000Z
DTSTART;TZID=Europe/Berlin:20240328T010000
DURATION:P7W
RRULE:FREQ=DAILY;COUNT=5
SUMMARY:Weekly Reminder
END:VEVENT
although the duration is expressed nominal (in RFC terms), it still ends up in accurate occurrence durations that consider the actual start and end times, including any DST changes:
- Start: 2024-03-28T01:00:00 (Europe/Berlin, UTC+01:00)
- End: 2024-05-16T01:00:00 (Europe/Berlin, UTC+02:00) - Accurate duration includes DST change.
The method GetFirstDuration()
should use the difference End - Start
as the result (we should change the method name), because the applicable timezone (including DST changes) is set in the next step when determining occurrences.
We have such a helper method because of RFC 5545 3.6.1 "The "DTSTART" property for a "VEVENT" specifies the inclusive start of the event. For recurring events, it also specifies the very first instance in the recurrence set. (...)"
Currently, the problem comes from
that calculates the real-world, accurate duration, while is should just use the pure date/time difference. That's the cause of wrong occurrences.
This being said, my conclusion also was, that CalDateTime
arithmetic operations should always calculate the real-world, accurate duration (as they do now, except for the bug tracked with #670).
It could well be that we have talked past each other so far because of the word "duration" having different meanings in section 3.6.1 and GetFirstDuration()
and CalDateTime
arithmetic operations.
Does all this make sense to you?
Does all this make sense to you?
Yes absolutely. Probably I was somewhat unclear and we are also mixing two related but different topics.
-
The core of the problem described in this ticket is what is tracked in #670. It should be fixed by making sure that all
CalDateTime.Subtract
and.Add
variants do their calculations the same way, i.e. either in exact time (i.e. after converting to UTC) or in nominal time (i.e. calculating in the given timezone without any tz conversion). Whether we use exact or nominal doesn't matter in case of a single-date event like the one in the issue's description (except for ambiguous or non-existing timestamps). -
The other problem that we started discussing is the topic around how to deal with nominal vs exact in the case of recurring events. There it makes a difference whether calculations are done in UTC or in local time, if some recurrences span DST changes while others don't. That's not the problem described in this ticket though, so it should probably be tracked independently. The first step of dealing with it was taken in #574.