Date Slipper v1.0

The Problem This Solves

Predicting delivery dates is hard! The further out an estimate is, the less precise we can be about the time period in which it is delivered. Errors compound and the longer a project takes, the more delays affect it. If the project will take 1 hour, we might accurately estimate delivery within a handful of minutes. But if delivery will take 5 months, we cannot estimate to within the same handful of minutes. The solution is to decrease the precision of the estimated timeframe the later you go. But this is much harder than it looks…

Why This Is Difficult

As with most problems, setting up the problem in the right way is 80% of the solution. This is a hard problem to think about until you have the right framework in which assemble a solution. While putting that framework together, you have to know to consider—and be able to address—at least the following concerns:

Loss of precision over time: the longer the timeframe, the less precise the estimate.
Delivery needs to be broken in to discrete timeframes instead of simply ± some range—otherwise the audience naturally computes the middle of the range as a real date.
Projects are very rarely early. Lack of accuracy from an estimate almost always means the delivery will be later, not earlier.
Delivering early is possible and should probably happen more often.
Delivery estimates need to get increasingly more accurate as time goes on so that what was once a far-off estimate with low precision will eventually be a near estimate with high precision. That transition needs to happen smoothly and naturally.
If a delivery is projected to occur near the end of a certain time frame, it is better to estimate that it be delivered in the next time frame. For example, calculated delivery on June 30th should be estimated as Q3, not Q2.

This does not solve…

Deriving the original estimate.
Delivering your product.

Our Solution

Discrete problems:

Create “stepped” timeframes
Determine which timeframe to use for any particular estimate

Create Stepped Timeframes:

The final output we want to display is a textual explanation of the timeframe expected for delivery. Examples of this include “late August” or “mid Q1” or even “March 19”. To accomplish this, we set up arrays of the possible components of each textual timeframe to be combined in various ways:

portions = ["early ", "mid ", "late "]

months = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"]

quarters = ["Q1", "Q2", "Q3", "Q4"]

These components can be combined (with numerals) to represent all of the following types of timeframes:

Day: Mar 19

Partial-Month: early Jan

Partial-Quarter (or Month): late Q2

Quarter: Q3 2014

Year: 2015

We can think of these categories as an array of precision:

precision = [ Day, Partial-Month, Partial-Quarter, Quarter, Year ]

Each of these timeframes corresponds with a level of precision we want to use for estimating delivery. The daily timeframes suggests a level of precision within one day. The partial month timeframes suggests a level of precision to within approximately 1/3 of a month (or a little more than a week). Etcetera.

With the arrays described above (portions, months, and quarters) and numeric representation of days and years, we can assemble any combination of the timeframes described above. The next step is to calculate the proper assemblage.

Consider the following list of numbers:

0, 10, 21, 37, 59, 91, 137, 200, 283, 388

The important feature for our purposes is the distance between these numbers; that is to say, if you take any number and subtract the number before it, these numbers fall into groups related to the timeframes we want to generate.

The first number (0) is a special case. There is nothing before it, and this number alone corresponds with the daily timeframe. Since the daily timeframe is the most exact timeframe possible, there is only one possible category for this timeframe: exact.

The next three numbers (10, 21, 37), when subtracting the number before it, return: 10, 11, 16 (these are the intervals between the numbers). These interval values—as numbers of days—correspond approximately with the number of days in each partial-month timeframe (~10).

The next three numbers (59, 91, 137) have intervals of: 22, 32, 46. These intervals correspond approximately with the number of days in each monthly or partial-quarter timeframe (~31).

The next three numbers (200, 283, 388) have intervals of: 63, 83, 105. These intervals correspond approximately with the number of days in each quarterly timeframe (~90).

Anything higher than those intervals corresponds with the yearly intervals, although after three yearly intervals, we skip to a final category and just admit that the timeframe is “probably never”.

This series of numbers is generated by plugging successive integer values of x into the following equation:

y = 11x - 1.2x2 + 0.53x3

x = 1	y = 0
x = 2	y = 10
x = 3	y = 21
x = 4	y = 37
x = 5	y = 59
…	…

The result is the array of numbers listed above. The index of each item in this array is of special use to us in solving our stepped timeframe problem. The index of each number divided by 3 and rounded up (“ceiling”) is the index into the needed precision. (Note: a partial differential could also be used.)

x =	y =	precision
0	0	daily
1, 2, 3	1	partial-month
4, 5, 6	2	partial-quarter
7, 8, 9	3	quarter
10, 11, 12	4	year
> 12	5	never

Now we have what we need to create an array of timeframes. Our implementations take a functional approach, so we wait to create the actual text until the delivery estimate is ready… which is the next step.

Determine Which Timeframe To Use:

Choosing the proper timeframe is also more difficult than it seems at first. The challenge lies primarily in smoothly slipping to the following timeframe when a delivery estimate falls near the very end of one particular timeframe. We accomplish this with a second function which runs near to the initial curve, but will slip to the next timeframe as the delivery estimate approaches the end of a timeframe.

y = 13x0.113 - 16.5

In this equation, x is the delivery estimate, and when rounded up to either 0 (“max”) or the next whole number (“ceiling”), y is an index into our “intervals” array. The result is another time in days close to the delivery estimate.

We can take the larger of the two numbers as the appropriate timeframe to use for the final estimate. The number that is used is the original delivery estimate in most cases, unless that would fall near the end of a calculated timeframe. If it is near the end, the larger value returned from “intervals” is used and the final result is the next timeframe.

For example:

delivery	y	intervals(y)	precision	final result
5	-0.907	0 days	daily	Feb 22
20	1.737	21 days	partial-month	early Mar
70	4.511	91 days	partial-quarter	mid Q2
160	6.568	200 days	quarter	Q3 2013
700	10.75	681 days	yearly	2015
1200	12.47	1105 days	never	probably never

The end result is a timeframe using a graduated precision which has the expected delivery date comfortably within that timeframe.

rrwright/date-slipper