Timezone not applied to timeBucket without any split
longweiquan opened this issue · 4 comments
Hello Team Plywood,
When I use timeBucket to split the results by P1H with a timezone and without any other splits, the timezone is not applied to the timestamp.
And when I use timeBucket to split the results by P1H with the same timezone and with other splits, the timezone is applied to the timestamp.
So, the behaviours are different for timeBucket when there is or not a split. Would you please check this problem, please ?
Hi! Could you post the query you're making please? Thanks!
Hello Team Plywood,
I can not reproduce problem with the static data source, with static data, the date of the result is always in UTC
Would you please tell me the correct behaviour? Does plywood return always UTC or the formatted date with timezone ?
here is my test code, it might help you answer my question
const plywood = require('plywood');
const timezone = require('chronoshift/lib/walltime/walltime-data');
const ply = plywood.ply;
const $ = plywood.$;
const Dataset = plywood.Dataset;
const chronoshift = plywood.Chronoshift;
const walltime = chronoshift.WallTime;
if(!walltime.rules) walltime.init(timezone.rules, timezone.zones);
const context = {
data: Dataset.fromJS({
data: [
{
time: new Date('2016-01-01T01:00:00.000Z').getTime(),
ad: '1',
type: 'imp'
},
{
time: new Date('2016-01-01T02:00:00.000Z').getTime(),
ad: '1',
type: 'imp'
},
{
time: new Date('2016-01-01T03:00:00.000Z').getTime(),
ad: '1',
type: 'imp'
}
],
attributes: [
{name: 'time', type: 'TIME'},
{name: 'ad', type: 'STRING'},
{name: 'type', type: 'STRING'}
]
})
};
function execute(ex, message, done){
console.log(message);
ply()
.apply('data', ex)
.compute(context)
.then((result) => {
const data = result.data[0].data.data;
data.forEach((row) => {
console.log('start:' + row.hour.start);
console.log('end:' + row.hour.end);
});
done();
}).done();
}
execute($("data").split({
ad: '$ad',
hour: '$time.timeBucket("PT1H", "America/New_York")'
}), 'with ad split', () => {
execute($("data").split({
hour: '$time.timeBucket("PT1H", "America/New_York")'
}), 'without ad split', () => {});
});
Hello Team Plywood,
I think I found the problem,
when there's only timeBucket split (or dimensions), the request body for druid is
...
queryType: 'timeseries',
granularity: { type: 'period', period: 'PT1H', timeZone: 'Europe/Paris' },
...
and when there's timeBucket split and default split, the request body for druid is
"queryType":"groupBy",
"granularity":"all",
"dimensions":[
{"type":"default","dimension":"ad","outputName":"ad"},
{"type":"extraction","dimension":"__time","outputName":"hour","extractionFn":
{"format":"yyyy-MM-dd'T'HH':00","timeZone":"Europe/Paris","type":"timeFormat"}
}
]
You will see that the generated requests are different, the first one do not return hour
so the result is resolved by plywood based on timestamp, the second one return hour
with time shift applied by druid.
For more information, for the second case, the following functions are called in plywood. the function timeRangeInflaterFactory doesn't apply timezone to date.
external.timeFloorToExtraction (duration: Duration, timezone: Timezone)
druidExternal.timeRangeInflaterFactory(label: string, duration: Duration, timezone: Timezone): Inflater {
return (d: any) => {
var v = d[label];
if ('' + v === "null") {
d[label] = null;
return;
}
var start = new Date(v); // new Date will use the timezone of client instead of input timezone
d[label] = new TimeRange({ start, end: duration.shift(start, timezone) })
};
}
Released in 0.15.13