google/lovefield

found a bug with groupBy clause

tim-tai opened this issue · 1 comments

I found this when I was trying to create a query that will return a list of task.

Here is the query

var query = db.select(task.id,
       		task.note,
	      	).
		from(item).
		innerJoin(task, task.id.eq(item.task_id)).
		where(item.auditor_id.eq(123)).
		groupBy(item.task_id);

Here are the sample data for both table, task(master) and item(detail).

var task_data = [{
      'id': 1,
      'initiator_id': 1,
      'schedule_id': 1,
      'status': 0,
      'description': 'Task Example A'
    }, {
      'id': 2,
      'initiator_id': 2,
      'schedule_id': 2,
      'status': 0,
      'description': 'Task Example B'
    }];

var item_data = [{
        'id': 1,
        'task_id': 1,
        'auditor_id': 123,
        'tag_id': 1,
        'asset_name': 'Item A',
        'asset_type_name': 'Type A',
        'description': 'example A'
      }, {
        'id': 2,
        'task_id': 1,
        'auditor_id': 123,
        'tag_id': 2,
        'asset_name': 'Item B',
        'asset_type_name': 'Type B',
        'description': 'example B'
      }
      /*,{
            'id': 3,
            'task_id': 2,
            'auditor_id': 123,
            'tag_id': 3,
            'asset_name': 'Item C',
            'asset_type_name': 'Type C',
            'description': 'example C'
          }*/
    ];

So with the Sample data above the query should just return one row of task id (1) and description (example A), but somehow it return twice.

The result will be correct when third data row(commented) is added, which will return two task id(1, 2) and two description(example A, example B).

Here is the example: jsfiddle

Execution plan:
project(task.id,task.description, groupBy(item.task_id))
-groupBy(item.task_id)
--join(type: inner, impl: index_nested_loop, join_pred(task.id eq item.task_id))
---select(value_pred(item.auditor_id eq 123))
----table_access(item)
---no_op_step(task)

Repro:
https://github.com/arthurhsu/lovefield-ts/tree/bugrepro