Node-DC/Node-DC-EIS

Wrong output data with the latest commit PR #21

Closed this issue · 4 comments

@egalli We are seeing totally difference performance numbers (2x more) compare to the old server code. After inspecting the change in employee_controller.js, I found three sets of changes,

  1. in exports.findAll = function(req, res)
    ...
    Employee.find({}, {
    address: 0,
    compensation: 0,
    family: 0,
    health: 0,
    photo: 0
    })
    ...

  2. exports.findByZipcode = function(req, res)
    ...
    if (!zipcode) {
    ...
    } else {
    Employee.find({'address.zipcode' : zipcode}, {
    compensation: 0,
    family: 0,
    health: 0,
    photo: 0
    })

and 3) exports.findByName = function findByName(req, res) {
Employee.find(query, {
address: 0,
compensation: 0,
family: 0,
health: 0,
photo: 0
})
...

With these three changes, effectively we are sending truncated/incomplete data to the client reducing the data transfer by 3x, which in turn improved the RPS but it's not correct behavior.

Please update the code to fix this issue and issue a new PR.

We were sending less data before the multi-schema to single-schema change. Which is the correct behavior?

I changed it to match the previews version because pug used to only care about a subset data. Should the pug version be changed to send the extra data?

@egalli Earlier we were sending data collected from all models using 'async' feature, there was not reason not too. The database schema was arranged to mimic relational model with employee_id as a foreign key. This is also the correct behavior. So when you query an employee data by 'id', 'last_name' or 'zipcode', expectation is to get the complete record set for the found employees with those criteria. Since there was no master-detail view while sending json object. This might be true in PUG case, but I don't see any need to change that for PUG.
One last thing, using 'last_name' and 'zipcode' ratios in the database loader we have already limited the record creation and distribution.
Please change the PUG version to match this as well.

I'm slightly confused by your statement. We only used async to query all schema in findById. findByZipcode and findByName did not use async, both just retrieved data from single schema (Address and Employee respectively). My point was that retrieving full objects during during listings was a breaking change. If you feel that is the correct approach, we can change it back. Although, it seems kind of odd to send that much data.

Regarding the pug implementation, do you want me to make it match or not?

@egalli You are right, I verified that in very early version of multi-schema mode, indeed the findByZipcode or findByName didn't pull in complete data set but that stage was still in the development phase waiting to be fixed. But then we made decision to convert into a single schema design after some feedback from Matteo@Nearform. So this error implicitly then got fixed with single schema design. But the intent was always to send complete data set for that query. So please go ahead, address this issue and submit a new PR. Thanks.