To Do List

Question

To Do List

bransonf opened this issue 5 years ago · 10 comments

Answer 1 · 2019-06-03T13:50:48.000Z

Nice! Progress looks great so far!

Answer 2 · 2019-06-30T01:42:17.000Z

@chris-prener Could you implement the continuous integration for me? I would rather not create travis/appveyor accounts and not sure of best practice implementation.

Answer 3 · 2019-06-30T12:09:02.000Z

yep - I'll do that now

Answer 4 · 2019-06-30T13:37:31.000Z

OK @bransonf - I cleaned up some code and some of the package infrastructure and added a bunch of items to the to-do list. Nothing major, I don't think! Nice work on getting these API calls sorted out.

A couple questions for cxy_geocode:

why is this named 'response'? is there a reason not to overwrite 'split'?
the left_join's performance could be improved if we used one variable instead of 4... can we use id?

A couple questions for cxy_geocoder:

what is the practical impact of using a warning in an internal function... I'm not sure about this
why call methods::as() instead of as.numeric()?

Answer 5 · 2019-06-30T13:49:27.000Z

Also, FWIW, appyveyor is failing because the new dplyr release's binaries aren't out yet and for some reason the source builds are failing. On the Travis side, it is failing on 3.3 and 3.4 because of the serialization issue I noted above.

Answer 6 · 2019-06-30T17:54:30.000Z

What do we do about the appveyor issue?

Reason to not overwrite split, as I understand (Could be wrong about this) is that initializing an empty vector pre-allocates memory for data to be put into. If the function re-wrote to the split object, it would put it in a new section of memory, and it would have to call the new object every iteration in the loop. (Causing an exponential slow down) Even if I'm wrong about that, the worst risk of using another object is using a bit more memory.

I'm not really happy about the left_join part at all. I never found a solution for unique id to row id. I would like to create a list of only unique addresses to send to the geocoder and then join that UID to a regular ID.

Dependency on methods and internal warning are remnants of early development, they've been removed. The error for all Non-Matches takes care of what the internal warning was meant for.

Answer 7 · 2019-06-30T18:23:41.000Z

And a note about the testing, we won't technically get full coverage without using more than 1000 addresses, but I'm confident that if something would fail in the batch process, it would also fail on the smaller data.

Answer 8 · 2019-06-30T20:59:43.000Z

OK - totally fine with split and glad you made those other changes. I made a couple of small tweaks to the unit test and fixed a documentation issued that R CMD check identified.

We're still failing one of the unit tests, which is connected to what happens when the data have a state column but it isn't included in the function call. This is something we need to address. Can you check that out?

That same solution is implemented in parts of both gateway and postmaster for the UID... do you want me to add that or do you want to explore how it works in those packages?

Answer 9 · 2019-06-30T21:01:29.000Z

RE: appveyor and Travis failing - the dplyr binaries should be available early this week and that should resolve any outstanding issues. As long as devtools::check() is doing well locally, I'm not concerned.

Answer 10 · 2019-07-04T20:12:29.000Z

OK - almost ready to fork and submit to CRAN - we just need to let the stack of CI builds wrap up!