invinst/chicago-police-data

Double-check data in /Clean/May2016 folder

Closed this issue ยท 8 comments

Data was created in pull request #17

@rajivsinclair and @ithinkidunno โ€“ do you have your own attempts at merged / clean data that we can check against @Yahwes's?

DGalt commented

(posted in slack, but figure it should go here too)
So I started looking at this as well - @Yahwes did you do when there was no obvious match. E.g. when there is a nan in the the Complaint_Number column, or if there is a Complaint_Number in one of the sheets in the pair but not other (e.g. 2012 Parties has the complaint number 10521242, but this does not exist in the 2012 Incid sheet)

@DGalt yeah I think doing as much of the work as possible in public is
๐Ÿ‘๐Ÿฝ๐Ÿ‘๐Ÿฝ๐Ÿ‘๐Ÿฝ
On Fri, Jun 10, 2016 at 3:01 PM DGalt notifications@github.com wrote:

(posted in slack, but figure it should go here too)
So I started looking at this as well - @Yahwes https://github.com/yahwes
did you do when there was no obvious match. E.g. when there is a nan in the
the Complaint_Number column, or if there is a Complaint_Number in one of
the sheets in the pair but not other (e.g. 2012 Parties has the complaint
number 10521242, but this does not exist in the 2012 Incid sheet)

โ€”
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#19 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ADD5HUmd1-6D0ZS6R6ZpPzMN8yNR11vlks5qKcKPgaJpZM4Ix1cU
.

(posted in slack, but figure it should go here too)
The 2012 CRID 1052142 is the only one that was in "parties" but not in "incid". In that case, the values I pulled in for the 13 columns in "incid" only (Beat:Incident_Time_End, CLOSEDATIPRA_DATETIME, Report_Status:Penalty_Status), are left blank.

DGalt commented

Going through @Yahwes's process I get the same thing

๐Ÿ‘๐Ÿฝ๐Ÿ‘๐Ÿฝ

Anything else we should document here?

@Yahwes what is a .feather file?
On Fri, Jun 10, 2016 at 5:30 PM DGalt notifications@github.com wrote:

Going through @Yahwes https://github.com/yahwes's process I get the
same thing

โ€”
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#19 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ADD5HTlGclmwX9xmt3nsXWlP0JH2WIadks5qKeWXgaJpZM4Ix1cU
.

It's the new Python/R dataframe format.

Gotcha, thanks! @Yahwes
On Fri, Jun 10, 2016 at 5:52 PM yahwes notifications@github.com wrote:

It's the new Python/R dataframe format.

โ€”
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#19 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ADD5HfD32j66tco5H0KHyskkzoD3Yd0Eks5qKeqygaJpZM4Ix1cU
.