unitedstates/wish-list

Data release calendar

dwillis opened this issue · 13 comments

On the NICAR-L listserv we've been kicking around the idea of building a calendar of annual data releases from government agencies that we can count on on a regular basis. Some proposed metadata:

Dataset Name
Department or Agency
State or Federal
State (if applicable)
Month Released
Typical Release Date
Last Updated
Update Frequency
Formats Available
Link
Notes

A couple of questions:

  • Is this within the unitedstates project scope?
  • What is the best way to collect/store/publish this information?

I'm happy to move discussion of this here. I guess the first question has to be, is this the place for it? OR do we create our own separate git for this?

I think there are some natural advantages to having it here, since the project has some other efforts that are essentially manual crowd-sourcing and collation. But curious what others think.

Yeah, I'm fine with keeping it here. But if prevailing wisdom pushes it to another Git, that's good by me too.

Nailing down schema and standardized means of crowdsource input are what I see as the first two goals.

ghing commented

Over at @newsapps we've been talking about this internally. Would love to share in the lifting. Makes sense to do this in @unitedstates because it already has a per-state structure that would let orgs focus on their geographies of interest.

Great! @esagara suggested adding a methodology link to the schema as well, which I think makes sense.

Re: Collection. It strikes me that some sort of standardized form would probably work best for this. That way we can insure data coming in fits our needs.

Plenty of room to build, but if we're building proof of concept ahead of NICAR-L, a simple, user-friendly base is a must to me.

I think it's very much within the scope of @unitedstates and would love to support it here. I think @dwillis has Owner permissions and can do whatever, but let me know how I can help.

I also think this is a terrific project idea. A community calendar of data events spread across the government(s) would be useful even to the government itself. :)

While it's probably best to use a human writable diff-friendly format, I could also see a script that creates an iCal version of the end result, and/or uses Travis CI to publish the iCal version automatically to a permanent URL.

cc @philipashlock

I like the iCal idea. I'd love to put out the raw data feed as well as a simple ready-made display for those who want it.

So...what's next?

@dwillis You interested in starting a repo? Or was there a particular person on the NICAR list that was spearheading it?

@konklone Sure, I'll start one.

Hi I'm in. iCal is cool - will you want to include just US data, or is there the ability to add in world-wide datasets?

We haven't really discussed scope just yet. Would love to see it expand
worldwide, but I think proof-of-concept will likely be confined to U.S. No
reason we couldn't go there though.

On Wed, Jan 13, 2016 at 7:03 AM, meenat123 notifications@github.com wrote:

Hi I'm in. iCal is cool - will you want to include just US data, or is
there the ability to add in world-wide datasets?


Reply to this email directly or view it on GitHub
#22 (comment)
.

Stephen Stirling
Staff Writer
Computer Assisted Reporting Team
The Star-Ledger
Office: 973.392.4174
Cell: 908.720.5363

For reference, I grabbed some of the Census data in iCal. Here's the field
listing:

Subject Text
Start Date Date/Time
Start Time Date/Time
End Date Date/Time
End Time Date/Time
All day event True/False
No End Time True/False
Organizer Text
Phone Text
Email Text
Categories Text
Description Text
Location Text
Street Text
City Text
State Text
Zip Code Text
Country Text
More Info Text
User Id Num
Event Id Num
Repeat Id Num

On Wed, Jan 13, 2016 at 1:11 PM, Stephen Stirling <stephenstirling@gmail.com

wrote:

We haven't really discussed scope just yet. Would love to see it expand
worldwide, but I think proof-of-concept will likely be confined to U.S. No
reason we couldn't go there though.

On Wed, Jan 13, 2016 at 7:03 AM, meenat123 notifications@github.com
wrote:

Hi I'm in. iCal is cool - will you want to include just US data, or is
there the ability to add in world-wide datasets?


Reply to this email directly or view it on GitHub
#22 (comment)
.

Stephen Stirling
Staff Writer
Computer Assisted Reporting Team
The Star-Ledger
Office: 973.392.4174
Cell: 908.720.5363

Stephen Stirling
Staff Writer
Computer Assisted Reporting Team
The Star-Ledger
Office: 973.392.4174
Cell: 908.720.5363

So I've gone ahead and made a working test case for NJ for Google and for Ical. Here's the Google Calendar version:

bit.ly/1nyNRlZ

Now, I'm a disorganized mess and a calendar novice, so feel free to chime in and correct me here.

The main problem I ran into is that the formatting is super rigid, which limited my ability to add certain categories. I merged a few of them into the Description category, but the formatting is off.

Thoughts? Also, any further thoughts on how to accept/store live submissions?