Schema feedback
jpmckinney opened this issue · 7 comments
- Why are
needs
andcategories
arrays of objects? Why not simply use arrays of strings like"categories": ["Community", "Education"]
. The current proposal is unnecessarily complicated by using objects. - It is not immediately obvious what
bornAt
means. I would use a term that is not metaphorical. The PROV ontology haswasGeneratedBy
. If you prefer,originatingEvent
would be the most literal. politicalEntity
is too narrow. What if I make an app for an NGO? Why not just use the termtargetOrganization
, which is agnostic to whether the organization is political or not?
If you want to reuse terms from existing vocabularies:
- Instead of
thumbnailUrl
, useimage
from Schema.org. - Instead of
geography
, usegeographicArea
from Schema.org. - Instead of
categories
ortopic
, use the singularcategory
from Description of a Project, used by Python, Mozilla, Freshmeat, etc. Alternatively, usesubject
, from DCMI, the most widespread metadata standard.
Here is the current output of the civic-tech-movement api with some of civic.json's proposed extra fields and your schema.org changes added. Tell me what you think. I feel like there is a lot of duplicate info in there.
You'll notice the name
and description
in my example are different from the github_details
name
and description
. We do this because we need to include redeploys. If no name
or description
is provided with the submitted project list, we use the ones fromgithub_details
.
We put all the github_details
into its own attribute as we don't want to lock ourselves into GitHub projects, though admittedly that's all people seem to be using currently and what we recommend.
The project_needs
attribute finds all the GitHub Issues labeled as project_needs and lists their titles.
The participation
attribute is used to fill the graphs on Chicago's project page.
{
name: "South Bend Voices",
description: "A redeploy of CityVoice for South Bend, IN",
brigade: "Code for America",
wasGeneratedBy : "Code for America",
targetOrganization : "South Bend, IN",
geographicArea : "South Bend, IN",
status : "Deployed",
image : "http://www.southbendvoices.com/assets/cityvoice_icon-fc542c9a630409505811c10c9bb9a8b0.png",
type: "web service",
categories: ["community engagement", "housing"],
link_url: "http://www.cityvoiceapp.com/",
code_url: "https://github.com/codeforamerica/cityvoice",
github_details: {
contributors: [
{
avatar_url: "https://gravatar.com/avatar/bdd8cc46ae86e389388ae78dfc45effe?d=https%3A%2F%2Fidenticons.github.com%2F4b102bf6681e25c44a3c980791826c1f.png&r=x",
contributions: 518,
html_url: "https://github.com/daguar",
login: "daguar",
owner: false,
url: "https://api.github.com/users/daguar"
},
...
],
contributors_url: "https://api.github.com/repos/codeforamerica/cityvoice/contributors",
created_at: "2013-06-06T00:12:30Z",
description: "A place-based call-in system for gathering and sharing community feedback",
forks_count: 12,
homepage: "http://www.cityvoiceapp.com/",
html_url: "https://github.com/codeforamerica/cityvoice",
id: 10515516,
language: "Ruby",
name: "cityvoice",
open_issues: 37,
owner: {
avatar_url: "https://gravatar.com/avatar/ec81184c572bc827b72ebb489d49f821?d=https%3A%2F%2Fidenticons.github.com%2F190ee0a9502204c8340cee81293edbbe.png&r=x",
html_url: "https://github.com/codeforamerica",
login: "codeforamerica",
type: "Organization"
},
participation: [
...
],
project_needs: [ ... ],
pushed_at: "2014-02-21T20:43:16Z",
updated_at: "2014-02-21T20:43:16Z",
watchers_count: 10
}
}
It does seem a bit chatty. What’s the actual benefit of schema.org things here?
@ondrae Is github_details
more or less identical to GitHub's API response, with the exception of project_needs
and participation
? Why not move those out of github_details
to keep that block to only what GitHub's API returns?
Other notes:
- Not every civic tech project will be created by a CfA brigade. Why not rename
brigade
toauthor
(which is also a Dublic Core metadata term)? link_url
is not very descriptive. Do you mean something more likeproject_url
? Or maybe just go withurl
?
To clarify:
- Shouldn't the value of
targetOrganization
be "City of South Bend"? - Shouldn't the value of
wasGeneratedBy
be "Code for America Fellowship", the event that generated this app? In this case, creating a new term likeoriginatingEvent
may avoid confusion.
@migurski The Schema.org points are my last, least important points. That said, if you want to call something a "standard", then you should at least look at what existing standards are out there and reuse terms where possible. Reusing terms with the same meaning makes a new specification easier to understand and adopt. I could have proposed terms from other standards; Schema.org is simply very popular. There are benefits to standard reuse - what's the benefit to not reusing standards? (If you instead described the project as "a specification that we only intend to use at Code for America brigades, with no ambition for adoption by the wider civic tech community," then I wouldn't be here, as the decisions here wouldn't affect me.)
Why not rename brigade to author (which is also a Dublic Core metadata term)?
+1. I think it should be generic as well.
Truthfully, a lot of these fields seem a bit low-value to me. I think the original thought of "the more you make people do the fewer people will do it" is actually spot-on, and so we should evaluate each field that cannot be filled in programmatically in a serious cost-benefit way.
To that end, here are fields I think could be removed reasonably:
wasGeneratedBy : "Code for America",
targetOrganization : "South Bend, IN",
geographicArea : "South Bend, IN",
That said, looking at the actual output from the API at the moment, those don't appear to be in there.
I'm starting to feel like what we need is a working draft table of all the fields, with a few attributes:
- Field name
- Example value
- Required? (yes/no)
- Auto-filled via GitHub if blank?
How do people feel about that idea?
If the use cases and requirements haven't yet been defined or agreed upon, then I think building a table (maybe in the wiki?) will allow for better discussion. I think that just one of geographicArea and targetOrganization is needed; I figure that the use case for that field is to show the geographic distribution of app deployments. wasGeneratedBy seems like it just satisfies people's curiosity; I'm not sure what the use case for it would be.
This is a great discussion. What ever the spec ends up being, it won't be too difficult for the three existing projects - Code for America, Open Gov Hack Night, and BetaNYC - to make a few changes in our JavaScript front ends to support it.
We're trying push the Civic Tech Movement API live by the end of March, so whatever schema is decided on here we'll support in the /projects endpoint.