Citations
Closed this issue · 9 comments
Currently TheBigDB has no way to attribute data to it's source. Is this something that we want to be able to do?
- It would allow increased confidence in our data, because we can prove it came from an authoritative source.
- It seems that most other major data aggregators (Wolfram Alpha, Google Maps, Wikipedia ect.) do this. Not to cargo-cult the matter, but if they cite their sources, shouldn't we?
EDIT: Never mind, I forgot something.
bobtwinkles: I was reticent about having sources when I built it, but your arguments make sense. How sources of a statement should be represented? A line of free text, of about 500 chars?
nickthename: It won't be that way, as you would just put a string inside an array of nodes, a statement has more than that, like periods. It will probably be a field "source".
You're right, I forgot we had meta-data available. Citation data could be represented roughly the same as period data, right? You could either have several fields, as in:
["information", "type", "datum"] period: { from: "1993-01-20 12:00:00", to: "2001-01-20 11:59:59" } author: {bob smith} publish data: {2005} media type: {book} etc.
Or just combine it in one field, like:
["information", "type", "datum"] period: { from: "1993-01-20 12:00:00", to: "2001-01-20 11:59:59" } cite: {bob smith, United States Presidents,book, 2005}
I think there are too many types of sources (e.g. links, book, website article) to even try to represent them in any other way than free-text. I'm basically just wondering about the size of the text field.
The average citation on wikipedia seems to be roughly 120 characters, with the longest normal one stretching into about 200 characters. However, #1 on this page http://en.wikipedia.org/wiki/Kenya#References demonstrates the need for a reasonably large size (500+ chars) to accommodation direct portions of the text, etc.
Also, how will searching by citation work? Lets say I learn that a certain author was exposed to be a liar, and wanted to downvote every submission citing him. I guess this could be handled by Search(*).with(cite: { John Phoney *}, but it's important we consider how we want that to work.
Yep, that's probably what's going to happen. Will close this issue once it is implemented.
Should we also define a recommended citation format, as a guideline? I.E MLA, APA, ect. Most of these formats are hard to get right though.
It seems like most logical would be some system wherein each citation has name, date, title, page number if needed, etc, in some set order, but flexible based on the citation type. It doesn't seem like our controls will be as tight as something like wikipedia, so if we want people to cite sources the citation method would have to be fairly intuitive and flexible. Something along the lines of:
"John Smith, A Book About a Topic, 2007, entropy house, pg 77."
With "Name, Title, Date, Publisher, Page/URL, etc" or something similar as a generic style.
Closing Github issues, as we got a brand new website (with discussions) to play with: http://thebigdb.com/ !