UW-xDD/app-template

Will this template force users to use PostGres? Or can it be designed so that users can use a sentence table tsv dump?

Closed this issue · 4 comments

Will this template force users to use PostGres? Or can it be designed so that users can use a sentence table tsv dump?
iross commented

The cobalt guys don't do any postgres stuff with the TSV dumps we've been sending them. They just read/parse the TSV directly. So no, we won't force users to use postgres. But apps that DO use postgres should assume that there's a table of data called [appname]_sentences magically there and waiting for them. For these apps, there should be a setup.py that reads the sample TSV files into the [appname]_sentences table, which run.py will read.

(edit) Obviously this is far from set in stone. The reason I'm thinking of it this way is that my gut feeling is that it'd be wasteful to do a (SELECT subset FROM postgres master table) -> (tsv dump) -> (postgres ingest), since the dumping+reading could be pretty slow and looks unnecessary to me.

Ah - I think I see now. So will all apps also have a directory called ./input that has the same data products (but at the full scale, in cases where the test_set < full_set) as you have been sending folks? In other words, would this code work “ with open('./input/bibjson') as fid”

On Mar 2, 2016, at 9:48 AM, Ian Ross <notifications@github.commailto:notifications@github.com> wrote:

The cobalt guys don't do any postgres stuff with the TSV dumps we've been sending them. They just read/parse the TSV directly. So no, we won't force users to use postgres. But apps that DO use postgres should assume that there's a table of data called [appname]_sentences magically there and waiting for them. For these apps, there should be a setup.py that reads the sample TSV files into the [appname]_sentences table, which run.py will read.


Reply to this email directly or view it on GitHubhttps://github.com//issues/5#issuecomment-191296127.

iross commented

Yup, that'll work. The bibjson will always be in ./input. And we can certainly dump all the data products there too, if that's what the user (you) want.

yes, I think that makes a lot of sense, if its not too much trouble. In other words, the environment that we (i.e. you) will create for the full app deployment will be as similar to the dev environment as possible.

On Mar 2, 2016, at 10:47 AM, Ian Ross <notifications@github.commailto:notifications@github.com> wrote:

Yup, that'll work. The bibjson will always be in ./input. And we can certainly dump all the data products there too, if that's what the user (you) want.


Reply to this email directly or view it on GitHubhttps://github.com//issues/5#issuecomment-191321095.