swcarpentry/good-enough-practices-in-scientific-computing

Clarify 'make each row an observation'

Closed this issue · 12 comments

Make each row an observation: Data often comes in a wide format, because that facilitated data entry or human inspection. Imagine one row per field site and then columns for measurements made at each of several time points. Be prepared to gather such columns into a variable of measurements, plus a new variable for time point.

I don't really understand that last sentence...

Are we going to have any diagrams? Because something like this would help immensely here. This is borrowed from slides by @garrettgman and I suspect we can either get permission to use it or some equivalent.

screen shot 2016-08-01 at 6 58 22 am

The right hand version here is what I mean by "wide". The left hand version is what I mean by "gather into variable of measurements plus a new variable for the time point" (year, in the case of the example). But we could make the diagram and the wording match exactly, obviously.

Lovin' it! I would swap the figures, though. Are more examples possible? I previously suggested looking at Data Carpentry's spreadsheet lesson, in particular http://www.datacarpentry.org/spreadsheet-ecology-lesson/01-format-data.html (and maybe http://www.datacarpentry.org/spreadsheet-ecology-lesson/02-common-mistakes.html).

Are there other diagrams or potential diagrams? If we're good with the general idea, I'll take responsibility for making or getting something like the above (yes @lexnederbragt you're correct about some optimization still being possible).

Re: the spreadsheet lesson. I just taught from that!

I think it's OK to keep a wide spreadsheet -- sometimes it really is easier for data entry and inspection. And then use your analytical tool to reshape. So I'd like to word this in such a way that either is OK but you need to be aware of the different formats and get facile in some way of moving between them.

Hi guys,

There are more diagrams like Jenny's first example (and Jenny's example as
well) sprinkled throughout this online chapter:

http://r4ds.had.co.nz/tidy-data.html

Garrett

On Mon, Aug 1, 2016 at 4:16 PM, Jennifer (Jenny) Bryan <
notifications@github.com> wrote:

Re: the spreadsheet lesson. I just taught from that!

I think it's OK to keep a wide spreadsheet -- sometimes it really is
easier for data entry and inspection. And then use your analytical tool to
reshape. So I'd like to word this in such a way that either is OK but you
need to be aware of the different formats and get facile in some way of
moving between them.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#138 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAFMFrmLVEqV-DPRvA_2Dcb8JvmQUXuIks5qblQGgaJpZM4JZd_o
.

Who'd like to push this over the line?

I can complete this.

@garrettgman Is it easy to provide source that would allow me to make diagrams like yours but swap left and right and match an existing example of ours exactly? I would probably make something like this in Keynote, like an animal. But I can certainly do that!

Or ... I could provide excruciating detail on what I'd like and you could make one diagram for us? We'd of course acknowledge your kind assistance.

But I can cope regardless.

Thanks @jennybc. I just sent the keynote to you via email.

Thanks @garrettgman, my fellow Keynote animal. At least I'll know exactly how to edit it!

Oh @gvwilson I just noticed I failed to snip out just the bit you need from that PNG. Are you in a position to do that or shall I?

I'm not sure what "just the bit you" is - can you please do that and go
ahead and push the change directly (no need for a PR)?

@gvwilson OK done.