/datasciquotes

Data Science related quotes

Data Science quotes

File organization and naming are powerful weapons against chaos.

@JennyBryan

Dear past-Hadley: PLEASE COMMENT YOUR CODE BETTER. Love present-Hadley

@hadleywickham tweet

The attached is similar to the code we used.

— anonymous

Your closest collaborator is you six months ago, but you don't reply to emails.

@gonuke, after @kcranstn tweet quoting @mtholder

I will let the data speak for itself when it cleans itself.

@AllisonReichel tweet

If you can't make changes because you're afraid of breaking something, it's already broken.

Kara Woo, via @franciscoyira@techhub.social toot

Never check data when you are hungry, thirsty, or tired.

@GhazalGulati, via @AmeliaMN tweet

One accurate measurement is worth a thousand expert opinions.

— Grace Hopper, via @NMatasci tweet

Pick a license, any license.

@codinghorror blog post

If it hurts, do it more often.

@martinfowler, via @JennyBryan tweet

You won’t write tests, because they feel like make work, and then you’ll make yourself very sad, and so you’ll start writing tests. As far as I can tell, everyone does this.

@christinacaci blog post

Installation/setup/whatever is always harder and much more poorly documented than mere usage.

@JennyBryan tweet

Open source isn’t free like sunshine. It’s free like a puppy.

@sarahnovotnoy, via @bridgetkromhout tweet

Thou shalt get only as creative with names as thy own skill with regular expressions.

@JennyBryan tweet

The most important tool for Reproducible Research is the mindset, when starting, that the end product will be reproducible.

Keith Baggerly, via @kwbroman tweet

Big Data: (n): the belief that a big enough pile of horseshit will, with probability one, somewhere contain a pony.

@mlipsitch, via @callin_bull tweet

Working with data is not about rules to follow but about decisions to make.

@naupakaz, via @kwbroman tweet

I'm not worried about being scooped, I'm worried about being ignored.

@magnusnordborg, via @BaxterTwi tweet

needs more pvalue

@mikelove tweet

Batch effects are important and they will bollocks you up.

Keith Baggerly, via @kwbroman tweet

Classroom data are like teddy bears; real data are like a grizzly with salmon blood dripping out its mouth.

@JennyBryan, via @sgrifter tweet

like asking how to extract chocolate from meatloaf

@voovarb tweet

It's not that we don't test our code, it's that we don't store our tests so they can be re-run automatically.

@hadleywickham, testthat article

If you use software that lacks automated tests, you are the tests.

@JennyBryan tweet

and I’m still pretty sure some of the data is missing, but it could still be here, in this ONE HUNDRED SHEET excel file

@RallidaeRule tweet

Teach stats as you would cake baking: make a few before you delve into the theory of leavening agents.

@JennyBryan tweet after Joan Strassmann blog post

In theory there is no difference between theory and practice. In practice there is.

— Attributed to Jan L. A. van de Snepscheut; often misattributed to Yogi Berra

The opposite of “open” isn’t “closed”. The opposite of “open” is “broken”.

— John Wilbanks, "The unreasonable effectiveness of open data" (pdf slides)

Let's start the "titanic data" movement. Data too big to fail.

@neilfws with assist from @aaronquinlan, tweet

Behind every wildly successful tool there’s probably a very powerful abstraction.

@JennyBryan tweet

R is a datasmith's heaven-on-earth; I like Python, long term relationship with Excel, quite like Power Query, DAX's a keeper, but I love R.

@tggleeson tweet

Data cleaning code cannot be clean. It's a sort of sin eater.

@StatFact tweet

Well begun is half done.

— Attributed to Aristotle and Mary Poppins

You shouldn't feel ashamed about your code - if it solves the problem, it's perfect just the way it is. But also, it could always be better.

@hadleywickham, via @allimoberger tweet

Last week I told a collaborator to stick the files on a USB drive and walk the 100m across the road rather than figure out inter-institute file sharing.

@PeteHaitch tweet

I don't want it perfect. I want it Thursday.

I. Jack Good, via @SherlockPHolmes, via @kwbroman tweet

Le mieux est l’enemi du bien (Perfect is the enemy of good)

Voltaire, via @SherlockPHolmes

The usethis package implements this important principle: Automate that which can be automated. Your computer was literally born to implement rote-but-fussy stuff for you.

@JennyBryan tweet

Of course someone has to write loops. It doesn’t have to be you.

@JennyBryan slides

that moment when you feel a small surge of satisfaction that something has gone right is the moment to commit

the message says why you're happy

@Corey_Yanofsky tweet