compute
(now GNU Datamash is a command-line
program to perform textual,numerical,statistical operation on text files.
What's the sum and mean of the values in field 1 ?
$ seq 10 | datamash sum 1 mean 1
55 5.5
Given a file with three columns (Name, College Major, Score), what is the average, grouped by college major?
$ cat scores.txt
John Life-Sciences 91
Dilan Health-Medicine 84
Nathaniel Arts 88
Antonio Engineering 56
Kerris Business 82
...
Sort input and group by column 2, calculate average on column 3:
$ datamash --sort --group 2 mean 3 < scores.txt
Arts 68.9474
Business 87.3636
Health-Medicine 90.6154
Social-Sciences 60.2667
Life-Sciences 55.3333
Engineering 66.5385
-
GitHub Mirror: http://github.com/agordon/datamash
-
Main website: http://www.gnu.org/s/datamash
-
Galaxy Tool Demo: http://computedemo.teamerlich.org/ (GNU Datamash is available on the Galaxy ToolShed).
-
Download and installation: http://www.gnu.org/software/datamash/download/
-
Send questions/bug-reports to the GNU Datamash mailing list: bug-datamash@gnu.org (subscribe or search archive of previous discussions).
Copyright (C) 2014 Assaf Gordon (assafgordon@gmail.com)
GPLv3 or later