AwkLib is a small nodejs library for processing record oriented files like csv.
npm install awklib
Once you install the library, you can start using it by loading require function.
/*
sample.csv
xyz,abc,10
abc,abc,10
*/
var AwkLib = require("awklib");
var options = {
files: [ "sample.csv" ],
columns: ["first_name", "last_name", "age" ],
header: false,
delimiter: ','
};
var p = new AwkLib(options);
p.on("begin", function(obj){
this.s.totalAge = 0;
});
p.on("line", function(obj){
this.s.totalAge += parseInt(obj.crow.age); // crow => current row.
});
p.on("end", function(obj){
console.log("totalAge: " + this.s.totalAge); // output will be 20
});
p.process();
options object is made up of four properties:
- files
It is an array of file names which you want to process.
- columns
It is an array of column names which is used to identify fields in each line of the file. If you do not specify any column names then you can pick fields based on thier position like $0 for first field, $1, $2.. etc.
- header
To indicate whether file contains header or not. If set to true, header line will be skipped and no line event will be fired. default is set to false.
- delimiter
delimiter is the character which separates fields in a line of the given file. default is set to space.
Once options object is built, you can use it to instantiate AwkLib Object. options object is a must for AwkLib object to instantiate. AwkLib object is an Event Emitter and emits following three events.
-
begin
This event is emitted once before the first line event for each file.
-
line
This event is emitted whenever a line is seen.
-
end
This event is emitted when processing a file is finished.
Each of this above events callbacks recieves an object which is local to the current file being processed and is shared across all these events. This local object contains a property called crow which represents the row that is being processed. You can reference columns or fields via this crow object. You can also use it to set some properties to gather some statistics like totalAge, totalPopulation, totalRunsScored, average etc.
There is one more object called s(shared) which is visible across all files and is a property of AwkLib object. You can use it to gather statistics across all the files.
/*
population files.
asia.csv
india,100
chine,200
srilanka,50
europe.csv
russia 70
germany 60
*/
var AwkLib = new AwkLib("awklib");
var options = {
files: [ "asia.csv", "europe.csv" ],
header: false,
delimiter: ','
};
var p = new AwkLib(options);
p.on("begin", function(obj){
// this.s is the shared object across all files, shared across all the events[begin, line, end ].
if(this.s["totalPopulation"] === undefined)
this.s.totalPopulation = 0;
obj.cpopulation = 0; // obj is the local object for the file currently being executed. shared across events [begin, line, end]
});
p.on("line", function(obj){
obj.cpopulation += parseInt(obj.crow.$1);
this.s.totalPopulation += obj.cpopulation;
});
p.on("end", function(obj){
// obj.currentFile holds, name of the file currently being processed.
console.log("Total population for file: " + obj.currentFile + ": " + obj.cpopulation);
if(obj.currentFile == "europe.csv")
console.log("Total Population: " + this.s.totalPopulation);
});
p.process();