/csv-tool

Primary LanguageC++

HOW TO BUILD THE PROJECT:
-----------------------------

The source was built using gcc version 4.8.5, using g++ and -std=c++11 enabled.
To build the project one can simply use the Makefile and run make on the terminal.
BUILD COMMAND:
	make

PURPOSE:
------------------------------

This is a tool that serves the following purpose:
- Parses a CSV file, as per these rules: https://en.wikipedia.org/wiki/Comma-separated_values#Basic_rules, and stored them as a Table. It assumes a valid CSV file, and does not check for it's validity while parsing.
- Select columns from a particular file, and view the stored table. For eg: '*' lets you view all columns of table
- Get statistic on a column from a table that contains numeric data - get median, max, min, average. (It does not take the null rows of the column into the statistic shown) 
- Perform an arithmetic operation on two columns from the same table, and output the result as a new column.
- Sort the rows of a table based on a particular column in ascending order
- Perform a join - inner, left outer, right outer, full outer - on two tables based on a join predicate or the ON clause in sql (eg: table1.col1 == table2.col2)
- Interactive tool, that could add a bunch of tables to the tool, remove table from the tool, and interactively builds the query that needs to be processed.


DESIGN OF PROJECT:
------------------------------

Tool.cpp Options.h: 
Holds the iteractive or client part of the tool which interactively builds the query or stores pointers to the backend tables.

Table.h Table.cpp:
Holds the class that parses a table, stores the table, builds metadata on top of columns when required and caches them for later use.The operations to perform joins and arithmetic ops are specified in this class. Joins -> use the sort, merge/join algorithm => complexity O(nlogn + mlogm) for sorting and  O(nm) for the join [Overall O(nm)] => where n and m are the rows of table1 and table2. 

Metadata.h Metadata.cpp:
Holds the class that defines the metadata for a column which could possibly sort them, and compare them. It also provides statistic over that column

utility.h:
Some utility functions, that contain the merge sort routine as well.


HOW TO RUN:
-------------------------------

As this is a single tool, after building the project, one can run it using the executable tool, and this is an interactive menu driven application.

RUN COMMAND:
	./tool

EXAMPLES:
--------------------------------



-----------------------------------------------------------
----------------TOOL MAIN MENU-----------------------------

Choose from the following options:
1 add new table
2 view table - (you could optionally select columns)
3 get table statistics
4 perform arithmetic operations [+,-,*,/] on two columns row-wise
5 perform join [inner, outer-left,right,full] on two tables
6 remove a table
7 sort rows in ascending order based on column values
8 quit application
-----------------------------------------------------------
Enter number (1-8): 1



Enter filename:	tbl1.csv



Does this table contain a header?(y/n) n



Table ID: 1	Filename: tbl1.csv


-----------------------------------------------------------
----------------TOOL MAIN MENU-----------------------------

Choose from the following options:
1 add new table
2 view table - (you could optionally select columns)
3 get table statistics
4 perform arithmetic operations [+,-,*,/] on two columns row-wise
5 perform join [inner, outer-left,right,full] on two tables
6 remove a table
7 sort rows in ascending order based on column values
8 quit application
-----------------------------------------------------------
Enter number (1-8): 2



-----------------------------------------------------------
List of all existing tables:
Total number of tables: 1
Table ID: 1	Filename: tbl1.csv
-----------------------------------------------------------
==>	-----------------------------------------------------------
==>	Enter table id:
==>	-----------------------------------------------------------
==>	1
==>	-----------------------------------------------------------



==>	-----------------------------------------------------------
==>	Select columns: 
==>	Number of columns in table: 2
==>	Enter space-separated column number or * for all: (ex: 1 2 3 or *)
==>	-----------------------------------------------------------
==>	*
Output result to file? (y/n) n
Rafferty	31	
Jones	33	
Heisenberg	33	
Robinson	34	
Smith	34	
Williams	[NULL]	


-----------------------------------------------------------
----------------TOOL MAIN MENU-----------------------------

Choose from the following options:
1 add new table
2 view table - (you could optionally select columns)
3 get table statistics
4 perform arithmetic operations [+,-,*,/] on two columns row-wise
5 perform join [inner, outer-left,right,full] on two tables
6 remove a table
7 sort rows in ascending order based on column values
8 quit application
-----------------------------------------------------------
Enter number (1-8): 1



Enter filename:	tbl2.csv



Does this table contain a header?(y/n) n



Table ID: 2	Filename: tbl2.csv


-----------------------------------------------------------
----------------TOOL MAIN MENU-----------------------------

Choose from the following options:
1 add new table
2 view table - (you could optionally select columns)
3 get table statistics
4 perform arithmetic operations [+,-,*,/] on two columns row-wise
5 perform join [inner, outer-left,right,full] on two tables
6 remove a table
7 sort rows in ascending order based on column values
8 quit application
-----------------------------------------------------------
Enter number (1-8): 2



-----------------------------------------------------------
List of all existing tables:
Total number of tables: 2
Table ID: 2	Filename: tbl2.csv
Table ID: 1	Filename: tbl1.csv
-----------------------------------------------------------
==>	-----------------------------------------------------------
==>	Enter table id:
==>	-----------------------------------------------------------
==>	2
==>	-----------------------------------------------------------



==>	-----------------------------------------------------------
==>	Select columns: 
==>	Number of columns in table: 2
==>	Enter space-separated column number or * for all: (ex: 1 2 3 or *)
==>	-----------------------------------------------------------
==>	2  
Output result to file? (y/n) n
Sales	
Engineering	
Clerical	
Marketing	


-----------------------------------------------------------
----------------TOOL MAIN MENU-----------------------------

Choose from the following options:
1 add new table
2 view table - (you could optionally select columns)
3 get table statistics
4 perform arithmetic operations [+,-,*,/] on two columns row-wise
5 perform join [inner, outer-left,right,full] on two tables
6 remove a table
7 sort rows in ascending order based on column values
8 quit application
-----------------------------------------------------------
Enter number (1-8): 3



-----------------------------------------------------------
List of all existing tables:
Total number of tables: 2
Table ID: 2	Filename: tbl2.csv
Table ID: 1	Filename: tbl1.csv
-----------------------------------------------------------
==>	-----------------------------------------------------------
==>	Enter table id:
==>	-----------------------------------------------------------
==>	1
==>	-----------------------------------------------------------



Enter the column id(s) to get a statistic on: 
==>	-----------------------------------------------------------
==>	Select columns: 
==>	Number of columns in table: 2
==>	Enter space-separated column number or * for all: (ex: 1 2 3 or *)
==>	-----------------------------------------------------------
==>	2



Output result to file? (y/n) n



-----------------------------------------------------------
Stats for column: 2
-----------------------------------------------------------
Median: 33
Min: 31
Max: 34
Average: 33
-----------------------------------------------------------


-----------------------------------------------------------
----------------TOOL MAIN MENU-----------------------------

Choose from the following options:
1 add new table
2 view table - (you could optionally select columns)
3 get table statistics
4 perform arithmetic operations [+,-,*,/] on two columns row-wise
5 perform join [inner, outer-left,right,full] on two tables
6 remove a table
7 sort rows in ascending order based on column values
8 quit application
-----------------------------------------------------------
Enter number (1-8): 4



-----------------------------------------------------------
List of all existing tables:
Total number of tables: 2
Table ID: 2	Filename: tbl2.csv
Table ID: 1	Filename: tbl1.csv
-----------------------------------------------------------
==>	-----------------------------------------------------------
==>	Enter table id:
==>	-----------------------------------------------------------
==>	1
==>	-----------------------------------------------------------



Select column 1:
==>	-----------------------------------------------------------
==>	Select columns: 
==>	Number of columns in table: 2
==>	Enter column number: (ex: 1)
==>	-----------------------------------------------------------
==>	1



Operator (ex: + or - or * or /):
==>	-----------------------------------------------------------
==>	+



Select column 2:
==>	-----------------------------------------------------------
==>	Select columns: 
==>	Number of columns in table: 2
==>	Enter column number: (ex: 1)
==>	-----------------------------------------------------------
==>	1



Output result to file? (y/n) n



==>	Err: Column is not numeric!
-----------------------------------------------------------


-----------------------------------------------------------
----------------TOOL MAIN MENU-----------------------------

Choose from the following options:
1 add new table
2 view table - (you could optionally select columns)
3 get table statistics
4 perform arithmetic operations [+,-,*,/] on two columns row-wise
5 perform join [inner, outer-left,right,full] on two tables
6 remove a table
7 sort rows in ascending order based on column values
8 quit application
-----------------------------------------------------------
Enter number (1-8): 4



-----------------------------------------------------------
List of all existing tables:
Total number of tables: 2
Table ID: 2	Filename: tbl2.csv
Table ID: 1	Filename: tbl1.csv
-----------------------------------------------------------
==>	-----------------------------------------------------------
==>	Enter table id:
==>	-----------------------------------------------------------
==>	1
==>	-----------------------------------------------------------



Select column 1:
==>	-----------------------------------------------------------
==>	Select columns: 
==>	Number of columns in table: 2
==>	Enter column number: (ex: 1)
==>	-----------------------------------------------------------
==>	2



Operator (ex: + or - or * or /):
==>	-----------------------------------------------------------
==>	+



Select column 2:
==>	-----------------------------------------------------------
==>	Select columns: 
==>	Number of columns in table: 2
==>	Enter column number: (ex: 1)
==>	-----------------------------------------------------------
==>	2



Output result to file? (y/n) n



62
66
66
68
68
[NULL]


-----------------------------------------------------------
----------------TOOL MAIN MENU-----------------------------

Choose from the following options:
1 add new table
2 view table - (you could optionally select columns)
3 get table statistics
4 perform arithmetic operations [+,-,*,/] on two columns row-wise
5 perform join [inner, outer-left,right,full] on two tables
6 remove a table
7 sort rows in ascending order based on column values
8 quit application
-----------------------------------------------------------
Enter number (1-8): 7



-----------------------------------------------------------
List of all existing tables:
Total number of tables: 2
Table ID: 2	Filename: tbl2.csv
Table ID: 1	Filename: tbl1.csv
-----------------------------------------------------------
==>	-----------------------------------------------------------
==>	Enter table id:
==>	-----------------------------------------------------------
==>	1
==>	-----------------------------------------------------------



Select columns to output from table:
==>	-----------------------------------------------------------
==>	Select columns: 
==>	Number of columns in table: 2
==>	Enter space-separated column number or * for all: (ex: 1 2 3 or *)
==>	-----------------------------------------------------------
==>	*



Select col from table based on which you want to sort rows on:
==>	-----------------------------------------------------------
==>	Select columns: 
==>	Number of columns in table: 2
==>	Enter column number: (ex: 1)
==>	-----------------------------------------------------------
==>	2



Output result to file? (y/n) n



Williams	[NULL]	
Rafferty	31	
Jones	33	
Heisenberg	33	
Robinson	34	
Smith	34	


-----------------------------------------------------------
----------------TOOL MAIN MENU-----------------------------

Choose from the following options:
1 add new table
2 view table - (you could optionally select columns)
3 get table statistics
4 perform arithmetic operations [+,-,*,/] on two columns row-wise
5 perform join [inner, outer-left,right,full] on two tables
6 remove a table
7 sort rows in ascending order based on column values
8 quit application
-----------------------------------------------------------
Enter number (1-8): 5



Enter left table:
-----------------------------------------------------------
List of all existing tables:
Total number of tables: 2
Table ID: 2	Filename: tbl2.csv
Table ID: 1	Filename: tbl1.csv
-----------------------------------------------------------
==>	-----------------------------------------------------------
==>	Enter table id:
==>	-----------------------------------------------------------
==>	1
==>	-----------------------------------------------------------



Select columns to output from table1:
==>	-----------------------------------------------------------
==>	Select columns: 
==>	Number of columns in table: 2
==>	Enter space-separated column number or * for all: (ex: 1 2 3 or *)
==>	-----------------------------------------------------------
==>	*  



Enter right table:
-----------------------------------------------------------
List of all existing tables:
Total number of tables: 2
Table ID: 2	Filename: tbl2.csv
Table ID: 1	Filename: tbl1.csv
-----------------------------------------------------------
==>	-----------------------------------------------------------
==>	Enter table id:
==>	-----------------------------------------------------------
==>	2
==>	-----------------------------------------------------------



Select columns to output from table2:
==>	-----------------------------------------------------------
==>	Select columns: 
==>	Number of columns in table: 2
==>	Enter space-separated column number or * for all: (ex: 1 2 3 or *)
==>	-----------------------------------------------------------
==>	2



JOIN TYPE (1: inner or 2: left outer or 3: right outer or 4: full outer): (ex: 1 or 2 or 3 or 4)
==>	-----------------------------------------------------------
==>	2



Enter JOIN predicate columns: ON col1 == col2, where col1 is from table 1 and col2 is from table 2
Select col1 from table1:
==>	-----------------------------------------------------------
==>	Select columns: 
==>	Number of columns in table: 2
==>	Enter column number: (ex: 1)
==>	-----------------------------------------------------------
==>	2
Select col2 from table2:
==>	-----------------------------------------------------------
==>	Select columns: 
==>	Number of columns in table: 2
==>	Enter column number: (ex: 1)
==>	-----------------------------------------------------------
==>	1



Output result to file? (y/n) n



Rafferty	31	Sales	
Jones	33	Engineering	
Heisenberg	33	Engineering	
Robinson	34	Clerical	
Smith	34	Clerical	
Williams	[NULL]	[NULL]	


-----------------------------------------------------------
----------------TOOL MAIN MENU-----------------------------

Choose from the following options:
1 add new table
2 view table - (you could optionally select columns)
3 get table statistics
4 perform arithmetic operations [+,-,*,/] on two columns row-wise
5 perform join [inner, outer-left,right,full] on two tables
6 remove a table
7 sort rows in ascending order based on column values
8 quit application
-----------------------------------------------------------
Enter number (1-8): 8