When you work with spreadsheets and casual users things do not always go as you want.
I worked on a project where users have to upload a spreadsheet with a lot of data about clients purchases. After 3 upload they call me because the system "did not work well".
The files look like these followiong.
File 1:
Name | Surname | Emails |
---|---|---|
John | Smith | john.smith@gmail.com |
John | Doe | john.doe@hotmail.com |
File 2:
Surname | Name | |
---|---|---|
Smith | John | john.smith@gmail.com |
Doe | John | john.doe@hotmail.com |
File 3:
Nombre | Apellido | |
---|---|---|
john.smith@gmail.com | John | Smith |
john.doe@hotmail.com | John | Doe |
3 files, 3 different structures. 3 different headers. WTF! How can i guess the position of the columns i'm looking for?
This gem try to solve the problem
Add to your Gemfile and run the bundle
command to install it.
gem "columns-matcher"
N.B. Requires Ruby 1.9.2 or later.
require "columns-matcher"
@matcher = ColumnsMatcher::Matcher.new
# the column that contains the name can be lebeled with "NAME", "NOME" or "NOMBRE"
@matcher.add_column("name", ["NAME", "NOME", "NOMBRE"])
# the column that contains the surname can be lebeled with "SURNAME", "COGNOME" or "APELLIDOS"
@matcher.add_column("cognome", ["SURNAME", "COGNOME", "APELLIDOS"])
# We suppose the header is ["COGNOME", "NOME", "INDIRIZZO"]
@matcher.set_header(header_loaded_from_spreadsheet)
@matcher.column_of("name") # return 1
@matcher.column_of("cognome") # return 0
First try is as exact match. If does not work it try with different case:
require "columns-matcher"
@matcher = ColumnsMatcher::Matcher.new
# the column that contains the name can be lebeled with "NAME", "NOME" or "NOMBRE"
@matcher.add_column("name", ["name", "nome", "nombre"])
# We suppose the header is ["APELLIDO", "NOMBRE", "ADDRESS"]
@matcher.set_header(header_loaded_from_spreadsheet)
@matcher.column_of("name") # return 1
If I can't find the column with exact match or different case match i can also use reg exp
require "columns-matcher"
@matcher = ColumnsMatcher::Matcher.new
# the column that contains the name can be lebeled with "NAME", "NOME" or "NOMBRE"
@matcher.add_column("email", ["[Ee]?[\-]*mail[s]*"])
# We suppose the header is ["Surname", "Name", "Emails"]
@matcher.set_header(header_loaded_from_spreadsheet)
@matcher.column_of("email") # return 2