MarkPflug/Sylvan.Data.Excel

change Excel schema when header has special character

IngridtRodriguez opened this issue · 2 comments

Been trying to change the schema of an excel file. I know the schema I'm suppose the get but for some reason, one of my files has a header with a line break.

File example:
"ID\n"|"Date"
252094|2021-01-26

Tried:

new ExcelSchema(true, Schema.Parse("ID\n>ClientId,Date>FromDate:datetime"));
new ExcelSchema(true, Schema.Parse("ID\\n>ClientId,Date>FromDate:datetime"));

Pd: The file can change column order at any given time. I just know the column names

The Schema.Parse method needs a bit of work. It doesn't have any mechanism for character escaping. I've been aware of this shortcoming, but haven't bothered to fix it yet. I'd probably go the route of using C# style escape sequences when I do get around to implementing it. This would allow you to write something like "ID\n>ClientId" (notice the double backslash). Might consider URL encoding instead, not sure.

As a workaround for now, you can use the Schema.Builder type to create the Schema instance:

using Sylvan.Data;
using Sylvan.Data.Excel;

var schema =
    new Schema.Builder()
        .Add("Id\n", "ClientId", typeof(string))
        .Add("Date", "FromDate", typeof(DateTime))
        .Build();

var excelSchema = new ExcelSchema(true, schema);


static class Ex
{
    // extension method to make building the schema a bit terser.
    public static Schema.Builder Add(this Schema.Builder builder, string? baseName, string name, Type type)
    {
        return
        builder.Add(
            new Schema.Column.Builder()
            {
                BaseColumnName = baseName,
                ColumnName = name,
                DataType = type
            }
        );
    }
}

I tested the workaround. It's perfect. Thank you