MrPowers/chispa

Give user control to customize output formatting

MrPowers opened this issue · 2 comments

As noted in this pull request (#68), we want to give the user the ability to control the formatting of the output.

The formatting should be easy to configure for a given test and also easy to set globally for the entire test suite.

Here are the main concepts we want to model:

  • formatting for matched rows, unmatched rows, matched cells, unmatched cells
  • reprinting the DataFrame columns that don't match (for wide DataFrame comparisons). See this PR: #48.
  • displaying the diff DataFrame, see this PR: #35

The formatting should let the user configure color, underline, and bold.

These settings should be globally applicable to all the interfaces in the project including schema comparisons, DataFrame comparisons, StructField comparisons, and column comparisons.

Something like this could work:

{
  "mismatched_rows": ["red", "bold"],
  "matched_rows": "blue",
  "mismatched_cells": ["white", "underline"],
  "print_dif": True,
  "print_mismatched_cols": True
}

The user should be able to set this globally and then override for a given test (they should be able to partially override).

The user should also be able to ignore this entirely and just rely on the built-in defaults.

Hopefully we can make the outputs look good on both Mac and Windows machines.

Let's start with this:

{
  "mismatched_rows": ["red", "bold"],
  "matched_rows": "blue",
  "mismatched_cells": ["white", "underline"],
  "matched_cells": ["blue", "bold"]
}

We will need to document the supported colors and font styles.

Hi there,

Is there any way to turn formats off or adjust for Databricks? In Databricks we see ANSI character codes in the output instead of nicely colored results.