twosigma/marbles

Format multiline local repr values better

leifwalsh opened this issue · 2 comments

Is your feature request related to a problem? Please describe.
If a local variable has a value whose repr is multiple lines, the first line is indented by a tab plus the variable name, but the rest of the lines are flush with the left margin of the terminal.

This is particularly annoying for tabular data, where the first line, usually column names, is offset from the data rows.

For example:

Source (/home/leif/git/marbles-demo-bikeshare/bikeshare/test_bikeshare.py):
     49 long_trips = _data[_data['tripduration'] > pd.Timedelta('24h')]
 >   50 self.assertDataFrameEmpty(long_trips)
     51 
Locals:
	long_trips=            tripduration               starttime                stoptime
1495     1 days 13:33:16 2015-01-08 01:06:37.000 2015-01-09 14:39:54.000
4064     1 days 21:25:08 2015-01-17 13:55:59.000 2015-01-19 11:21:08.000
7168     2 days 16:31:59 2015-01-26 17:14:12.000 2015-01-29 09:46:11.000
722      1 days 19:32:50 2015-02-06 15:31:02.000 2015-02-08 11:03:53.000
1388     1 days 02:19:46 2015-02-13 08:04:02.000 2015-02-14 10:23:49.000
5621     6 days 16:32:20 2015-03-17 16:26:54.000 2015-03-24 08:59:15.000
6453   191 days 14:29:48 2015-03-20 02:24:09.000 2015-09-27 16:53:58.000

Describe the solution you'd like
Multiline output reprs should be indented uniformly.

Describe alternatives you've considered
Could also special case things like pandas.DataFrame but doing it for anything which reprs to something with newlines seems better.

Additional context
Additionally, values are currently rendered with str, not repr, which means strings don't get quotes and so don't look like strings, and there's no space around the equals sign separating the name and value.

I'm not sure whether we want strings to have quotes or not, what do you think @thejunglejane?

I think we do want quotes around strings; it will make them look like strings, and also make the locals copy-paste-able.

In the grander scheme of things, I think we'll want to provide type-specific reprs eventually. This also makes me wonder if there's a way to allow users to provide their own type-specific reprs. This could be useful, e.g., if a team wants to adopt a template for annotations. What do you think?

Good, I agree on quotes.

Jupyter has a pluggable display mechanism, for example, an object which implements _repr_html_ will be rendered by jupyter notebook by calling that instead of __repr__.

We could introduce a similar _repr_marbles_ hook that teams could use, falling back to standard __repr__.