scijava/scyjava

org.scijava.table.Table to pandas dataframe mangles text columns

Closed this issue · 2 comments

Scyjava mangles text data columns when converting an org.scijava.table.Table to a pandas DataFrame. Below is a minimal example using test_image.tif in the repo.

import imagej

# initialize ImageJ
ij = imagej.init()
print(f"ImageJ version: {ij.getVersion()}")

compute_stats_script = """
#@ OpService ops
#@ net.imglib2.RandomAccessibleInterval image
#@output stats

statNames = new org.scijava.table.GenericColumn("Statistic")
statValues = new org.scijava.table.DoubleColumn("Value")
addRow = (n, v) -> { statNames.add(n); statValues.add(v.getRealDouble()) }

addRow("geometricMean", ops.stats().geometricMean(image))

stats = new org.scijava.table.DefaultGenericTable()
stats.add(statNames)
stats.add(statValues)
"""

image = ij.io().open('sample-data/test_image.tif')
result = ij.py.run_script("Groovy", compute_stats_script, args={"image" : image})
df = ij.py.from_java(result.getOutput("stats"))
print(df)

This outputs:

                                 Statistic      Value
0  (g, e, o, m, e, t, r, i, c, M, e, a, n)  595.44145

Instead of geometricMean we get what I assume is a list of chars.

I've tracked the bug to this line:

df = pd.DataFrame(data).T

This bug has been fixed with c715b93. The example script above now produces:

       Statistic      Value
0  geometricMean  595.44145

The problem came from creating a pandas DataFrame from a List of Java objects. Converting the contents of the List to Python objects prior to creating the DataFrame resolves this issue.