NPE in FrameColumnImpl.schema property
koperagen opened this issue · 2 comments
koperagen commented
NPE happens in this line.
values.mapNotNull { it.takeIf { it.nrow > 0 }?.schema() }.intersectSchemas()
I briefly looked into csv.kt and found that only tryParseImpl method could potentially create FrameColumn and provide null there. Need to confirm that it's possible.
Another thing that could cause problem is read
method itself that actually tries to parse the file as JSON, CSV, TSV, Excel and others until it succeeds. So, if that file cannot be parsed as CSV, it continues and can produce strange result too
Full stack trace below
The problem is found in one of the loaded libraries: check library converters (fields callbacks)
java.lang.NullPointerException: Parameter specified as non-null is null: method org.jetbrains.kotlinx.dataframe.DataFrameKt.getNrow, parameter <this>
org.jetbrains.kotlinx.jupyter.exceptions.ReplLibraryException: The problem is found in one of the loaded libraries: check library converters (fields callbacks)
at org.jetbrains.kotlinx.jupyter.exceptions.CompositeReplExceptionKt.throwLibraryException(CompositeReplException.kt:50)
at org.jetbrains.kotlinx.jupyter.codegen.FieldsProcessorImpl.process(FieldsProcessorImpl.kt:68)
at org.jetbrains.kotlinx.jupyter.repl.impl.CellExecutorImpl$execute$1$1.invoke(CellExecutorImpl.kt:94)
at org.jetbrains.kotlinx.jupyter.repl.impl.CellExecutorImpl$execute$1$1.invoke(CellExecutorImpl.kt:93)
at org.jetbrains.kotlinx.jupyter.config.LoggingKt.catchAll(logging.kt:42)
at org.jetbrains.kotlinx.jupyter.config.LoggingKt.catchAll$default(logging.kt:41)
at org.jetbrains.kotlinx.jupyter.repl.impl.CellExecutorImpl.execute(CellExecutorImpl.kt:93)
at org.jetbrains.kotlinx.jupyter.repl.CellExecutor$DefaultImpls.execute$default(CellExecutor.kt:14)
at org.jetbrains.kotlinx.jupyter.ReplForJupyterImpl$evalEx$1.invoke(repl.kt:500)
at org.jetbrains.kotlinx.jupyter.ReplForJupyterImpl$evalEx$1.invoke(repl.kt:478)
at org.jetbrains.kotlinx.jupyter.ReplForJupyterImpl.withEvalContext(repl.kt:441)
at org.jetbrains.kotlinx.jupyter.ReplForJupyterImpl.evalEx(repl.kt:478)
at org.jetbrains.kotlinx.jupyter.messaging.ProtocolKt$shellMessagesHandler$2$res$1.invoke(protocol.kt:320)
at org.jetbrains.kotlinx.jupyter.messaging.ProtocolKt$shellMessagesHandler$2$res$1.invoke(protocol.kt:314)
at org.jetbrains.kotlinx.jupyter.JupyterExecutorImpl$runExecution$execThread$1.invoke(execution.kt:38)
at org.jetbrains.kotlinx.jupyter.JupyterExecutorImpl$runExecution$execThread$1.invoke(execution.kt:33)
at kotlin.concurrent.ThreadsKt$thread$thread$1.run(Thread.kt:30)
Caused by: java.lang.NullPointerException: Parameter specified as non-null is null: method org.jetbrains.kotlinx.dataframe.DataFrameKt.getNrow, parameter <this>
at org.jetbrains.kotlinx.dataframe.DataFrameKt.getNrow(DataFrame.kt)
at org.jetbrains.kotlinx.dataframe.impl.columns.FrameColumnImpl$schema$1.invoke(FrameColumnImpl.kt:43)
at org.jetbrains.kotlinx.dataframe.impl.columns.FrameColumnImpl$schema$1.invoke(FrameColumnImpl.kt:42)
at kotlin.SynchronizedLazyImpl.getValue(LazyJVM.kt:74)
at org.jetbrains.kotlinx.dataframe.impl.schema.UtilsKt.extractSchema(Utils.kt:92)
at org.jetbrains.kotlinx.dataframe.impl.schema.UtilsKt.extractSchema(Utils.kt:26)
at org.jetbrains.kotlinx.dataframe.api.SchemaKt.schema(schema.kt:17)
at org.jetbrains.kotlinx.dataframe.impl.codeGen.ReplCodeGeneratorImpl.process(ReplCodeGeneratorImpl.kt:50)
at org.jetbrains.kotlinx.dataframe.jupyter.Integration.updateAnyFrameVariable(Integration.kt:132)
at org.jetbrains.kotlinx.dataframe.jupyter.Integration.access$updateAnyFrameVariable(Integration.kt:73)
at org.jetbrains.kotlinx.dataframe.jupyter.Integration$onLoaded$4.invoke(Integration.kt:295)
at org.jetbrains.kotlinx.dataframe.jupyter.Integration$onLoaded$4.invoke(Integration.kt:290)
at org.jetbrains.kotlinx.jupyter.api.libraries.FieldHandlerFactory.createUpdateExecution$lambda$0(FieldHandlerFactory.kt:38)
at org.jetbrains.kotlinx.jupyter.codegen.FieldsProcessorImplKt.executeEx(FieldsProcessorImpl.kt:88)
at org.jetbrains.kotlinx.jupyter.codegen.FieldsProcessorImplKt.access$executeEx(FieldsProcessorImpl.kt:1)
at org.jetbrains.kotlinx.jupyter.codegen.FieldsProcessorImpl.process(FieldsProcessorImpl.kt:47)
... 15 more
koperagen commented
So indeed some JSON value in the cell + null value in other causes an issue in CSV reading
koperagen commented
val df2 = DataFrame.readDelimStr("""name
"[""str""]"
null
""")