MolecularCartography/Optimus

KNIME & Python: AttributeError

Closed this issue · 6 comments

When trying to execute "Detect LC-MS Features" in KNIME v3.3.3 on either Mac OS v10.12.6 or Microsoft Windows Server 2008 R2, the following error results:

ERROR Python Script (1⇒1)  0:423:403  Execute failed: Traceback (most recent call last):
  File "F:\u0597274\downloads\knime_3.3.3.win32.win32.x86_64\knime_3.3.3\plugins\org.knime.python_3.3.0.v201611242050\py\PythonKernel.py", line 282, in execute
    exec(source_code, _exec_env, _exec_env)
  File "<string>", line 47, in <module>
AttributeError: 'numpy.int64' object has no attribute 'split'

Any idea what's going on? I'm using the latest Anaconda v2.7 python 64-bit (v4.4.0) with protobuf v2.6.1, pyopenms v2.0.1, and pyMSpec v0.1. I've also tried using the latest pyopenms on Mac OS with the same error (I haven't tried using the latest pyopenms on Windows because it requires python v3 and Optimus's documentation exclusively mentions python v2.7).

Hi @tantrev, can you please check which versions of pandas and numpy you have installed?
I suspect that you've got new versions that are incompatible with Optimus at the moment.

Thanks for bringing up the latest pyopenms version! I've already given up hoping that OpenMS team is ever going to support Python 3. That's actually the main reason, why Optimus requires Python 2.7.

Currently, however, I guess you'd have to downgrade your pandas module to 0.19.0 in order to get Optimus working.

Also, you can upgrade pyopenms to 2.1.0 as it contains a fixture of a bug leading to crash of Python interpreter. I'll update the README description and installation scripts to change the pyopenms version.

Hi @iprotsyuk , sure thing!

As per your recommendation, I installed pyopenms v2.1.0 and pandas v0.19.0/numpy v1.1.3, and still receive the following error:

ERROR Python Script (1⇒1) 0:423:403 Execute failed: Traceback (most recent call last):
File "F:\u0597274\downloads\knime_3.3.3.win32.win32.x86_64\knime_3.3.3\plugins\org.knime.python_3.3.0.v201611242050\py\PythonKernel.py", line 282, in execute
exec(source_code, _exec_env, _exec_env)
File "", line 47, in
AttributeError: 'numpy.int64' object has no attribute 'split'

If it makes any difference, my logfile for a fresh KNIME session may be viewed here (before I manually selected new stub files and used this experimental design file).

Thanks for the info about python v2.7. I was just a little confused because I initially tried to get the latest OpenMS to run on Windows and my Mac, but after having trouble (as documented here), I was told that v2.7 python support has apparently been dropped for future OpenMS versions?

Anywho, thanks again for the help. :)

Also, this is just more of a conceptual question (forgive me, I'm new to mass spectrometry), but can Optimus handle mixed MS-types of files within its experimental design? For example, one of the datasets I'm re-analyzing has ~50+ centroid-based MS1 spectra, then 2 pooled MS/MS samples, and another 2 pooled fully profiled MS1 spectra. I understand the experimental design documentation talks about "pooled" samples, I just wasn't sure if there was a way to specify specific subtypes.

@tantrev, pooled samples used in Optimus experimental design are pooled QC samples, which are supposed to be a mixture of all the samples, considered in a study, injected periodically during your MS-sequence. Optimus can use these samples to filter out features that are not reproducible across QC runs.
Thus, the filtering takes into account only MS1 information.

Regarding the error you're receiving, it seems the experimental design file is the cause. Perhaps the guidance here isn't clear enough. A few remarks on your design:

  • the spreadsheet doesn't have to be filled completely. Actually, only the 1st column is essential. In the second column you can specify blank samples (if you have them) using "BLANK" keyword, and it'll give you minimalistic, but sensible experimental design.
  • Accordingly, you don't have to fill 3rd and 4th columns with 1s and 0s. This is what actually causing the error you're getting, as group names and replicate names are expected to be strings. I guess I need to enforce convertion into strings for these columns.
  • "POOLED_QC" must contain underscore. However, you might reconsider using this keyword after my explanation in the previous comment about assumptions Optimus has regarding pooled QC samples.

Hi @tantrev, feel free to reopen this issue if you stumble upon the problem again. I'm it for the time being.