yt-project/yt

BUG: Segfault in smoothing length calculation on Mac

nastasha-w opened this issue · 9 comments

Bug report

Bug summary

I get a segfault when loading the Tipsy 00300 galaxy into yt. From the output, this seems to occur when yt tries to calculate smoothing lengths.

The KDTree construction before the smoothing length computation seems to work. This issue occurred on my Mac both with numpy 1.26, and with the bleeding-edge numpy version. The error reminded me of #4953, but that issue was marked resolved after an upstream fix in numpy.

Code for reproduction
This issue was triggered when analyzing the Tipsy galaxy 00300 from the yt hub.

import yt
sampledatadir = "/Users/nastasha/code/ytdev_misc/sample_data/"
datasetn = sampledatadir + "TipsyGalaxy/galaxy.00300"
ds = yt.load(datasetn)

Actual outcome

yt : [INFO     ] 2024-08-01 16:53:46,619 Parameters: current_time              = 20.100000000000044
yt : [INFO     ] 2024-08-01 16:53:46,620 Parameters: domain_dimensions         = [1 1 1]
yt : [INFO     ] 2024-08-01 16:53:46,621 Parameters: domain_left_edge          = None
yt : [INFO     ] 2024-08-01 16:53:46,621 Parameters: domain_right_edge         = None
yt : [INFO     ] 2024-08-01 16:53:46,621 Parameters: cosmological_simulation   = 0.0
yt : [INFO     ] 2024-08-01 16:53:46,634 Loading KDTree from galaxy.00300.kdtree
Generate smoothing length:   0%|                                            | 0/76962 [00:00<?, ?it/s]zsh: segmentation fault  ipython
(ytdev) nastasha@dhcp-165-124-148-182 ytdev_misc % /Users/nastasha/opt/anaconda3/envs/ytdev/lib/python3.12/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

Expected outcome

A loaded yt dataset, with smoothing lengths for the SPH particles.

Version Information

  • Operating System: M1 Mac, Sonoma 14.5 (23F79)
  • Python Version: 3.12.4 (also occurred in an older environment with python 3.11 and numpy 1.26)
  • yt version: 4.4.dev0 (main branch in my git repo, synced with yt main on 2024-08-01)
  • environment setup: (from the yt top directory)
    conda create -n ytdev python=3.12 pip 
    conda activate ytdev
    python -m pip install git+https://github.com/numpy/numpy.git
    python -m pip install -e ".[doc]"
    python -m pip install h5py pandas

update: I tried to run the same thing on the university linux cluster, and after many frustrations getting yt installed, I got the same error running the same code, except without any text after segmentation fault. This doesn't seem to be OS-specific.

Details of the linux setup:

  • OS: Red Hat Enterprise Linux 7.9 (according to the system webpage)
  • python: 3.12.3
  • numpy: 1.26.4
  • yt: 4.4.dev0 (main branch on 2024-08-01)
  • yt installation:
    • conda environment specifying python 3.12, from .yaml file:
      channels:
        - defaults
        - conda-forge
      dependencies:
        - python=3.12
        - astropy
        - numpy
          - pandas
          - scipy
          - h5py
          - ipython
          - matplotlib
          - more-itertools
          - cython
          - mpi4py
          - pytest
          - unyt
          - pip 
        variables:
          MPLBACKEND: TkAgg
        # MPLBACKEND needs to be set on Quest; might not be an issue on other systems
      
    • yt from the git main branch
    • python -m pip install . in the yt top-level directory
    • for the pip installation, I had to load a non-default gcc version on this system. gcc/11.2.0 worked, the default gcc version 4.8.5 threw an error when I tried to pip install.

Interestingly, the following doesn't crash (at least on my system)

import yt

yt.load_sample("TipsyGalaxy")

logger output

yt : [INFO     ] 2024-08-02 08:43:29,130 Sample dataset found in '/Users/clm/dev/yt-project/test-data-dir/TipsyGalaxy/galaxy.00300'
yt : [INFO     ] 2024-08-02 08:43:29,244 Parameters: current_time              = 20.100000000000044
yt : [INFO     ] 2024-08-02 08:43:29,244 Parameters: domain_dimensions         = [1 1 1]
yt : [INFO     ] 2024-08-02 08:43:29,244 Parameters: domain_left_edge          = None
yt : [INFO     ] 2024-08-02 08:43:29,244 Parameters: domain_right_edge         = None
yt : [INFO     ] 2024-08-02 08:43:29,244 Parameters: cosmological_simulation   = 0.0
yt : [INFO     ] 2024-08-02 08:43:29,251 Allocating for 3.154e+05 particles

A seemingly crucial difference I see with your report is that the kdtree isn't loaded, so let's force it:

import yt

ds = yt.load_sample("TipsyGalaxy")
ds.index.kdtree

logger output:

yt : [INFO     ] 2024-08-02 08:44:58,063 Sample dataset found in '/Users/clm/dev/yt-project/test-data-dir/TipsyGalaxy/galaxy.00300'
yt : [INFO     ] 2024-08-02 08:44:58,146 Parameters: current_time              = 20.100000000000044
yt : [INFO     ] 2024-08-02 08:44:58,146 Parameters: domain_dimensions         = [1 1 1]
yt : [INFO     ] 2024-08-02 08:44:58,146 Parameters: domain_left_edge          = None
yt : [INFO     ] 2024-08-02 08:44:58,146 Parameters: domain_right_edge         = None
yt : [INFO     ] 2024-08-02 08:44:58,146 Parameters: cosmological_simulation   = 0.0
yt : [INFO     ] 2024-08-02 08:44:58,152 Allocating for 3.154e+05 particles
yt : [INFO     ] 2024-08-02 08:44:58,429 Loading KDTree from galaxy.00300.kdtree

still no crash. I made several attempts with numpy 2.0.1 and 1.26.4

You're using python -m pip install . without the -e/--editable flag; can you confirm that you're also not running your script from the top level of the repo ? (this is only safe in editable installs)
I'm also tempted to suggest trying without conda, which may (or may not) be the decisive difference between our systems.

I got the same error earlier without the loaded KDTree. I can try removing it, but I'm pretty sure yt just put that together the first time and saved it before it crashed. (It was not with the downloaded files.)

Do you get a crash every time or does is it flaky ?

It seems to crash every time in this setup. Also, I am not running this from the yt directory or its subdirectories, but from a separate directory with a few test scripts.

I just tried the following setup with venv/pip, also on my mac. (I made sure I had all my conda environments deactivated first, and I checked was on the latest version of the yt main git branch.)

brew install python@3.12
/opt/homebrew/bin/python3.12 -m venv /Users/nastasha/code/venvs/ytdev_main
source /Users/nastasha/code/venvs/ytdev_main/bin/activate
cd ~/code/yt
python -m pip install -e ".[doc]"
python -m pip install h5py

I initially got the same error:

>>> import yt
>>> sampledatadir = "/Users/nastasha/code/ytdev_misc/sample_data/"
>>> datasetn = sampledatadir + "TipsyGalaxy/galaxy.00300"
>>> ds = yt.load(datasetn)

yielded

yt : [INFO     ] 2024-08-05 13:12:02,768 Parameters: current_time              = 20.100000000000044
yt : [INFO     ] 2024-08-05 13:12:02,768 Parameters: domain_dimensions         = [1 1 1]
yt : [INFO     ] 2024-08-05 13:12:02,769 Parameters: domain_left_edge          = None
yt : [INFO     ] 2024-08-05 13:12:02,769 Parameters: domain_right_edge         = None
yt : [INFO     ] 2024-08-05 13:12:02,769 Parameters: cosmological_simulation   = 0.0
yt : [INFO     ] 2024-08-05 13:12:02,777 Loading KDTree from galaxy.00300.kdtree
Generate smoothing length:   0%|                                            | 0/76962 [00:00<?, ?it/s]zsh: segmentation fault  python
(ytdev_main) nastasha@Nastashas-MacBook-Pro ytdev_misc % /opt/homebrew/Cellar/python@3.12/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

but after deleting the previously generated kdtree file, it did run ok:

yt : [INFO     ] 2024-08-05 13:15:27,323 Parameters: current_time              = 20.100000000000044
yt : [INFO     ] 2024-08-05 13:15:27,323 Parameters: domain_dimensions         = [1 1 1]
yt : [INFO     ] 2024-08-05 13:15:27,324 Parameters: domain_left_edge          = None
yt : [INFO     ] 2024-08-05 13:15:27,324 Parameters: domain_right_edge         = None
yt : [INFO     ] 2024-08-05 13:15:27,324 Parameters: cosmological_simulation   = 0.0
yt : [INFO     ] 2024-08-05 13:15:27,341 Allocating KDTree for 76962 particles
Generate smoothing length: 100%|███████████████████████████▉| 76961/76962 [00:00<00:00, 100284.59it/s]
yt : [INFO     ] 2024-08-05 13:15:28,170 Allocating for 3.154e+05 particles

TLDR; using venv + pip instead of conda + conda's pip seems to work, but there may be some issue or compatibility problem with the KDTree file rather than the actual smoothing length calculation in the conda version.
It is a shame conda doesn't seem to work well with yt (anymore) though; along with some colleagues, I'd been using miniconda installations in my home space on some clusters to get around issues with system python packages.

I certainly didn't mean to imply that conda wasn't supported, I was just wondering why I wasn't able to reproduce the problem.

So, it seems we've established that this is a build-time issue with something other than yt itself (since you're compiling/installing it with pip in both scenarios). The most likely suspect would be numpy, and I think we can deduce that the issue is that conda distributions are built against some broken C++ dependency, while wheels from PyPI are not.

We're making progress but there is still a lot of guessing. What we really need at this stage is a much simpler reproducer script, ideally one that doesn't use yt at all. This is typically a lot of work; I cannot promise that I would even attempt it (I'd like to avoid installing conda if I can avoid it).

Oh geez. Well, for the record, here are the installed packages and versions in venv + pip and conda + pip:

venv + pip

using homebrew's python 3.12.4

(ytdev_sph_proj_backend) nastasha@Nastashas-MacBook-Pro yt % pip list --local  
Package                       Version     Editable project location
----------------------------- ----------- -------------------------
alabaster                     0.7.16
appnope                       0.1.4
asttokens                     2.4.1
attrs                         24.1.0
Babel                         2.15.0
beautifulsoup4                4.12.3
bleach                        6.1.0
bottle                        0.12.25
certifi                       2024.7.4
charset-normalizer            3.3.2
cmyt                          2.0.0
comm                          0.2.2
contourpy                     1.2.1
cycler                        0.12.1
debugpy                       1.8.3
decorator                     5.1.1
defusedxml                    0.7.1
docutils                      0.20.1
ewah_bool_utils               1.2.1
executing                     2.0.1
fastjsonschema                2.20.0
fonttools                     4.53.1
h5py                          3.11.0
idna                          3.7
imagesize                     1.4.1
iniconfig                     2.0.0
ipykernel                     6.29.5
ipython                       8.26.0
ipywidgets                    8.1.3
jedi                          0.19.1
Jinja2                        3.0.3
jsonschema                    4.23.0
jsonschema-specifications     2023.12.1
jupyter_client                8.6.2
jupyter_core                  5.7.2
jupyterlab_pygments           0.3.0
jupyterlab_widgets            3.0.11
kiwisolver                    1.4.5
MarkupSafe                    2.1.5
matplotlib                    3.9.0
matplotlib-inline             0.1.7
mistune                       3.0.2
more-itertools                10.3.0
mpmath                        1.3.0
nbclient                      0.10.0
nbconvert                     7.16.4
nbformat                      5.10.4
nbsphinx                      0.9.4
nest-asyncio                  1.6.0
numpy                         2.0.1
packaging                     24.1
pandocfilters                 1.5.1
parso                         0.8.4
pexpect                       4.9.0
pillow                        10.4.0
pip                           24.0
platformdirs                  4.2.2
pluggy                        1.5.0
prompt_toolkit                3.0.47
psutil                        6.0.0
ptyprocess                    0.7.0
pure_eval                     0.2.3
Pygments                      2.18.0
pyparsing                     3.1.2
pytest                        8.3.2
python-dateutil               2.9.0.post0
PyX                           0.16
pyzmq                         26.1.0
referencing                   0.35.1
requests                      2.32.3
rpds-py                       0.19.1
six                           1.16.0
snowballstemmer               2.2.0
soupsieve                     2.5
Sphinx                        7.3.7
sphinx-bootstrap-theme        0.8.1
sphinx-rtd-theme              2.0.0
sphinxcontrib-applehelp       2.0.0
sphinxcontrib-devhelp         2.0.0
sphinxcontrib-htmlhelp        2.1.0
sphinxcontrib-jquery          4.1
sphinxcontrib-jsmath          1.0.1
sphinxcontrib-qthelp          2.0.0
sphinxcontrib-serializinghtml 2.0.0
stack-data                    0.6.3
sympy                         1.13.1
tinycss2                      1.3.0
tomli_w                       1.0.0
tornado                       6.4.1
tqdm                          4.66.5
traitlets                     5.14.3
unyt                          3.0.3
urllib3                       2.2.2
wcwidth                       0.2.13
webencodings                  0.5.1
widgetsnbextension            4.0.11
yt                            4.4.dev0    /Users/nastasha/code/yt

conda + pip

using conda 22.9.0, installing only python and pip from conda directly, and the rest from conda's pip. I think this is also the version where I tried using bleading-edge numpy.

(ytdev) nastasha@Nastashas-MacBook-Pro yt % conda list
# packages in environment at /Users/nastasha/opt/anaconda3/envs/ytdev:
#
# Name                    Version                   Build  Channel
alabaster                 0.7.16                   pypi_0    pypi
appnope                   0.1.4                    pypi_0    pypi
asttokens                 2.4.1                    pypi_0    pypi
attrs                     23.2.0                   pypi_0    pypi
babel                     2.15.0                   pypi_0    pypi
beautifulsoup4            4.12.3                   pypi_0    pypi
bleach                    6.1.0                    pypi_0    pypi
bottle                    0.12.25                  pypi_0    pypi
bzip2                     1.0.8                h6c40b1e_6  
ca-certificates           2024.7.2             hecd8cb5_0  
certifi                   2024.7.4                 pypi_0    pypi
charset-normalizer        3.3.2                    pypi_0    pypi
cmyt                      2.0.0                    pypi_0    pypi
comm                      0.2.2                    pypi_0    pypi
contourpy                 1.2.1                    pypi_0    pypi
cycler                    0.12.1                   pypi_0    pypi
debugpy                   1.8.2                    pypi_0    pypi
decorator                 5.1.1                    pypi_0    pypi
defusedxml                0.7.1                    pypi_0    pypi
docutils                  0.20.1                   pypi_0    pypi
ewah-bool-utils           1.2.1                    pypi_0    pypi
executing                 2.0.1                    pypi_0    pypi
expat                     2.6.2                hcec6c5f_0  
fastjsonschema            2.20.0                   pypi_0    pypi
fonttools                 4.53.1                   pypi_0    pypi
h5py                      3.11.0                   pypi_0    pypi
idna                      3.7                      pypi_0    pypi
imagesize                 1.4.1                    pypi_0    pypi
iniconfig                 2.0.0                    pypi_0    pypi
ipykernel                 6.29.5                   pypi_0    pypi
ipython                   8.26.0                   pypi_0    pypi
ipywidgets                8.1.3                    pypi_0    pypi
jedi                      0.19.1                   pypi_0    pypi
jinja2                    3.0.3                    pypi_0    pypi
jsonschema                4.23.0                   pypi_0    pypi
jsonschema-specifications 2023.12.1                pypi_0    pypi
jupyter-client            8.6.2                    pypi_0    pypi
jupyter-core              5.7.2                    pypi_0    pypi
jupyterlab-pygments       0.3.0                    pypi_0    pypi
jupyterlab-widgets        3.0.11                   pypi_0    pypi
kiwisolver                1.4.5                    pypi_0    pypi
libcxx                    14.0.6               h9765a3e_0  
libffi                    3.4.4                hecd8cb5_1  
markupsafe                2.1.5                    pypi_0    pypi
matplotlib                3.9.1                    pypi_0    pypi
matplotlib-inline         0.1.7                    pypi_0    pypi
mistune                   3.0.2                    pypi_0    pypi
more-itertools            10.3.0                   pypi_0    pypi
mpmath                    1.3.0                    pypi_0    pypi
nbclient                  0.10.0                   pypi_0    pypi
nbconvert                 7.16.4                   pypi_0    pypi
nbformat                  5.10.4                   pypi_0    pypi
nbsphinx                  0.9.4                    pypi_0    pypi
ncurses                   6.4                  hcec6c5f_0  
nest-asyncio              1.6.0                    pypi_0    pypi
numpy                     2.1.0.dev0               pypi_0    pypi
openssl                   3.0.14               h46256e1_0  
packaging                 24.1                     pypi_0    pypi
pandas                    2.2.2                    pypi_0    pypi
pandocfilters             1.5.1                    pypi_0    pypi
parso                     0.8.4                    pypi_0    pypi
pexpect                   4.9.0                    pypi_0    pypi
pillow                    10.4.0                   pypi_0    pypi
pip                       24.0            py312hecd8cb5_0  
platformdirs              4.2.2                    pypi_0    pypi
pluggy                    1.5.0                    pypi_0    pypi
prompt-toolkit            3.0.47                   pypi_0    pypi
psutil                    6.0.0                    pypi_0    pypi
ptyprocess                0.7.0                    pypi_0    pypi
pure-eval                 0.2.3                    pypi_0    pypi
pygments                  2.18.0                   pypi_0    pypi
pyparsing                 3.1.2                    pypi_0    pypi
pytest                    8.3.2                    pypi_0    pypi
python                    3.12.4               hcd54a6c_1  
python-dateutil           2.9.0.post0              pypi_0    pypi
pytz                      2024.1                   pypi_0    pypi
pyx                       0.16                     pypi_0    pypi
pyzmq                     26.0.3                   pypi_0    pypi
readline                  8.2                  hca72f7f_0  
referencing               0.35.1                   pypi_0    pypi
requests                  2.32.3                   pypi_0    pypi
rpds-py                   0.19.1                   pypi_0    pypi
setuptools                69.5.1          py312hecd8cb5_0  
six                       1.16.0                   pypi_0    pypi
snowballstemmer           2.2.0                    pypi_0    pypi
soupsieve                 2.5                      pypi_0    pypi
sphinx                    7.3.7                    pypi_0    pypi
sphinx-bootstrap-theme    0.8.1                    pypi_0    pypi
sphinx-rtd-theme          2.0.0                    pypi_0    pypi
sphinxcontrib-applehelp   2.0.0                    pypi_0    pypi
sphinxcontrib-devhelp     2.0.0                    pypi_0    pypi
sphinxcontrib-htmlhelp    2.1.0                    pypi_0    pypi
sphinxcontrib-jquery      4.1                      pypi_0    pypi
sphinxcontrib-jsmath      1.0.1                    pypi_0    pypi
sphinxcontrib-qthelp      2.0.0                    pypi_0    pypi
sphinxcontrib-serializinghtml 2.0.0                    pypi_0    pypi
sqlite                    3.45.3               h6c40b1e_0  
stack-data                0.6.3                    pypi_0    pypi
sympy                     1.13.1                   pypi_0    pypi
tinycss2                  1.3.0                    pypi_0    pypi
tk                        8.6.14               h4d00af3_0  
tomli-w                   1.0.0                    pypi_0    pypi
tornado                   6.4.1                    pypi_0    pypi
tqdm                      4.66.4                   pypi_0    pypi
traitlets                 5.14.3                   pypi_0    pypi
tzdata                    2024.1                   pypi_0    pypi
unyt                      3.0.3                    pypi_0    pypi
urllib3                   2.2.2                    pypi_0    pypi
wcwidth                   0.2.13                   pypi_0    pypi
webencodings              0.5.1                    pypi_0    pypi
wheel                     0.43.0          py312hecd8cb5_0  
widgetsnbextension        4.0.11                   pypi_0    pypi
xz                        5.4.6                h6c40b1e_1  
yt                        4.4.dev0                 pypi_0    pypi
zlib                      1.2.13               h4b97444_1 

This also gets python 3.12.4, but seems to include some packages like libcxx that the pip + venv version doesn't seem to need.

finally got around to trying on my mac (which already had a miniconda installation) and I can't reproduce the failure. Tried a couple different numpy versions, went back a bit in the yt development branch too. Since I do have conda, @neutrinoceros or @nastasha-w let me know if you've got any ideas for something else to try.

I'm on a M2 mac running Sonoma 14.6.1 . Installed everything into a fresh conda environment with Python 3.12.4 and using pip.

I suspect the real bug is in openblas or highway (both C/C++ dependencies to NumPy). I don't know how they are linked to numpy in conda-world so it might be worth exploring if this bug is reproducible with downgraded versions of those; if numpy is installed from PyPI I think it includes its own copies.