BUG: Segfault in smoothing length calculation on Mac
nastasha-w opened this issue · 9 comments
Bug report
Bug summary
I get a segfault when loading the Tipsy 00300 galaxy into yt
. From the output, this seems to occur when yt
tries to calculate smoothing lengths.
The KDTree construction before the smoothing length computation seems to work. This issue occurred on my Mac both with numpy 1.26, and with the bleeding-edge numpy version. The error reminded me of #4953, but that issue was marked resolved after an upstream fix in numpy.
Code for reproduction
This issue was triggered when analyzing the Tipsy galaxy 00300 from the yt hub.
import yt
sampledatadir = "/Users/nastasha/code/ytdev_misc/sample_data/"
datasetn = sampledatadir + "TipsyGalaxy/galaxy.00300"
ds = yt.load(datasetn)
Actual outcome
yt : [INFO ] 2024-08-01 16:53:46,619 Parameters: current_time = 20.100000000000044
yt : [INFO ] 2024-08-01 16:53:46,620 Parameters: domain_dimensions = [1 1 1]
yt : [INFO ] 2024-08-01 16:53:46,621 Parameters: domain_left_edge = None
yt : [INFO ] 2024-08-01 16:53:46,621 Parameters: domain_right_edge = None
yt : [INFO ] 2024-08-01 16:53:46,621 Parameters: cosmological_simulation = 0.0
yt : [INFO ] 2024-08-01 16:53:46,634 Loading KDTree from galaxy.00300.kdtree
Generate smoothing length: 0%| | 0/76962 [00:00<?, ?it/s]zsh: segmentation fault ipython
(ytdev) nastasha@dhcp-165-124-148-182 ytdev_misc % /Users/nastasha/opt/anaconda3/envs/ytdev/lib/python3.12/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
Expected outcome
A loaded yt
dataset, with smoothing lengths for the SPH particles.
Version Information
- Operating System: M1 Mac, Sonoma 14.5 (23F79)
- Python Version: 3.12.4 (also occurred in an older environment with python 3.11 and numpy 1.26)
- yt version: 4.4.dev0 (main branch in my git repo, synced with yt main on 2024-08-01)
- environment setup: (from the yt top directory)
conda create -n ytdev python=3.12 pip conda activate ytdev python -m pip install git+https://github.com/numpy/numpy.git python -m pip install -e ".[doc]" python -m pip install h5py pandas
update: I tried to run the same thing on the university linux cluster, and after many frustrations getting yt
installed, I got the same error running the same code, except without any text after segmentation fault
. This doesn't seem to be OS-specific.
Details of the linux setup:
- OS: Red Hat Enterprise Linux 7.9 (according to the system webpage)
- python: 3.12.3
- numpy: 1.26.4
- yt: 4.4.dev0 (main branch on 2024-08-01)
- yt installation:
- conda environment specifying python 3.12, from .yaml file:
channels: - defaults - conda-forge dependencies: - python=3.12 - astropy - numpy - pandas - scipy - h5py - ipython - matplotlib - more-itertools - cython - mpi4py - pytest - unyt - pip variables: MPLBACKEND: TkAgg # MPLBACKEND needs to be set on Quest; might not be an issue on other systems
yt
from the git main branchpython -m pip install .
in theyt
top-level directory- for the pip installation, I had to load a non-default
gcc
version on this system.gcc/11.2.0
worked, the default gcc version 4.8.5 threw an error when I tried to pip install.
- conda environment specifying python 3.12, from .yaml file:
Interestingly, the following doesn't crash (at least on my system)
import yt
yt.load_sample("TipsyGalaxy")
logger output
yt : [INFO ] 2024-08-02 08:43:29,130 Sample dataset found in '/Users/clm/dev/yt-project/test-data-dir/TipsyGalaxy/galaxy.00300'
yt : [INFO ] 2024-08-02 08:43:29,244 Parameters: current_time = 20.100000000000044
yt : [INFO ] 2024-08-02 08:43:29,244 Parameters: domain_dimensions = [1 1 1]
yt : [INFO ] 2024-08-02 08:43:29,244 Parameters: domain_left_edge = None
yt : [INFO ] 2024-08-02 08:43:29,244 Parameters: domain_right_edge = None
yt : [INFO ] 2024-08-02 08:43:29,244 Parameters: cosmological_simulation = 0.0
yt : [INFO ] 2024-08-02 08:43:29,251 Allocating for 3.154e+05 particles
A seemingly crucial difference I see with your report is that the kdtree isn't loaded, so let's force it:
import yt
ds = yt.load_sample("TipsyGalaxy")
ds.index.kdtree
logger output:
yt : [INFO ] 2024-08-02 08:44:58,063 Sample dataset found in '/Users/clm/dev/yt-project/test-data-dir/TipsyGalaxy/galaxy.00300'
yt : [INFO ] 2024-08-02 08:44:58,146 Parameters: current_time = 20.100000000000044
yt : [INFO ] 2024-08-02 08:44:58,146 Parameters: domain_dimensions = [1 1 1]
yt : [INFO ] 2024-08-02 08:44:58,146 Parameters: domain_left_edge = None
yt : [INFO ] 2024-08-02 08:44:58,146 Parameters: domain_right_edge = None
yt : [INFO ] 2024-08-02 08:44:58,146 Parameters: cosmological_simulation = 0.0
yt : [INFO ] 2024-08-02 08:44:58,152 Allocating for 3.154e+05 particles
yt : [INFO ] 2024-08-02 08:44:58,429 Loading KDTree from galaxy.00300.kdtree
still no crash. I made several attempts with numpy 2.0.1 and 1.26.4
You're using python -m pip install .
without the -e/--editable
flag; can you confirm that you're also not running your script from the top level of the repo ? (this is only safe in editable installs)
I'm also tempted to suggest trying without conda, which may (or may not) be the decisive difference between our systems.
I got the same error earlier without the loaded KDTree. I can try removing it, but I'm pretty sure yt
just put that together the first time and saved it before it crashed. (It was not with the downloaded files.)
Do you get a crash every time or does is it flaky ?
It seems to crash every time in this setup. Also, I am not running this from the yt directory or its subdirectories, but from a separate directory with a few test scripts.
I just tried the following setup with venv
/pip
, also on my mac. (I made sure I had all my conda
environments deactivated first, and I checked was on the latest version of the yt
main git branch.)
brew install python@3.12
/opt/homebrew/bin/python3.12 -m venv /Users/nastasha/code/venvs/ytdev_main
source /Users/nastasha/code/venvs/ytdev_main/bin/activate
cd ~/code/yt
python -m pip install -e ".[doc]"
python -m pip install h5py
I initially got the same error:
>>> import yt
>>> sampledatadir = "/Users/nastasha/code/ytdev_misc/sample_data/"
>>> datasetn = sampledatadir + "TipsyGalaxy/galaxy.00300"
>>> ds = yt.load(datasetn)
yielded
yt : [INFO ] 2024-08-05 13:12:02,768 Parameters: current_time = 20.100000000000044
yt : [INFO ] 2024-08-05 13:12:02,768 Parameters: domain_dimensions = [1 1 1]
yt : [INFO ] 2024-08-05 13:12:02,769 Parameters: domain_left_edge = None
yt : [INFO ] 2024-08-05 13:12:02,769 Parameters: domain_right_edge = None
yt : [INFO ] 2024-08-05 13:12:02,769 Parameters: cosmological_simulation = 0.0
yt : [INFO ] 2024-08-05 13:12:02,777 Loading KDTree from galaxy.00300.kdtree
Generate smoothing length: 0%| | 0/76962 [00:00<?, ?it/s]zsh: segmentation fault python
(ytdev_main) nastasha@Nastashas-MacBook-Pro ytdev_misc % /opt/homebrew/Cellar/python@3.12/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
but after deleting the previously generated kdtree file, it did run ok:
yt : [INFO ] 2024-08-05 13:15:27,323 Parameters: current_time = 20.100000000000044
yt : [INFO ] 2024-08-05 13:15:27,323 Parameters: domain_dimensions = [1 1 1]
yt : [INFO ] 2024-08-05 13:15:27,324 Parameters: domain_left_edge = None
yt : [INFO ] 2024-08-05 13:15:27,324 Parameters: domain_right_edge = None
yt : [INFO ] 2024-08-05 13:15:27,324 Parameters: cosmological_simulation = 0.0
yt : [INFO ] 2024-08-05 13:15:27,341 Allocating KDTree for 76962 particles
Generate smoothing length: 100%|███████████████████████████▉| 76961/76962 [00:00<00:00, 100284.59it/s]
yt : [INFO ] 2024-08-05 13:15:28,170 Allocating for 3.154e+05 particles
TLDR; using venv
+ pip
instead of conda
+ conda
's pip
seems to work, but there may be some issue or compatibility problem with the KDTree file rather than the actual smoothing length calculation in the conda
version.
It is a shame conda
doesn't seem to work well with yt
(anymore) though; along with some colleagues, I'd been using miniconda installations in my home space on some clusters to get around issues with system python packages.
I certainly didn't mean to imply that conda wasn't supported, I was just wondering why I wasn't able to reproduce the problem.
So, it seems we've established that this is a build-time issue with something other than yt itself (since you're compiling/installing it with pip in both scenarios). The most likely suspect would be numpy, and I think we can deduce that the issue is that conda distributions are built against some broken C++ dependency, while wheels from PyPI are not.
We're making progress but there is still a lot of guessing. What we really need at this stage is a much simpler reproducer script, ideally one that doesn't use yt at all. This is typically a lot of work; I cannot promise that I would even attempt it (I'd like to avoid installing conda if I can avoid it).
Oh geez. Well, for the record, here are the installed packages and versions in venv
+ pip
and conda
+ pip
:
venv + pip
using homebrew
's python 3.12.4
(ytdev_sph_proj_backend) nastasha@Nastashas-MacBook-Pro yt % pip list --local
Package Version Editable project location
----------------------------- ----------- -------------------------
alabaster 0.7.16
appnope 0.1.4
asttokens 2.4.1
attrs 24.1.0
Babel 2.15.0
beautifulsoup4 4.12.3
bleach 6.1.0
bottle 0.12.25
certifi 2024.7.4
charset-normalizer 3.3.2
cmyt 2.0.0
comm 0.2.2
contourpy 1.2.1
cycler 0.12.1
debugpy 1.8.3
decorator 5.1.1
defusedxml 0.7.1
docutils 0.20.1
ewah_bool_utils 1.2.1
executing 2.0.1
fastjsonschema 2.20.0
fonttools 4.53.1
h5py 3.11.0
idna 3.7
imagesize 1.4.1
iniconfig 2.0.0
ipykernel 6.29.5
ipython 8.26.0
ipywidgets 8.1.3
jedi 0.19.1
Jinja2 3.0.3
jsonschema 4.23.0
jsonschema-specifications 2023.12.1
jupyter_client 8.6.2
jupyter_core 5.7.2
jupyterlab_pygments 0.3.0
jupyterlab_widgets 3.0.11
kiwisolver 1.4.5
MarkupSafe 2.1.5
matplotlib 3.9.0
matplotlib-inline 0.1.7
mistune 3.0.2
more-itertools 10.3.0
mpmath 1.3.0
nbclient 0.10.0
nbconvert 7.16.4
nbformat 5.10.4
nbsphinx 0.9.4
nest-asyncio 1.6.0
numpy 2.0.1
packaging 24.1
pandocfilters 1.5.1
parso 0.8.4
pexpect 4.9.0
pillow 10.4.0
pip 24.0
platformdirs 4.2.2
pluggy 1.5.0
prompt_toolkit 3.0.47
psutil 6.0.0
ptyprocess 0.7.0
pure_eval 0.2.3
Pygments 2.18.0
pyparsing 3.1.2
pytest 8.3.2
python-dateutil 2.9.0.post0
PyX 0.16
pyzmq 26.1.0
referencing 0.35.1
requests 2.32.3
rpds-py 0.19.1
six 1.16.0
snowballstemmer 2.2.0
soupsieve 2.5
Sphinx 7.3.7
sphinx-bootstrap-theme 0.8.1
sphinx-rtd-theme 2.0.0
sphinxcontrib-applehelp 2.0.0
sphinxcontrib-devhelp 2.0.0
sphinxcontrib-htmlhelp 2.1.0
sphinxcontrib-jquery 4.1
sphinxcontrib-jsmath 1.0.1
sphinxcontrib-qthelp 2.0.0
sphinxcontrib-serializinghtml 2.0.0
stack-data 0.6.3
sympy 1.13.1
tinycss2 1.3.0
tomli_w 1.0.0
tornado 6.4.1
tqdm 4.66.5
traitlets 5.14.3
unyt 3.0.3
urllib3 2.2.2
wcwidth 0.2.13
webencodings 0.5.1
widgetsnbextension 4.0.11
yt 4.4.dev0 /Users/nastasha/code/yt
conda + pip
using conda 22.9.0
, installing only python
and pip
from conda
directly, and the rest from conda
's pip
. I think this is also the version where I tried using bleading-edge numpy.
(ytdev) nastasha@Nastashas-MacBook-Pro yt % conda list
# packages in environment at /Users/nastasha/opt/anaconda3/envs/ytdev:
#
# Name Version Build Channel
alabaster 0.7.16 pypi_0 pypi
appnope 0.1.4 pypi_0 pypi
asttokens 2.4.1 pypi_0 pypi
attrs 23.2.0 pypi_0 pypi
babel 2.15.0 pypi_0 pypi
beautifulsoup4 4.12.3 pypi_0 pypi
bleach 6.1.0 pypi_0 pypi
bottle 0.12.25 pypi_0 pypi
bzip2 1.0.8 h6c40b1e_6
ca-certificates 2024.7.2 hecd8cb5_0
certifi 2024.7.4 pypi_0 pypi
charset-normalizer 3.3.2 pypi_0 pypi
cmyt 2.0.0 pypi_0 pypi
comm 0.2.2 pypi_0 pypi
contourpy 1.2.1 pypi_0 pypi
cycler 0.12.1 pypi_0 pypi
debugpy 1.8.2 pypi_0 pypi
decorator 5.1.1 pypi_0 pypi
defusedxml 0.7.1 pypi_0 pypi
docutils 0.20.1 pypi_0 pypi
ewah-bool-utils 1.2.1 pypi_0 pypi
executing 2.0.1 pypi_0 pypi
expat 2.6.2 hcec6c5f_0
fastjsonschema 2.20.0 pypi_0 pypi
fonttools 4.53.1 pypi_0 pypi
h5py 3.11.0 pypi_0 pypi
idna 3.7 pypi_0 pypi
imagesize 1.4.1 pypi_0 pypi
iniconfig 2.0.0 pypi_0 pypi
ipykernel 6.29.5 pypi_0 pypi
ipython 8.26.0 pypi_0 pypi
ipywidgets 8.1.3 pypi_0 pypi
jedi 0.19.1 pypi_0 pypi
jinja2 3.0.3 pypi_0 pypi
jsonschema 4.23.0 pypi_0 pypi
jsonschema-specifications 2023.12.1 pypi_0 pypi
jupyter-client 8.6.2 pypi_0 pypi
jupyter-core 5.7.2 pypi_0 pypi
jupyterlab-pygments 0.3.0 pypi_0 pypi
jupyterlab-widgets 3.0.11 pypi_0 pypi
kiwisolver 1.4.5 pypi_0 pypi
libcxx 14.0.6 h9765a3e_0
libffi 3.4.4 hecd8cb5_1
markupsafe 2.1.5 pypi_0 pypi
matplotlib 3.9.1 pypi_0 pypi
matplotlib-inline 0.1.7 pypi_0 pypi
mistune 3.0.2 pypi_0 pypi
more-itertools 10.3.0 pypi_0 pypi
mpmath 1.3.0 pypi_0 pypi
nbclient 0.10.0 pypi_0 pypi
nbconvert 7.16.4 pypi_0 pypi
nbformat 5.10.4 pypi_0 pypi
nbsphinx 0.9.4 pypi_0 pypi
ncurses 6.4 hcec6c5f_0
nest-asyncio 1.6.0 pypi_0 pypi
numpy 2.1.0.dev0 pypi_0 pypi
openssl 3.0.14 h46256e1_0
packaging 24.1 pypi_0 pypi
pandas 2.2.2 pypi_0 pypi
pandocfilters 1.5.1 pypi_0 pypi
parso 0.8.4 pypi_0 pypi
pexpect 4.9.0 pypi_0 pypi
pillow 10.4.0 pypi_0 pypi
pip 24.0 py312hecd8cb5_0
platformdirs 4.2.2 pypi_0 pypi
pluggy 1.5.0 pypi_0 pypi
prompt-toolkit 3.0.47 pypi_0 pypi
psutil 6.0.0 pypi_0 pypi
ptyprocess 0.7.0 pypi_0 pypi
pure-eval 0.2.3 pypi_0 pypi
pygments 2.18.0 pypi_0 pypi
pyparsing 3.1.2 pypi_0 pypi
pytest 8.3.2 pypi_0 pypi
python 3.12.4 hcd54a6c_1
python-dateutil 2.9.0.post0 pypi_0 pypi
pytz 2024.1 pypi_0 pypi
pyx 0.16 pypi_0 pypi
pyzmq 26.0.3 pypi_0 pypi
readline 8.2 hca72f7f_0
referencing 0.35.1 pypi_0 pypi
requests 2.32.3 pypi_0 pypi
rpds-py 0.19.1 pypi_0 pypi
setuptools 69.5.1 py312hecd8cb5_0
six 1.16.0 pypi_0 pypi
snowballstemmer 2.2.0 pypi_0 pypi
soupsieve 2.5 pypi_0 pypi
sphinx 7.3.7 pypi_0 pypi
sphinx-bootstrap-theme 0.8.1 pypi_0 pypi
sphinx-rtd-theme 2.0.0 pypi_0 pypi
sphinxcontrib-applehelp 2.0.0 pypi_0 pypi
sphinxcontrib-devhelp 2.0.0 pypi_0 pypi
sphinxcontrib-htmlhelp 2.1.0 pypi_0 pypi
sphinxcontrib-jquery 4.1 pypi_0 pypi
sphinxcontrib-jsmath 1.0.1 pypi_0 pypi
sphinxcontrib-qthelp 2.0.0 pypi_0 pypi
sphinxcontrib-serializinghtml 2.0.0 pypi_0 pypi
sqlite 3.45.3 h6c40b1e_0
stack-data 0.6.3 pypi_0 pypi
sympy 1.13.1 pypi_0 pypi
tinycss2 1.3.0 pypi_0 pypi
tk 8.6.14 h4d00af3_0
tomli-w 1.0.0 pypi_0 pypi
tornado 6.4.1 pypi_0 pypi
tqdm 4.66.4 pypi_0 pypi
traitlets 5.14.3 pypi_0 pypi
tzdata 2024.1 pypi_0 pypi
unyt 3.0.3 pypi_0 pypi
urllib3 2.2.2 pypi_0 pypi
wcwidth 0.2.13 pypi_0 pypi
webencodings 0.5.1 pypi_0 pypi
wheel 0.43.0 py312hecd8cb5_0
widgetsnbextension 4.0.11 pypi_0 pypi
xz 5.4.6 h6c40b1e_1
yt 4.4.dev0 pypi_0 pypi
zlib 1.2.13 h4b97444_1
This also gets python 3.12.4
, but seems to include some packages like libcxx
that the pip
+ venv
version doesn't seem to need.
finally got around to trying on my mac (which already had a miniconda installation) and I can't reproduce the failure. Tried a couple different numpy versions, went back a bit in the yt development branch too. Since I do have conda, @neutrinoceros or @nastasha-w let me know if you've got any ideas for something else to try.
I'm on a M2 mac running Sonoma 14.6.1 . Installed everything into a fresh conda environment with Python 3.12.4 and using pip.
I suspect the real bug is in openblas or highway (both C/C++ dependencies to NumPy). I don't know how they are linked to numpy in conda-world so it might be worth exploring if this bug is reproducible with downgraded versions of those; if numpy is installed from PyPI I think it includes its own copies.