[BUG] Segmentation fault (core dumped)
celsofranssa opened this issue · 1 comments
First of all, thank you for this excellent library.
Describe the bug
Building TDF matrix: 100%|███████████████████████████████████████████████| 13905/13905 [00:34<00:00, 408.07it/s]
Building inverted index: 100%|███████████████████████████████████████| 148864/148864 [00:10<00:00, 14750.18it/s]
Batch search: 0%| | 0/13905 [00:00<?, ?it/s]
Segmentation fault (core dumped)
I am getting Segmentation fault (core dumped)
when using bsearch
in Sparse Retriever.
Current environment
-
CUDA:
- GPU:
- NVIDIA GeForce RTX 3090
- available: True
- version: 12.1 -
Packages:
- absl-py: 2.0.0
- accelerate: 0.24.1
- aiohttp: 3.8.6
- aiosignal: 1.3.1
- alembic: 1.12.1
- antlr4-python3-runtime: 4.9.3
- appdirs: 1.4.4
- async-timeout: 4.0.3
- attrs: 23.1.0
- autofaiss: 2.15.8
- beautifulsoup4: 4.12.2
- bleach: 6.1.0
- cachetools: 5.3.2
- cbor: 1.0.0
- cbor2: 5.5.1
- certifi: 2023.7.22
- charset-normalizer: 3.3.2
- click: 8.1.7
- colorlog: 6.7.0
- contourpy: 1.2.0
- cramjam: 2.7.0
- cycler: 0.12.1
- dill: 0.3.7
- docker-pycreds: 0.4.0
- embedding-reader: 1.5.1
- faiss-cpu: 1.7.4
- fastparquet: 2023.10.1
- filelock: 3.13.1
- fire: 0.4.0
- fonttools: 4.44.0
- frozenlist: 1.4.0
- fsspec: 2023.10.0
- gitdb: 4.0.11
- gitpython: 3.1.40
- google-auth: 2.23.4
- google-auth-oauthlib: 1.1.0
- greenlet: 3.0.1
- grpcio: 1.59.2
- huggingface-hub: 0.17.3
- hydra-core: 1.3.2
- idna: 3.4
- ijson: 3.2.3
- indxr: 0.1.5
- inscriptis: 2.3.2
- ir-datasets: 0.5.5
- jinja2: 3.1.2
- joblib: 1.3.2
- kaggle: 1.5.16
- keybert: 0.8.3
- kiwisolver: 1.4.5
- krovetzstemmer: 0.8
- lightning-utilities: 0.9.0
- llvmlite: 0.41.1
- lxml: 4.9.3
- lz4: 4.3.2
- mako: 1.3.0
- markdown: 3.5.1
- markdown-it-py: 3.0.0
- markupsafe: 2.1.3
- matplotlib: 3.8.1
- mdurl: 0.1.2
- mpmath: 1.3.0
- multidict: 6.0.4
- multipipe: 0.1.0
- multiprocess: 0.70.15
- networkx: 3.2.1
- nltk: 3.8.1
- nmslib: 2.1.1
- numba: 0.58.1
- numpy: 1.26.1
- nvidia-cublas-cu12: 12.1.3.1
- nvidia-cuda-cupti-cu12: 12.1.105
- nvidia-cuda-nvrtc-cu12: 12.1.105
- nvidia-cuda-runtime-cu12: 12.1.105
- nvidia-cudnn-cu12: 8.9.2.26
- nvidia-cufft-cu12: 11.0.2.54
- nvidia-curand-cu12: 10.3.2.106
- nvidia-cusolver-cu12: 11.4.5.107
- nvidia-cusparse-cu12: 12.1.0.106
- nvidia-nccl-cu12: 2.18.1
- nvidia-nvjitlink-cu12: 12.3.52
- nvidia-nvtx-cu12: 12.1.105
- oauthlib: 3.2.2
- omegaconf: 2.3.0
- oneliner-utils: 0.1.2
- optuna: 3.4.0
- orjson: 3.9.10
- packaging: 23.2
- pandas: 1.5.3
- pillow: 10.1.0
- pip: 23.3.1
- protobuf: 4.23.4
- psutil: 5.9.6
- pyarrow: 12.0.1
- pyasn1: 0.5.0
- pyasn1-modules: 0.3.0
- pyautocorpus: 0.1.12
- pybind11: 2.6.1
- pygments: 2.16.1
- pyparsing: 3.1.1
- pystemmer: 2.0.1
- python-dateutil: 2.8.2
- python-slugify: 8.0.1
- pytorch-lightning: 2.1.1
- pytorch-metric-learning: 2.3.0
- pytz: 2023.3.post1
- pyyaml: 6.0.1
- ranx: 0.3.18
- regex: 2023.10.3
- requests: 2.31.0
- requests-oauthlib: 1.3.1
- retriv: 0.2.3
- rich: 13.6.0
- rsa: 4.9
- safetensors: 0.4.0
- scikit-learn: 1.3.2
- scipy: 1.11.3
- seaborn: 0.13.0
- sentence-transformers: 2.2.2
- sentencepiece: 0.1.99
- sentry-sdk: 1.39.1
- setproctitle: 1.3.3
- setuptools: 68.2.2
- six: 1.16.0
- smmap: 5.0.1
- soupsieve: 2.5
- sqlalchemy: 2.0.23
- sympy: 1.12
- tabulate: 0.9.0
- tensorboard: 2.15.1
- tensorboard-data-server: 0.7.2
- termcolor: 2.3.0
- text-unidecode: 1.3
- threadpoolctl: 3.2.0
- tokenizers: 0.14.1
- torch: 2.1.0
- torchaudio: 2.1.0
- torchmetrics: 1.2.0
- torchvision: 0.16.0
- tqdm: 4.66.1
- transformers: 4.35.0
- trec-car-tools: 2.6
- triton: 2.1.0
- typing-extensions: 4.8.0
- unidecode: 1.3.7
- unlzw3: 0.2.2
- urllib3: 2.0.7
- wandb: 0.16.1
- warc3-wet: 0.2.3
- warc3-wet-clueweb09: 0.2.5
- webencodings: 0.5.1
- werkzeug: 3.0.1
- wheel: 0.41.2
- yarl: 1.9.2
- zlib-state: 0.1.6 -
System:
- OS: Linux
- architecture:
- 64bit
- ELF
- processor: x86_64
- python: 3.10.13
- release: 5.15.0-88-generic
- version: #98~20.04.1-Ubuntu SMP Mon Oct 9 16:43:45 UTC 2023
I had this issue before, and the reason is the query was too long in my experiment