OpenNMT/CTranslate2

Fast inference engine for Transformer models

C++MIT

Pinned issues

Continuous batching

#1333 opened 2 years ago

Open5

Can batch translation on CPU result in different output?

#693 opened 3 years ago

Open8

Feature request: AMD GPU support with oneDNN AMD support

#1072 opened 2 years ago

Open51

Issues

Convert model.bin (fp32) to model.bin (int8)
#1761 opened 5 months ago
4
When I run online for a long time, the gpu memory will get bigger and bigger
#1759 opened 5 months ago
0
set_random_seed does not make temperature based decoding deterministic
#1757 opened 5 months ago
4
Support for ARM64 on Windows
#1756 opened 5 months ago
3
Docker images not published
#1754 opened 5 months ago
2
CI failing in the several recent PRs
#1753 opened 5 months ago
2
Flash Attention regurgitates repeated tokens - seq2seq
#1752 opened 6 months ago
1
Llama 3.1 support please?
#1745 opened 5 months ago
2
Converting to Llama architecture to expand models supported by Ctranslate2
#1744 opened 4 months ago
1
Trouble converting Mistral-Nemo despite architecture indicating MistralForCausalLM
#1743 opened 4 months ago
10
The version of CUB in your include path is not compatible with this release of Thrust.
#1742 opened 6 months ago
1
Introduce better format for whisper models
#1741 opened 6 months ago
1
Other differences in the beam search implementation?
#1740 opened 6 months ago
5
Falcon-11B support
#1737 opened 7 months ago
0
[Feature Request] Expose Profiler to Python
#1736 opened 7 months ago
1
Add gemma2 support
#1735 opened 3 months ago
2
[Profiling] Help needed with Profiling CT2 on both CUDA and CPU
#1734 opened 7 months ago
1
Segmentation fault when using `return_alternatives` flag on NLLB model on cuda
#1731 opened 7 months ago
2
[feature request] Mixed quantizations.
#1730 opened 7 months ago
1
Download ready to use model
#1729 opened 6 months ago
3
Gemma model - help needed
#1728 opened 7 months ago
4
Adding a layer to an existing model?
#1726 opened 7 months ago
4
Feature Request: Support for CohereForAI AYA-23 models
#1725 opened 7 months ago
0
Inference for ctranslate2 using tensor parallel with mpi
#1724 opened 7 months ago
1
Qwen2 Support?
#1721 opened 7 months ago
2
Does CT2/OpenNMT engine support Qualcomm SoC?
#1720 opened 7 months ago
1
T5 inference result is all <pad>
#1719 opened 7 months ago
2
Feature Request: Implement Static Cache and Quantization Techniques in CTranslate2
#1717 opened 8 months ago
18
Proper way to change alignment_heads and alignment_layer when using return_attention=True for converted HF Seq2Seq Transformers?
#1716 opened 8 months ago
0
Facing issues with Ctranslate2 when working with Intel built-in GPU and oneDNN
#1715 opened 6 months ago
7
Ctranslate2 Pypi exceeds limit 20GB
#1712 opened 5 months ago
1
Converter not working for NLLB models
#1711 opened 8 months ago
9
CUDA DeviceAllocate segfault
#1709 opened 8 months ago
3
Different results when run with tensor parallelism
#1708 opened 8 months ago
2
Support for Phi3-Small, Medium, and Vision
#1707 opened 8 months ago
2
Clang unusual switches wrongly hardcoded in resulting setup.py
#1704 opened 8 months ago
3
Doesn't build without docker. libiomp5 not found
#1703 opened 8 months ago
6
Option --self_attn_type scaled-dot-flash is not supported (supported values are: scaled-dot)
#1702 opened 8 months ago
6
Whisper encode roughly 4x slower than openai/pytorch
#1699 opened 8 months ago
1
libctranslate2-81fc0d88.so.4.2.1 in python package has executable stack flag
#1698 opened 8 months ago
0
CTranslate2 cmake error when trying to build the code from source with cuda support enabled on Windows.
#1697 opened 8 months ago
6
I got invalid conversion error when compile on linux
#1696 opened 8 months ago
2
opus-mt-en-zh does not respect the end token
#1694 opened 8 months ago
0
Can't hide GPUs to get_cuda_device_count()
#1693 opened 8 months ago
5
How to compile from source on windows 11?
#1692 opened 9 months ago
3
Unexpected inference results from Flan-T5 XXL converted to ctranslate2 with version 4.2.1 and 4.1.1 (using tensor parallel)
#1691 opened 9 months ago
4
target_prefix latency
#1689 opened 9 months ago
2
[SOLVED] Running Llama3 with Ctranslate2
#1688 opened 3 months ago
4
Dynamic LoRA switching
#1686 opened 9 months ago
2
Problem converting Phi3-instruct-128k; "su" rope scaling in Phi-3
#1685 opened 8 months ago
3