Pinned Repositories
conferences
defcon_grt_notebook
Quickstart notebook for the DEF CON 32 Generative Red-teaming Challenge
marque
Minimal workflows
paperstack
Arxiv + Notion Sync
parley
Tree of Attacks (TAP) Jailbreaking Implementation
research
General research for Dreadnode
rigging
Lightweight LLM Interaction Framework
TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
tensorrtllm_backend
The Triton TensorRT-LLM Backend
transformers-neuronx
dreadnode's Repositories
dreadnode/rigging
Lightweight LLM Interaction Framework
dreadnode/parley
Tree of Attacks (TAP) Jailbreaking Implementation
dreadnode/research
General research for Dreadnode
dreadnode/conferences
dreadnode/marque
Minimal workflows
dreadnode/paperstack
Arxiv + Notion Sync
dreadnode/defcon_grt_notebook
Quickstart notebook for the DEF CON 32 Generative Red-teaming Challenge
dreadnode/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
dreadnode/tensorrtllm_backend
The Triton TensorRT-LLM Backend
dreadnode/transformers-neuronx
dreadnode/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs