• Used spark for data cleaning
  • elastic search vector database for semantic search
  • mistral llm with static quantization and running the inference on onnx compiler with tensorrt runtime
  • deployed on docker with gcp and kubernetes
  • Replacing mistral with gemma
  • Adding bark and whisper