KOSMOS-2 is designed to handle text and images simultaneously, and redefine the way we perceive and interact with multimodal data, KOSMOS-2 is built on a Transformer-based causal language model architecture, similar to other renowned models like LLaMa-2 and Mistral AI's 7b model.
inuwamobarak/KOSMOS-2
KOSMOS-2 is designed to handle text and images simultaneously, and redefine the way we perceive and interact with multimodal data, KOSMOS-2 is built on a Transformer-based causal language model architecture, similar to other renowned models like LLaMa-2 and Mistral AI's 7b model.
Jupyter Notebook