/semikong

First Open-Source Industry-Specific Model for Semiconductors

Primary LanguagePythonApache License 2.0Apache-2.0

SEMIKONG: The First Semiconductor Industry-Specific Large Language Model

SEMIKONG is an open-source, industry-specific large language model (LLM) tailored to the semiconductor domain. It aims to address the unique challenges faced by the semiconductor industry, such as the physics and chemistry of semiconductor devices and processes, by incorporating domain-specific knowledge into the model.

Key Features

  • First industry-specific LLM for the semiconductor domain
  • Trained on a comprehensive semiconductor-related text corpus
  • Novel pre-training approach leveraging domain-specific knowledge
  • Superior performance compared to general-purpose LLMs on industry-relevant benchmarks
  • Serves as a valuable foundation for companies to build proprietary models tailored to their needs

Contributions

This project is the result of a collaborative effort involving multiple companies and individuals:

We would like to express our gratitude to the AI Alliance (https://thealliance.ai) for providing the impetus, resources, and platform for this work, and for collaboration in open science. We also extend our thanks to the member organizations of the AI Alliance, their researchers and engineers for their valuable contributions to this study, including:

  • Noritaka Yokomori (Tokyo Electron)
  • Anthony Annunziata (IBM Research)
  • Sean Hughes (ServiceNow)
  • Phong Nguyen (FPT Software, AI Center)

Their expertise, insights, and collaborative spirit have been instrumental in advancing our research.

Getting Started

To get started with SemiKong, please refer to the installation guide and usage instructions.

License

This project is licensed under the Apache 2.0 License.

Citation

If you use SemiKong in your research or development work, please cite our paper:

@article{semikong2024,
  title={SemiKong: Curating, Training, and Evaluating A Semiconductor Industry-Specific Large Language Model},
  author={Christopher Nguyen et al.},
  journal={arXiv preprint arXiv:2024.xxxxx},
  year={2024}
}

Contact

For questions, suggestions, or collaborations, please contact the SemiKong team at semikong@aitomatic.com