Analyzing the virtual weight connectomes in GPT-like transformers
Primary LanguageJupyter NotebookMIT LicenseMIT