/awesome-ai4lam

A list of awesome AI in libraries, archives, and museum collections from around the world 🕶️

Primary LanguageSCSSCreative Commons Zero v1.0 UniversalCC0-1.0

Awesome AI for LAM Awesome

A curated list of resources, projects, and tools for using Artificial Intelligence in Libraries, Archives, and Museums.

License Maintained? Last commit GitHub contributors Mastodon Slack

Contents

Introduction

This list is a collection of resources, tools, projects, and other materials for professionals and enthusiasts in the Libraries, Archives, and Museums (LAM) sector. You might also know this as the GLAM (galleries, libraries, archives and museums) or CHI (cultural heritage institutions) sector, or be more familiar with the term 'memory institutions'. However you describe the field, if you know of an AI, machine learning, big data or data science project, event or resource related to collections, please share it here!

This list is maintained by the AI4LAM community. Its aim is to support knowledge sharing, innovation, and collaboration in applying AI to LAM.

Learning ResourcesClick button to suggest an addition

Please note: the appearance of a resource on this list does not constitute an official endorsement by AI4LAM.

Introductions to AI

Computer vision

Natural language processing

Generative AI

AI in galleries, libraries, archives and museums

Other "awesome" lists in AI and ML

Tools and FrameworksClick button to suggest an addition

Note: datasets for training and testing are listed in a separate section of this document.

Document analysis, transcription, and labeling

  • Arkindex – open-source platform for managing & processing collections of digitized documents
  • Callico – open-source web platform for document annotation
  • Coconut Libtool – web-based textual analysis tool designed to assist social scientists, librarians, or anyone in data analysis
  • Distributed Annotation 'n' Enrichment (DANE) – compute task assignment & file storage for automatic annotation of content (CLARIAH, Norway)
  • HTRFLOW demo and associated GitHub repo – explore AI models for Handwritten Text Recogntion (Swedish National Archives)
  • Label Studio – data labeling platform to fine-tune LLMs, prepare training data, or validate AI models
  • OCR correction – OCR correction tools (Bibliothèque nationale, Luxembourg)
  • Surya – multilingual document OCR toolkit with line-level text detection
  • Text models from the National Library of Sweden – available on Hugging Face
  • Transkribus – transcription, recognition, & searching of historical documents

Audio and video analysis, transcription, and labeling

  • Acoustic models from the National Library of Sweden – available on Hugging Face
  • Annotorious – JavaScript image annotation library
  • Audiovisual Metadata Platform (AMP) – generation of metadata for discovery & use of digital audio & video collections (Indiana U., USA)
  • CAMPI – Computer-Aided Metadata Generation for Photo archives Initiative (Carnegie Mellonw U., USA)
  • ELAN – addS textual annotations to audio and/or video recordings (Max Planck Institute for Psycholinguistics, The Netherlands)
  • inaFaceAnalyzer – Python toolbox for face-based description of gender representation in media (Institut National de l'Audiovisuel, France)
  • Newspaper Navigator – explore visual & textual content in the Chronicling America digitized newspaper collection (Library of Congress, USA)
  • Oodi – virtual information assistant (Helsinki Central Library)
  • ReTV – video analysis & summarization (Modul Univesrity, Austria)
  • VGG Image Annotator – manual annotation software for image, audio and video

Indexing and classification

  • Annif and associated tutorial – tool for automated subject indexing and classification (National Library of Finland)

Search and retrieval

Applications of Transformers, LLMs, and GPT

  • BERTopic – topic modeling technique that leverages Transformers and c-TF-IDF
  • Chatbot for Luxembourgish newspapers – uses ChatGPT and understands French, German and English (Bibliothèque nationale de Luxembourg)
  • Norwegian Transformer Model (NoTraM) – transformer model for Norwegian and Nordic languages (National Library of Norway)
  • Swedish BERT – BERT model for the Swedish language (Royal Library of Sweden)
  • Visual AI – open-world interpretable visual transformer (UK)

DatasetsClick button to suggest an addition

Datasets available on Hugging Face

There are many (G)LAM-related datasets on Hugging Face. The following links will perform live searches directly in Hugging Face for datasets tagged with the given terms:

Datasets available elsewhere

Projects, Initiatives, and Case StudiesClick button to suggest an addition

Project lists & directories

Select individual projects

Policies and recommendationsClick button to suggest an addition

Statements by organizations and government bodies

Surveys of policies and recommendations

Frameworks

Conferences and WorkshopsClick button to suggest an addition

The annual Fantastic Futures conference is the main conference series for the AI4LAM community. Various other conferences and workshops are relevant to the community and may be included in the list below.

Upcoming Conferences and Workshops

👋🏻 Note: AI4LAM's conferences tracker Google sheet has a more complete list of events. The following is a list of larger and/or especially relevant events for AI4LAM.

Past Conferences and Workshops

  • Fantastic Futures 2018 – Dec. 5 at the National Library of Norway, Oslo, Norway.
  • Fantastic Futures 2019 – Dec. 4–6 at Stanford University, Stanford, California, USA.
  • Fantastic Futures 2021 – Dec. 8–10 at the Bibliothèque nationale de France, Paris, France.
  • Fantastic Futures 2022 – Nov. 30–Dec. 2 virtual event hosted by the British Library, London, England.
  • ai4Libraries Conference – Oct. 19 virtual event hosted by Georgia Tech Library, Atlanta, Georgia, USA.
  • Fantastic Futures 2023 – Nov. 15–17 at Internet Archive Canada Headquarters, Vancouver, British Columbia, Canada.
  • Fantastic Futures 2024 – Oct. 15–18 at The National Film and Sound Archive of Australia (NFSA) in Canberra, Australia.

Publications and News SourcesClick button to suggest an addition

Journals and Magazines

News sources

Community

The AI4LAM community's home page is https://ai4lam.org. The secretariat and other contact addresses can be found at the About page.

Contributions

Your help and participation in enhancing this awesome list are very much welcome! Please use the issue ticket system to request additions or changes, or to make other contributions to this repository. For more information, please visit the guidelines for contributing.

Click button to suggest an addition

License

CC0 Logo

The contents of this page are licensed under the Creative Commons CC0 1.0 Universal license. CC0 is a “no rights reserved” license; the authors relinquish copyright and similar rights to the contents of the Awesome AI for LAM list.