This repository contains the solution of one of the groups participating in the LDAC2023 hackathon.
When adding content to the bSDD you can link your classes and definitions to existing resources. However, many content creators do not link their terms -- they may not even be aware which similar terms have been defined elsewhere. In this solution we provide code to suggest related terms within bSDD in three ways:
- Within a term's description, identify potential hyperlinks to bSDD terms.
- Identify similar terms based on its
name
anddescription
:
- Semantic similarity, which relies on
sentence-transformers/all-mpnet-base-v2
embeddings - Span overlap similarity, following some work on Intelligent Regulatory Compliancy (iReC)
bSDD page: http://bsdd.buildingsmart.org/ Documentation: https://github.com/buildingSMART/bSDD Data model: https://github.com/buildingSMART/bSDD/.../bSDD Search GUI: https://search.bsdd.buildingSMART.org API docs: https://app.swaggerhub.com/apis/buildingSMART/Dictionaries/v1 Management platform: https://manage.bsdd.buildingsmart.org/ Same three on test server: