Hierarchical Classification Domain Dictionary Construction for Process Industry Based on Knowledge Graph
South China University of Technology & Pengcheng Laboratory
This project focuses on constructing a hierarchical classification domain dictionary for the process industry, leveraging knowledge graphs. The process industry, encompassing metallurgy, petroleum, chemicals, building materials, and electricity, is essential for economic and societal development but faces challenges such as insufficient automation and skilled labor shortages. Our method includes efficient entity recognition and relationship extraction, effective updating and expansion of industrial entities, and systematic mapping of entity categories to build a comprehensive dictionary. This provides robust technical support for industrial knowledge management and intelligent manufacturing.
- Hierarchical Classification: Organizes domain-specific terms and concepts in a hierarchical structure for efficient data management and retrieval.
- Knowledge Graph Integration: Utilizes knowledge graphs to interlink related concepts and provide context-aware information retrieval.
- Scalability: Designed to handle large-scale industrial data with the ability to scale as needed.
- Flexibility: Easily adaptable to different domains within the process industry.
To install and set up this project, follow these steps:
-
Clone the Repository
git clone https://github.com/ecjtulrh/IndustryDictionary/.git cd code
-
Install Dependencies
pip install -r requirements.txt
-
Setup Neo4j Database
- Download and install Neo4j from Neo4j Download Center
- Start the Neo4j database and set up a new project
- Import the provided data into Neo4j
-
Configure Settings
- Update the
config.json
file with your Neo4j database credentials and other configuration settings
- Update the
-
Run the Entity Recognition and Relationship Extraction
python entity_recognition.py
-
Update and Expand Entities
python update_entities.py
-
Build the Hierarchical Classification Dictionary
python build_dictionary.py
For detailed experimental results, comparisons with baseline models, and error analysis, refer to the paper.
We welcome contributions to enhance this project. Please follow these steps:
- Fork the repository
- Create a new branch (
git checkout -b feature-branch
) - Commit your changes (
git commit -am 'Add new feature'
) - Push to the branch (
git push origin feature-branch
) - Create a new Pull Request
This project is licensed under the MIT License. See the LICENSE file for more details.
We thank all the contributors and the community for their valuable input and support.
KGDD supports both manual and docker image environment configuration, you can choose the appropriate way to build.
conda create -n kgdd python=3.8 -y
conda activate kgdd
pip install torch==1.13.1+cu116 --extra-index-url https://download.pytorch.org/whl/cu116
pip install -r requirements.txt
Coming soon
If you use the codes and datasets , please cite the following paper(not published yet).