Many studies have recently concentrated on determining if GCNs can handle various Natural Language Processing tasks, particularly text categorization. While text classification using GCNs is widely-studied, its graph building approaches, such as node/edge selection and feature
representation, as well as the best GCN learning mechanism in text classification, are mostly ignored.
This project aims to carry out a multi label text classification process with the help of node classification and by utilizing graph convolution network (GCN).
- Graph Construction: Building graphs using both document-word edges and word-word edges.
- GCN Utilization: Employing GCNs for multi-label text categorization.
- Insightful Analysis: Provides valuable information on the effectiveness of different graph node/edge building methods in GCN training/testing for text classification.
- Source: BBC News dataset from Kaggle.
- Articles: 2225 articles.
- Categories: Business, Entertainment, Politics, Sport, Tech.
- Columns: Text and Category.
-
Node Categorization: Visualization of node categorization for each text column in the CSV file.
-
Document Nodes: Nodes representing documents.
-
Word Nodes: Nodes representing words.
- Confusion Matrix
- F1 Scores (Macro): 0.9295
- F1 Scores (Weighted): 0.9318
- Accuracy: 0.9318
No | Code Functionality | % Complete | Runs without Problem (Y/N) | Minor Issues |
---|---|---|---|---|
1 | Preprocessing | 100 | Y | None |
2 | Keyword Extraction | 100 | Y | None |
3 | Constructing Edges | 100 | Y | None |
4 | Building the Graph | 100 | Y | None |
5 | Model Building | 100 | Y | None |
6 | Training using GCN | 100 | Y | None |
7 | Evaluation | 100 | Y | None |
- Building graphs using text data.
- Performing node classification.
- Graph node and edge construction.
- The role of nodes and edges in corpus-level textual graphs.
- The impact of node embeddings, edge construction, and GCN learning.
- Model Accuracy: Improving model accuracy beyond the current 93%.
- Graph Building Efficiency: Finding better approaches for constructing graphs to enhance efficiency.