/Thesis_vis

Primary LanguageJavaScript

Web-based interactive network visualization of master's theses dataset from NOVA Information Management School (NOVA IMS) was created by using Gephi.

image

It supports exploration, browsing and navigation as well as zoon. In the visualization was incorporated the main results of Latent Dirichlet Allocation (LDA), clustering algorithms, and network analysis. It shows the clusters, the interlinkages between documents and the subjects covered in each thesis. Besides that, the useful information of the metadata and useful text statistic of the documents can be embedded in the visualization.

Similar nodes (theses) are located close to each other in the network. Most of the similar nodes have the same colour which meaning that they belong to the same cluster. This shows some similarities with the results found by using SOM and Ward even though the network was based only in three of eight topics. Above is described each cluster found:


Cluster 0 (Red): It is characterized by the topics “Health Management Education”, “Land_cover/Maps/Urbanism/Populations/Environmental” and “GIS/Spatial/Smart_cities/Maps/Technology”; Most of the theses in this cluster are from “Geospatial Technologies”, “Geographic Information Systems and Science” and “Clinical Research Management”.


Cluster 1 (Blue): It is characterized by the topics “Customer_satisfaction_&_behaviour/Marketing/Products/Brands” and “Busines/Customers/Companies/Management/Technology”; Most of the theses in this cluster are from “Marketing Intelligence” and “Marketing Research and CRM”. The cluster also has some theses from “Information Systems and Technologies Management”, “Knowledge Management and Business Intelligence” and “Data Science and Advanced Analytics”. *


Cluster 2 (Dark moderate magenta): It is characterized by the topic “Busines/Customers/Companies/Management/Technology”; Most of the theses in this cluster are from “Information Systems and Technologies Management” and “Knowledge Management and Business Intelligence”. The cluster also has some theses from “Data Science and Advanced Analytics”. The theses in this cluster have many connections with other theses and diversity of topics.


Cluster 3 (Yellow): It is characterized by the topics “GIS/Spatial/Smart_cities/Maps/Technology” and “Busines/Customers/Companies/Management/Technology”; Most of the theses in this cluster are from “Geospatial Technologies”, “Geographic Information Systems and Science” and some theses from “Knowledge Management and Business Intelligence”.


Cluster 4 (Pink): It is characterized by the topic “Machine_learning_algorithms” and has some weight on “Busines/Customers/Companies/Management/Technology” and “GIS/Spatial/Smart_cities/Maps/Technology”; Most of the theses in this cluster are from “Data Science and Advanced Analytics”, “Geospatial Technologies”. The cluster also has some theses from “Knowledge Management and Business Intelligence” and “Risk Analysis and Management”.


Cluster 5 (Gray): It is characterized by the topics “Risk/Bank/Insurance/Investiments/Marks/Law” and “Busines/Customers/Companies/Management/Technology”; Most of the theses in this cluster are from “Risk Analysis and Management”, “Law and Financial Markets” and “Information Systems and Technologies Management”.