The purpose of this repository is to apply some network analysis methods, which is precisely based on graph theory, to sample data. This repository does not have
- Academic or technical deep explanation
- Meaningful background (I just played with data)
- Insightful results
$ pyenv --version
pyenv 1.2.18
$ python --version
Python 3.7.6
local
$ python -m venv .venv
$ source .venv/bin/activate
$ pip install -r requirements.txt
$ jupyter lab
docker
$ docker build -t <image-name> .
$ docker run -it -p 8888:8888 <image-name>
# If you want
# $ docker run -d --rm -p 8080:8080 plantuml/plantuml-server:jetty
Then, open your browser by localhost:8888. Probably, JupyterLab requires you to input access token (it's already outputted on console.)
The data was extracted from collection.abc.
Only on a diagram, I wrote down some functions to understand what functions are declared as an abstract method, and what functions are added on some classes. Source is here.
Notes
- Separated
@abstractmethod
and usual ones by horizontal line - Avoided to write output type (I'm not confident)
cls
means@classmethod
Data is following Gremlin style because I'm aiming to insert this data into AWS Neptune. From my perspective, I have to deal with some Graph DB in real business situations. For further improvements, I just chose to store data based on Gremlin style.
$ head -n 5 data/vetices.csv
~id,name:String
v0,"Container"
v1,"Hashable"
v2,"Iterable"
v3,"Iterator"
$ head -n 5 data/edges.csv
~id,~from,~to,~label
e0,v3,v2,extends
e1,v4,v2,extends
e2,v5,v3,extends
e3,v8,v6,extends
If your interest to AWS Neptune -> here
I compute below three centralities to analyze this network in detail. Centrality can express "how important a node is in a network". By looking at the results, we can understand "what node (=class) is important?"
Notebook: here
By combining with these three results, I made a ranking table. As you know, Set, Mapping and Sequence is related to set, dict, list. Since these types are quite important to understand and use python in development. This result is seemed to be great and matched to our intuition or experience.
name:String | Sum of rank value |
---|---|
Collection | 3 |
Set | 8 |
Sequence | 12.5 |
MappingView | 16.5 |
Mapping | 23.5 |
These are very helpful to do this activities. Arigatou!
Tools
Extensions
Libraries
UML
- PlantUML Overview
- Class Diagram
- UML Tutorial
- UMLの爆速プレビュー環境をVisual Studio Code + PlantUML Server on Dockerで簡単に構築する
Python
Network Analysis & Graph Theory