/Proximity-indices-applied-to-OpenAlex

Project aiming at measuring the proximity between cybersecurity technologies under the form of time series based on bibliometric data taken from Openalex.

Primary LanguageJupyter Notebook

Proximity-indices-between-technologies-applied-to-OpenAlex


Table of Contents
  1. About The Project
  2. Getting Started
  3. Contributing
  4. Contact
  5. Acknowledgments

About The Project

This projects aims at creating and measuring the evolution of the proximity between cybersecurity technologies under the form of time series based on bibliometric (text mining) data taken from Openalex.

The project is divided in 3 parts: (i) the collection of raw data, (ii) the exploration and transformation (EDA) of the data, and (iii) the creation, visualization and forecasting of proximity indices (extracted from raw data). Each part is contained in one specific folder. Each folder contains a file called "directory_file". If you run this file, it runs all the other files contained in the folder (in the right order). Nevertheless, some libraries must be installed before running the main directory file. This can be found in "getting started".

(back to top)

Built With

(back to top)

Structure of the repository

The repository contains several folders each dedicated to specific task of the project.

  • .idea
    This is just a folder automatically generated by github. One do not need it to run the code.
  • creation_data_and_variables
    This folder contains all the files which create the different pandas dataframes of variables.
  • exploratory_analysis
    This folder contains all the files which explore and visualize the data generated in the previous folder.
  • indices_proximity
    This folder contains all the files which generate, explore, visualize, process, cluster and forecast time series of indices of technological proximity.

(back to top)

Getting Started

To get a local copy up and running follow these simple example steps.

Prerequisites

Give a list of all the libraries required to run my work.

  • keybert~=0.6.0
    pip install keybert
  • tqdm~=4.64.1
    conda install -c conda-forge tqdm
  • nltk~=3.7
    conda install -c anaconda nltk
  • pandas~=1.4.4
    conda install -c anaconda pandas
  • numpy~=1.22.4
    conda install -c anaconda numpy
  • sktime~=0.14.1
    conda install -c conda-forge sktime
  • yellowbrick~=1.5
    conda install yellowbrick=1.5
  • torch~=1.13.1
    conda install -c pytorch pytorch
  • tslearn~=0.5.2
    conda install tslearn=0.5.2
  • darts~=0.21.0
    conda install -c conda-forge darts
  • scikit-learn~=1.0.2
    conda install scikit-learn=1.0.2
  • scipy~=1.9.3
    conda install scipy=1.9.3
  • seaborn~=0.12.0
    conda install seaborn=0.12.0
  • optuna~=2.10.1
    conda install optuna=2.10.1
  • matplotlib~=3.5.3
    conda install matplotlib=3.5.3
  • requests~=2.28.1
    conda install requests=2.28.1

Installation and use

Below is an example of how you can instruct your audience on installing and setting up your app. This template doesn't rely on any external dependencies or services.

  1. Download all the folders in your laptop
  2. Download manually "m4_monthly.pkl" and "m4_monthly_scaled.pkl" from https://github.com/unit8co/amld2022-forecasting-and-metalearning/tree/main/data and to put them in the folder called "indices_proximity". These files are used then for the transfer learning part and I could not download them automatically, because of some technical problems.
  3. Download all the libraries mentionned above.
  4. Once this is done, one can simply run the main directory file and then all the files are ran. Note that the whole computations might take approximately a week to be run.

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

(back to top)

Contact

Alessandro Tavazzi
E-mail adress - tavazale@gmail.com
LinkedIn - https://www.linkedin.com/in/alessandro-tavazzi-237bb1201/

Project Link: https://github.com/technometrics-lab/Proximity-indices-applied-to-OpenAlex

(back to top)