/stat-6990-dataset-structure

Primary LanguageJupyter NotebookOtherNOASSERTION

stat-6990-dataset-structure

This repository serves as a template for a STAT 6990 project assignment, where students are tasked with preparing a network data set. This project emphasizes the importance of data preparation, reproducibility, and documentation.

A simple network is used an an example.

Repository Structure

  • network.gml: This file contains the network file in a standard format (GML, GraphML, JSON, or CSV) with all necessary attributes. Here I've used GML as an example.
  • /reproduction_files: A sub-repository with all original data and scripts or notebooks needed to transform the original data into the final network structure.
  • network_card.json: The network card providing a concise summary of the dataset, located in the base folder.
  • README.md: This document, explaining the repository's structure and purpose.

License

For data, we recommend using a license that facilitates sharing and reuse with proper attribution. Suitable licenses include, but are not limited to:

  • CC-BY : This license lets others distribute, remix, adapt, and build upon your work, even commercially, as long as they credit you for the original creation.
  • CC0 : A way to waive all rights and put the data in the public domain.
  • ODC-BY : Similar to CC-BY but specifically designed for data.

Choose a license that aligns with your intentions for data sharing and reuse. Add a LICENSE file in your repository with the chosen license text.

Resources

Final notes

Remember to add @jg-you as a collaborator if your repository is private.

Structure your project according to this template and consult the provided links for guidance on licenses and documentation.