doctoral-research-publication

This repository contains data collected and analyzed as part of a doctoral research project that explores the improvement of the consistency of architectural diagrams of distributed systems through the use of system descriptors. The data stem from two distinct research methodologies: a survey and an empirical experiment, detailed as follows:

Survey Data:

Objective: To analyze the perception of software engineering practitioners regarding the quality and consistency of architectural diagrams of distributed systems within the software industry. This survey aims to shed light on the issue of architectural diagram inconsistency and highlight the most common problems faced by software architecture professionals.

Methodology: The survey was conducted online, comprising closed and open questions to collect qualitative and quantitative opinions.

Participants: It included 132 software architecture practitioners with varied levels of experience, from different geographical regions.

Data Included: Survey responses (anonymized), statistical analysis of quantitative responses, and a synthesis of qualitative responses.
Empirical Experiment Data:

Objective: To investigate the impact of using system descriptors, such as Docker-compose and Kubernetes, on the consistency of architectural diagrams of distributed systems.

Methodology: We compared two categories of software architectural diagrams: ad-hoc diagrams and descriptor-based diagrams. Ad-hoc diagrams are generated using general design and authoring tools such as Visio or Draw.io, while descriptor-based diagrams can be automatically generated from system descriptors, such as Docker Compose or Kubernetes. To compare the effectiveness of these two diagram categories, we conducted an empirical study with software engineers and architects. Participants were given a brief description of a distributed system and a diagram from one of the two categories. They then answered a set of questions based on the system description and the presented diagram. We collected data on three variables: the score of correct answers (variable C), the time each participant took to answer the questions (variable T), and participants' perception of how easily they could extract information from the diagrams (variable P).

Participants: 26 professionals and advanced students from the information systems and computing area.

Data Included: Generated architectural diagrams, used system descriptors, comparative analysis of the consistency of the diagrams, and participants' feedback on the experience.

Data Format:

The data are made available in accessible formats for analysis, including CSV, and XLS for quantitative data, PDF for architectural diagrams, and text documents for qualitative responses and feedback.

Access and Use:

These data are made available under the MIT license, allowing for use, sharing, and adaptation for academic and research purposes, provided the original source is properly cited.

How to Cite:

To cite this dataset in publications, please use: DOI 10.5281/zenodo.10822839

jalvesnicacio/doctoral-research-publication

doctoral-research-publication

Data Format:

Access and Use:

How to Cite: