Note: This kit is currently unreleased, and is dependent on the OEA framework v0.7
This module test data generation kit aims to enable users to generate randomized test data, that will be able to be used across all modules, schemas, and packages within the OEA framework. This tool will allow you to create temporary data to be used in experimentation with any module or package. Test data generated here will also connect across modules, allowing the user to create robust dashboards on semi-realistic data, with no threat to the privacy of an education system.
The OEA test data generation kit uses five base-truth tables to artifically generate data for any module by creating the general data and then assigning the data source's proper column names. These base-truth table details are described below, which are defined within the test data generation class notebook.
Abbreviations
- SIS: School Information System
- UUID: Universal Unique Identifier
Column Name | Description |
---|---|
Gender | Student gender: M (male), F (female), or O (other) |
FirstName | Student first name |
MiddleName | Student middle name |
LastName | Student last name |
StudentID | SIS ID: UUID |
Birthday | Student birth date: YYYY-MM-DD |
School | School name |
SchoolID | SIS ID: UUID |
Grade | Student grade level (numerical) |
Performance | Student academic performance: high, avg (average), or low |
HispanicLatino | Student ethnicity: True or False |
Race | white (White), blackafricanamerican (Black or African American), americanindianalaskanative (American Indian or Alaska Native), asian (Asian), nativehawaiianpacificislander (Native Hawaiian or Other Pacific Islander), or twoormoreraces (Two or More Races) |
Flag | (Blank), FreeLunch, ReducedLunch, Homeless, or GiftedOrTalented |
Student school email address: (FirstName)(LastName)@contoso.edu | |
Phone | Student phone number |
Address | Student street address |
City | Student city |
State | Student state: CA |
Zipcode | Student zipcode: ##### |
Column Name | Description |
---|---|
SchoolName | School name |
SchoolID | SIS ID: UUID |
Column Name | Description |
---|---|
CourseName | Course name |
CourseID | Course information system ID: UUID |
SchoolName | School name where course is hosted |
SchoolID | School information system ID of school where course is hosted |
CourseSubject | English Language and Literature, Mathematics, Life and Physical Sciences, Social Sciences and History, Visual and Performing Arts, Physical Health and Safety Education, Information Technology, Communication and Audio Video Technology, Business and Marketing, Health Care Sciences, Architecture and Construction, Human Services, Engineering and Technology, World Language, Miscellaneous, or Non-Subject-Specific |
CourseGradeLevel | Numeric grade level (i.e. 9, 10, 11, 12) |
Column Name | Description |
---|---|
SectionName | Section name: (CourseName) ### |
SectionID | SIS ID: UUID |
CourseName | CourseName associated with section |
CourseID | CourseID associated with section |
SchoolName | SchoolName where section is hosted |
SchoolID | SchoolID of SchoolName |
SectionSubject | CourseSubject of related course |
SectionGradeLevel | CourseGradeLevel of related course |
Column Name | Description |
---|---|
StudentName | Student first and last name |
StudentID | StudentID of StudentName |
SectionName | SectionName of section the student is enrolled in |
SectionID | SectionID of SectionName |
CourseName | CourseName associated with section |
CourseID | CourseID associated with section |
CourseGradeLevel | CourseGradeLevel associated with CourseName/CourseID |
SchoolName | School that is hosted section that student is enrolled in |
SchoolID | SchoolID of SchoolName |
Preparation: This module currently leans on v0.7 of the OEA framework. Ensure you have proper Azure subscription and credentials and setup of the OEA framework. This will include v0.7 of the OEA python class.
- Examine modules/data sources currently compatible. See below for these applicable data sources. Choose which modules or data sources to apply this test data generator.
- If you do not see a data source you wish to generate test data for, you will need to develop assets similar to the Insights module test data generator example.
- Import the general module test data generation class and demo notebooks, and run the demo notebook to create the base-truth tables. See more details and instructions under the notebook folder in this kit.
- Run the desired module-specific test data generation demo notebook.
- Verify that the test data was created and stored in stage1.
- Ingest the test data within the scope of that particular module or package. You can then utilize the test data generated for the relevant module or package/use case Power BI dashboard.
As it currently stands, this test data generation kit can be applied to the following OEA Modules:
Module | Applicable Tables |
---|---|
Clever Module | For the Daily Participation and Resource Usage tables. |
Microsoft Education Insights Module | For M365 roster and activity tables. |
See the Insights module test data generator assets under the Notebook resource for an example of a compatible module for this test data generation kit.
Out-of-the box assets for this OEA test data generation kit include:
- Base-truth table generation notebooks:
- test_data_generation_py: Main class for test data generation. Used by test_data_gen_demo to create base truth table files for support test data generation modules.
- test_data_gen_demo: Run this file in your OEA Synapse environment to generate base truth table files that can be used to create any module test data.
- Module-specific table generation notebooks:
- Clever module test data generation notebooks.
- Insights module test data generation notebooks.
This Test Data Generation Kit welcomes contributions.
This module was developed by Kwantum Analytics. The architecture and reference implementation for all modules is built on Azure Synapse Analytics - with Azure Data Lake Storage as the storage backbone, and Azure Active Directory providing the role-based access control.
Microsoft and any contributors grant you a license to the Microsoft documentation and other content in this repository under the Creative Commons Attribution 4.0 International Public License, see the LICENSE file, and grant you a license to any code in the repository under the MIT License, see the LICENSE-CODE file.
Microsoft, Windows, Microsoft Azure and/or other Microsoft products and services referenced in the documentation may be either trademarks or registered trademarks of Microsoft in the United States and/or other countries. The licenses for this project do not grant you rights to use any Microsoft names, logos, or trademarks. Microsoft's general trademark guidelines can be found at http://go.microsoft.com/fwlink/?LinkID=254653.
Privacy information can be found at https://privacy.microsoft.com/en-us/
Microsoft and any contributors reserve all other rights, whether under their respective copyrights, patents, or trademarks, whether by implication, estoppel or otherwise.