In order to run COPPTCHA, you must create and utilize a dataset with the HPC profile and COPPA label of each app. The tools necessary for data collection are:
- Python 2.7 or higher
- Play_scraper
- re
- Android Studio with SDK and NDK (simpleperf, monkey)
Here is a video that explains the data collection process.
- Edit and run scraper_for_app_list.py, replacing the dummy values with your chosen values for app amount and age ranges. This will create the app list.
- Remove all apps not covered in the COPPA census repository.
- Remove all apps no longer available in the Google Play Store.
- Go to the app's profile in the COPPA repository.
- Copy the guide_to_reading_coppa_compliance_status_data for the app.
- For each feature, replace the violation with a 1 or a 0 if it does or does not exist, respectively. This represents the y vector for the app.
- Download Android Studio and any necessary dependencies.
- Open Android Studio.
- Press the gear at the bottom right to enter configuration settings.
- Choose ‘SDK Manager’.
- Choose ‘SDK Tools’.
- Click on the boxes for ‘Android SDK Tools’ and ‘NDK’. Change the location above the list to set your path.
- Press apply to begin installation of components.
- Press Finish when done.
- Go to the Android SDK location (set during component installation).
- Locate the platform-tools folder and set it as your path.
- Connect your device with an USB, ensuring that Developer Options and USB Debugging are enabled.
- Run “adb devices” to check if your device is properly connected.
- Return to the SDK location.
- Enter ndk-bundle/simpleperf/bin/android/arm to locate the simpleperf binary.
- Push the simpleperf binary to the same folder as the adb binary.
- Run “adb push simpleperf data/local/tmp” to push the binary onto the device.
- Use “adb shell” followed by “su” to enter the shell and establish root privilege.
- Set path as /data/local/tmp.
- Run ./simpleperf to make sure it works.
- Run ./simpleperf list for all supported events.
- Run ./simpleperf list hw for all supported hardware events.
- Create a list of all the events you want to test and split them into sets of 5 (or another number depending on your processor). Edit event_list_prep.py as needed.
- Ensure monkey works.
- Set a constant seed and duration value.
- Use the stat command for each event set. Sample Command: monkey -p com.jg.spongeminds.preschooldemo -v 1050 -s 42 ; ./simpleperf stat --app com.jg.spongeminds.preschooldemo --csv -e L1-icache-load-misses,LLC-loads,LLC-load-misses,LLC-stores,LLC-store-misses --duration 30 --interval 50 -o test/com.jg.spongeminds.preschooldemo_set2.csv
Notes:
- Always include the full package name after --app.
- The process ID and thread options tend to be unreliable. You can use -a to generalize event counting.
- Add --csv to make sure the output file is in a comma separated form. Excluding it will allow you to read the data, but not manipulate it easily.
- For the events list, when listing multiple events, simpleperf requires the name is be exactly as it was shown in the output of ./simpleperf list. Additionally, there cannot be any spaces and there must be a comma between events.
- Note that using -e $(cat events.txt) works so long as the text file follows the same guidelines.
- Measuring more than six events at a time results in multiplexing and should be avoided.
- You can use --group instead of -e to sync event monitoring.
- Optionally, include --duration for simpleperf to run only at a specified amount of seconds. Excluding this would require a manual exit (CTRL-C).
- Including --interval prints statistics every specified amount of milliseconds. While using this option results in a larger output file, it facilitates detecting possible multiplexing.
- Note that if you use this option, only the very last counts for each event will matter in the output file.
- If you do not choose a file name after -o, it will output all the results to perf.data.