/LibScout

LibScout: Third-party library detector for Java/Android apps

Primary LanguageJavaApache License 2.0Apache-2.0

LibScout

LibScout is a light-weight and effective static analysis tool to detect third-party libraries in Android/Java apps. The detection is resilient against common bytecode obfuscation techniques such as identifier renaming or code-based obfuscations such as reflection-based API hiding or control-flow randomization. Further, LibScout is capable of pinpointing exact library versions including versions that contain severe bugs or security issues.

LibScout requires the original library SDKs (compiled .jar/.aar files) to extract library profiles that can be used for detection on Android apps. Pre-generated library profiles are hosted at the repository LibScout-Profiles.

Unique detection features:

  • Library detection resilient against many kinds of bytecode obfuscation (e.g. obfuscations by ProGuard)
  • Capability of pinpointing the exact library version (in some cases to a set of 2-3 candidate versions)
  • Capability of handling dead-code elimination, by computing a similarity score against baseline SDKs
  • Library API usage analysis upon detection

Over time LibScout has been extended to perform additional analyses both on library SDKs and detected libraries in apps:

  • API compatibility analysis across library versions and lib developers adherence to semantic versioning.
  • Library updatability analysis to infer if and to which extent detected libraries in apps can be updated without code changes based on their API usage.

In addition, there is an Android Studio extension up2dep that integrates the API compatibility information into the IDE to help developers keeping their dependencies up-to-date (and more).

Library History Scraper (./scripts)

The scripts directory contains a library-scraper python script to automatically download original library SDKs including complete version histories from Maven Central, JCenter and custom mvn repositories. The original library SDKs can be used to generate profiles and to conduct library API compatibility analyses (see modules below). Use the library-profile-generator script to conveniently generate profiles at scale.

The scrapers need to be configured with a json config that includes metadata of the libraries to be fetched (name, repo, groupid, artefactid). The scripts/library-specs directory contains config files to retrieve over 100 libraries from maven central and a config to download Amazon libraries and Android libraries from Google's maven repository (350 libraries, including support, gms, ktx, jetpack, ..).

NEW (07/30/19): Added list of 45 ad/tracking libraries with currently 1182 versions (trackers.json).

Detecting (vulnerable) library versions

Ready-to-use library profiles and library meta-data can be found in the repository LibScout-Profiles.

LibScout has builtin functionality to report library versions with the following security vulnerabilities.
The pre-generated profiles for vulnerable versions are tagged with [SECURITY], patches with [SECURITY-FIX].
This information is encoded in the library.xml files that have been used to generate the profiles. We try to update the list/profiles whenever we encounter new security issues. If you can share information, please let us know.

Library Version(s) Fix Version Vulnerability Link
Airpush < 8.1 > 8.1 Unsanitized default WebView settings Link
Apache CC 3.2.1 / 4.0 3.2.2 / 4.1 Deserialization vulnerability Link
Dropbox 1.5.4 - 1.6.1 1.6.2 DroppedIn vulnerability Link
Facebook 3.15 3.16 Account hijacking vulnerability Link
MoPub < 4.4.0 4.4.0 Unsanitized default WebView settings Link
OkHttp 2.1 - 2.7.4
3.0.0- 3.1.2
2.7.5
3.2.0
Certificate pinning bypass Link
Plexus Archiver < 3.6.0 3.6.0 Zip Slip vulnerability Link
SuperSonic < 6.3.5 6.3.5 Unsafe functionality exposure via JS Link
Vungle < 3.3.0 3.3.0 MitM attack vulnerability Link
ZeroTurnaround < 1.13 1.13 Zip Slip vulnerability Link

Identified Issues

On our last scan of free apps on Google Play (05/25/2017), LibScout detected >20k apps still containing one of these vulnerable lib versions. The findings have been reported to Google's ASI program. Unfortunately, the report seemed to be ignored. In consequence, we manually notified many app developers.

Among others, McAfee published a Security Advisory for one of their apps.

LibScout 101

  • LibScout requires Java 1.8 or higher and can be build with Gradle.
  • Generate a runnable jar with the gradle wrapper gradlew (Linux/MacOS) or gradlew.bat (Windows), by invoking it with the build task, e.g. ./gradlew build
    The LibScout.jar is output to the build/libs directory.
  • Some less frequently changing options can be configured via LibScout's config file LibScout.toml.
  • Most LibScout modules require an Android SDK (jar) to distinguish app code from framework code (via the -a switch). Refer to https://developer.android.com/studio/ for download instructions.
  • By default, LibScout logs to stdout. Use the -d switch to redirect output to files. The -m switch disables any text output. Depending on the operation mode (see below), LibScout's results can be written to disk in JSON format or JAVA serialization.
  • LibScout repo structure in a nutshell:

|_ gradlew / gradlew.bat (gradle wrappers to generate runnable LibScout.jar)
|_ assets
|    |_ library.xml (Library meta-data template)
|_ config
|    |_ LibScout.toml (LibScout's config file)
|    |_ logback.xml (log4j configuration file)
|_ data
|    |_ app-version-codes.csv (Google Play app packages with valid version codes)
|_ lib
|    Android axml
|_ scripts
|    |_ library-specs (pre-defined library specs)
|    |_ library-scraper.py   (scraper for mvn-central, jcenter, custom mvn)
|    |_ library-profile-generator.sh (convenience profile generator)
|_ src
    source directory of LibScout (de/infsec/tpl). Includes some open-source,
    third-party code to parse AXML resources / app manifests etc.
  • LibScout supports different use cases implemented as modules (modes of operation). Below a detailed description for each module.

Library Profiling (-o profile)

This module generates unique library fingerprints from original lib SDKs (.jar and .aar files supported). These profiles can subsequently be used for testing whether the respective library versions are included in apps. Each library file additionally requires a library.xml that contains meta data (e.g. name, version,..). A template can be found in the assets directory. For your convenience, you can use the library scraper (./scripts) to download full library histories from Maven repositories. By default, LibScout generates hashtree-based profiles with Package and Class information (omitting methods).

java -jar LibScout.jar -o profile [-a android_sdk_jar] -x path_to_library_xml path_to_library_file

Library Detection (-o match)

Detects libraries in apps using pre-generated profiles. Optionally, LibScout also conducts an API usage analysis for detected libraries, i.e. which library APIs are used by the app or by other libraries (-u switch).
Analysis results can be written in different formats.

  1. the JSON format (-j switch), creates subfolders in the specified directory following the app package, i.e. *com.foo* will create *com/foo* subfolders. This is useful when coping with a large number of apps. For detailed information about the information stored, please refer to the JSON output specification.
  2. the serialization option (-s switch) writes stat files per app to disk (deprecated)
java -jar LibScout.jar -o match -p path_to_profiles [-a android_sdk_jar] [-u] [-j json_dir] [-m] [-d log_dir] path_to_app(s)  

Library API compatibility analysis (-o lib_api_analysis)

Analyzes changes in the documented (public) API sets of library versions.
The analysis results currently include the following information:

Compliance to Semantic Versioning (SemVer), i.e. whether the change in the version string between consecutive versions (expected SemVer) matches the changes in the respective public API sets (actual SemVer). Results further include statistics about changes in API sets (additions/removals/modifcations). For removed APIs, LibScout additionally tries to infer alternative APIs (based on different features).

For the analysis, you have to provide a path to the original library SDKs. LibScout recursively searches for library jars|aars (leaf directories are expected to have at most one jar|aar file and one library.xml file). For your convenience use the library scraper. Analysis results are written to disk in JSON format (-j switch).

java -jar LibScout.jar -o lib_api_analysis [-a android_sdk_jar] [-j json_dir] path_to_lib_sdks

Library Updatability analysis (-o updatability)

This mode is an extension to the match mode. It first detects library versions in the provided apps and conducts a library usage analysis (-u is implied). In addition, it requires library API compat data (via the -l switch) as generated in the lib_api_analysis mode . Based on the lib API usage in the app and the compat info, LibScout determines the highest version that is still compatible to the set of used lib APIs.
Note: The new implementation still lacks some features, e.g. the results are currently logged but not yet written to json. See the code comments for more information.

java -jar LibScout.jar -o updatability [-a android_sdk_jar] [-j json_dir] -l lib_api_data_dir path_to_app(s)

Scientific Publications

For technical details and large-scale evaluation results, please refer to our publications:

If you use LibScout in a scientific publication, we would appreciate citations using these Bibtex entries: [bib-ccs16] [bib-ccs17]