Java application to display top JS libraries used in sites. The application does the following
- Seeks the command line input from user for a search value
- Searches google with the given search value
- Spawns multiple crawlers and looks for JS libraries used
- Aggregates and displays top JS libraries used
GoogleSearch class is used to initiate a search to google, using google search API. The class is configured to query for only 10 results.
SearchResults and SearchItem classes are used to map the results returned from Google.
JSLibrarySearcher is used for searching for script tags and trying to identify the libraries used.
CrawlerTask connects to a url, and initiates the search CrawlerManager initializes the threads and spawns off the CrawlerTasks. CrawlerManager is configured to spawn 10 simultaneous threads
mvn dependency:tree result
+- com.google.code.gson:gson:jar:2.8.1:compile
+- junit:junit:jar:4.12:test
- org.hamcrest:hamcrest-library:jar:1.3:test
- org.hamcrest:hamcrest-core:jar:1.3:test
GSON library is used for parsing JSON results from google search APIs.
Junit for unit test cases
com.gurupv.simple.command.CommandLineExecutor contains the main method to start the application.
The entire project can be built, compiled and executed using Maven.
Maven executor plugin is also added. If required, mvn exec:exec can be used to run the application using Maven.
The entire code base can also be imported into eclipse as a maven project.
Test cases are added to test each individual components
- Tests for Google Search component
- Tests for JS library pattern search
- Test for Crawler Task
yml file included for Travis CI integration