Pinned Repositories
correlation-approximation
Spark implementation of the Google Correlate algorithm to quickly find highly correlated vectors in huge datasets
Datawake
Browser add-on and web server to support collection and analysis of web browsing data.
distributed-graph-analytics
Distributed Graph Analytics (DGA) is a compendium of graph analytics written for Bulk-Synchronous-Parallel (BSP) processing frameworks such as Giraph and GraphX. The analytics included are High Betweenness Set Extraction, Weakly Connected Components, Page Rank, Leaf Compression, and Louvain Modularity.
distributed-louvain-modularity
Community Detection and Compression Analytic for Big Graph Data
graphene
mitie-trainer
Model Training tool for MITIE
newman
Quickly analyze and explore email with advanced analytics and visualization.
pst-extraction
PST extraction and analytic pipeline
spark-distributed-louvain-modularity
Spark / graphX implementation of the distributed louvain modularity algorithm
zephyr
Zephyr is a big data, platform agnostic ETL API, with Hadoop MapReduce, Storm, and other big data bindings.
Sotera Defense, now Jacobs's Repositories
Sotera/spark-distributed-louvain-modularity
Spark / graphX implementation of the distributed louvain modularity algorithm
Sotera/distributed-graph-analytics
Distributed Graph Analytics (DGA) is a compendium of graph analytics written for Bulk-Synchronous-Parallel (BSP) processing frameworks such as Giraph and GraphX. The analytics included are High Betweenness Set Extraction, Weakly Connected Components, Page Rank, Leaf Compression, and Louvain Modularity.
Sotera/newman
Quickly analyze and explore email with advanced analytics and visualization.
Sotera/pst-extraction
PST extraction and analytic pipeline
Sotera/watchman
Watchman: An open-source social-media event-detection system
Sotera/aggregate-micro-paths
Infer movement patterns from large amounts of geo-temporal data in a cloud environment.
Sotera/track-communities
A series of analytics for creating networks from geo-temporal track data based on time/space co-occurrence. Includes UI for visualization of communities and tracks.
Sotera/DatawakeDepot
Loopback web application for administration of Datawake networks
Sotera/webpageclassifier
Categorizes a website given URL into one of blog|wiki|news|forum|classified|shopping|undecided.
Sotera/firmament
NodeJS script and Docker files to create MySQL/MongoDB backed AngularJS/Bootstrap web application
Sotera/go_watchman
github.com/watchman apps for which go is specifically well suited
Sotera/interactive-graph-viewer
An R Shiny app for interactively viewing the results of the Louvain method for community detection.
Sotera/joy-m51
A package for capturing and analyzing network flow data and intraflow data, for network research, forensics, and security monitoring.
Sotera/merlin-stack
Sotera/newman-vm
newman vm
Sotera/ANBK-Convert
C# application to convert Analyst notebook files
Sotera/newman-research
Tools to be evaluated prior to integration into Newman
Sotera/opendata
Repository of documentation about the open datasets published by the UK Web Archive.
Sotera/Rmmtsne
A native R implementation of multiple maps t-distributed stochastic neighbor embedding (mmtsne).
Sotera/sotera.github.io
Sotera/threat_detection
Automatic threat detection in images.
Sotera/MerlinETL
Sotera/BertTokenizers
Open source project for BERT Tokenizers in C#.
Sotera/datawake-ts
Angular2/TypeScript implementation of sotera/DatawakeDepot
Sotera/ea_dox
Scripts for reading and flattening dox files
Sotera/embed-map
A map widget that uses Leaflet and allows bi-directional communication via HTML5 postMessage() thru an iframe
Sotera/loopback-webpack-typescript-amino
A basic Web Application using TypeScript (client & server), WebPack & LoopBack
Sotera/ngrini
Dashboard for the ngrini cycle
Sotera/sitehound
Site Hound (previously THH) is a Domain Discovery Tool
Sotera/spark
Mirror of Apache Spark