Pinned Repositories
Ad-Library-API-Script-Repository
GitHub repository of commonly used python scripts that allows everyone to pull data via the Ad Library API
Ad_Library_API
Python code package to scrape the Facebook Ad Library data
aeneas
aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
aeneas-vagrant
aeneas-vagrant automates the creation of a Vagrant box to run aeneas
anomalize
Tidy anomaly detection
asr-data
Data and code for a small project on meta-information from the American Sociological Review
Auto-GPT
An experimental open-source attempt to make GPT-4 fully autonomous.
BERTopic
Leveraging BERT and c-TF-IDF to create easily interpretable topics.
predicting-poverty-replication
A Python3 and PyTorch replication of Jean et al. (2016). Original paper Github: https://github.com/nealjean/predicting-poverty
joshzyj's Repositories
joshzyj/example-twitter-analysis
Example Twitter analysis for UA course SOC 596a-002, Fall 2016
joshzyj/MSongsDB
Code for the Million Song Dataset, the dataset contains metadata and audio analysis for a million tracks, a collaboration between The Echo Nest and LabROSA. See website for details.
joshzyj/dirbot
Scrapy project to scrape public web directories (educational)
joshzyj/ebola-abm
joshzyj/GloVe
GloVe model for distributed word representation
joshzyj/weighted-modularity-LPAwbPLUS
Algorithms for finding weighted modularity in bipartite networks
joshzyj/interpy-zh
《Python进阶》(Intermediate Python 中文版)
joshzyj/nlp-lang
这个项目是一个基本包.封装了大多数nlp项目中常用工具
joshzyj/twitter_nlp
Twitter NLP Tools
joshzyj/cs228-material
Teaching materials for the probabilistic graphical models class at Stanford
joshzyj/ProQuest-to-PlainText
joshzyj/DPLP
A RST Parser with a trained model
joshzyj/stat-learning
Notes and exercise attempts for "An Introduction to Statistical Learning"
joshzyj/DataScienceSpCourseNotes
Compiled Notes for all 9 courses in the Coursera Data Science Specialization
joshzyj/rvest
Simple web scraping for R
joshzyj/DECClient
Tools for USAID's Development Experience Clearinghouse
joshzyj/httr
httr: a friendly http package for R
joshzyj/Tech_Notes
Technical Notes
joshzyj/lxnxs
Some utilities for LexisNexis-related scraping
joshzyj/xgboost
Large-scale and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, on single node, hadoop yarn and more.
joshzyj/dtm_gensim
Dynamic Topic Modelling Tutorial Files
joshzyj/vaderSentiment
VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains.
joshzyj/WeixinCrawler
A crawler of Weixin's public accounts' content and RSS generator.
joshzyj/infx547
INFX547 - Social Media Data Mining and Analysis
joshzyj/D-Lab_TextAnalysisWorkingGroup
Repository for D-Lab Working Group Files, Scripts, Wiki, Issues, etc.
joshzyj/social_media_profile_analysis
Social Network Analysis (Facebook, Twitter and Google+)
joshzyj/renren
人人网信息抓取与数据挖掘。social network analysis
joshzyj/movements
ABM of social movement formation, including various social mechanisms introduced by the Internet
joshzyj/occupywallst.org
Open Source Social Movement Website
joshzyj/weiquncrawler
This is a crawler for Sina Weiqun website(WAP) information, including given Weiqun's posts, replies, users and their follow relation. Written in Python 2.7.1, store data in SQLite3. Relation-crawling part customized on Github Project sina_reptile.