iamjoona's Stars
deezer/spleeter
Deezer source separation library including pretrained models.
google-research/bert
TensorFlow code and pre-trained models for BERT
jamesaphoenix/Click_Through_Rate_Optimization_Google_Search_Console
This is a small, mini-project where I created a simple machine learning model to predict the click through rate of a given URL (web page) using Python + sci-kit learn.
codelucas/newspaper
newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
findopendata/findopendata
A search engine for Open Data
commoncrawl/cc-pyspark
Process Common Crawl data with Python and Spark
jroakes/screaming-frog-shingling
Uses Screaming Frog Internal HTML with text extraction along with a shingling algorithm to compare content duplication across the pages of a crawled site.
searchVIU/Labs
searchVIU Labs
benjaminestes/bq-stat
Get Stat ranking data into BQ for use in Data Studio.
ecoron/SerpScrap
SEO python scraper to extract data from major searchengine result pages. Extract data like url, title, snippet, richsnippet and the type from searchresults for given keywords. Detect Ads or make automated screenshots. You can also fetch text content of urls provided in searchresults or by your own. It's usefull for SEO and business related research tasks.
pandas-dev/pandas
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
buriy/python-readability
fast python port of arc90's readability tool, updated to match latest readability.js!
max-mapper/art-of-node
:snowflake: a short introduction to node.js
ranksense/url-inspector-automator
URL Inspection Tool Automator
MLTSEO/MLTS
Machine Learning Toolkit for SEO
NimaSoroush/differencify
Differencify is a library for visual regression testing
kalaspuffar/puppeteer-example
A small example how to use puppeteer to drive chrome
sohamkamani/javascript-design-patterns-for-humans
An ultra-simplified explanation of design patterns implemented in javascript
iihnordic/screamingfrog-docker
Docker image for ScreamingFrog version 16
browserless/browserless
Deploy headless browsers in Docker. Run on our cloud or bring your own. Free for non-commercial uses.
N0taN3rd/Squidwarc
Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head
GoogleChromeLabs/perftools-runner
Google Performance Tools runner using Puppeteer
emadehsan/thal
Getting started with Puppeteer and Chrome Headless for Web Scraping
anishkny/webgif
Easily generate animated GIFs from websites
apify/crawlee
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
greglobinski/gatsby-starter-hero-blog
A ready to use, easy to customize, fully equipped GatsbyJS starter with a 'Hero' section on the home page.
paulirish/pwmetrics
Progressive web metrics at your fingertipz
jeremiak/jekyll-offline
jekyll plugin to use service workers and make site content available offline
phantombuster/nickjs
Web scraping library made by the Phantombuster team. Modern, simple & works on all websites. (Deprecated)