Pinned Repositories
CrawlnChat
A modular web crawling and chat system that allows for ingesting website content through XML sitemaps, converting to vector embeddings, and providing AI-powered chat interfaces through multiple frontend options.
ForecastGA
A Python tool to forecast Google Analytics data using several popular time series models.
ghost-material
Materialize Theme For Ghost.js
glove-to-word2vec
Converting GloVe vectors into word2vec format for easy usage with Gensim
gsc-logger
Google Search Console Logger for Google App Engine
iCodeSEO
Repo for Content for iCodeSEO.dev
querycat
A Sample repo using the Apriori and FP Growth algorithms to produce categories for queries, and BERT for PoP change visualization.
screaming-frog-shingling
Uses Screaming Frog Internal HTML with text extraction along with a shingling algorithm to compare content duplication across the pages of a crawled site.
SEODP
The SEO Data Platform automates SEO analysis, aggregating data from Google Analytics 4, Search Console, Page Speed Insights, and rendered content. Powered by Google Gemini AI, it emails actionable insights weekly or monthly, streamlining SEO workflows.
tech-seo-crawler
Build a small, 3 domain internet using Github pages and Wikipedia and construct a crawler to crawl, render, and index.
jroakes's Repositories
jroakes/tech-seo-crawler
Build a small, 3 domain internet using Github pages and Wikipedia and construct a crawler to crawl, render, and index.
jroakes/ForecastGA
A Python tool to forecast Google Analytics data using several popular time series models.
jroakes/querycat
A Sample repo using the Apriori and FP Growth algorithms to produce categories for queries, and BERT for PoP change visualization.
jroakes/SEODP
The SEO Data Platform automates SEO analysis, aggregating data from Google Analytics 4, Search Console, Page Speed Insights, and rendered content. Powered by Google Gemini AI, it emails actionable insights weekly or monthly, streamlining SEO workflows.
jroakes/Google-Data
jroakes/RepoCoder
RepoCoder is a Python package that allows you to send your code for review using Large Language Models (LLMs) like Anthropic's Claude or Google's Gemini. It provides an easy way to get code reviews, suggestions for improvements, and more.
jroakes/Taxonomy
jroakes/daily-seo
Uses Gemini Model to auto-curate content from 20+ individual user and website feeds. Feeds are setup using the rss.app tool because it supports pulling Twitter posts for provided users. Script runs with a GitHub action every four hours.
jroakes/geneticML
Automatically refine Python code to meet specified objectives.
jroakes/google-analytics
A command-line interface and Python API wrapper for Google Analytics.
jroakes/WayDiffer
Waydiffer is a Streamlit application that compares website versions archived in the Wayback Machine.
jroakes/python-semrush
Python wrapper around the SEMrush API.
jroakes/DailySEOFeed
A Bluesky Feed Generator that curates SEO content based on community interaction patterns. The algorithm ranks discussions based on likes, reposts, and replies, with a 48-hour content lifespan and recency boost for fresh content.
jroakes/Npath
Exploring path sequences in GA4 BigQuery data
jroakes/BLINK
Entity Linker solution
jroakes/custom-metrics
Custom metrics to use with WebPageTest agents
jroakes/GA4Map
jroakes/google-searchconsole
A wrapper for the Google Search Console API.
jroakes/SchemaAnalyzer
jroakes/SF-Issues-to-Pages
jroakes/slack-parlai
Python Machine Learning Package for Providing Corpus, Sentiment Analysis & Entity Extraction
jroakes/transformers
🤗 Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.
jroakes/falcon7b-inf
jroakes/gatest
jroakes/hacker-news-reader
A clean and filterable Hacker News clone
jroakes/haystack
:mag: Haystack is an open source NLP framework that leverages Transformer models. It enables developers to implement production-ready neural search, question answering, semantic document search and summarization for a wide range of applications.
jroakes/legacy.httparchive.org
<<THIS REPOSITORY IS DEPRECATED>> The HTTP Archive provides information about website performance such as # of HTTP requests, use of gzip, and amount of JavaScript. This information is recorded over time revealing trends in how the Internet is performing. Built using Open Source software, the code and data are available to everyone allowing researchers large and small to work from a common base.
jroakes/slack-for-llama
jroakes/triviafarm-starter
jroakes/web-vitals-module
Web Vitals: Essential module for a healthy Nuxt.js