Fluquid
We empower you to make data-driven decisions using Machine Learning, Analytics and Python Coding
Cork, Ireland
Pinned Repositories
browser_fingerprint
builtwith
fork of https://bitbucket.org/richardpenman/builtwith
cryptocurrency
altcoin market and project analyses
extract-social-media
Extract social media links and account names from websites.
find_job_titles
find any kind of occupation or job title in a text or file
html-to-etree
convenience method for parsing html to lxml elementtree using sane character decoding
ilen-tech
kafka-docker
Dockerfile for Apache Kafka
sde
Structured Data Extractor. An application to extract structured data from web pages. It uses Data Extraction Based on Partial Tree Alignment (DEPTA) method. (UPDATE: I implemented a newer algorithm: https://github.com/seagatesoft/webdext)
yandex-search
Search library for yandex.ru search engine.
Fluquid's Repositories
fluquid/find_job_titles
find any kind of occupation or job title in a text or file
fluquid/extract-social-media
Extract social media links and account names from websites.
fluquid/yandex-search
Search library for yandex.ru search engine.
fluquid/cryptocurrency
altcoin market and project analyses
fluquid/browser_fingerprint
fluquid/builtwith
fork of https://bitbucket.org/richardpenman/builtwith
fluquid/capcoin
Gets data from coincap.io into the CLI
fluquid/cookiecutter-data-science
A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.
fluquid/cookiecutter-pypackage
Cookiecutter template for a Python package.
fluquid/dragnet
Just the facts -- web page content extraction
fluquid/email-audit
Audit which email spam bots can collect from your sites.
fluquid/html-to-etree
convenience method for parsing html to lxml elementtree using sane character decoding
fluquid/ilen-tech
fluquid/kafka-docker
Dockerfile for Apache Kafka
fluquid/sde
Structured Data Extractor. An application to extract structured data from web pages. It uses Data Extraction Based on Partial Tree Alignment (DEPTA) method. (UPDATE: I implemented a newer algorithm: https://github.com/seagatesoft/webdext)
fluquid/cookiecutter-pypackage-minimal
A minimal template for python packages
fluquid/cookiecutter-scrapycloud
A bare minimum Scrapy project template ready for Scrapinghub's Scrapy Cloud service.
fluquid/fluquid-lib
utility library
fluquid/githubarchive.org
GitHub Archive is a project to record the public GitHub timeline, archive it, and make it easily accessible for further analysis.
fluquid/html-text
Extract text from HTML