jancurn

Founder and CEO of Apify - the web scraping and automation platform. PhD in AI. Y Combinator Fellow.

@apifyPrague, Czech Republic

Pinned Repositories

act-pdf-to-html
Converts PDF to HTML using the pdf2htmlex tool
Language:JavaScript2 3 00
actor-amazon-crawler
Amazon crawler - this configuration will extract items for a keywords that you will specify in the input, and it will automatically extract all pages for the given keyword. You can specify more keywords on the input for one run.
Language:JavaScript10
actor-analyze-domains
An Apify actor that crawls web pages from a list of provided domains and analyzes them. For example, it checks whether pages have HTTPS version, saves their HTML content and screenshot, HTTP response headers, SSL certificate information, text body, outgoing links, emails, phone numbers, social handles and more.
Language:JavaScript4 3 13
actor-find-broken-links
A source code of an Apify actor that finds and reports broken links on a website. Unlike other SEO analysis tools, it also reports broken URL #fragments.
Language:JavaScript6 3 106
actor-metadata-extractor
An Apify actor that crawls a list of web pages and extracts various metadata from them.
Language:JavaScript1 2 00
actor-residential-proxy-probe
Probes Apify residential proxies and maintains a pool of proxies from specific ZIP codes or DMAs
Language:JavaScript2 4 03
awesome-puppeteer
A curated list of awesome puppeteer resources.
21

jancurn's Repositories

jancurn/actor-find-broken-links
A source code of an Apify actor that finds and reports broken links on a website. Unlike other SEO analysis tools, it also reports broken URL #fragments.
Language:JavaScript6 3 106
jancurn/actor-analyze-domains
An Apify actor that crawls web pages from a list of provided domains and analyzes them. For example, it checks whether pages have HTTPS version, saves their HTML content and screenshot, HTTP response headers, SSL certificate information, text body, outgoing links, emails, phone numbers, social handles and more.
Language:JavaScript4 3 13
jancurn/act-pdf-to-html
Converts PDF to HTML using the pdf2htmlex tool
Language:JavaScript2 3 00
jancurn/actor-residential-proxy-probe
Probes Apify residential proxies and maintains a pool of proxies from specific ZIP codes or DMAs
Language:JavaScript2 4 03
jancurn/awesome-puppeteer
A curated list of awesome puppeteer resources.
21
jancurn/actor-amazon-crawler
Amazon crawler - this configuration will extract items for a keywords that you will specify in the input, and it will automatically extract all pages for the given keyword. You can specify more keywords on the input for one run.
Language:JavaScript10
jancurn/actor-metadata-extractor
An Apify actor that crawls a list of web pages and extracts various metadata from them.
Language:JavaScript1 2 00
jancurn/act-probe-page-resources
Apify act to load web pages and analyze HTTP resources they request
Language:JavaScript
jancurn/actor-selenium-custom-firefox
Apify actor with custom build of Firefox, instrumented using Selenium.
Language:JavaScript2 02
jancurn/awesome-web-scraping
List of libraries, tools and APIs for web scraping and data processing.
Language:Makefile2 0
jancurn/bson-ext
The C++ bson parser for the node.js mongodb driver.
Language:JavaScript1 0
jancurn/iron-router
A client and server side router designed specifically for Meteor.
Language:JavaScript2 0
jancurn/llama_index
LlamaIndex (GPT Index) is a project that provides a central interface to connect your LLM's with external data.
Language:Python0 0
jancurn/nekolikOtazek
Language:HTML2 0
jancurn/puppeteer
Headless Chrome Node API
Language:JavaScript2 0
jancurn/stayinghomeclub
A list of all the companies WFH or events changed because of covid-19
Language:Ruby
jancurn/www
The mitmproxy website, https://mitmproxy.org/.
Language:CSS1 0
jancurn/yclist
List and description of ycombinator companies
Language:Ruby2 0

jancurn

Pinned Repositories

act-pdf-to-html

actor-amazon-crawler

actor-analyze-domains

actor-find-broken-links

actor-metadata-extractor

actor-residential-proxy-probe

awesome-puppeteer

jancurn's Repositories

jancurn/actor-find-broken-links

jancurn/actor-analyze-domains

jancurn/act-pdf-to-html

jancurn/actor-residential-proxy-probe

jancurn/awesome-puppeteer

jancurn/actor-amazon-crawler

jancurn/actor-metadata-extractor

jancurn/act-probe-page-resources

jancurn/actor-selenium-custom-firefox

jancurn/awesome-web-scraping

jancurn/bson-ext

jancurn/iron-router

jancurn/llama_index

jancurn/nekolikOtazek

jancurn/puppeteer

jancurn/stayinghomeclub

jancurn/www

jancurn/yclist