Pinned Repositories
doris
Apache Doris is an easy-to-use, high performance and unified analytics database.
awesome-python
A curated list of awesome Python frameworks, libraries, software and resources
Daemonize-Manage
Daemonize management base class, providing daemon creation and termination, logging, child process management/守护进程管理基类,提供守护进程创建及终止、日志记录、子进程管理
Eggtart
Eggtart, a distributed web page information processing framework, including web page data crawling, analysis, and results processing / 蛋挞,一个分布式网页信息处理框架,包括网页数据爬取、分析、结果业务处理
gitignore
A collection of useful .gitignore templates
HashTree
Create a hash tree based on the URL for local storage and retrieval of web page information in the crawler/基于URL构建哈希目录树,用于爬虫中网页信息本地存储及检索
HtmlExtract-Java
Extract all the text in the text, Chinese, keywords, Title, ICP, link and inside and outside the chain ratio, form form, alert, meta, jump, sensitive words and other information / 抽取HTML中所有文本、中文、关键词、Title、ICP、链接及内外链比例、form表单、alert、meta、跳转、敏感词等信息
HtmlExtract-Python
Extract all the text in the text, Chinese, keywords, Title, ICP, link and inside and outside the chain ratio, form form, alert, meta, jump, sensitive words and other information / 抽取HTML中所有文本、中文、关键词、Title、ICP、链接及内外链比例、form表单、alert、meta、跳转、敏感词等信息
MiniORM-MySQL
Based on the MySQLdb provide formatting Mysql light-weight ORM/基于MySQLdb提供格式化操作Mysql的轻量级ORM
UrlSplit
Split the URL for the scheme, server, host domain, top domain, port, path, params, query/分割URL为协议、所用服务、主机域名、顶级域名、端口、路径、参数、查询字段几部分
xinyiZzz's Repositories
xinyiZzz/MiniORM-MySQL
Based on the MySQLdb provide formatting Mysql light-weight ORM/基于MySQLdb提供格式化操作Mysql的轻量级ORM
xinyiZzz/HashTree
Create a hash tree based on the URL for local storage and retrieval of web page information in the crawler/基于URL构建哈希目录树,用于爬虫中网页信息本地存储及检索
xinyiZzz/HtmlExtract-Python
Extract all the text in the text, Chinese, keywords, Title, ICP, link and inside and outside the chain ratio, form form, alert, meta, jump, sensitive words and other information / 抽取HTML中所有文本、中文、关键词、Title、ICP、链接及内外链比例、form表单、alert、meta、跳转、敏感词等信息
xinyiZzz/UrlSplit
Split the URL for the scheme, server, host domain, top domain, port, path, params, query/分割URL为协议、所用服务、主机域名、顶级域名、端口、路径、参数、查询字段几部分
xinyiZzz/Eggtart
Eggtart, a distributed web page information processing framework, including web page data crawling, analysis, and results processing / 蛋挞,一个分布式网页信息处理框架,包括网页数据爬取、分析、结果业务处理
xinyiZzz/gitignore
A collection of useful .gitignore templates
xinyiZzz/HtmlExtract-Java
Extract all the text in the text, Chinese, keywords, Title, ICP, link and inside and outside the chain ratio, form form, alert, meta, jump, sensitive words and other information / 抽取HTML中所有文本、中文、关键词、Title、ICP、链接及内外链比例、form表单、alert、meta、跳转、敏感词等信息
xinyiZzz/awesome-python
A curated list of awesome Python frameworks, libraries, software and resources
xinyiZzz/Daemonize-Manage
Daemonize management base class, providing daemon creation and termination, logging, child process management/守护进程管理基类,提供守护进程创建及终止、日志记录、子进程管理
xinyiZzz/MiniORM-beanstalk
Based on the beanstalkc provide beanstalk light-weight ORM/基于beanstalkc提供操作beanstalk的轻量级ORM
xinyiZzz/incubator-doris
Apache Doris (Incubating)
xinyiZzz/JsonExtractor
Pan JSON format data extractor, based on the stack and the regular extraction of JSON specified level key value/泛JSON格式数据抽取器,基于栈和正则抽取JSON中指定层级key的value
xinyiZzz/Threadpool
Can loop through the thread pool, support pass function, pass, transfer call function, immediately terminate all threads, support thread recycling, save time and resources/可循环线程池,支持传函数、传参、传回调函数、立即终止所有线程,支持线程的循环利用,节省时间和资源
xinyiZzz/bloaty
Bloaty McBloatface: a size profiler for binaries
xinyiZzz/ByConity
ByConity is an open source cloud-native data warehouse
xinyiZzz/ChatGPT-Next-Web-1
One-Click to deploy well-designed ChatGPT web UI on Vercel. 一键拥有你自己的 ChatGPT 网页服务。
xinyiZzz/code_test
xinyiZzz/doris-vectorized
xinyiZzz/doris-website
Apache Doris Website
xinyiZzz/free-programming-books-zh_CN
:books: 免费的计算机编程类中文书籍,欢迎投稿
xinyiZzz/gophish
Open-Source Phishing Toolkit
xinyiZzz/httrack-py
Automatically exported from code.google.com/p/httrack-py
xinyiZzz/IQL
An ad hoc query service based on the spark sql engine.(基于spark sql引擎的即席查询服务)
xinyiZzz/lda2vec
xinyiZzz/libunwind
Mirror of official LLVM libunwind git repository located at http://llvm.org/git/libunwind. Updated every five minutes. http://llvm.org
xinyiZzz/Machine-Learning
xinyiZzz/show-me-the-code
Python 练习册,每天一个小程序
xinyiZzz/spark
Apache Spark - A unified analytics engine for large-scale data processing
xinyiZzz/spark-doc-zh
Apache Spark 官方文档中文版
xinyiZzz/xinyiZzz