Pinned Repositories
albert
A fast and flexible keyboard launcher
albert-chinese-ner
使用预训练语言模型ALBERT做中文NER
CCF-BDCI-2021-POI-3rd-Prize
CCF BDCI 2021 高德POI名称生成赛题 三等奖解决方案 (完整代码+说明)
ccks-2020-finance-transfer-ee-baseline
CCKS2020面向金融领域的小样本跨类迁移事件抽取baseline
CCKS-2021-Financial-Event-Extraction_Rank-6th
面向金融领域的篇章级事件抽取和事件因果关系抽取 第六名 方案及代码
CCKS-2021-Huawei-Event-Extraction
华为面向通信领域的过程类事件抽取baseline方案及代码
Chinese-Word-Vectors
100+ Chinese Word Vectors 上百种预训练中文词向量
ChineseNMT
ChineseNMT: Translate English to Chinese with PyTorch Implementation of Transformer
DeepRL
Deep Reinforcement Learning Lab, a platform designed to make DRL technology and fun for everyone
Text2SQL-or-NL2SQL-End2End-baseline
本方案在以下赛事中获得TOP3名次:1.2021百度paddlepaddle&国家电网AI创新大赛 语义解析赛道 三等奖 2.千言·语义解析 冠军 3.CCKS 2022 金融NL2SQL 亚军 4. WAIC 2022 Text2SQL 三等奖
shenzaimin's Repositories
shenzaimin/moz-sql-parser
DEPRECATED - Let's make a SQL parser so we can provide a familiar interface to non-sql datastores!
shenzaimin/CLUECorpus2020
Large-scale Pre-training Corpus for Chinese 100G 中文预训练语料
shenzaimin/chinese-poetry
The most comprehensive database of Chinese poetry 🧶最全中华古诗词数据库, 唐宋两朝近一万四千古诗人, 接近5.5万首唐诗加26万宋诗. 两宋时期1564位词人,21050首词。 🤪 😜 阿里招p6/p7 Python Golang | gaojunqi@outlook.com | 上海张江
shenzaimin/spark-nlp
State of the Art Natural Language Processing
shenzaimin/Statistical-Learning-Method_Code
手写实现李航《统计学习方法》书中全部算法
shenzaimin/TabularSemanticParsing
Translating natural language questions to a structured query language
shenzaimin/wikiextractor
A tool for extracting plain text from Wikipedia dumps
shenzaimin/OpenASR
A pytorch based end2end speech recognition system.
shenzaimin/DeepRL
Deep Reinforcement Learning Lab, a platform designed to make DRL technology and fun for everyone
shenzaimin/End-to-end-ASR-Pytorch
This is an open source project (formerly named Listen, Attend and Spell - PyTorch Implementation) for end-to-end ASR implemented with Pytorch, the well known deep learning toolkit.
shenzaimin/MK-SQuIT
Synthesizing Questions using Iterative Template-Filling
shenzaimin/GNNPapers
Must-read papers on graph neural networks (GNN)
shenzaimin/text-classification-cn
中文文本分类实践,基于搜狗新闻语料库,采用传统机器学习方法以及预训练模型等方法
shenzaimin/scikit-opt
Genetic Algorithm, Particle Swarm Optimization, Simulated Annealing, Ant Colony Optimization Algorithm,Immune Algorithm, Artificial Fish Swarm Algorithm, Differential Evolution and TSP(Traveling salesman)
shenzaimin/MacBERT
Revisiting Pre-trained Models for Chinese Natural Language Processing (Findings of EMNLP)
shenzaimin/spider
scripts and baselines for Spider: Yale complex and cross-domain semantic parsing and text-to-SQL challenge
shenzaimin/ccf_2020_qa_match
ccf 2020 qa match competition
shenzaimin/competition_baselines
开源的各大比赛baseline
shenzaimin/nlp_chinese_corpus
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
shenzaimin/vokenization
PyTorch code for EMNLP 2020 Paper "Vokenization: Improving Language Understanding with Visual Supervision"
shenzaimin/chisp
scripts and baselines for CSpider: Chinese semantic parsing and text-to-SQL challenge
shenzaimin/Electra_CRF_NER
We start a company-name recognition task with a small scale and low quality training data, then using skills to enhanced model training speed and predicting performance with least artificial participation. The methods we use involve lite pre-training models such as Albert-small or Electra-small with financial corpus, knowledge of distillation and multi-stage learning. The result is that we improve the recall rate of company names recognition task from 0.73 to 0.92 and get 4 times as fast as BERT-Bilstm-CRF model.
shenzaimin/gector
Official implementation of the paper “GECToR – Grammatical Error Correction: Tag, Not Rewrite” // Published on BEA15 Workshop (co-located with ACL 2020) https://www.aclweb.org/anthology/2020.bea-1.16.pdf
shenzaimin/THUOCL
THUOCL(THU Open Chinese Lexicon)中文词库
shenzaimin/pegasus
shenzaimin/masr
中文语音识别; Mandarin Automatic Speech Recognition;
shenzaimin/CCKS2019_EventEntityExtraction_Rank5
SEBERTNets:一种面向金融领域的事件主体抽取方法
shenzaimin/ccks-2020-finance-transfer-ee-baseline
CCKS2020面向金融领域的小样本跨类迁移事件抽取baseline
shenzaimin/ccks2020-baseline
CCKS 2020: 基于本体的金融知识图谱自动化构建技术评测
shenzaimin/AutoIE