longxudou
LLM Researcher @sail-sg. Maintainer ⚓️Sailor | 🔱Sailor2 | 🚢 SailCraft | 🧭 SailCompass
Research Scientist @ Sea AI LabHarbin
Pinned Repositories
Assembly-Language-HIT-2015-Winter
HIT-SCIR-CoNLL2019
"HIT-SCIR at MRP 2019: A Unified Pipeline for Meaning Representation Parsing via Efficient Training and Effective Encoding"-1st system in CoNLL2019 shared task
JAMR
Improve the Installation of JAMR with new Scala version
multispider
MultiSpider: Towards Benchmarking Multilingual Text-to-SQL Semantic Parsing
Paper_Reading
This repos is used for storing my own readed paper. Most of them come from Natural Language Process or Deep Learning related topic.
sailcompass
🧭 SailCompass: Towards Reproducible and Robust Evaluation for Southeast Asian Languages
sailcraft
🚢 Data Toolkit for Sailor Language Models
sailor-llm
[EMNLP-2024] ⚓️ Sailor: Open Language Models for South-East Asia
sailor2
🔱 Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs
scaling-with-vocab
[NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623
longxudou's Repositories
longxudou/HIT-SCIR-CoNLL2019
"HIT-SCIR at MRP 2019: A Unified Pipeline for Meaning Representation Parsing via Efficient Training and Effective Encoding"-1st system in CoNLL2019 shared task
longxudou/multispider
MultiSpider: Towards Benchmarking Multilingual Text-to-SQL Semantic Parsing
longxudou/Paper_Reading
This repos is used for storing my own readed paper. Most of them come from Natural Language Process or Deep Learning related topic.
longxudou/JAMR
Improve the Installation of JAMR with new Scala version
longxudou/HIT-SCIR-CoNLL2020
"HIT-SCIR at MRP 2020: Transition-based Parser and Iterative Inference Parser"-3rd system in CoNLL2020 shared task
longxudou/Misc_Learning
longxudou/rat-sql
A relation-aware semantic parsing model from English to SQL
longxudou/allennlp-reading-comprehension
longxudou/BLINK
Entity Linker solution
longxudou/ContextualSP
Multiple paper open-source codes of the Microsoft Research Asia DKI group
longxudou/deduplicate-text-datasets
longxudou/DPR
Dense Passage Retriever - is a set of tools and models for open domain Q&A task.
longxudou/duorat
longxudou/edit-distance
Python library for computing edit distance between arbitrary Python sequences.
longxudou/example-app-editable-dataframe
This is a demo of a dataframe with editable cells, powered by `streamlit-aggrid` from Pablo Fonseca. You can edit the cells by clicking on them and then export your selection to a csv file! 🎈
longxudou/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
longxudou/gazp
Source code for Grounded Adaptation for Zero-shot Executable Semantic Parsing
longxudou/GENRE
Autoregressive Entity Retrieval
longxudou/IRNet-1
An algorithm for cross-domain NL2SQL
longxudou/longxudou.github.io
longxudou/Megatron-LLM
distributed trainer for LLMs
longxudou/nepali-translator
Neural Machine Translation on the Nepali-English language pair
longxudou/picard
PICARD - Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models
longxudou/Research
novel deep learning research works with PaddlePaddle
longxudou/scaling-with-vocab
📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies
longxudou/spider-schema-gnn
Author implementation of the paper "Representing Schema Structure with Graph Neural Networks for Text-to-SQL Parsing"
longxudou/sqlova
longxudou/st-chat
Streamlit Component, for a Chatbot UI
longxudou/TaBERT
This repository contains source code for the TaBERT model, a pre-trained language model for learning joint representations of natural language utterances and (semi-)structured tables for semantic parsing. TaBERT is pre-trained on a massive corpus of 26M Web tables and their associated natural language context, and could be used as a drop-in replacement of a semantic parsers original encoder to compute representations for utterances and table schemas (columns).
longxudou/tensor2struct-public
Semantic parsers based on encoder-decoder framework