sonlam1102
UIT - VNUHCM
University of Information Technology (UIT), Vietnam National University, Ho Chi Minh CityHo Chi Minh City
sonlam1102's Stars
acl-org/acl-anthology
Data and software for building the ACL Anthology.
RUCAIBox/HaluEval
This is the repository of HaluEval, a large-scale hallucination evaluation benchmark for Large Language Models.
binhvq/news-corpus
Corpus tiếng việt
google-deepmind/AQuA
A algebraic word problem dataset, with multiple choice questions annotated with rationales.
jindongwang/Pytorch-CapsuleNet
An easy-to-follow Pytorch implementation of Hinton's Capsule Network
jerbarnes/semeval22_structured_sentiment
SemEval-2022 Shared Task 10: Structured Sentiment Analysis
jerbarnes/sentiment_graphs
Graph parsing approach to structured sentiment analysis.
VT-NLP/Mocheg
Dataset and Code for Multimodal Fact Checking and Explanation Generation (Mocheg)
phusroyal/ViHOS
Repository for the paper "ViHOS: Vietnamese Hate and Offensive Spans Detection" (EACL2023)
CSHaitao/THUIR-COLIEE2023
Code to reproduce THUIR‘s submissions for COLIEE 2023 Task1 and Task2
bino282/ViNLP
ds4v/absa-vlsp-2018
End-to-end Multi-task Solutions for Aspect Category Sentiment Analysis (ACSA) on Vietnamese Datasets
kh4nh12/llm_learning_resource
Large Language Models (LLMs) Learning Resources
seoneun/T5-Question-Generation
SQuAD Question Generation module based on T5-large
heraclex12/vietpunc
Vietnamese Punctuation Prediction using Pretrained Language Models
mrzjy/sunburst
A simple Python implementation of ngram sunburst (nested pie chart) visualization showed in CoQA paper
kh4nh12/ViVQA
LuongPhan/UIT-ViSFD
cyoon47/CS1QA
Repository for CS1QA: A Dataset for assisting Code-based Question Answering in an Introductory Programming Course, published at NAACL 2022
kietnv/vireader
Machine Reading Comprehension has attracted significant interest in research on natural language understanding, and large-scale datasets and neural network-based methods have been developed for this task. However, most developments of resources and methods in machine reading comprehension have been investigated using two resource-rich languages, English and Chinese. This article proposes a system called ViReader for open-domain machine reading comprehension in Vietnamese by using Wikipedia as the textual knowledge source, where the answer to any particular question is a textual span derived directly from texts on Vietnamese Wikipedia. Our system combines a sentence retriever component, based on techniques of information retrieval to extract the relevant sentences, with a transfer learning-based answer extractor trained to predict answers based on Wikipedia texts. Experiments on multiple datasets for machine reading comprehension in Vietnamese and other languages demonstrate that (1) our ViReader system is highly competitive with prevalent machine learning-based systems, and (2) multi-task learning by using a combination consisting of the sentence retriever and answer extractor is an end-to-end reading comprehension system. The sentence retriever component of our proposed system retrieves the sentences that are most likely to provide the answer response to the given question. The transfer learning-based answer extractor then reads the document from which the sentences have been retrieved, predicts the answer, and returns it to the user. The ViReader system achieves new state-of-the-art performances, with values of 70.83% EM (exact match) and 89.54% F1, outperforming the BERT-based system by 11.55% and 9.54%, respectively. It also obtains state-of-the-art performance on UIT-ViNewsQA (another Vietnamese dataset consisting of online health-domain news) and BiPaR (a bilingual dataset on English and Chinese novel texts). Compared with the BERT-based system, our system achieves significant improvements (in terms of F1) with 7.65% for English and 6.13% for Chinese on the BiPaR dataset. Furthermore, we build a ViReader application programming interface that programmers can employ in Artificial Intelligence applications.
barshana-banerjee/ParaQA
A Dataset with Multiple Paraphrase Responses for Single-Turn Question Answering
ngxtnhi/ViLexNorm
A Lexical Normalization Corpus for Vietnamese Social Media Text
vuraemon/UITws-v1
[PACLING 2019] Vietnamese Word Segmentation with SVM: Ambiguity Reduction and Suffix Capture
kh4nh12/UIT-ViON-Dataset
An Open-domain, Large-scale and High-quality Dataset of Online News in Vietnamese
AnhHoang0529/Small-LexNormViHSD
A Dataset for Vietnamese Lexical Normalization
JkUndead/UIT-ViNames-Dataset
UIT-ViNames is a dataset for Predicting Genders for Vietnamese Names
DoPhamPhucTinh/ViRe4MRC
ViRe4MRC is the first benchmark for review-based machine reading comprehension in Vietnamese. ViRe4MRC contains 6,603 data points, human-generated from 2,174 reviews on two domains: restaurant and smartphone.
arghadeep25/Database
Geodatabase using Postgre SQL
Khoacannotcode/VinAI_2020
VinBigData Chest X-ray Abnormalities Detection
mlip-cmu/mlip-cmu.github.io
Homepage