yuxiaw

yuxiaw's Stars

gradio-app/gradio
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
Language:Python35.1k 178 5.2k2.6k
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
Language:Python15k 123 1.2k1.4k
brightmart/nlp_chinese_corpus
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
9.6k 287 451.5k
esbatmop/MNBVC
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化，也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
3.6k 66 56252
Libr-AI/OpenFactVerification
Loki: Open-source solution designed to automate the process of verifying factuality
Language:Python1k 5 745
GAIR-NLP/factool
FacTool: Factuality Detection in Generative AI
Language:Python845 10 2963
chatopera/efaqa-corpus-zh
❤️Emotional First Aid Dataset, 心理咨询问答、聊天机器人语料库
Language:Python653 17 082
facebookresearch/EmpatheticDialogues
Dialogue model that produces empathetic responses when trained on the EmpatheticDialogues dataset.
Language:Python463 12 4264
shmsw25/FActScore
A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"
Language:Python308 4 3648
Sahandfer/EMPaper
This is a repository for sharing papers in the field of empathetic conversational AI. The related source code for each paper is linked if available.
248 13 226
Libr-AI/do-not-answer
Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs
Language:Jupyter Notebook200 5 125
behavioral-data/Empathy-Mental-Health
Repository containing codes and dataset access instructions for the EMNLP 2020 paper on empathy in text-based mental health support
Language:Python158 13 1535
Libr-AI/OpenRedTeaming
Papers about red teaming LLMs and Multimodal models.
91 9 05
yuxiaw/Factcheck-GPT
Fact-Checking the Output of Generative Large Language Models in both Annotation and Evaluation.
Language:Python82 4 611
mbzuai-nlp/SemEval2024-task8
SemEval2024-task8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection
Language:Python71 9 228
marslanm/Multimodality-Representation-Learning
This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have been cited and discussed in the survey just accepted https://dl.acm.org/doi/abs/10.1145/3617833 .
69 8 07
behavioral-data/PARTNER
Repository containing code for the WWW 2021 paper on empathic rewriting
Language:Python64 9 1014
hkust-nlp/felm
Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)
Language:Python56 3 51
Nanami18/Snowballed_Hallucination
44 2 04
anthonywchen/RARR
RARR: Researching and Revising What Language Models Say, Using Language Models
Language:Python43 2 38
yuxiaw/OpenFactCheck
Language:Python37 3 12
ryuryukke/OUTFOX
[AAAI 2024] The official repository for our paper, "OUTFOX: LLM-Generated Essay Detection Through In-Context Learning with Adversarially Generated Examples"
Language:Python36 2 03
lanluyu/zhihu
zhihu是一个知乎话题内容的爬虫，可以爬取知乎所有的话题相关的问答内容
Language:Python30 0 114
oaimli/PeerSum
The dataset and code for PeerSum at EMNLP'23.
Language:Python14 1 21
mitmedialab/empathic-stories
Language:HTML13 14 03
yuxiaw/USTS
This work explores collective human opinions in Semantic Textual Similarity, with a new uncertainty-aware STS dataset, USTS released.
1 1 00