Literature, References, Resources, Papers, Links, Links to Libraries etc.
Opened this issue · 5 comments
Note, 4.1.2023: During this research effort I've been browsing, reviewing, visiting and revisiting, studying a huge amount of articles, concepts,, linked by association during browsing etc. for feeding ideas etc. The best would be to put them in some special representation, DB, semantic network etc.
So far starting with one out of many hundreds or maybe a thousand (so far) - well, a general curiosity, starting from that seed. This is a research & development project on its own, automatic analysis and learning assistant, reading assistant and accelerator, cognitive accelerator etc. An unpublished "in-house" project and experimental application, called [Research] Assistant or ACS in short (Assistant C#) which is a playground and inspiration for ideas and developments in these directions of "Cognitive Acceleration". In a broader sense, any computer and software is such a tool, though.
Various Statistical Similarity methods: https://en.wikipedia.org/wiki/Semantic_similarity
A blog on Question Answering etc.: https://queryunderstanding.com/
Speech Recognition datasets etc.
https://ai.meta.com/blog/voxpopuli-the-largest-open-multilingual-speech-corpus-for-ai-translation-and-more/
https://arxiv.org/abs/2006.13979
https://ai.meta.com/blog/xls-r-self-supervised-speech-processing-for-128-languages/
Language Identification library: tested, use the small model
https://fasttext.cc/docs/en/language-identification.html
https://huggingface.co/facebook/fasttext-language-identification
Common Crawl tools
https://github.com/facebookresearch/cc_net
Huge Dataset
https://github.com/togethercomputer/RedPajama-Data
...
https://arxiv.org/abs/2007.10310
Bulgarian POS-tagger and NER-tagger: Applied
https://github.com/AMontgomerie/bulgarian-nlp
https://github.com/AMontgomerie/bulgarian-nlp/blob/master/examples/pos_example.ipynb
https://github.com/AMontgomerie/bulgarian-nlp/blob/master/examples/text_annotator_example.ipynb
About the Named-entity tags:
https://en.wikipedia.org/wiki/Inside%E2%80%93outside%E2%80%93beginning_(tagging)
PHATGOOSE Repository
PHATGOOSE, which stands for Post-Hoc Adaptive Gating Over an Ocean of Specialized Experts, enables zero-shot generalization from specialized experts (eg PEFT modules) trained on diverse datasets by adaptively routing among them. It requires an additional, inexpensive training step of a gate in front of a frozen PEFT module for its corresponding task.
Pyvene
Stanford NLP Python Library for Understanding and Improving PyTorch Models via Interventions
https://github.com/stanfordnlp/pyvene
https://arxiv.org/abs/2403.07809
Depth map monocular ... depth estimation ... synthetic data, real data ...
Depth Anything V2
Lihe Yang1 Bingyi Kang2
†
Zilong Huang2
Zhen Zhao Xiaogang Xu Jiashi Feng2 Hengshuang Zhao1
‡
1HKU 2TikTok
†
project lead
‡
corresponding author
https://depth-anything-v2.github.io/