frldj's Stars
zylon-ai/private-gpt
Interact with your documents using the power of GPT, 100% privately, no data leaks
spotify/annoy
Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk
cemoody/lda2vec
castorini/pyserini
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
codebasics/deep-learning-keras-tf-tutorial
Learn deep learning with tensorflow2.0, keras and python through this comprehensive deep learning tutorial series. Learn deep learning from scratch. Deep learning series for beginners. Tensorflow tutorials, tensorflow 2.0 tutorial. deep learning tutorial python.
weaviate/semantic-search-through-wikipedia-with-weaviate
Semantic search through a vectorized Wikipedia (SentenceBERT) with the Weaviate vector search engine
kstathou/vector_engine
Build a semantic search engine with Transformers and Faiss
acrosson/nlp
Natural Language Processing
Sarthakjain1206/Intelligent_Document_Finder
Document Search Engine Tool
AravindR7/Topic-Modeling-BERT-LDA
# Topic modeling with BERT, LDA and Clustering. Latent Dirichlet Allocation(LDA) probabilistic topic assignment and pre-trained sentence embeddings from BERT/RoBERTa.
YatinChaudhary/TopicBERT
Implementation of EMNLP2020 accepted paper: "TopicBERT: Topic-aware BERT for Efficient Document Classification"
ayesha92ahmad/NLP-image-to-text
code to extract text from images
dgunning/cord19
a repo for the cord19 challenge
ryderling/DGMS
Code for "Deep Graph Matching and Searching for Semantic Code Retrieval"
Ismailhachimi/French-Word-Embeddings
French word embeddings from series sub-titles
AustinKrause/nyt-article-summarizer
New York Times Article Summarization Tool
IkshitaMishra/TopicModelling-LSA-LDA
Retrieving 'Topics' (concept) from corpus using (1) Latent Dirichlet Allocation (Genism) for modelling. Perplexity and Coherence score were used as evaluation models. (2) Latent Semantic Analysis using Term Frequency- Inverse Document Frequency and Truncated Singular Value Decomposition.
beaupletga/Search_Engine_for_Wikipedia
Implementing from scratch a search engine for the French Wikipedia
boxabhi/django_elastic_demo
iam-mhaseeb/Python-Implementation-of-LSA
A Jupyter notebook on implementation of Latent Semantic Analysis (A Topic Modelling Algorithm) in python.
misbahulard/search-engine-tfidf
Search engine implementation with TF.IDF algorithm using python + flask + mysql
yym6472/bert_semantic_matching
BERT中文语义匹配,基于allennlp。
SivaAndMe/Coarse-grained-Sentiment-Analysis-on-Swachh-Bharat-using-Tweets
Arowwa/CamembertForFun
Small project of sentiment classification using CamemBERT trained on Allociné reviews and with a webapp interface
HARIHARAN548/Checkbox-Table-cell-detection-using-OpenCV-Python
To extract relevant information from unstructured data sources like OMR sheets, scanned invoices, bills, etc into structured data, using Computer Vision and Natural Language Processing. the primary steps we are dependent on are Optical Character Recognition and Document Layout Analysis. Optical Character Recognition (OCR) is for detecting the text from the image where we try to get additional metadata from the documents like identifying headers, paragraphs, lines, words, tables, key-value pairs, etc.
hiteshmishra708/django-elasticsearch
msesmart/InformationRetrieval
Real Yelp review data, cosine similarity ranking of query review in Vector Space, TF-IDF model. Unigram, Bigram Language model with linear interpolation smoothing, absolute discounting smoothing, Dirichlet smoothing. Perplexity analysis. Evaluations of six language models, including boolean, TF-IDF, Okapi BM25, Pivoted Length Normalization, Jelinek-Mercer smoothing, Dirichlet Prior Smoothing. The evaluation methods include Mean Average Precision, P@K, Reciprocal rank, Normalized Discount Cumulative Gain (NDCG).
ChosenOne2241/BM25
A complete implementation of Okapi BM25 with five evaluation methods (precision, recall, MAP, P at N and NDCG at N), using only standard Python libraries.
jennifernguyen281/Forecast-Daily-Interstate-94-Westbound-Traffic-Volume-for-MN-DoT-ATR-Station-301
Time series forecasting project using SAS
kassimi98/Information_retrieval_system
Information retrieval system ,python, Text Mining, bag f words, web Mining, word2Vec, jupyter, IPYNB