similarity

There are 405 repositories under similarity topic.

  • vektonn

    Language:C#127
  • text2vec

    text2vec

    text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。

    Language:Python4.6k
  • similarity

    similarity: Text similarity calculation Toolkit for Java. 文本相似度计算工具包,java编写,可用于文本相似度计算、情感分析等任务,开箱即用。

    Language:Java1.5k
  • dssim

    dssim

    Image similarity comparison simulating human perception (multiscale SSIM in Rust)

    Language:Rust1.1k
  • python-string-similarity

    A library implementing different string similarity and distance measures using Python.

    Language:Python996
  • recordlinkage

    recordlinkage

    A powerful and modular toolkit for record linkage and duplicate detection in Python

    Language:Python975
  • similarities

    Similarities: a toolkit for similarity calculation and semantic search. 相似度计算、匹配搜索工具包,支持亿级数据文搜文、文搜图、图搜图,python3开发,开箱即用。

    Language:Python809
  • Final_word_Similarity

    综合了同义词词林扩展版与知网(Hownet)的词语相似度计算方法,词汇覆盖更多、结果更准确。

    Language:Python724
  • Macropodus

    自然语言处理工具Macropodus,基于Albert+BiLSTM+CRF深度学习网络架构,中文分词,词性标注,命名实体识别,新词发现,关键词,文本摘要,文本相似度,科学计算器,中文数字阿拉伯数字(罗马数字)转换,中文繁简转换,拼音转换。tookit(tool) of NLP,CWS(chinese word segnment),POS(Part-Of-Speech Tagging),NER(name entity recognition),Find(new words discovery),Keyword(keyword extraction),Summarize(text summarization),Sim(text similarity),Calculate(scientific calculator),Chi2num(chinese number to arabic number)

    Language:Python660
  • pHash

    pHash - the open source perceptual hash library

    Language:C++567
  • BertSimilarity

    Computing similarity of two sentences with google's BERT algorithm。利用Bert计算句子相似度。语义相似度计算。文本相似度计算。

    Language:Python496
  • cogcomp-nlp

    CogComp's Natural Language Processing Libraries and Demos: Modules include lemmatizer, ner, pos, prep-srl, quantifier, question type, relation-extraction, similarity, temporal normalizer, tokenizer, transliteration, verb-sense, and more.

    Language:Java475
  • Duplicate-Image-Finder

    difPy - Python package for finding duplicate and similar images

    Language:Python471
  • DISTS

    IQA: Deep Image Structure and Texture Similarity Metric

    Language:Python390
  • pg_similarity

    set of functions and operators for executing similarity queries

    Language:C368
  • WordSimilarity

    基于哈工大同义词词林扩展版的单词相似度计算方法

    Language:Python357
  • fast_vector_similarity

    The Fast Vector Similarity Library is designed to provide efficient computation of various similarity measures between vectors.

    Language:Rust353
  • Customer-Chatbot

    中文智能客服机器人demo,包含闲聊和专业问答2个部分,支持自定义组件(Chinese intelligent customer chatbot Demo, including the gossip and the professional Q&A(FAQ) , support for custom components!)

    Language:Python309
  • textdistance.rs

    🦀📏 Rust library to compare strings (or any sequences). 25+ algorithms, pure Rust, common interface, Unicode support.

    Language:Rust281
  • geocoding

    :globe_with_meridians: 地理编码技术,提供地址标准化和相似度计算。

    Language:Kotlin257
  • sensegram

    Making sense embedding out of word embeddings using graph-based word sense induction

    Language:Python212
  • html-similarity

    Compare html similarity using structural and style metrics

    Language:Python210
  • tensorflow-ml-nlp

    텐서플로우와 머신러닝으로 시작하는 자연어처리(로지스틱회귀부터 트랜스포머 챗봇까지)

    Language:Jupyter Notebook200
  • ChatGLM-RLHF

    对ChatGLM直接使用RLHF提升或降低目标输出概率|Modify ChatGLM output with only RLHF

    Language:Python190
  • synt

    Find similar functions and classes in your JavaScript/TypeScript code

    Language:TypeScript182
  • text-similarity

    文本相似度(匹配)计算,提供Baseline、训练、推理、指标分析...代码包含TensorFlow/Pytorch双版本

    Language:Python174
  • xiangsi

    中文文本相似度计算器

    Language:Python129
  • ml-classify-text-js

    ml-classify-text-js

    Machine learning based text classification in JavaScript using n-grams and cosine similarity

    Language:JavaScript126
  • unisim

    UniSim is a package for efficient similarity computation, fuzzy matching, and clustering of data.

    Language:Python120
  • text-similarity-php

    Calculate Text Similarity by Cosine Theorem + Segmentation PHP Version

    Language:PHP109
  • rltk

    Record Linkage ToolKit (Find and link entities)

    Language:Python107
  • ruimtehol

    R package to Embed All the Things! using StarSpace

    Language:C++102
  • nxontology

    NetworkX-based Python library for representing ontologies

    Language:Python84
  • goodreads-toolbox

    9 tools for Goodreads.com, for finding people based on the books they’ve read, finding books popular among the people you follow, following new book reviews, etc

    Language:Perl83
  • levenshtein.c

    Levenshtein algorithm in C

    Language:C82
  • aurora

    Malware similarity platform with modularity in mind.

    Language:Python76