Issues
- 0
Error installing on Python 3.12.2
#1188 opened by joelryan2k - 10
BlockingError: No records have been blocked together.
#1179 opened by bwenyenye - 1
Add a new record to existing maps
#1180 opened by rderidder-lda - 2
Several independent groups
#1181 opened by rderidder-lda - 1
- 0
- 1
PyLBFGS package not compatible with python 3.11
#1169 opened by jack-odonoghue - 0
Process crashing while running.
#1168 opened by manoharsuggula - 1
Use of predict_proba
#1162 opened by pecade - 0
Feature importance on classifier
#1161 opened by pecade - 3
- 0
consider amortized costs for branch and bound heuristics
#1176 opened by fgregg - 26
Training not providing enough matches
#1077 opened by tigerang22 - 0
No Predicates Found after Providing Too Many Labels
#1156 opened by EvanOman - 1
Can't import 'dedupe_dataframe' because of numpy
#1150 opened by jordy-moddit - 0
- 0
Doc update to add reference for cluster scoring
#1148 opened by jaime-varela - 1
About Inverse Document Frequency implementation
#1126 opened by lmores - 3
Is incremental clustering supported?
#1113 opened by lmores - 5
Installation breaks because Levenshtein_search version 1.4.5 is no more listed on PyPi
#1129 opened by fsal - 4
- 4
Levenshtein_search GPL 3-licensed Revisited
#1128 opened by sarmohamed - 0
Improve typing of Data
#1136 opened by fgregg - 0
New Index Predicate types using Embeddings
#1143 opened by fgregg - 0
extend index predicates to whole model
#1144 opened by fgregg - 1
About CanopyIndex implementation
#1125 opened by lmores - 0
raise BlockingError( dedupe.core.BlockingError: No records have been blocked together. Is the data you are trying to match like the data you trained on? If so, try adding more training data.
#1119 opened by sowmyahnstreamforce - 1
- 2
memory leak
#1117 opened by Pobby321 - 3
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process
#1118 opened by paulmakeraiimi - 0
memory leak
#1116 opened by Pobby321 - 7
transition to plugins for dedupe variables.
#1085 opened by fgregg - 1
Syntax Error in dedupe/api.py
#1110 opened by sushantpatil99 - 2
ValueError in `numpy.concatenate` during active labeling in Record Linkage and Gazeteer examples
#1108 opened by manusturla - 6
- 5
Inference time of RecordLink is too slow
#1094 opened by QQSkill - 2
Consider HDBSCAN as clustering algorithm
#1092 opened by NickCrews - 3
Cut a release that includes #1087
#1095 opened by NickCrews - 3
- 1
Blocking as a feature for scoring
#1103 opened by fgregg - 1
Can we overhaul internals of Variables
#1104 opened by NickCrews - 5
- 8
Clustering scores containing 0 fails filtering
#1072 opened by NickCrews - 3
Add random_state everywhere for reproducibility
#1089 opened by NickCrews - 1
Error when reproducing Gazetteer Example
#1090 opened by hlra - 2
ConvergenceWarning during training
#1091 opened by NickCrews - 2
Point out gephi as a debugger
#1096 opened by NickCrews - 1
Enforce match when 2 fields are equal
#1075 opened by the-whopper - 1
NotADirectoryError: [WinError 267] The directory name is invalid: 'C:\\Users\\username\\AppData\\Local\\Temp\\tmpfb6idzyr\\blocks.db'
#1074 opened by mbkupfer - 1
Documenting the guarantee that fingerprinter won't emit duplicate tokens for the stame field.
#1078 opened by fgregg