xtannier
Professor at Sorbonne Université (formerly known as Univ. Pierre et Marie Curie — UPMC). Teaches at Polytech Sorbonne. Researcher at LIMICS.
Sorbonne Université, Inserm, Limics, Polytech SorbonneParis, France
Pinned Repositories
DCTFinder
Extract title and creation time from web page.
EZAnnot
Tool for fast concept and rule-based extraction for dummies.
hyperopt-sklearn
Hyper-parameter optimization for sklearn
kea
A tokenizer for French
MeSH-C_classification
NCRFpp
NCRF++, an Open-source Neural Sequence Labeling Toolkit. It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components. (code for COLING/ACL 2018 paper)
nlstruct
Natural language structuring library
PyRATA
"Python Rule-based feAture sTructure Analysis" or "Python Rule-bAsed Text Analysis"
simpletransformers
Transformers made simple with training, evaluation, and prediction possible with one line each. Currently supports Sequence Classification (binary, multiclass, multilabel, sentence pair), Token Classification (NER), Question Answering, Language Modeling, Regression, Conversational AI, and Multi-Modal tasks. Built on top of the Hugging Face Transformer library.
WebAnnotator
WebAnnotator is a tool for annotating Web pages. WebAnnotator is implemented as a Firefox extension (https://addons.mozilla.org/en-US/firefox/addon/webannotator/), allowing annotation of both offline and inline pages. The HTML rendering is fully preserved and all annotations consist in new HTML spans with specific styles. WebAnnotator provides an easy and general-purpose framework and is made available under CeCILL free license (close to GNU GPL — see the license text), so that use and further contributions are made simple. All parts of an HTML document can be annotated: text, images, videos, tables, menus, etc. The annotations are created by simply selecting a part of the document and clicking on the relevant type and subtypes. The annotated elements are then highlighted in a specific color. Annotation schemas can be defined by the user by creating a simple DTD representing the types and subtypes that must be highlighted. Finally, annotations can be saved (HTML with highlighted parts of documents) or exported (in a machine-readable format).
xtannier's Repositories
xtannier/WebAnnotator
WebAnnotator is a tool for annotating Web pages. WebAnnotator is implemented as a Firefox extension (https://addons.mozilla.org/en-US/firefox/addon/webannotator/), allowing annotation of both offline and inline pages. The HTML rendering is fully preserved and all annotations consist in new HTML spans with specific styles. WebAnnotator provides an easy and general-purpose framework and is made available under CeCILL free license (close to GNU GPL — see the license text), so that use and further contributions are made simple. All parts of an HTML document can be annotated: text, images, videos, tables, menus, etc. The annotations are created by simply selecting a part of the document and clicking on the relevant type and subtypes. The annotated elements are then highlighted in a specific color. Annotation schemas can be defined by the user by creating a simple DTD representing the types and subtypes that must be highlighted. Finally, annotations can be saved (HTML with highlighted parts of documents) or exported (in a machine-readable format).
xtannier/DCTFinder
Extract title and creation time from web page.
xtannier/EZAnnot
Tool for fast concept and rule-based extraction for dummies.
xtannier/hyperopt-sklearn
Hyper-parameter optimization for sklearn
xtannier/kea
A tokenizer for French
xtannier/MeSH-C_classification
xtannier/NCRFpp
NCRF++, an Open-source Neural Sequence Labeling Toolkit. It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components. (code for COLING/ACL 2018 paper)
xtannier/nlstruct
Natural language structuring library
xtannier/PyRATA
"Python Rule-based feAture sTructure Analysis" or "Python Rule-bAsed Text Analysis"
xtannier/simpletransformers
Transformers made simple with training, evaluation, and prediction possible with one line each. Currently supports Sequence Classification (binary, multiclass, multilabel, sentence pair), Token Classification (NER), Question Answering, Language Modeling, Regression, Conversational AI, and Multi-Modal tasks. Built on top of the Hugging Face Transformer library.
xtannier/term-extractor
Extraction de termes
xtannier/yaset
Yet Another SEquence Tagger