LexMapr

A Lexicon and Rule-Based Tool for Translating Short Biomedical Specimen Descriptions into Semantic Web Ontology Terms

Build status

The main script file for processing is bin/lexmapr

Dependencies

Usage

usage: lexmapr [-h] [-o [OUTPUT]] [--format FORMAT] input_file [log_file]

positional arguments:
  input_file            Input csv file
  log_file              Log file

optional arguments:
  -h, --help            show this help message and exit
  -o [OUTPUT], --output [OUTPUT]
                        Output file
  --format FORMAT       Output format

Example input files (in `lexmapr/tests/input`)

Filename	Description
`small_simple.csv`	A small simple test dataset
`enteroForFreq.csv`	Dataset from EnteroBase
`genomeTrackerMaster.csv`	Dataset from GenomeTrakr
`bccdcsample.csv`	Dataset from BCCDC
`zheminSamples.csv`	Zhemin's samples from EnteroBase
`GRDI-UniqueSamples.csv`	Dataset from GRDI

Resources Files (in `lexmapr/resources`)

Filename	Description
`CombinedResourceTerms.csv`	All the ontology terms with their ids extracted and combined in a single file
`SynLex.csv`	Synonym Lexicon
`AbbLex.csv`	Abbreviation/Acronym Lexicon
`NefLex.csv`	Non English FoodNames Lexicon
`ScorLex.csv`	Spellings correction Lexicon
`SemLex.csv`	Semantic Tagging Lexicon
`inflection-exceptions.csv`	Exception list for avoiding false positives during inflection treatment
`candidateProcesses.csv`	Additional processes which are candidates for inclusion
`wikipediaCollocations.csv`	Additional compound terms (collocations) detected out of datasets which are candidates for inclusion
`mining-stopwords.csv`	Stop Words list for treatment refined for domian under consideration

dfornika/LexMapr

LexMapr

Build status

Dependencies

Usage

Example input files (in lexmapr/tests/input)

Resources Files (in lexmapr/resources)

Example input files (in `lexmapr/tests/input`)

Resources Files (in `lexmapr/resources`)