philgooch/abbreviation-extraction

[tests] Some results, for reference ...

victoriastuart opened this issue · 0 comments

This is fabulous; thank you! :-)

I ran some tests. As expected abbreviations in [ ] or { } are ignored (request: add these?), and as mentioned in Issue #12 it would be nice if reverse definitions such as n.s. (not significant) were included, along with the standard not significant (n.s.) occurrences.


## SINGLE-SENTENCE TESTS:

$ cat victoria-test-single_sentence.txt; python abbreviations/schwartz_hearst.py victoria-test-single_sentence.txt 

  Breast cancer susceptibility gene 1 (BRCA1) is a tumor suppressor protein.
  {'BRCA1': 'Breast cancer susceptibility gene 1'}

## Ditto:

  Breast cancer susceptibility gene 2 [BRCA2] is also a tumor suppressor protein.
  {}

  Victoria is from NS (Nova Scotia).
  {}

  Victoria is from N.S. (Nova Scotia).
  {}

  Victoria is from Nova Scotia (N.S.).
  {'N.S.': 'Nova Scotia'}

  Victoria is from Nova Scotia (NS).
  {'NS': 'Nova Scotia'}

  Breast cancer susceptibility gene 2 (BRCA2) is also a tumor suppressor protein.
  {'BRCA2': 'Breast cancer susceptibility gene 2'}

  Breast cancer susceptibility gene 2 {BRCA2} is also a tumor suppressor protein.
  {}

  Breast cancer susceptibility gene 2 -- BRCA2 -- is also a tumor suppressor protein.
  {}

## More complex tests:

  Victoria is from Nova Scotia (NoSc).
  {'NoSc': 'Nova Scotia'}

  Victoria is from Nova Scotia (ns).
  {'ns': 'Nova Scotia'}

  Victoria is from Nova Scotia (nosc).
  {'nosc': 'Nova Scotia'}

  Association of bioMediCal scientists of CanaDa (ABC).
  {'ABC': 'Association of bioMediCal scientists of CanaDa'}

I am impressed~ :-D