A combinatory categorial grammar (CCG) library for the web.
NOTE: Work-in-progress could be found on the development branch.
- Node.js ^= 12.18.2
- NPM ^= 6.14.7
Run npm install
and we are all set.
Include this library on your HTML file.
<script type="text/javascript" src="/path/to/ccgjs"></script>
Replace /path/to/ccgjs
with the CCGjs library URL.
Then, use it:
<script type="text/javascript">
const { CCG } = ccgjs;
// do something
</script>
See examples/index.html
as reference.
Read and then parse machine-readable CCG derivation into a JavaScript object.
Usage:
const str = '(<T S 0 2> (<L S/NP PSP PSP Hi S/NP>) (<L NP NNP NNP Wisnu NP>))';
const reader = new CCG.Reader(str);
if (reader.read()) {
console.log(reader.result);
}
The returned object looks like this:
{
node: {
type: 'T',
ccgCat: 'S',
head: 0,
dtrs: 2,
},
left: {
node: {
type: 'L',
ccgCat: 'S/NP',
modPOSTag: 'PSP',
origPOSTag: 'PSP',
word: 'Hi',
predArgCat: 'S/NP',
},
},
right: {
node: {
type: 'L',
ccgCat: 'NP',
modPOSTag: 'NNP',
origPOSTag: 'NNP',
word: 'Wisnu',
predArgCat: 'NP',
},
},
}
We uses PEG.js to build the parser.
The parsing expression grammar could be found on the src/ccg.pegjs
file.
As for the generated parser, it could be found on the src/generated.pegjs.ts
file.
Run npm run pegjs
to generate the .pegjs
file into .ts
file.
Construct a JavaScript tree object based on the parsed machine-readable CCG
derivation via CCG.Reader
. It will also building useful metadata for later
use.
Usage:
const str = '(<T S 0 2> (<L S/NP PSP PSP Hi S/NP>) (<L NP NNP NNP Wisnu NP>))';
const tree = new CCG.Tree(str);
console.log(tree);
The returned object looks like this:
Tree {
metadata: {
isParsed: true,
sentence: 'Hi Wisnu',
words: [ 'Hi', 'Wisnu' ],
ccgCats: [ 'S/NP', 'NP' ],
height: 2,
nodes: [ [Object], [Object], [Object] ]
},
mappedIndexedWords: { '0': { value: [Object] }, '1': { value: [Object] } },
root: {
value: { type: 'T', ccgCat: 'S', head: 0, dtrs: 2 },
left: { value: [Object] },
right: { value: [Object] }
}
}
For more information about the omitted [Object]
,
see CCG.TreeTypes.Metadata
, CCG.TreeTypes.IndexedWordMapper
, and
CCG.TreeTypes.Node
.
We can also turn the tree back into machine-readable CCG derivation by doing
tree.toString()
. The returned string
will be:
(<T S 0 2> (<L S/NP PSP PSP Hi S/NP>) (<L NP NNP NNP Wisnu NP>))
It is possible to get the structured CCG derivation based on the
CCG.TreeTypes.Node
simply by doing tree.buildDerivations()
.
The returned Array<Array<CCG.TreeTypes.Derivation>>
will be:
[
[
{ from: 0, to: 0, ccgCat: 'S/NP' },
{ from: 1, to: 1, ccgCat: 'NP' }
],
[ { from: 0, to: 1, ccgCat: 'S', opr: '>' } ]
]
How to read?
In the CCG.TreeTypes.Metadata
, we may find words
key. In this example,
it will be ['Hi', 'Wisnu']
. Meaning that word Hi
is at 0
index and
word Wisnu
is at 1
index. We may read it as:
Hi Wisnu
------ ---------
S/NP NP
--------------->
S
Render and manipulate CCG.Tree
as a DOM (document object model) directly on
the browser. Currently, there is only one method available.
Usage:
const str = [
'(<T Sf 1 2>',
'(<T NP 0 2>',
'(<L NP NNP NNP raam NP>)',
'(<L NP\\NP PSP PSP ne NP\\NP>))',
'(<T Sf\\NP 1 2>',
'(<T NP 0 2>',
'(<L NP NNP NNP mohan NP>)',
'(<L NP\\NP PSP PSP ko NP\\NP>))',
'(<T (Sf\\NP)\\NP 1 2>',
'(<T NP 1 2>',
'(<L NP/NP JJ JJ niilii NP/NP>)',
'(<L NP NN NN kitaab NP>))',
'(<L ((Sf\\NP)\\NP)\\NP VM VM dii ((Sf\\NP)\\NP)\\NP>))))',
].join(' ');
const dom = new CCG.DOM(str);
const table = dom.createTable();
// apply it directly
document.body.appendChild(table);
// or take the HTML string
console.log(table.outerHTML);
How it looks like? Take a look on this JS Bin!
There are a lot of things to do. The goal of this project is to enable interactive CCG derivation manipulation directly on the web browser. By manipulation we meant, as an example, the ability to (interactively) create, edit, and delete a CCG node from its tree. The direction of this project should be clear by now.
Please refrain to contribute for the time being until this project officially
released. We will add CONTRIBUTING.md
after we are ready.
Both issues and pull requests will be ignored.
This JavaScript library is part of my undergraduate thesis. Hence, I would like to thank my supervisors (@aromadhony and @saidalfaraby) for their advice and guidance.
Hockenmaier, J., & Steedman, M. (2007). CCGbank: A corpus of CCG derivations and dependency structures extracted from the Penn Treebank. Computational Linguistics, 33(3), 355–396.
Ambati, B.R., Deoskar, T. & Steedman, M. Hindi CCGbank: A CCG treebank from the Hindi dependency treebank. Lang Resources & Evaluation 52, 67–100 (2018).
Licensed under the MIT License.