/brat-standoff-to-json

Converts brat standoff format to JSONL format

Primary LanguageGoMIT LicenseMIT

Welcome to Brat-standoff to JSON converter

Brat-standoff to JSON converter is a cli tool that converts the given brat standoff to JSON format

Using brat Standoff Converter

 git clone https://github.com/astutic/brat-standoff-to-json.git

OR

Download a release from here

and rename the file as brat-standoff-to-json

Examples

Converts and Prints JSONL (in acharya format) for brat files in the specified directory

go run main.go -p "./path/to/the/collection"

OR

brat-standoff-to-json -p "./path/to/the/collection"
example
go run main.go -p "./testData/news"

OR

brat-standoff-to-json  -p "./testData/news"

Save to an output file

go run main.go -p "./path/to/the/collection" --output "path/output-file-name"

OR

brat-standoff-to-json  -p "./path/to/the/collection" --output "path/output-file-name"
example

The command below will generate an output file named acharyaFormat.jsonl in the current directory

go run main.go -p "./testData/news" --output "./acharyaFormat.jsonl"

OR

brat-standoff-to-json  -p "./testData/news" --output "./acharyaFormat.jsonl"

Converting specific files

! NOTE the order of the .ann files an .txt files should be the same
go run main.go --ann "file1.ann,file2.ann" --txt "file1.txt,file2.txt" --conf "file.conf"

example
go run main.go --ann "path/to/first.ann,path/to/second.ann" --txt "path/to/first.txt,path/to/second.txt" --conf "path/to/annotation.conf"

OR

brat-standoff-to-json  --ann "path/to/first.ann,path/to/second.ann" --txt "path/to/first.txt,path/to/second.txt" --conf "path/to/annotation.conf"

Commands

Command Short hand Type Description Default value
folderPath p string Path to the folder containing the brat standoff collection
ann a string Comma sepeartad locations of the annotation files (.ann) in correct order
txt t string Comma sepeartad locations of the text files (.txt) in correct order
conf c string Location of the annotation configuration file (annotation.conf)
output o string Name of the output file to be generated
force f bool If you wish to overwrite the generated file then set force to true false
version v bool Prints the version number false

Original data displayed in brat

Original data displayed in brat

Data from Brat converted to and uploaded to Acharya

Brat data displayed in Acharya

Note

[ Windows PowerShell ] If you want to use the Brat → JSONL converter and If Brat Standoff contains non English characters Then its advised to set the following in PowerShell first

$OutputEncoding = [console]::InputEncoding = [console]::OutputEncoding = New-Object System.Text.UTF8Encoding

Features that are currently unsupported: