georgid/AlignmentDuration

concatenate textGrid data.

Closed this issue · 3 comments

Concatenate TextGrid annotation files for the segmented files
into one-per-recording TextGrid annotation automatically:

  1. install TextGridTools version 1.4.1 (either through GitHub or pip install --upgrade tgt). If you use pip, please make sure that it is really version 1.4.1 that is installed — if you get an older version, try again.
  2. QUESTION:
    Because each big audio file to which the concatenated text grid corresponds starts with n seconds of silence, then I need to insert in the beginning n seconds of silence and then concatenate TextGrids where the first starts at timestamp = n.

To do this I tried this code:

shiftTime = 51.354230

tiers_ = []
os.chdir(pathInput)
tgtURI = '/Users/joro/Documents/Phd/UPF/ISTANBUL/goekhan/02_Kimseye_2_zemin.TextGrid'

from tgt.util import shift_boundaries
tg = tgt.read_textgrid(tgtURI)

tier = tg.get_tier_by_name('words')
tierShifted = shift_boundaries(tier, shiftTime,0)

tg.add_tiers(tierShifted)

tgOutURI = pathOut + 'Kimseye.TextGrig'
tgt.write_to_file(tg, tgOutURI)

However I get this error:

in ()
21
22 tgOutURI = pathOut + 'Kimseye.TextGrig'
---> 23 tgt.write_to_file(tg, tgOutURI)

/usr/local/lib/python2.7/site-packages/tgt/io.pyc in write_to_file(textgrid, filename, format, encoding, **kwargs)
390 with codecs.open(filename, 'w', encoding) as f:
391 if format in _EXPORT_FORMATS:
--> 392 f.write(_EXPORT_FORMATS[format](textgrid, **kwargs))
393 else:
394 raise Exception('Unknown output format: {0}'.format(format))

/usr/local/lib/python2.7/site-packages/tgt/io.pyc in export_to_short_textgrid(textgrid)
241 textgrid_corrected = correct_start_end_times_and_fill_gaps(textgrid)
242 for tier in textgrid_corrected:
--> 243
result += ['"' + tier.tier_type() + '"',

244                    '"' + escape_text(tier.name) + '"',
245
                tier.start_time, tier.end_time, len(tier)]

AttributeError: 'Interval' object has no attribute 'tier_type'

RESPONSE:

the problem you encounter is solved easily. To add the shifted tier, you did the following:

tg.add_tiers(tierShifted)

This method, however, expects a list of tiers, not a single tier. You have to do the following instead

tg.add_tier(tierShifted)

or

tg.add_tiers([tierShifted])

concatenate in chrono order one after the other: first make sure they have same number of tiers

python ~/Downloads/TextGridTools-master/scripts/tgt-concatenate-textgrids.py -i 727cff89-392f-4d15-926d-63b2697d7f3f.TextGrid 727cff89-392f-4d15-926d-63b2697d7f3f_71.389446_82.591367.TextGrid -o 727cff89-392f-4d15-926d-63b2697d7f3f.TextGrid

script to shift first tier

python ~/Downloads/TextGridTools-master/scripts/tgt-shift-boundaries.py 20.867514 727cff89-392f-4d15-926d-63b2697d7f3f/727cff89-392f-4d15-926d-63b2697d7f3f_20.867514_31.168639.TextGrid

import os
import tgt
shiftTime = 20.867514

tiers_ = []
tgtURI = '/home/georgid/Documents/makam_acapella/727cff89-392f-4d15-926d-63b2697d7f3f/727cff89-392f-4d15-926d-63b2697d7f3f.TextGrid'

from tgt.util import shift_boundaries
tg = tgt.read_textgrid(tgtURI)

tier = tg.get_tier_by_name("phonemes")
##### this part does not work
# tierShifted = shift_boundaries(tier, shiftTime,0)
# a = tier.intervals
# for anno in a[:25]:

#     anno.end_time += 11
#     anno.start_time +=  11

#     tier.add_interval(anno)

####### makes sense only if tier consists of one section to repeat
tier2 = tgt.util.concatenate_tiers(tier, tier, shiftTime)
print tier2

tg.add_tier(tier2)
# tg.delete_tier(tier)

pathOut = '/home/georgid/Documents/makam_acapella/727cff89-392f-4d15-926d-63b2697d7f3f/727cff89-392f-4d15-926d-63b2697d7f3f.TextGrid'
# tgt.write_to_file(tg, pathOut)

added whole TextGrids in a new repository https://github.com/georgid/makam_acapella