lquirosd/P2PaLA

File do not contain region.

lordzuko opened this issue · 2 comments

@lquirosd I am trying to train P2PALA model on both baseline and region.

python P2PaLA.py --exp_name test_run --gpu -1 --seed 1 --use_global_log ./tensorboard_output --log_comment "_foo" --no-pin_memory --out_mode LR --use_gan --tr_data ./work/data/train2 --do_val --val_data ./work/data/val2

Following is one of the XML:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<PcGts xmlns="http://schema.primaresearch.org/PAGE/gts/pagecontent/2013-07-15" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://schema.primaresearch.org/PAGE/gts/pagecontent/2013-07-15 http://schema.primaresearch.org/PAGE/gts/pagecontent/2013-07-15/pagecontent.xsd">
    <Metadata>
        <Creator>TRP</Creator>
        <Created>2014-09-25T12:35:47.179+02:00</Created>
        <LastChange>2018-06-11T15:01:18.835+02:00</LastChange>
    </Metadata>
    <Page imageFilename="Seite0357.JPG" imageWidth="2338" imageHeight="3511">
        <ReadingOrder>
            <OrderedGroup id="ro_1528722078914" caption="Regions reading order">
                <RegionRefIndexed index="0" regionRef="region_1f24d622-a557-48b5-aa3f-e1ae5841bb7a"/>
                <RegionRefIndexed index="1" regionRef="region_01df7a4c-4993-4ec1-8bb8-f75fa854498d"/>
            </OrderedGroup>
        </ReadingOrder>
        <TextRegion  id="region_1f24d622-a557-48b5-aa3f-e1ae5841bb7a" custom="readingOrder {index:0;} structure {type:page-number;}">
            <Coords points="1664,208 1925,208 1925,325 1664,325"/>
            <TextLine id="line_33d15b73-b080-4fde-918a-d8896ac0699e" custom="readingOrder {index:0;} structure {type:page-number;}">
                <Coords points="1900,227 1683,235 1683,305 1900,297"/>
                <Baseline points="1683,298 1900,290"/>
                <TextEquiv>
                    <Unicode>178</Unicode>
                </TextEquiv>
            </TextLine>
            <TextEquiv>
                <Unicode>178</Unicode>
            </TextEquiv>
        </TextRegion>
        <TextRegion  id="region_01df7a4c-4993-4ec1-8bb8-f75fa854498d" custom="readingOrder {index:1;} structure {type:paragraph;}">
            <Coords points="603,469 1929,469 1974,1777 1946,2148 2069,2361 1929,2704 603,2704"/>
            <TextLine id="line_b9287b47-860e-4ca6-8847-74b2c67b777b" custom="readingOrder {index:0;} structure {type:paragraph;}">
                <Coords points="1869,512 1444,497 696,481 696,604 1444,620 1869,635"/>
                <Baseline points="696,579 1444,595 1869,610"/>
                <TextEquiv>
                    <Unicode>die stetigs Ire Pixen, Wöhrn</Unicode>
                </TextEquiv>
            </TextLine>
            <TextLine id="line_6e8fcbcc-0f44-4c32-ab81-458cf088fcd1" custom="readingOrder {index:1;} structure {type:paragraph;}">
                <Coords points="1815,641 1340,637 707,617 707,721 1340,741 1815,745"/>
                <Baseline points="707,687 1340,707 1815,711"/>
                <TextEquiv>
                    <Unicode>vnd waffen haimblich vnd</Unicode>
                </TextEquiv>
            </TextLine>
            <TextLine id="line_2cabe657-3c84-449a-911d-c280708a261e" custom="readingOrder {index:2;} structure {type:paragraph;}">
                <Coords points="1915,753 1240,719 696,715 696,835 1240,839 1915,873"/>
                <Baseline points="696,811 1240,815 1915,849"/>
                <TextEquiv>
                    <Unicode>offenlich herūmb tragen. welhes</Unicode>
                </TextEquiv>
            </TextLine>
            <TextLine id="line_7cd460dc-f18c-4b77-9f1b-ac6ccda89a49" custom="readingOrder {index:3;} structure {type:paragraph;}">
                <Coords points="1749,879 1136,840 704,848 704,941 1136,933 1749,972"/>
                <Baseline points="704,915 1136,907 1749,946"/>
                <TextEquiv>
                    <Unicode>sonderlich Marckhts Zeiten</Unicode>
                </TextEquiv>
            </TextLine>
            <TextLine id="line_918df727-5423-4df6-8a01-50223f8efc99" custom="readingOrder {index:4;} structure {type:paragraph;} abbrev {offset:17; length:7;expansion:khomenden;}">
                <Coords points="1838,978 1232,955 704,947 704,1045 1232,1053 1838,1076"/>
                <Baseline points="704,1019 1232,1027 1838,1050"/>
                <TextEquiv>
                    <Unicode>Alda Jedem Alher khomend</Unicode>
                </TextEquiv>
            </TextLine>
            <TextLine id="line_3ef11782-85a8-4081-935f-c6c02041dd34" custom="readingOrder {index:5;} structure {type:paragraph;}">
                <Coords points="1846,1082 1244,1059 680,1051 680,1149 1244,1157 1846,1180"/>
                <Baseline points="680,1123 1244,1131 1846,1154"/>
                <TextEquiv>
                    <Unicode>Kaūff: vnd handlsman. vnd</Unicode>
                </TextEquiv>
            </TextLine>
            <TextLine id="line_c12e52b7-369e-4977-876e-94cc446250d4" custom="readingOrder {index:6;} structure {type:paragraph;}">
                <Coords points="1884,1186 1205,1171 707,1155 707,1255 1205,1271 1884,1286"/>
                <Baseline points="707,1227 1205,1243 1884,1258"/>
                <TextEquiv>
                    <Unicode>Anndern frembden Personen</Unicode>
                </TextEquiv>
            </TextLine>
            <TextLine id="line_f30b1253-80ae-4f9d-9eb9-26e7d5c89594" custom="readingOrder {index:7;} structure {type:paragraph;}">
                <Coords points="1842,1293 1259,1258 700,1262 700,1374 1259,1370 1842,1405"/>
                <Baseline points="700,1343 1259,1339 1842,1374"/>
                <TextEquiv>
                    <Unicode>wer die seind. vermig der</Unicode>
                </TextEquiv>
            </TextLine>
            <TextLine id="line_1333a76b-c4c7-41f0-bdb4-12e757f041b7" custom="readingOrder {index:8;} structure {type:paragraph;}">
                <Coords points="1900,1413 1614,1379 1078,1359 719,1359 719,1471 1078,1471 1614,1491 1900,1525"/>
                <Baseline points="719,1447 1078,1447 1614,1467 1900,1501"/>
                <TextEquiv>
                    <Unicode>offenlichen Marckhtsberūeff¬</Unicode>
                </TextEquiv>
            </TextLine>
            <TextLine id="line_b2fef3bc-ffdb-4453-a8c3-c6058529f898" custom="readingOrder {index:9;} structure {type:paragraph;}">
                <Coords points="1880,1531 1271,1488 711,1484 711,1573 1271,1577 1880,1620"/>
                <Baseline points="711,1551 1271,1555 1880,1598"/>
                <TextEquiv>
                    <Unicode>ūng. vnd sonder barer vor</Unicode>
                </TextEquiv>
            </TextLine>
            <TextLine id="line_97beca06-11dd-4bb8-8246-e9f9da88189e" custom="readingOrder {index:10;} structure {type:paragraph;} abbrev {offset:18; length:4;expansion:Fürstlichen;}">
                <Coords points="1768,1625 1236,1605 700,1594 700,1684 1236,1695 1768,1715"/>
                <Baseline points="700,1656 1236,1667 1768,1687"/>
                <TextEquiv>
                    <Unicode>disem Aūsganngner Frl:</Unicode>
                </TextEquiv>
            </TextLine>
            <TextLine id="line_3d9dc680-7ce9-4fdb-ada2-ec2a1dab6bf9" custom="readingOrder {index:11;} structure {type:paragraph;}">
                <Coords points="1957,1722 1282,1688 711,1672 711,1780 1282,1796 1957,1830"/>
                <Baseline points="711,1753 1282,1769 1957,1803"/>
                <TextEquiv>
                    <Unicode>Mandaten. dergleichen Wöhrn</Unicode>
                </TextEquiv>
            </TextLine>
            <TextLine id="line_f4e9aa6b-4368-4c0c-9c12-e9b4e284dec6" custom="readingOrder {index:12;} structure {type:paragraph;}">
                <Coords points="1861,1836 1260,1798 718,1784 718,1885 1260,1899 1861,1937"/>
                <Baseline points="718,1861 1260,1875 1861,1913"/>
                <TextEquiv>
                    <Unicode>bei sich Zūtragen bei hechster</Unicode>
                </TextEquiv>
            </TextLine>
            <TextLine id="line_92eeee8c-7d84-4df6-841a-840605bf6bc2" custom="readingOrder {index:13;} structure {type:paragraph;}">
                <Coords points="1757,1942 1222,1920 711,1904 711,2002 1222,2018 1757,2040"/>
                <Baseline points="711,1971 1222,1987 1757,2009"/>
                <TextEquiv>
                    <Unicode>Straff verboten wirdet,</Unicode>
                </TextEquiv>
            </TextLine>
            <TextLine id="line_2ca8f6f9-066a-4dda-9269-ddfae4f7f755" custom="readingOrder {index:14;} structure {type:paragraph;}">
                <Coords points="1856,2047 1265,2011 722,1996 722,2104 1265,2119 1856,2155"/>
                <Baseline points="722,2084 1265,2099 1856,2135"/>
                <TextEquiv>
                    <Unicode>beforderist Alhie. Als ainem</Unicode>
                </TextEquiv>
            </TextLine>
            <TextLine id="line_e38dad75-3c18-455a-9c10-65851d705adf" custom="readingOrder {index:15;} structure {type:paragraph;}">
                <Coords points="1842,2160 1376,2149 714,2133 714,2221 1376,2237 1842,2248"/>
                <Baseline points="714,2190 1376,2206 1842,2217"/>
                <TextEquiv>
                    <Unicode>offnen vnūersperten Orth.</Unicode>
                </TextEquiv>
            </TextLine>
            <TextLine id="line_9d49bb7e-8712-414f-b529-62fdb799942a" custom="readingOrder {index:16;} structure {type:paragraph;} abbrev {offset:26; length:6;expansion:welschen;}">
                <Coords points="1954,2255 1740,2245 1360,2225 710,2209 710,2319 1360,2335 1740,2355 1954,2365"/>
                <Baseline points="710,2295 1360,2311 1740,2331 1954,2341"/>
                <TextEquiv>
                    <Unicode>Zwischen den Teitsch. vnd welsch</Unicode>
                </TextEquiv>
            </TextLine>
            <TextLine id="line_844ef0d9-ba9a-455b-aded-db437704708f" custom="readingOrder {index:17;} structure {type:paragraph;}">
                <Coords points="1878,2371 1583,2355 1192,2344 698,2326 698,2422 1192,2440 1583,2451 1878,2467"/>
                <Baseline points="698,2395 1192,2413 1583,2424 1878,2440"/>
                <TextEquiv>
                    <Unicode>Hanndlsleüten vnd Personen.</Unicode>
                </TextEquiv>
            </TextLine>
            <TextLine id="line_a294c1a1-2227-46da-8953-25e5a5fac926" custom="readingOrder {index:18;} structure {type:paragraph;}">
                <Coords points="1858,2474 1323,2445 718,2427 718,2527 1323,2545 1858,2574"/>
                <Baseline points="718,2504 1323,2522 1858,2551"/>
                <TextEquiv>
                    <Unicode>sonnderlich bei Nechtlichen</Unicode>
                </TextEquiv>
            </TextLine>
            <TextLine id="line_72ebf6ae-669b-4798-8c31-64ecca4cbd6a" custom="readingOrder {index:19;} structure {type:paragraph;}">
                <Coords points="1838,2580 1174,2560 718,2538 718,2618 1174,2640 1838,2660"/>
                <Baseline points="718,2603 1174,2625 1838,2645"/>
                <TextEquiv>
                    <Unicode>Zeiten, Zū besorgen. Ain böse </Unicode>
                </TextEquiv>
            </TextLine>
            <TextEquiv>
                <Unicode>die stetigs Ire Pixen, Wöhrn
vnd waffen haimblich vnd
offenlich herūmb tragen. welhes
sonderlich Marckhts Zeiten
Alda Jedem Alher khomend
Kaūff: vnd handlsman. vnd
Anndern frembden Personen
wer die seind. vermig der
offenlichen Marckhtsberūeff¬
ūng. vnd sonder barer vor
disem Aūsganngner Frl:
Mandaten. dergleichen Wöhrn
bei sich Zūtragen bei hechster
Straff verboten wirdet,
beforderist Alhie. Als ainem
offnen vnūersperten Orth.
Zwischen den Teitsch. vnd welsch
Hanndlsleüten vnd Personen.
sonnderlich bei Nechtlichen
Zeiten, Zū besorgen. Ain böse </Unicode>
            </TextEquiv>
        </TextRegion>
    </Page>
</PcGts>

I am getting following in the logs:

File Seite0395 do not contains regions
Element type "page-number"undefined on color dic, set to default=175
Element type "page-number"undefined on color dic, set to default=175
Element type "paragraph"undefined on color dic, set to default=175
Element type "paragraph"undefined on color dic, set to default=175
File Seite0357 do not contains regions
Element type "page-number"undefined on color dic, set to default=175
Element type "page-number"undefined on color dic, set to default=175
Element type "paragraph"undefined on color dic, set to default=175
Element type "paragraph"undefined on color dic, set to default=175
Element type "marginalia"undefined on color dic, set to default=175
Element type "marginalia"undefined on color dic, set to default=175
File Seite0388 do not contains regions

From the XML file I can see that the region is defined in the file, what I am doing wrong ?

Hi,
In this case the file contains no region defined on the --regions argument, please define those. e.g.
P2PaLA.py [some options] --regions page-number paragraph marginalia [other options]
See the help for mode details help

This worked ! closing the issue.