facebook/duckling

Misrecognitions on two hourly intervals in an input [FR]

Opened this issue · 0 comments

Offending input: 08h30-12h00 et 14h00-16h30
Command:

curl -XPOST http://0.0.0.0:8000/parse --data 'locale=fr_CH&text=" 08h30-12h00 et 14h00-16h30"'

Problems with the current output:

  • 08h30-12h00 et 14h00-16h30 is parsed as 00:14, not even 14:00
  • only one time range (14h00-16h30) out of two is recognized
  • like with many problematic inputs in French, the last group of digits is almost always recognized as seconds, even if from the context it's obvious it is not

Current output:

[
    {
        "body": "08h30-12h00 et 14h00-16h",
        "dim": "time",
        "end": 26,
        "latent": false,
        "start": 2,
        "value": {
            "grain": "minute",
            "type": "value",
            "value": "2021-09-07T00:14:00.000-07:00",
            "values": [
                {
                    "grain": "minute",
                    "type": "value",
                    "value": "2021-09-07T00:14:00.000-07:00"
                },
                {
                    "grain": "minute",
                    "type": "value",
                    "value": "2021-09-08T00:14:00.000-07:00"
                },
                {
                    "grain": "minute",
                    "type": "value",
                    "value": "2021-09-09T00:14:00.000-07:00"
                }
            ]
        }
    },
    {
        "body": "14h00-16h30",
        "dim": "time",
        "end": 28,
        "latent": false,
        "start": 17,
        "value": {
            "from": {
                "grain": "minute",
                "value": "2021-09-06T14:00:00.000-07:00"
            },
            "to": {
                "grain": "minute",
                "value": "2021-09-06T16:31:00.000-07:00"
            },
            "type": "interval",
            "values": [
                {
                    "from": {
                        "grain": "minute",
                        "value": "2021-09-06T14:00:00.000-07:00"
                    },
                    "to": {
                        "grain": "minute",
                        "value": "2021-09-06T16:31:00.000-07:00"
                    },
                    "type": "interval"
                },
                {
                    "from": {
                        "grain": "minute",
                        "value": "2021-09-07T14:00:00.000-07:00"
                    },
                    "to": {
                        "grain": "minute",
                        "value": "2021-09-07T16:31:00.000-07:00"
                    },
                    "type": "interval"
                },
                {
                    "from": {
                        "grain": "minute",
                        "value": "2021-09-08T14:00:00.000-07:00"
                    },
                    "to": {
                        "grain": "minute",
                        "value": "2021-09-08T16:31:00.000-07:00"
                    },
                    "type": "interval"
                }
            ]
        }
    },
    {
        "body": "30\"",
        "dim": "duration",
        "end": 29,
        "latent": false,
        "start": 26,
        "value": {
            "normalized": {
                "unit": "second",
                "value": 30
            },
            "second": 30,
            "type": "value",
            "unit": "second",
            "value": 30
        }
    }
]