Text processing for chatbot IA lab implementation
Team members: Pricop Ovidiu, Butnaru Adrian, Ciuc Tiberiu, Ciubotariu Alexandru, Dorneanu Cristian, Lucian Alexandru, Rusu Alexandru
Live API endpoint: http://ec2-52-213-135-23.eu-west-1.compute.amazonaws.com/api/v1
Live API Documentation: Wiki page
- Chatbot receives user raw input as string
- Raw input is proofread (corrected), processed and translated into an xml tree that gets annotated with metadata
- Constructed xml tree is semantically interpreted and a response is designed
- Response is processed and sent back to user as a string
- each word in sentence will be verified with a dictionary
- if word not found in dictionary -> search word in knowledge base as proper noun
- [Optional] verify that sentence contains at least one verb (otherwise it wouldn't be a sentence)
- if proper noun -> annotate word with its definition
- if special construct (calendaristic date, math expression, etc.) -> annotate with its type -> needs predefined types
- if plain text word -> annotate word with its synonims
The following communication schema (request-response) serves for text annotation purposes. This way, in the text-processing phase, we are able to annotate proper nouns in user input with their most recent, on-demand crawled data from the world wide web.
- Request Schema
{
"title": "Get Proper Noun Definition Request",
"type": "object",
"required": [
"word"
],
"properties": {
"word": {
"type": "string"
}
}
}
- Response Schema
{
"title": "Get Proper Noun Definition Response",
"type": "object",
"required": [
"shortDefinition", "definitionSource", "eror"
],
"properties": {
"shortDefinition": {
"description": "A short definition of the supplied proper noun",
"type": "string"
},
"definitionSource": {
"description": "The source of the definition, may be a link or a book w. author et. al.",
"type": "string"
},
"error": {
"type": "boolean"
},
"errorMessage": {
"type": "string"
},
"errorId": {
"type": "integer"
}
}
}
The AI module will invoke the text-processing module and get the annotated user input based on the supplied user string input.
[TODO] A string will be received from the AI module as the bot's response to the user-supplied input. This string needs to be parsed, corrected (and annotated). [this section needs more information]