Rasa NLU response parsing error
Closed this issue ยท 8 comments
Hi
When trying to integrate RASA NLU, an exception is thrown when the NLU training set contains entities that are resolved.
(From the console):
jaicf.activator.rasa.api.RasaApi - Cannot parse RasaParseMessageRequest(text=....., messageId=30eed22e-e5e8-4960-a979-6485d5295f45)
kotlinx.serialization.MissingFieldException: Field 'confidence' is required, but it was missing
From RASA output (using rasa train shell), we see a structure like this:
{
"entity": "frequency",
"start": 9,
"end": 18,
"confidence_entity": 0.9578643441200256,
"value": "7",
"extractor": "DIETClassifier",
"processors": [
"EntitySynonymMapper"
]
}
But the mapping in RasaParseMessageResponse.kt contains:
@Serializable
data class Entity(
val start: Int,
val end: Int,
val confidence: Float,
val value: String,
val entity: String
)
Maybe the confidence field should be renamed 'confidence_entity' to map correctly the output?
Hi @jpearll, thanks for reporting!
As I see from RASA API spec and source code, this field is called 'confidence' and we shouldn't rename it. It seems like 'confidence_entity' is a custom field provided by DIETClassifier
.
But the problem is the 'confidence' field is marked as optional in API spec, but it is required in our model classes. I think we can simply make it optional (nullable).
@morfeusys fyi
Hi all,
It seems that this is still an issue with both CLREntityExtractor and DIETClassifier. After looking further into it, the only fix I could find is adding @SerialName("confidence_entity")
above the confidence property (basically the same as what @jpearll suggested). But this fix would only work with extractors that use the "confidence_entity" attribute in their Entity JSON object.
I did find another alternative annotation in Kotlinx called @JsonNames()
which seems to allow multiple names so that we can add the two possible keys ("confidence" or "confidence_entity"). Unfortunately, this annotation isn't out yet but once it is I think it'd be a suitable solution.
If we were to make it optional then we need to add a default value as a Float?
still shows as a required value: https://stackoverflow.com/questions/64796913/kotlinx-serialization-missingfieldexception. But annoyingly, as with the suggested fix above, would only work on a couple of extractors that output their entity object with the "confidence" attribute.
I could only really find a list of the proper outputs of the various extractors here as the API docs don't represent all of them: RasaHQ/rasa#6795
One possible (but very janky) fix would be to have two properties "confidence" and "confidence_entity" which both have default values, and then have a "getConfidence" func like so: fun getConfidence(): Float = maxOf(confidence as Float, confidence_entity as Float)
. This way it would cover the two types of outputs we've encountered so far and would cover us until we find a better solution, such as the @JsonNames
above.
Imo the output of the different extractors from Rasa should've been kept to the same schema but I guess there must have been other factors not making it possible.
HI @CiaronHowell!
Many thanks for the research!
Based on what you've described, I guess we can quick-fix it by adding default null
values to all nullable fields and adding JsonObject
field in order to store all unknown properties like that.
Will this be a suitable solution?
@CiaronHowell and also it'd be great if you can provide us a couple of examples of JSON responses from classifiers.
Unfortunately, we don't have enough time now to test the code with a real installation of Rasa
Sorry for the delayed response but thank you for the quick responses from yourself and @Denire, much appreciated!
HI @CiaronHowell! Many thanks for the research!
Based on what you've described, I guess we can quick-fix it by adding default
null
values to all nullable fields and addingJsonObject
field in order to store all unknown properties like that.Will this be a suitable solution?
Think this is definitely a possible temp solution. My only concern is that people can use multiple different classifiers or extractors (from my knowledge) so they might want to cover both bases. I guess the user could either extend on the data class to create a method that checks both or create a separate func somewhere that does the same thing but using the RasaActivatorContext as the input parameter.
@CiaronHowell and also it'd be great if you can provide us a couple of examples of JSON responses from classifiers. Unfortunately, we don't have enough time now to test the code with a real installation of Rasa
Completely understandable, not a problem! Unfortunately, I only know of the output of the two classifiers I've mentioned + the list of outputs on the Rasa github issue above. Annoyingly, even in the Rasa docs, the output is shown as "confidence" rather than "confidence_entity" but from the looks of it these are the only two possibilities.
We've added default null
s to all nullable properties. Now Rasa model classes should fully conform official API Spec.
To deal with custom classifier's fields we've added additional property rawResponse
to RasaActivatorContext
in JAICF 1.2.3 release.
So now you should be able to access all of the custom fields from response through RasaActivatorContext#rawResponse
property and Kotlin extension properties can help to build a fluent API, hiding all underlying JSON manipulations.
You can find sample test here
So with such a fix I think using different classifiers won't be a big problem, will it?
That's fair enough, think that solves it until the @JsonNames
annotation is available with the next release of Kotlinx. Looks like that annotation is what would be a nice fix for this issue like:
@JsonNames("confidence_entity")
val confidence: Float? = null
Or something along the lines of that.
Thank you for getting a solutions sorted though!
We'll definitely look into it when it's released and when we are ready to move on the next major Kotlin release.
As for now, we will release 1.2.4 version with the fix in a few days.
Thanks!