googleapis/python-language

The returned wiki url in response object by calling analyze_entities references to Non-English Wikipedia page

FrederickXZhang opened this issue · 1 comments

Hello,

The returned wiki url in response object by calling analyze_entities references to Non-English Wikipedia page. I think this method functioned properly before, i.e., always referencing to English Wikipedia page. However, it starts returning non-English wiki URLs right now. Please check details below.

Environment details

  • OS type and version: Ubuntu 20.04.4 LTS
  • Python version: python 3.9.7
  • pip version: pip 22.0.3
  • google-cloud-language version: google-cloud-language Version: 2.4.1

Steps to reproduce

from google.cloud import language_v1
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "AUTHENTICATION PRIVATE KEY"
client = language_v1.LanguageServiceClient()
text_content = "USA"
type_ = language_v1.Document.Type.PLAIN_TEXT
language = "en"
document = {"content": text_content, "type_": type_, "language": language}
encoding_type = language_v1.EncodingType.UTF32
response = client.analyze_entities(request = {'document': document, 'encoding_type': encoding_type})

Code example

The returned response object correctly recognizes the correct entity (USA), but references to non-English wiki URL even though I have specified "en" for the language option.

entities {
  name: "USA"
  type_: LOCATION
  metadata {
    key: "mid"
    value: "/m/09c7w0"
  }
  metadata {
    key: "wikipedia_url"
    value: "https://de.wikipedia.org/wiki/Vereinigte_Staaten"
  }
  salience: 1.0
  mentions {
    text {
      content: "USA"
    }
    type_: PROPER
  }
}
language: "en"

The error is also reproducible on your online demo. Please take a look at the url circled by me in the following screenshot.
image

Is there a way to revert the current release to an older version or revert your database?
Thanks!

Hello :) Hope you are doing well. I have reproduced your code, but it is showing me the correct wikipedia address.
Screen Shot 2022-07-12 at 6 31 22 PM