XeniaRieger/Modern-Search-Engines

Questions for Tutorial 2

Closed this issue · 1 comments

  • Can we use lib for Doc2Query?
  • Error Message SSL Certificat:
    Error: HTTPSConnectionPool(host='www.unimuseum.uni-tuebingen.de', port=443): Max retries exceeded with url: /en/museum-at-hohentuebingen-castle (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable
    to get local issuer certificate (_ssl.c:1000)')))
  • [] Should only visible content of the web page be tokenized or also header, hidden html comments etc.?

Doc2Query should be allowed to generate queries from document. But tutor will ask Prof and write in forum (till Tuesday evening) if not allowed.

Certificate Error because of settings of a specific person. Only for https?

Header tokenizing -> decide on our own. Could be useful or worse for system.