wikimedia/html-metadata

html-metadata returns all fields as undefined for specific url

Closed this issue · 5 comments

Hi,

I have been using html-metadata and thanks for such a wonderful software. I noticed that when used on the following url https://www.cnet.com/special-reports/vr101/ - it gives all fields as undefined.

Can you please look into the issue.

My code is

var scrape = require('html-metadata');
var url = process.argv[2];

scrape(url).then(function(metadata){
console.log("************************");
console.log(metadata);
});

and the output I get for this program is

parse() is deprecated, use toJson()


{ openGraph:
{ site_name: undefined,
title: undefined,
description: undefined,
url: undefined,
image:
{ url: undefined,
type: 'image/jpeg',
width: '630',
height: '315' },
app_id: undefined,
type: 'article' },
twitter:
{ card: 'summary_large_image',
creator: undefined,
site: undefined } }

mvolz commented

Thanks for reporting!

The underlying issue with that url is that it looks like they're using a templating language to create their html but the values for some reason the content tag isn't being added:

<!-- OpenGraph sharing tags -->
    <meta property="og:site_name"     ng-attr-content="{{share.siteName}}" />
    <meta property="og:title"         ng-attr-content="{{share.title}}" />
    <meta property="og:description"   ng-attr-content="{{share.description}}" />

But obviously we shouldn't be returning undefined! :)

mvolz commented

Hi,

I will fix it returning undefined, and instead you will get cleaner, but not very rich metadata. It looks like they have a programming error on their end :/ I'll let you know when the update is published.

mvolz commented

I think this has actually been resolved for a while, so closing :).