snarfed/granary

Some weird results

singpolyma opened this issue · 6 comments

https://granary.io/url?input=html&output=mf2-json&url=http%3A%2F%2Fbeta.singpolyma.net%2F

I see the first item is my representative hcard -- but it has as "author" property? Also not getting the full name that is present.

Authors on the hentries have the full content of the vcard as the name instead of pulling from the fn property, and are also missing the full name that is present.

Entries with rel=tag markup get a data element appended to their content.html property?

For comparison, https://pin13.net/mf2/?url=http%3A%2F%2Fbeta.singpolyma.net%2F gives me what I expected. https://mf2.kylewm.com/?url=http%3A%2F%2Fbeta.singpolyma.net%2F&parser=html5lib has most of the same issues (I assume you also use mf2py, so that's probably it) but it at least handles the name in the representative hcard properly.

thanks for trying out bridgy fed and granary (more), and for filing all these issues! much appreciated. and apologies for the problems, obviously.

you're right, granary does use mf2py, so it's often the root cause of mf2 parsing problems. it was dormant for a long time, but @kartikprabhu and @kevinmarks have recently picked up activity and done a lot of work on it in its new location, https://github.com/microformats/mf2py (and in https://github.com/kartikprabhu/mf2py/tree/experimental). i think they're getting close to putting out a release. looking forward to it!

in general, for just parsing mf2 HTML to JSON, i'd recommend a normal mf2 parser. granary uses AS1 as its common backplane format, so everything goes through that first, which means it's sometimes lossy in ways that just mf2 HTML => JSON isn't. (not to mention requires an extra serial HTTP request if you're using https://granary.io/.)

i'll still definitely look into the representative h-card name though!

and thanks for filing the corresponding issue in mf2py!

re the representative h-card name, https://pin13.net/mf2/?url=http://beta.singpolyma.net/ currently returns:

"properties": {
  "given-name": ["Stephen"],
  "additional-name": ["Paul"],
  "family-name": ["Weber"],
  "name": ["singpolyma"],
  "..."

given-name, additional-name, and family-name don't make it through granary's mf2 => AS conversion. name does. http://microformats.org/wiki/h-card#Properties says:

p-name - The full/formatted name of the person or organisation

...and granary takes that at face value, so it only keeps name all the way through mf2 => AS => mf2.

i'll update granary's mf2py as soon as @kartikprabhu / @kevinmarks cut a new release, which should hopefully fix the other issues.

Oh, ok. So granary removes fields it can't convert to the intermediate format, that makes sense. And AS doesn't support the full-name properties.

i've tentatively upgraded mf2py to https://github.com/snarfed/mf2py/tree/all , in the mf2py branch here and deployed, which i think fixes the rest of these problems.

the new output in https://granary.io/url?input=html&output=mf2-json&url=http%3A%2F%2Fbeta.singpolyma.net%2F and other examples here looks ok. definitely not as good as just a simple mf2 parser, but since html -> mf2 ison is an uncommon use case for granary, I'm ok with that.

tentatively closing. feel free to reopen!