DISCUSSION: What should output look like?

Question

DISCUSSION: What should output look like?

Opened this issue 9 years ago · 11 comments

DO YOU USE DOCTOP? If so, please comment below on how you feel its output should look in version 2!

Right now, Doctop output resembles:

{
  "copy": {
    "h1-1": [
      "This is a paragraph of text",
      "this is another paragraph",
      "h2-1" [
        "this should be a child of h2-1, which should be a child of h1-1",
        "h3-1": [
          "This should be a child of h3-1, which should be a child of h2-1"
        ]
      ]
    ],
    "h1-2": [
      "This should be a child of h1-2, which itself should be in the top level of the object.",
      "h3-2": [
        "This should be a child of h3-2, which should be a child of h1-2"
      ]
    ],
    "h1-3": [
      "This should be a child of h1-3",
      "Another child of h1-3"
    ]
  }
}

Meanwhile, Archie gets rendered and chucked onto the end while still remaining confusingly in the copy object. The header text itself gets lost and mainly acts as a comment for people editing the Doc. Another downside is that the numbering of the headers is a bit unintuitive — it doesn't reset for sub-headers. The upside of doing it this way, however, is folks can't come along and break your app by fiddling with the header text.

With a total rewrite on the way, I think there's a lot of room to improve on this. We could somehow add the header text into the output like in fancyOutput (which, do we even want to keep?), or something similar. Or we could make everything be totally array-like, which would be nice because it would mean we could just iterate through each array when templating.

If you have any thoughts on this, please let me know by commenting below! I'm about 80% of the way through a rewrite of Doctop in TypeScript that will be usable in NodeJS and not require jQuery, and am now needing to rewrite the walker part (which is probably the part most in need of a rewrite anyway).

Any and all feedback very welcome!

Answer 1 · 2016-04-05T01:58:13.000Z

Hey Andrew!
I am trying to prototype something using doctop, and yes having the text of the header in the output would be very useful !

Let me know if there's anything I can do to help!

Answer 2 · 2016-04-05T16:57:28.000Z

Hi @pietrop!

Thanks for the feedback! In an ideal world, any thoughts as to how Doctop would output? Am currently thinking something like:

[
    [
        {text: 'Header text', level: 1, tag 'h1'},
        [
            {text: 'Header text', level: 2, tag 'h2'},
            [
               {text: 'Paragraph text', level: 3, tag: 'p'}
            ]
       ]
    ]
]

Basically, nothing is explicitly named and all text nodes are given an object.

The advantage is that everything is in a similar structure and using arrays means everything is kept in order a little nicer than how I was doing things with objects. You also are able to just reference by array index, which gets around the "writers changing heading text and thus breaking the app". Alas, it opens it up to another issue, "writers adding new headers mid-document and breaking the app".

The other option that might be cool is using ES6 iterators or lists or something like that, not entirely sure; very much open to any and all ideas! 😄

Answer 3 · 2016-04-05T20:58:42.000Z

Great, good stuff!

so let's consider the text in markdown for clarity on the formatting

# Heading 1
some paragraph text of heading 1

## Heading 2
some paragraph text of heading 2

### Heading 3 
some paragraph text of heading 2

What I was wondering is, do they need to be nested? wha is the advantage? perhaps a more flat structure could be easier to troubleshoot and work with?

[
     { text: "Heading 1", tag:"h1"},
     { text: "some paragraph text of heading 1", tag:"p"},
     { text: "Heading 2", tag:"h2"},
     { text: "some paragraph text of heading 2", tag:"p"},
     { text: " Heading 3 ", tag:"h3"},
     { text: "some paragraph text of heading 2 ", tag:"p"}
]

Mostly just thinking out loud, but let me know what you think

Answer 4 · 2016-04-05T21:30:06.000Z

@pietrop Interesting idea. Main point of the nesting is to introduce some order to the document, but perhaps it adds more complexity than is necessary. If nothing else, a "flat" mode could be a cool option.

Answer 5 · 2016-04-05T21:31:45.000Z

Or possibly even a "maxDepth" option, that could be set to 0?

Answer 6 · 2016-04-06T01:46:22.000Z

Hi @Aendrew! awesome work. I'm using doctop for https://gurivr.com

I think what @pietrop is proposing fits better with Docs structure since headings and paragraphs are not nested. IMO nesting nodes will makes sense if you have nested structures and want to build like an AST but not for this.

Answer 7 · 2016-04-06T11:12:19.000Z

@impronunciable Cool, I kind of see your point. In that case, if it's just a flat hierarchy, how should nodes be handled? Similar to how I do it above as basic object, as just straight HTMLElements, or something else? Also, what utility do you get out of Doctop? Mainly as a way of sanitising GDocs' HTML output?

That's a really interesting project by the way; super cool! 😄

Answer 8 · 2016-04-06T12:46:53.000Z

I think the way pietro is proposing makes sense. It's easy to map to html or just "walk the nodes".

nodes.map(({tag, text}) => `<${tag}>${text}</${tag}>`).join('')

I'm using it combined with ArchiML (thank you for the compatibility layer) to add annotations to gdocs and then generate VR scenes.

For the moment I'm not interested in the text structure since I'm just looking at the archie property but I see scenarios in the future where I'd like to identify sections and take decisions, but didn't think a lot for now (started the tool a week ago);

Answer 9 · 2017-01-17T15:57:58.000Z

@Aendrew following up on this, I was wondering if in the current version is possible to get comments associated with a gogole doc in the json?

Answer 10 · 2017-01-19T17:25:43.000Z

@pietrop That's a good question and something I've been wondering myself. Will take a look and get back to you!

You have good timing, I might do some work on Doctop in the next few weeks if there's demand given a few folks over here at the FT are considering it.

Edit: Alas, there currently seems no way to output comments as part of a published document. I suppose I could do some stuff with the API but that will start to lose some of the library's simplicity.

Answer 11 · 2017-01-19T17:31:34.000Z

Awesome, let me know if you need any help, would like to learn more about how this works under the hood :)