Optionally parse org data-trees as nodes instead of pages

Question

Optionally parse org data-trees as nodes instead of pages

ispringle opened this issue 2 years ago · 2 comments

It'd be great if there was a way to parse an org file's data-tree and convert it into nodes of data as opposed to the more markdown-style file parsing where the file is a single entity.

For example given this org file:

* blog
** This is a post
    Here is some post content
** This is another post
    This is some different post content

Currently the above file would get parsed as a single entity and you'd end up with a a <h1>blog</h1> and then the h2 headings under that. If we parsed this in a more org-ish way and treated headings as nodes on a data-tree we'd end up with a data structure such as:

nodes: [
  blog: {
    content: "...",
    nodes: [
      "This is a blog post": {...},
      "This is another post": {...},
    ],
    ...
  },
  ...
]

Answer 1 · 2022-08-06T12:47:27.000Z

Hey.

This structure is less "org-ish" because it doesn't follow org structure. Examples of cases that would be hard to handle:

inlinetasks (headings in-between content)
headings that don't nest nicely: *** headings under * ones
repeated heading titles
the order of headings is almost lost
can your blog posts have any heading? Should all headings be nodes or should we apply an arbitrary rule? (e.g., to only lift headings with ids as org-roam does)

I'd say that this is a rather specific use case (making all headings into "nodes") and I wouldn't implement it. The good news is that it is easy to do yourself: you could traverse org-data and section nodes and lift their section children as nodes (if they satisfy your lifting condition).

(Lifting all headings with IDs as nodes is more common (org-roam) and I would love to see that as a library.)

Answer 2 · 2022-08-23T23:06:02.000Z

Yes, there would need to be some property value that signals the heading is now a leaf and not another node in the tree. ox-hugo does this by saying that any heading with a property drawer that contains a :EXPORT_FILE_NAME: is a leaf and all the content in it will be treated as content to be transformed into html. All my blog posts already contain an ID so perhaps that would be a good avenue to pursue.