Orgmunge was born out of the desire to modify Org documents programmatically from within Python. The wonderful orgparse can read an Org document into a tree object but doesn’t offer an interface to modify the tree and write it back to file.
The original use case was trying to sync Outlook calendar items with
Org: whenever someone rescheduled a meeting, my Python script was
unable to reschedule the Org heading it had originally
created. Instead of forking orgparse, I decided to write an actual
grammar for an Org document and use PLY to generate a parser for it.
Now Org syntax is too sophisticated for me to claim that this first
attempt can parse everything. In fact, some folks way smarter than I
am (and with more formal training), have hinted that Org
syntax can’t be properly parsed with a context-free grammar. For such
reasons (and for my own lack of experience with writing grammars), I
have restricted the scope of this module to the features I care about:
for each heading, the headline components (the COMMENT keyword, the
todo state, priority, cookies, and tags) are all parsed, as well as
any scheduling timestamps and all the drawers. The heading contents
are treated as a blob of text and the only thing the parser extracts
from the contents are the timestamps. No attempts are made at parsing
things like tables or source code blocks further. orgmunge can also
parse out the document’s metadata and export options but the major
assumption it makes is that the document starts out with some optional
metadata and export options, followed by some optional initial body
text (not falling under any heading), and then a tree of headings. Any
export options or metadata that come later within the document are
treated as text (some heading’s content).
- The only dependency of
orgmungeisPLY. So you needPLYinstalled. - Clone this repo
- Add the directory where you cloned this repo to your
PYTHONPATH - The parser recognizes todo keywords by looking for a file named
todos.jsonin one of 3 places:- The current directory
- The user’s home directory
- The package directory
- You can look at the file
todos.jsonshipped with this package for the expected syntax and customize it with your preferred todo keywords (or make a copy in your home directory).orgmungedoesn’t currently support reading todo states defined with#+TODOin the file it’s reading, and there are no plans to do so in the near future (see this issue). - Here’s an example:
{ "todo_states": { "my_todo": "TODO", "my_next": "NEXT", "my_wait": "WAIT" }, "done_states": { "my_cncl": "CNCL", "my_done": "DONE" } }
The dict keys can be any names meaningful to you and will be exposed as keys to the
_todo_keywords,_todo_states, and_done_statesattributes of theHeadlineclass. The dict values are the actual todo keywords that will be in the Org file you’re reading.The file in the repo has a different example with Font Awesome characters as the todo keywords.
- The
Orgclass in__init__.pyis the main entry point toorgmunge. It can be used to read an Org tree either from a string or from a file:from orgmunge import Org org_1 = Org('* TODO Something important\n', from_file=False) # \n needed to signify end of document org_2 = Org('/path/to/my/file.org') org_3 = Org('/path/to/my/file.org', debug=True) # Print PLY debugging info
- The
Orgobject has 3 main attributes you should care about:Org.metadatastores the metadata and export options found at the beginning of the file. This is a dict mapping the option/keyword name to a list of its values (to allow for cumulative keywords such as#+OPTION). Example:org_1 = Org('#+title: Test\n') assert(org_1.metadata['title'] == ['Test'])
Org.initial_bodystores any text between the metadata and the first heading.Org.rootstores the root of the Org tree. This is a heading with the headlineROOTwhose only useful attribute ischildren, which is a list of all the headings in the given document.
- The Org tree is a list of headings with parent, child and sibling relationships.
- A heading object consists of:
- A headline
- Contents:
- Scheduling, if any
- A list of Drawers, if any
- Body text, if any
- Important attributes:
properties. This is a dict mapping property names to their values. The properties are parsed from thePROPERTIESdrawer if it exists. This attribute can also be set by the user (the value supplied must be a dict).headlinereturns the heading’s headline. This attribute can also be set by a user (the value must be a Headline instance).schedulingis a Scheduling object containing information aboutSCHEDULED/DEADLINE/CLOSEDtimestamps of the heading, if any. Can also be set by the user (the value must be a Scheduling instance).drawersis a list of Drawer objects containing the drawers associated with this heading. When you update the heading’spropertiesattribute, thePROPERTIESdrawer is updated the next time you access it.childrenreturns a list of Heading objects that are the direct children of this heading.parentreturns the parent heading of the current one. If the current heading is a top-level heading, the root heading will be returned.siblingreturns the sibling heading of the current one that comes before it in the tree, if any. The reason this is the sibling heading that is formally tracked is because it’s the one that would adopt the current heading whenever the current heading is demoted. If you want a list of all siblings of the current heading, you can do this:siblings = [c for c in current_heading.parent.children if c is not current_heading]
levelis the heading’s level, with 1 being the top level and each sub-level after that being incremented by 1 (the heading’s level is the number of “stars” before its headline).
- Important methods:
clocking. This returns a list of Clocking objects, parsed from the heading’sLOGBOOKdrawer, if any. You can also pass the optional boolean parameterinclude_children, which, when True, includes clocking information of this heading’s children as well.add_childaccepts a Heading object to add as a child to the current heading. The optional boolean parameternewshould be set toTruewhen this is a new heading that was created and needs to be assigned a parent. It should be set toFalse(default) when the addition of a child is due to a promotion/demotion operation.remove_childaccepts a heading object and deletes it from the current heading’s children if it’s a child of the current heading.promotepromotes the current heading one level. If the heading has children, they would be orphaned so this raises aValueError. Technically, Org allows you to have, say, level 3 headings under a level 1 heading, butorgmungedoes not allow this to make parsing the tree easier.promote_treepromotes the current heading and all its descendants. Use this if the heading you want to promote has children.demotedemotes the current heading one level. If the current heading has no sibling to adopt it, the demotion attempt fails and raises aValueError.demote_treeis the equivalent ofpromote_treefor demotion.
- Important attributes:
doneis a boolean attribute that determines whether the headline is in one of the done states. You can’t set this attribute directly.levelis the headline’s level (the number of “stars” before the title)commentis a boolean attribute that determines whether a headline is commented out (by having the keywordCOMMENTinserted before the title).todoreturns/sets the headline’s todo state. You can set it yourself but it has to be one of the values ofself._todo_statesorself._done_states.cookiereturns/sets the headline’s cookie. See Cookie Objects.priorityreturns/sets the headline’s priority
- Important methods:
promotedecreases the level by the number given by the parametern(default 1).demoteacts likepromotebut increases the level byninstead.toggle_commenttoggles the state of whether or not a headline is commented out using theCOMMENTkeyword.comment_outensures the headline is commented out usingCOMMENTuncommentensures the headline is not commented out using theCOMMENTkeyword.raise_priorityincreases the headline’s priority by 1lower_prioritydecreases the headline’s priority by 1
- Has 6 attributes for the 3 possible scheduling keywords (3 are aliases of the other 3):
- CLOSED, closed
- SCHEDULED, scheduled
- DEADLINE, deadline
- Each attribute, when queried will return either
Noneor aTimeStampobject representing the timestamp associated with this particular scheduling keyword. You can set the attributes directly but they have to be set to aTimeStampobject.
- A
Drawerobject has only 2 attributes:nameandcontents. Thecontentsattribute is simply a list of lines making up the drawer contents. When you modify a heading’spropertiesattribute, itsPROPERTIESdrawer gets updated accordingly.
- The
Clockingobjects have 3 attributes:start_time,end_timeandduration. Only the first 2 can be set. When setting either, you should pass a string following the Org time format; namely, ‘%Y-%m-%d %a %H:%M’ (see the strftime(3) man page for an explanation of the format codes). - If
end_timeisNone, the duration is calculated from thestart_timeup to the current moment.
- The only attribute,
prioritycan be set directly by the user and can be one of only 3 strings: ‘A’, ‘B’ or ‘C’. Set toNoneto remove it from theHeading. - The methods
_raiseand_lowerwill raise or lower the priority. - If the priority is
None, raising it, sets it to ‘A’ and lowering it sets it to ‘C’.
- Important attributes:
start_timeandend_timecan be queried and set by the user. You can set them by supplying a string, adatetimeobject orNone.repeaterreturns a timestamp repeater string such as ‘+1w’. Can also be set by the user.deadline_warnacts similarly torepeaterand represents the number of days before a deadline to warn the user of an upcoming deadline.activeis a boolean property and decides whether the time stamp will be printed with[]or<>delimiters. Can be set directly by the user.
Cookieobjects represent progress on the currentHeading.- They can be of type ‘percent’ (e.g. [50%]) or of type ‘progress’ (e.g. [2/4]).
- Important attributes:
cookie_type: can only be one of ‘percent’ or ‘progress’. Can be set directly by the user.mandnrepresent the progress as the ratiom/n. If the cookie type is ‘percent’,nis 100. When changingcookie_type,mandnare converted accordingly.
- The ability to modify the tree was the main reason I wrote this package. Most of the attributes of the tree objects can be modified directly by the user.
- Use the
promote*anddemote*methods of theHeadingobjects to changeHeadinglevels. - To rearrange headings, note that a
Heading'schildrenattribute is a list whose ordering is important: in other words, the tree will be written back to a file with the order eachHeading’s children are in. So the user can rearrange the headings of the same level by assigning thechildrenattribute of their parent to a different order of child headings. It’s up to the user to update the child headings’siblingattributes appropriately.
- You can use the
Orgobject’swritemethod to write out the tree to a file whose name you supply to the method:from orgmunge import Org agenda = Org('/path/to/agenda.org') # Do something with agenda... agenda.write('/path/to/modified_agenda.org')