DocStruct - A Document Structure Parser A tool to create Document Structure[1] (DS) trees from XHTML websites. This was created as a term project for CSI 5386 (Fall 2009) at the University of Ottawa, Fall 2009. More detailed information on the project can be found in the paper located at http://cloud.github.com/downloads/cfournie/docstruct/paper.pdf Directories \module\ - Contains the python parser tool \spec\ - Contains example DS trees, and the DS XML Schema References [1] R. Power, D. Scott, and N. Bouayad-Agha, "Document structure," Comput. Linguist., vol. 29, no. 2, pp. 211-260, 2003. Accessible at http://www.mitpressjournals.org/doi/abs/10.1162/089120103322145315