Order of paragraphs and tables
Prigin opened this issue · 4 comments
Problem
I need to get all paragraphs and tables in order they have in docx file. Is there any way I can do this?
Solution
May be just one index for paragraph objects and table objects will be enough.
You mean that at the latest version of this gem Document#paragraphs
returns paragraphs in wrong order, right?
Could you give us a docx file to reproduce this behavior if you have?
The file would help us to investigate what happens.
Thanks
Not exactly. :) Sorry for not being transparent. Lets say I have a docx that I want to convert to txt:
I need to know place of each element(paragraphs and tables). How to get the same order of elements they have in DOCX? Or maybe they already have that method(which returns order number from doc). I cant actually find it :(
I was able to do this as followed. I'm using private vars/methods, but if they open up more APIs in the future, we won't have to.
doc = Docx::Document.open(file)
doc.instance_variable_get("@doc").xpath('//w:document//w:body').children.each do |c|
if c.name == 'p' # paragraph
p = doc.send(:parse_paragraph_from, c)
elsif c.name = 'tbl' # table
t = doc.send(:parse_table_from, c)
else # other types?
end
end
if u just want text, u don't need to parse them as paragraph/table. u can just get as "c.content"