text is lost when word is transferred to pdf!
wshupengpeng opened this issue · 1 comments
wshupengpeng commented
When I converted Word to PDF, I found that some of the text was missing
@Test
public void wordConvertToPdf() throws IOException {
String filePath = String.format("%s%s", BASE_DIR, "b0181116-8d64-464a-b4db-306acd067af3.docx");
String outputPdfPath = String.format("%s%s", BASE_DIR, "1.pdf");
// 缺失字体导致无法渲染正确的pdf
wordToPdf(filePath, outputPdfPath);
}
public static void wordToPdf(String docPath,String pdfPath) {
try(InputStream doc = new FileInputStream(docPath);
XWPFDocument document= new XWPFDocument(doc);
OutputStream out = new FileOutputStream(pdfPath)){
setFontType(document);
PdfOptions options = PdfOptions.create();
options.fontProvider(CustomizeFontProvider.getInstance());
PdfConverter.getInstance().convert(document, out, options);
}catch (Exception e){
log.error("wordToPdf failed ", e);
}
}
private static void setFontType(XWPFDocument xwpfDocument) {
//转换文档中文字字体
List<XWPFParagraph> paragraphs = xwpfDocument.getParagraphs();
if(paragraphs != null && paragraphs.size()>0){
for (XWPFParagraph paragraph : paragraphs) {
List<XWPFRun> runs = paragraph.getRuns();
if(runs !=null && runs.size()>0){
for (XWPFRun run : runs) {
if(StringUtils.isEmpty(run.getColor())){
run.setColor("000000");
}
}
}
}
}
//转换表格里的字体 我也不想俄罗斯套娃但是不套真不能设置字体
List<XWPFTable> tables = xwpfDocument.getTables();
for (XWPFTable table : tables) {
List<XWPFTableRow> rows = table.getRows();
for (XWPFTableRow row : rows) {
List<XWPFTableCell> tableCells = row.getTableCells();
for (XWPFTableCell tableCell : tableCells) {
List<XWPFParagraph> paragraphs1 = tableCell.getParagraphs();
for (XWPFParagraph xwpfParagraph : paragraphs1) {
List<XWPFRun> runs = xwpfParagraph.getRuns();
for (XWPFRun run : runs) {
if(StringUtils.isEmpty(run.getColor())){
run.setColor("000000");
}
}
}
}
}
}
}
wordTemplate:
word.docx
convert pdf result:
project__20f87ec11bad446a910610cf730eccc8.pdf
angelozerr commented
Any contribution are welcome!