溫馨提示×

您好,登錄后才能下訂單哦!

密碼登錄×
登錄注冊(cè)×
其他方式登錄
點(diǎn)擊 登錄注冊(cè) 即表示同意《億速云用戶服務(wù)條款》

Java如何實(shí)現(xiàn)無(wú)損Word轉(zhuǎn)PDF

發(fā)布時(shí)間:2022-06-09 09:11:21 來(lái)源:億速云 閱讀:1073 作者:zzz 欄目:開(kāi)發(fā)技術(shù)

這篇文章主要介紹“Java如何實(shí)現(xiàn)無(wú)損Word轉(zhuǎn)PDF”的相關(guān)知識(shí),小編通過(guò)實(shí)際案例向大家展示操作過(guò)程,操作方法簡(jiǎn)單快捷,實(shí)用性強(qiáng),希望這篇“Java如何實(shí)現(xiàn)無(wú)損Word轉(zhuǎn)PDF”文章能幫助大家解決問(wèn)題。

word轉(zhuǎn)pdf實(shí)現(xiàn)思路

代碼實(shí)現(xiàn)主要依賴兩個(gè)第三方j(luò)ar包,一個(gè)是pdfbox,一個(gè)是aspose-words。pdfbox包完全開(kāi)源免費(fèi),aspose-words免費(fèi)版生成有水印,且生成數(shù)量有限制。單純用pdfbox 實(shí)現(xiàn)word轉(zhuǎn)pdf的話,實(shí)現(xiàn)非常復(fù)雜,且樣式和原來(lái)樣式,保持一致的的比例很低。所以,我先用aspose-words生成了帶水印的pdf,再用pdfbox去除aspose-words生成的水印的,最終得到了一個(gè)無(wú)水印的pdf。

項(xiàng)目遠(yuǎn)程倉(cāng)庫(kù)

aspose-words 這個(gè)需要配置單獨(dú)的倉(cāng)庫(kù)地址才能下載,不會(huì)配置的可以去官網(wǎng)直接下載jar引入項(xiàng)目代碼中。

<repositories>
  <repository>
   <id>AsposeJavaAPI</id>
   <name>Aspose Java API</name>
   <url>https://repository.aspose.com/repo/</url>
  </repository>

Maven項(xiàng)目pom文件依賴

<!-- https://mvnrepository.com/artifact/org.apache.pdfbox/pdfbox -->
		<dependency>
			<groupId>org.apache.pdfbox</groupId>
			<artifactId>pdfbox</artifactId>
			<version>3.0.0-RC1</version>
		</dependency>
		<dependency>
			<groupId>com.github.jai-imageio</groupId>
			<artifactId>jai-imageio-jpeg2000</artifactId>
			<version>1.3.0</version>
		</dependency>
		<dependency>
			<groupId>com.aspose</groupId>
			<artifactId>aspose-words</artifactId>
			<version>21.9</version>
			<type>pom</type>
		</dependency>

核心代碼實(shí)現(xiàn)

import com.aspose.words.Document;
import com.aspose.words.SaveFormat;
import org.apache.pdfbox.Loader;
import org.apache.pdfbox.contentstream.operator.Operator;
import org.apache.pdfbox.cos.COSArray;
import org.apache.pdfbox.cos.COSDictionary;
import org.apache.pdfbox.cos.COSName;
import org.apache.pdfbox.cos.COSString;
import org.apache.pdfbox.pdfparser.PDFStreamParser;
import org.apache.pdfbox.pdfwriter.ContentStreamWriter;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.PDPageTree;
import org.apache.pdfbox.pdmodel.PDResources;
import org.apache.pdfbox.pdmodel.common.PDStream;

import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;

public class PDFHelper3 {

    public static void main(String[] args) throws IOException {

        doc2pdf("C:\\Users\\liuya\\Desktop\\word\\帆軟報(bào)表幫助文檔.docx");

    }


    //替換pdf文本內(nèi)容
    public static void replaceText(PDPage page, String searchString, String replacement) throws IOException {
        PDFStreamParser parser = new PDFStreamParser(page);
        List<?> tokens = parser.parse();
        for (int j = 0; j < tokens.size(); j++) {
            Object next = tokens.get(j);
            if (next instanceof Operator) {
                Operator op = (Operator) next;
                String pstring = "";
                int prej = 0;
                if (op.getName().equals("Tj")) {
                    COSString previous = (COSString) tokens.get(j - 1);
                    String string = previous.getString();
                    string = string.replaceFirst(searchString, replacement);
                    previous.setValue(string.getBytes());
                } else if (op.getName().equals("TJ")) {
                    COSArray previous = (COSArray) tokens.get(j - 1);
                    for (int k = 0; k < previous.size(); k++) {
                        Object arrElement = previous.getObject(k);
                        if (arrElement instanceof COSString) {
                            COSString cosString = (COSString) arrElement;
                            String string = cosString.getString();

                            if (j == prej) {
                                pstring += string;
                            } else {
                                prej = j;
                                pstring = string;
                            }
                        }
                    }
                    if (searchString.equals(pstring.trim())) {
                        COSString cosString2 = (COSString) previous.getObject(0);
                        cosString2.setValue(replacement.getBytes());
                        int total = previous.size() - 1;
                        for (int k = total; k > 0; k--) {
                            previous.remove(k);
                        }
                    }
                }
            }
        }
        List<PDStream> contents = new ArrayList<>();
        Iterator<PDStream> streams = page.getContentStreams();
        while (streams.hasNext()) {
            PDStream updatedStream = streams.next();
            OutputStream out = updatedStream.createOutputStream(COSName.FLATE_DECODE);
            ContentStreamWriter tokenWriter = new ContentStreamWriter(out);
            tokenWriter.writeTokens(tokens);
            contents.add(updatedStream);
            out.close();
        }
        page.setContents(contents);
    }

    //移除圖片水印
    public static void removeImage(PDPage page, String cosName) {
        PDResources resources = page.getResources();
        COSDictionary dict1 = resources.getCOSObject();
        resources.getXObjectNames().forEach(e -> {
            if (resources.isImageXObject(e)) {
                COSDictionary dict2 = dict1.getCOSDictionary(COSName.XOBJECT);
                if (e.getName().equals(cosName)) {
                    dict2.removeItem(e);
                }
            }
            page.setResources(new PDResources(dict1));
        });
    }


    //移除文字水印
    public static boolean removeWatermark(File file) {
        try {
            //通過(guò)文件名加載文檔
            PDDocument document = Loader.loadPDF(file);
            PDPageTree pages = document.getPages();
            Iterator<PDPage> iter = pages.iterator();
            while (iter.hasNext()) {
                PDPage page = iter.next();
                //去除文字水印
                replaceText(page, "Evaluation Only. Created with Aspose.Words. Copyright 2003-2021 Aspose", "");
                replaceText(page, "Pty Ltd.", "");
                replaceText(page, "Created with an evaluation copy of Aspose.Words. To discover the full", "");
                replaceText(page, "versions of our APIs please visit: https://products.aspose.com/words/", "");
                replaceText(page, "This document was truncated here because it was created in the Evaluation", "");
                //去除圖片水印
                removeImage(page, "X1");
            }
            document.removePage(document.getNumberOfPages() - 1);
            file.delete();
            document.save(file);
            document.close();
            return true;
        } catch (IOException ex) {
            ex.printStackTrace();
            return false;
        }

    }


    //doc文件轉(zhuǎn)pdf(目前最大支持21頁(yè))
    public static void doc2pdf(String wordPath) {
        long old = System.currentTimeMillis();
        try {
            //新建一個(gè)pdf文檔
            String pdfPath=wordPath.substring(0,wordPath.lastIndexOf("."))+".pdf";
            File file = new File(pdfPath);
            FileOutputStream os = new FileOutputStream(file);
            //Address是將要被轉(zhuǎn)化的word文檔
            Document doc = new Document(wordPath);
            //全面支持DOC, DOCX, OOXML, RTF HTML, OpenDocument, PDF, EPUB, XPS, SWF 相互轉(zhuǎn)換
            doc.save(os, SaveFormat.PDF);
            os.close();
            //去除水印
            removeWatermark(new File(pdfPath));
            //轉(zhuǎn)化用時(shí)
            long now = System.currentTimeMillis();
            System.out.println("Word 轉(zhuǎn) Pdf 共耗時(shí):" + ((now - old) / 1000.0) + "秒");
        } catch (Exception e) {
            System.out.println("Word 轉(zhuǎn) Pdf 失敗...");
            e.printStackTrace();
        }
    }


}

結(jié)果分析

以一個(gè)帶文字和圖片,工21頁(yè)的doc文件為例,word轉(zhuǎn)pdf花費(fèi)時(shí)長(zhǎng)4.398秒

Java如何實(shí)現(xiàn)無(wú)損Word轉(zhuǎn)PDF

原word樣式

Java如何實(shí)現(xiàn)無(wú)損Word轉(zhuǎn)PDF

Java如何實(shí)現(xiàn)無(wú)損Word轉(zhuǎn)PDF

轉(zhuǎn)化后pdf效果圖

Java如何實(shí)現(xiàn)無(wú)損Word轉(zhuǎn)PDF

Java如何實(shí)現(xiàn)無(wú)損Word轉(zhuǎn)PDF

通過(guò)對(duì)比,word原來(lái)的樣式和轉(zhuǎn)換pdf文件后的樣式基本沒(méi)有變化。

關(guān)于“Java如何實(shí)現(xiàn)無(wú)損Word轉(zhuǎn)PDF”的內(nèi)容就介紹到這里了,感謝大家的閱讀。如果想了解更多行業(yè)相關(guān)的知識(shí),可以關(guān)注億速云行業(yè)資訊頻道,小編每天都會(huì)為大家更新不同的知識(shí)點(diǎn)。

向AI問(wèn)一下細(xì)節(jié)

免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。

AI