How to Compress a Pdf File in Java

Java is a general-purpose computer programming language that is concurrent, class-based, object-oriented, and specifically designed to have as few implementation dependencies as possible. Read on to know more about java pdf document process and compress pdf file in Java.

How to Compress a Pdf File in Java. We can compress pdf file with the help of java. There are different ways to compress pdf files. Here, we have used iText API to compress pdf file programmatically. You need to download itextpdf jar and include it in your classpath before executing this example. How to get itextpdf

A PDF file is a digital format that keeps a document’s original text and graphical elements, as well as its layout. However, it’s often the case that the documents you create for customers or colleagues are larger in size than what is strictly required. You can compress your PDF files using Java in order the reduce file size without changing its layout.

Compressing a file is important if you want to share or store data in a reliable way. While there are numerous software available for this purpose, not all of them do an awe inspiring job. Fortunately, this article will show you how to compress pdf files using java programming language.

Compress, or Optimize PDF Files with Same Quality using Java

Different organizations use PDF files for secure and organized exchange of information. However, sometimes PDF files become huge in size owing to embedded contents like images, videos, drawings, etc. You can easily optimize or compress such a PDF file size without any compromise on the quality. Let us explore the following scenarios of PDF size compression and optimization which you can incorporate in your Java applications:

PDF Size Optimization and Compression API – Installation

You can utilize the efficient and reliable routines of Aspose.PDF for Java API for optimizing or compressing huge-size PDF files while keeping the same quality. You can download the JAR files from Downloads or with Maven configurations in your project.

Optimize PDF Documents for the Web using Java

PDF document can be optimized when you need to use them in your web pages. This optimization is helpful to display the first page of PDF document as quickly as possible. You can have the optimized PDF file by following the steps below:

  1. Open source PDF file
  2. Call optimize method for PDF Optimization
  3. Save the output PDF file

The code snippet below is an example of how to optimize PDF documents for the web in your Java environment:

// Open document
Document pdfDocument = new Document(“Original.pdf”);
// Optimize for web
pdfDocument.optimize();
// Save output document
pdfDocument.save(“Optimized_output.pdf”);

view rawOptimizePDFweb.java hosted with ❤ by GitHub

Compress or Optimize the Size of PDF containing Images using Java

Here we will mainly be discussing the scenarios where PDF files contain a lot of images thus are huge in size. For instance, a PDF file containing drawing for different models of airplanes and information about each part, minor or major, included as images or pictures of all components. Moreover, many professional documents could contain images as major artifacts of the file. In such scenarios, we can compress the PDF files with following approaches:

Shrinking, Compressing and Resizing All Images using Java

You can minimize the size of PDF file containing many images by shrinking, compressing and resizing the images. The size improvements could be noticeable because most of the file size is covered by the pictures that we now intend to shrink. You need to follow the steps below in order to shrink, compress and resize the pictures or images in a PDF file:

  1. Load input PDF file
  2. Initialize OptimizationOptions object
  3. Set Image Quality and Resolution
  4. Call optimizeResources method
  5. Save the output PDF document

The code snippet below shows how to shrink or compress images in order to reduce and minimize the PDF file size using Java:

// Load input document
Document doc = new Document(dataDir + “Test.pdf”);
// Initialize OptimizationOptions object
OptimizationOptions opt = new OptimizationOptions();
// Enable image compression
// Set the quality and resolution of images in PDF file
opt.getImageCompressionOptions().setCompressImages(true);
opt.getImageCompressionOptions().setImageQuality(10);
opt.getImageCompressionOptions().setMaxResolution(150);
opt.getImageCompressionOptions().setResizeImages(true);
doc.optimizeResources(opt);
// Save the updated file
doc.save(dataDir + “compressingPDFWithImages_out.pdf”);

view rawShrink_Compress_Resize_Optimize.java hosted with ❤ by GitHub

Removing Embedded Fonts, Unused Streams and Linking Duplicate Streams using Java

When you need to reduce PDF file size then every byte matters. Embedded fonts can help reducing file size with different approaches. For example, you can either unembed all the fonts or you can keep only the subset of font characters that are being used in the PDF file. It would be a partial unembedding of fonts that would still help in minimizing the file size. Moreover, you can remove unused streams or link duplicate streams to save further space. These PDF optimizations will reduce the file size considerably. You need to follow the following steps to optimize and reduce PDF file size:

  1. Load input PDF document
  2. Initialize OptimizationOptions class object
  3. Either unembed all fonts or the subset of fonts
  4. Link duplicate streams
  5. Remove unused streams

The following code elaborates how to compress PDF files for optimizing, reducing and minimizing size of PDF documents:

Document doc = new Document(dataDir + “Test.pdf”);
OptimizationOptions opt = new OptimizationOptions();
// Either
// Unembed all fonts in PDF
opt.setUnembedFonts(true);
//OR
// only keep embedded fonts for used characters
opt.setSubsetFonts(true);
// link duplicate streams
opt.setLinkDuplcateStreams(false);
// Remove unused streams
opt.setRemoveUnusedStreams(false);
// Remove unused objects
opt.setRemoveUnusedObjects(false);
doc.optimizeResources(opt);
// Save the updated file
doc.save(dataDir + “compressingPDF.pdf”);

view rawEmbeddedFont_Streams_Compress.java hosted with ❤ by GitHub

So far we have discussed the optimization approaches majorly for the PDF files with images. Now let us proceed with some more ways for PDF optimization.

Compress or Reduce PDF Document Size using Java

PDF files often contain annotations, editable form fields and color artifacts that collectively take up space. Let us explore the following procedures to compress PDF file size.

Removing or Flattening Annotations to Reduce Size with Java

PDF files can contain a lot of annotations. For instance, watermark, comments, shapes, etc. You can remove annotations if they are not required anymore or fatten the annotations if no further changes are needed. Please follow the steps below for removing or flattening annotations to optimize the PDF file size:

  1. Open source PDF document
  2. Iterate through each page
  3. Flatten or delete annotations
  4. Save the output PDF document

The code snippet below is an example how to remove or flatten annotations in PDF documents using Java:

// Open document
Document pdfDocument = new Document(dataDir + “OptimizeDocument.pdf”);
// Iterate through each page and annotation
for (Page page : pdfDocument.getPages())
{
for (Annotation annotation : page.getAnnotations())
{
// Either flatten the annotation
annotation.flatten();
// OR delete the annotation
// page.getAnnotations().delete(annotation);
}
}
// Save optimized PDF document
pdfDocument.save(dataDir + “OptimizeDocument_out.pdf”);

view rawAnnotation_Optimize.java hosted with ❤ by GitHub

Removing Form Fields to Minimize PDF File Size with Java

Fillable PDF forms are common where you need submission of data on large scale. After submission of data, fillable form fields can be removed to optimize and minimize PDF file size. You need to follow the below steps for removing form fields:

  1. Load input PDF document
  2. Check for form fields in PDF document
  3. Iterate through each field and flatten it
  4. Save the updated compressed PDF file
// Load source PDF form
Document doc = new Document(dataDir + “input.pdf”);
// Flatten Form fields
if (doc.getForm().getFields().length > 0)
{
for (Field item : doc.getForm().getFields())
{
item.flatten();
}
}
dataDir = dataDir + “FlattenForms_out.pdf”;
// Save the updated document
doc.save(dataDir);

view rawFlatten_Form_Optimize_PDF.java hosted with ❤ by GitHub

Convert RGB Color Space to Grayscale for PDF Compression and Optimization using Java

Most of the PDF files contain textual contents which can be represented well in Grayscale color space as well. Moreover, when the purpose and priority is to save each byte then even the images can be converted to Greyscale because the focus is on archiving the data. You may follow the below steps for compressing and optimizing PDF file size by converting RGB color space to Grayscale:

  1. Access source PDF document
  2. Initialize RgbToDeviceGrayConversionStrategy instance
  3. Convert color space of each color to Greyscale
  4. Save output optimized PDF file

The following code snippet shows how to compress and optimize PDF size by changing the color space in Java environment:

// Load input PDF document
Document document = new Document(“input.pdf”);
// Initialize RgbToDeviceGrayConversionStrategy instance
RgbToDeviceGrayConversionStrategy strategy = new RgbToDeviceGrayConversionStrategy();
for (int idxPage = 1; idxPage <= document.getPages().size(); idxPage++) {
Page page = document.getPages().get_Item(idxPage);
// Convert color space of each page to Greyscale
strategy.convert(page);
}
// Save output PDF document
document.save(“output.pdf”);

view rawOptimize_Color_Space.java hosted with ❤ by GitHub

Compress PDF document in Java

This article will demonstrate how to use Spire.PDF for Java to compress PDF document by compressing the PDF contents and compressing the images in the PDF document.

Compressing content

01import com.spire.pdf.*;
02 
03public class CompressPDF {
04 
05    public static void main(String[] args) {
06 
07        String inputFile = "Sample.pdf";
08        String outputFile = "output/CompressPDFcontent.pdf";
09 
10        PdfDocument document = new PdfDocument();
11        document.loadFromFile(inputFile);
12 
13        document.getFileInfo().setIncrementalUpdate(false);
14 
15        document.setCompressionLevel(PdfCompressionLevel.Best);
16 
17        document.saveToFile(outputFile, FileFormat.PDF);
18        document.close();
19        }
20    }

Effective screenshot after compressing the PDF content:

Compress PDF document in Java

Compressing image

01import com.spire.pdf.*;
02import com.spire.pdf.exporting.PdfImageInfo;
03import com.spire.pdf.graphics.PdfBitmap;
04 
05public class CompressPDF {
06 
07    public static void main(String[] args) {
08 
09        String inputFile = "Sample.pdf";
10        String outputFile = "output/CompressPDFImage.pdf";
11 
12        PdfDocument document = new PdfDocument();
13        document.loadFromFile(inputFile);
14 
15        document.getFileInfo().setIncrementalUpdate(false);
16 
17        for (int i = 0; i < document.getPages().getCount(); i++) {
18 
19            PdfPageBase page = document.getPages().get(i);
20            PdfImageInfo[] images = page.getImagesInfo();
21            if (images != null && images.length > 0)
22                for (int j = 0; j < images.length; j++) {
23                    PdfImageInfo image = images[j];
24                    PdfBitmap bp = new PdfBitmap(image.getImage());
25                    bp.setQuality(20);
26                    page.replaceImage(j, bp);
27 
28                }
29        }
30        document.saveToFile(outputFile, FileFormat.PDF);
31        document.close();
32    }
33}

Effective screenshot after compressing the PDF image:

Compress PDF document in Java

Compress & optimize PDF files in Java

More languages

Sample Java code for using PDFTron SDK to reduce PDF file size by removing redundant information and compressing data streams using the latest in image compression technology. Learn more about our Java PDF Library.Get StartedSamplesDownload

To run this sample, get started with a free trial of PDFTron SDK.

//---------------------------------------------------------------------------------------
// Copyright (c) 2001-2021 by PDFTron Systems Inc. All Rights Reserved.
// Consult legal.txt regarding legal and license information.
//---------------------------------------------------------------------------------------

import com.pdftron.pdf.*;
import com.pdftron.sdf.SDFDoc;

public class OptimizerTest {
    //---------------------------------------------------------------------------------------
    // The following sample illustrates how to reduce PDF file size using 'pdftron.PDF.Optimizer'.
    // The sample also shows how to simplify and optimize PDF documents for viewing on mobile devices
    // and on the Web using 'pdftron.PDF.Flattener'.
    //
    // @note Both 'Optimizer' and 'Flattener' are separately licensable add-on options to the core PDFNet license.
    //
    // ----
    //
    // 'pdftron.PDF.Optimizer' can be used to optimize PDF documents by reducing the file size, removing
    // redundant information, and compressing data streams using the latest in image compression technology.
    //
    // PDF Optimizer can compress and shrink PDF file size with the following operations:
    // - Remove duplicated fonts, images, ICC profiles, and any other data stream.
    // - Optionally convert high-quality or print-ready PDF files to small, efficient and web-ready PDF.
    // - Optionally down-sample large images to a given resolution.
    // - Optionally compress or recompress PDF images using JBIG2 and JPEG2000 compression formats.
    // - Compress uncompressed streams and remove unused PDF objects.
    //
    // 'pdftron.PDF.Flattener' can be used to speed-up PDF rendering on mobile devices and on the Web by
    // simplifying page content (e.g. flattening complex graphics into images) while maintaining vector text
    // whenever possible.
    //
    // Flattener can also be used to simplify process of writing custom converters from PDF to other formats.
    // In this case, Flattener can be used as first step in the conversion pipeline to reduce any PDF to a
    // very simple representation (e.g. vector text on top of a background image).
    //---------------------------------------------------------------------------------------
    public static void main(String[] args) {
        String input_path = "../../TestFiles/";
        String output_path = "../../TestFiles/Output/";
        String input_filename = "newsletter.pdf";
        String input_filename2 = "newsletter_opt1.pdf";
        String input_filename3 = "newsletter_opt2.pdf";
        String input_filename4 = "newsletter_opt3.pdf";
        String input_filename5 = "newsletter_SaveViewerOptimized.pdf";

        PDFNet.initialize(PDFTronLicense.Key());

        //--------------------------------------------------------------------------------
        // Example 1) Optimize a PDF.
        try {
            PDFDoc doc = new PDFDoc(input_path + input_filename);
            doc.initSecurityHandler();
            Optimizer.optimize(doc);
            doc.save(output_path + input_filename2, SDFDoc.SaveMode.LINEARIZED, null);
            // output PDF doc
            doc.close();
        } catch (Exception e) {
            e.printStackTrace();
            return;
        }

        //--------------------------------------------------------------------------------
        // Example 2) Reduce image quality and use jpeg compression for
        // non monochrome images.
        try {
            PDFDoc doc = new PDFDoc(input_path + input_filename);
            doc.initSecurityHandler();

            Optimizer.ImageSettings image_settings = new Optimizer.ImageSettings();
            ;

            // low quality jpeg compression
            image_settings.setCompressionMode(Optimizer.ImageSettings.e_jpeg);
            image_settings.setQuality(1);

            // Set the output dpi to be standard screen resolution
            image_settings.setImageDPI(144, 96);

            // this option will recompress images not compressed with
            // jpeg compression and use the result if the new image
            // is smaller.
            image_settings.forceRecompression(true);


            // this option is not commonly used since it can
            // potentially lead to larger files. It should be enabled
            // only if the output compression specified should be applied
            // to every image of a given type regardless of the output image size
            //image_settings.forceChanges(true);

            Optimizer.OptimizerSettings opt_settings = new Optimizer.OptimizerSettings();
            opt_settings.setColorImageSettings(image_settings);
            opt_settings.setGrayscaleImageSettings(image_settings);


            Optimizer.optimize(doc, opt_settings);

            doc.save(output_path + input_filename3, SDFDoc.SaveMode.LINEARIZED, null);
            // output PDF doc
            doc.close();
        } catch (Exception e) {
            e.printStackTrace();
            return;
        }

        //--------------------------------------------------------------------------------
        // Example 3) Use monochrome image settings and default settings
        // for color and grayscale images.
        try {
            PDFDoc doc = new PDFDoc(input_path + input_filename);
            doc.initSecurityHandler();

            Optimizer.MonoImageSettings mono_image_settings = new Optimizer.MonoImageSettings();
            mono_image_settings.setCompressionMode(Optimizer.MonoImageSettings.e_jbig2);
            mono_image_settings.forceRecompression(true);
            Optimizer.OptimizerSettings opt_settings = new Optimizer.OptimizerSettings();
            opt_settings.setMonoImageSettings(mono_image_settings);

            Optimizer.optimize(doc, opt_settings);

            doc.save(output_path + input_filename4, SDFDoc.SaveMode.LINEARIZED, null);
            // output PDF doc
            doc.close();
        } catch (Exception e) {
            e.printStackTrace();
            return;
        }

        // ----------------------------------------------------------------------
        // Example 4) Use Flattener to simplify content in this document
        // using default settings
        try {
            PDFDoc doc = new PDFDoc(input_path + "TigerText.pdf");
            doc.initSecurityHandler();

            Flattener fl = new Flattener();

            // The following lines can increase the resolution of background
            // images.
            //fl.setDPI(300);
            //fl.setMaximumImagePixels(5000000);

            // This line can be used to output Flate compressed background
            // images rather than DCTDecode compressed images which is the default
            //fl.setPreferJPG(false);

            // In order to adjust thresholds for when text is Flattened
            // the following function can be used.
            //fl.setThreshold(Flattener.e_keep_most);

            // We use e_fast option here since it is usually preferable
            // to avoid Flattening simple pages in terms of size and
            // rendering speed. If the desire is to simplify the
            // document for processing such that it contains only text and
            // a background image e_simple should be used instead.
            fl.Process(doc, Flattener.e_fast);

            doc.save(output_path + "TigerText_flatten.pdf", SDFDoc.SaveMode.LINEARIZED, null);
            // output PDF doc
            doc.close();
        } catch (Exception e) {
            e.printStackTrace();
            return;
        }
        
        // ----------------------------------------------------------------------
        // Example 5) Optimize a PDF for viewing using SaveViewerOptimized.
        try {
            PDFDoc doc = new PDFDoc(input_path + input_filename);
            doc.initSecurityHandler();

            ViewerOptimizedOptions opts = new ViewerOptimizedOptions();

            // set the maximum dimension (width or height) that thumbnails will have.
            opts.setThumbnailSize(1500);

            // set thumbnail rendering threshold. A number from 0 (include all thumbnails) to 100 (include only the first thumbnail) 
            // representing the complexity at which SaveViewerOptimized would include the thumbnail. 
            // By default it only produces thumbnails on the first and complex pages. 
            // The following line will produce thumbnails on every page.
            // opts.setThumbnailRenderingThreshold(0); 

            doc.saveViewerOptimized(output_path + input_filename5 , opts);
            // output PDF doc
            doc.close();
        } catch (Exception e) {
            e.printStackTrace();
            return;
        }
        PDFNet.terminate();
    }
}

Removing or Flattening Annotations to Reduce Size with Java

PDF files can contain a lot of annotations. For instance, watermark, comments, shapes, etc. You can remove annotations if they are not required anymore or fatten the annotations if no further changes are needed. Please follow the steps below for removing or flattening annotations to optimize the PDF file size:

  1. Open source PDF document
  2. Iterate through each page
  3. Flatten or delete annotations
  4. Save the output PDF document

The code snippet below is an example how to remove or flatten annotations in PDF documents using Java:

// Open document
Document pdfDocument = new Document(dataDir + “OptimizeDocument.pdf”);
// Iterate through each page and annotation
for (Page page : pdfDocument.getPages())
{
for (Annotation annotation : page.getAnnotations())
{
// Either flatten the annotation
annotation.flatten();
// OR delete the annotation
// page.getAnnotations().delete(annotation);
}
}
// Save optimized PDF document
pdfDocument.save(dataDir + “OptimizeDocument_out.pdf”);

Conclusion

Sometimes we need a tool or a program to meassure the score grade in our test. In that case you can use this compress pdf example. It’s a simple Java code. You just have to download any pdf sample and compressed it using necessary program like winrar.

You can combine or compress individual PDF files into one PDF, or combine multiple different PDF files and all the contents in a particular folder from your computer into a single PDF file to minimize the space occupied on hard disks.

Leave a Comment