PDF and Digital Signatures

By Dr. Bernd Wild, intarsys consulting GmbH

PDF is by far the document format most often used in conjunction with digital signatures. In 1999, Adobe introduced version 1.3, which allowed for embedding digital signatures directly within PDF documents. At that time, it was common practice when signing a file to store the signature itself in a signature container in a separate signature file with the same name as the file to be signed but with a special extension such as "pkcs7".

Although modern document formats such as OpenOfficeXML or OpenDocumentFormat now also allow embedding of digital signatures, this feature is hardly ever used with these formats in everyday life. If a document is to be signed, it is almost always signed as a PDF. Why is this so?

In addition to the format's early support of embedded signatures already discussed, PDF has always stood for "electronic paper", i.e. it is essentially a static document format for the visualization of textual and graphical content, even though more and more dynamic properties have been added in the course of development. Interactive forms, support for JavaScript actions and annotations has turned the originally static format more and more into a flexible document format that can be used as information carrier in complex workflows.

It is especially in document workflows that the digital signature plays a central role. Release, acceptance and approval processes up to the signing of contracts, which until recently were still done on paper, are increasingly being transferred to the digital world using digitally signed PDF documents.

Digital Signatures and Standards

Digital signatures, or more correctly PKI-based electronic signatures, were originally specified mainly via RFCs of the IETF and PKCS documents of the RSA corporation, all independently of the document format or the data to be signed.

Integration into the PDF structure was accomplished by simply attaching the external signature container to the PDF structure via a special signature dictionary. This meant that no special PDF signature format had to be defined, since the signature properties such as certificate chains, signature attributes, algorithms used, and validation information were part of a container that was opaque to PDF.

In contrast to other document formats, however, PDF facilitated visualization of digital signatures using graphic and textual elements from the very beginning. The appearance of a hand-signed paper document was therefore perfect.

The specification of how an integration is to be carried out, which PDF objects are involved and what exactly the scope of the signed area in the PDF is, has been part of the PDF specification since version 1.3. Currently, this is covered in chapter 12.8 of ISO 32000-1 and ISO 32000-2, among others.

The corresponding ETSI and EN standards EN 319 122 and EN 319 132 are now used as standards for the electronic signature itself, with their different variants CAdES (CMS-based Advanced Electronic Signature) for general data and XAdES (XML-based Advanced Electronic Signature) for XML data.

The bridge to the integration of these signature standards in PDF is provided by the EN 319 142 PAdES (PDF Advanced Digital Electronic Signatures) standard. In addition to the four basic profiles in EN 319 142-1, EN 319 142-2 specifies three extended profiles, some of which only relate to XML content in PDF.

ISO 32000-2 also explicitly refers to this framework for signature structures, including extension by ETSI TS 119 142-3, which deals with document time-stamp digital signatures, also known as PAdES-DTS.

Increasing Complexity

These numerous profiles are a result of diverse requirements for signatures in terms of evidential value, long-term verifiability and/or renewability. Thus, it is now possible to create a signature container that contains not only the actual signature but also all the certificates involved, together with their verification information. Such a signature can be reliably validated even without online access to the revocation information of the respective certificate issuers.

To anchor the signature structure in the PDF, additional objects such as the DSS dictionary or the VRI dictionary allow various structures to be interlinked more optimally. At the same time, the variety of profiles and their integration into PDF also pose a major challenge for software manufacturers developing PDF-based signature applications. Achieving standard conformity and the interchangeability of appropriately signed PDFs has become anything but trivial in the age of ISO 32000-2 and the ETSI standards, as compared to the beginnings of PDF 1.3.

The Crux with the Workflows

Adobe early recognized the potential of PDF and digital signatures to realize fully digital business processes. The dogma that any change to the PDF document after a digital signature has been applied would lead to the signature being broken was opposed to the desire to enable multiple or serial signatures.

Also, the possibility of being able to change certain form fields after a digital signature would not have been feasible without softening the strict requirement of unchangeability.

Fortunately, PDF has a powerful change mechanism with revisions that allow incremental changes to the document by attaching a new revision to the end of the PDF document.

The difficulty now lies in validating such PDF documents, interpreting the changes made correctly in order to be able to give the final "OK" for a valid signed PDF. Together with the variety of possible signature profiles, the validation of permitted modifications to signed PDF documents is the biggest challenge for developers of PDF application software.

Introducing the PDF Forms TWG

The introduction of interactive forms to PDF with version 1.2, and subsequent adoption by the US Internal Revenue Service for its online library in 1995, is considered to be one of the key moments in the early history of PDF. It brought the previously niche technology into the mainstream consciousness of end users and did so for something other than “electronic paper”.

While Adobe made additions to the native PDF forms technology to bring it to functional parity with HTML (3.x at the time) as well as improvements in the integrated JavaScript language and accessibility - little else has changed for quite a long time. In the meantime, HTML forms have advanced in ways that PDF can’t replicate, making it difficult or impossible to build workflows that leverage both.

Enterprises have been pursuing digital transformation for a while now, but the COVID-19 pandemic and the move to “work from home” has brought unprecedented growth to this business segment where the use of electronic forms is key.  Although companies such as DocuSign, Adobe, Dropbox and others have created their own extensions to PDF to enable rich workflows - it is imperative that these capabilities make their way into the core PDF standard.

To accomplish this goal, the PDF Association is starting a new PDF Forms Technical Working Group (TWG).  This community is dedicated to advancing the current PDF Forms technologies through the introduction of new declarative models with integrated semantics.  These capabilities will not only bring PDF in alignment with modern HTML forms, but re-establish PDF’s leadership in the forms and workflow world. The community will also work closely with the PDF Digital SignaturesPDF Reuse, and PDF/UA TWGs to ensure that those groups' input is heard.

PDF Association members can join the PDF Forms TWG today via the Member Area!

Dr. Bernd Wild is originally a graduate physicist. Together with some partners, he founded intarsys consulting GmbH in Karlsruhe in 1996. Dr. Wild now concentrates on consulting and providing assistance for complex system integration projects. Document technology has increasingly become a focal point during the past few years. This includes not only the creation of documents from source data, but also the entire documentation life cycle through to archiving. Technologies like electronic signatures, intelligent forms and document standards are at the core of his activities. In addition, intarsys offers products and software components that support these technologies and can be used for easily and reliably designing customer specific solutions.

Originally published at https://www.pdfa.org/introducing-the-pdf-forms-twg/