This white paper was written by Gerald Holmann, Qoppa Software founder and president.
Qoppa Software is located in Atlanta, GA and is a leading provider of PDF solutions and PDF software.
Financial documents are the essential media by which information is exchanged between parties involved in different types of transactions, including loan approvals, insurance and others. The information in these documents is relied upon to make decisions that in some cases involve large amounts of capital and risk.
As such, it is imperative that the information held in these documents is accurate. While verification of the information would be ideal, this is not always practical because of time constraints, cost and access. As a result, the information on the documents is frequently taken at face value without verification.
Historically, financial documents have been exchanged using hard copies, preferably using original documents such as bank statements. This medium affords a bit of verification because the documents may come from well known, standard institutions using letter head and pre-printed forms. Additionally, even though forging is still possible, modification of printed content on payer is hard to do without leaving any traces.
This has changed dramatically in recent years, most financial documents are now exchanged in electronic format, with entire transactions processed without ever using hard copies.
The format of choice for electronic documents is the PDF format, almost to the exclusion of any other format. Unfortunately, the great majority of PDF documents produced by financial institutions are unprotected.
Unprotected PDF documents are relatively easy to modify, many PDF editors on the market can do this in simple, user-friendly ways. Any and all content in a PDF can be modified, replaced or removed, and this can be done without leaving any trace or audit trail.
This means that anyone that wishes to modify financial data that they submit as part of any transaction can do so easily, inexpensively and without a trace on the document itself. The receiver of the documents has no way to tell if the documents have been modified. The only recourse is to verify the information through an audit with the institution that it comes from.
Security in PDF Documents
The PDF format provides for two distinct methods to secure PDF documents, each intended to achieve different goals.
Passwords and Permissions
A PDF document can use passwords to restrict access to a document, there are two different passwords that can be used, the user password and the master password.
The user password is used to restrict who can open the document. When this password is set in the document, the data inside the document is encrypted using a moderately secure algorithm. When the document is opened by a user or by any PDF processing application, this password is required in order to decrypt the document and view or access its contents.
The master password is used to restrict what actions can be done on the document. This is done by setting a set of permissions to disallow functions such as modifying contents, printing, extracting data, and others. When the master password is used, the document contents are also encrypted, but in such a way that the document can still be opened and viewed without having the password.
When both passwords are used, the PDF viewer or processor will need to provide the user password to open the document, and the master password to modify the document’s permissions or to perform any of the restricted operations.
Enforcement of the permissions set using the master password by a PDF application is “voluntary”: Once a document is opened, there are no technical barriers to modifying the contents of the document even if the permissions state that content should not be modified.
Additionally, once a document is opened, a PDF application can remove the encryption and permissions at will. Reputable PDF applications do enforce permissions as intended, but there are many tools and applications available that will not enforce them and specifically advertise the fact (i.e. this is intentional).
The conclusion is that password-based security is not very useful in the context of protecting a document’s contents. In order for a document to be useful in a transaction, it has to be possible to view the document. Once this is the case, encryption and permissions can be easily cleared and the contents of the document can then be tampered with.
Digital Signatures in PDFs
The intent of digital signatures in PDF documents is to verify the integrity of the document, that it has not been modified from the time that it was signed, and to positively identify the signer. Digital signatures do not prevent unauthorized access to a document or restrict its permissions.
There are two key elements to a digital signature:
The signature hash – When a digital signature is applied to a document, the contents of the document are run through a complex mathematical algorithm to produce a signature. The signature is essentially unique to the document contents, any changes to the contents would produce a different signature. When opening a document, a PDF consumer application re-calculates the signature hash and compares it to the hash stored in the document. If there are any differences, it means that the contents of the PDF have been modified.
The certificate used in the signature – A digital certificate identifies the entity that signed the document. Additionally, the signature holds the certificate of the organization(s) that issued the certificate used in signing, to be used in verification. By looking at the certificates included in the signature, a PDF consumer application can verify that the certificate belongs to the entity with a reasonable degree of certainty (more about this later).
One important note is that all the certificates included in the signature are included when calculating the signature hash. This means that any changes to any of these certificates will be detected and would invalidate the signature.
Certificate Authorities and the Certificate Chain
A digital certificate is intended to uniquely identify an entity, called the subject. Certificates have a unique ID and include information about the subject such as its name, address and a few other items. Additionally, a certificate also contains information about the entity that issued the certificate, the issuer.
Certificates can be created by anyone, there are simple software tools available to do this in all operating systems. When a certificate is created by the same organization that it is being created for, i.e. when the issuer is the same as the subject, it is called a self-signed certificate.
Self-signed certificates do not provide much value, except in special cases. For instance, self-signed certificates might be used in internal processes where software can be configured to specifically recognize the certificate as a trusted certificate. However, sending documents to 3rd parties with self-signed certificates does not have much value.
More commonly, certificates are issued by Certificate Authorities (CA), organizations that are recognized to be trustworthy because they implement verification mechanisms when issuing certificates.
CAs can delegate the issuance of certificates to other entities, secondary or intermediary CAs. This is done by issuing a certificate to the secondary CA with the intent settings set to allow for issuance of certificates. The secondary CA will then be able to issue certificates to third parties but will use itself (the secondary CA) as the issuer. This delegation can happen with additional levels. The list of certificates that go from the subject certificate to the root certificate, through any intermediate certificates, is called the Certificate Chain.
When applying signatures to documents, it is normal to include all the certificates from the subject certificate that is applying the signature to the root certificate of a CA that is generally considered trustworthy.
When verifying certificates in a signature, a PDF consumer application will work its way from the subject up the certificate chain until it finds a certificate that is trusted. When it does, the digital signature is considered trusted and validated. If none of the certificates in the chain are trusted, then the software will flag the signature as not fully verified.
Issues with Certificate Authorities Framework
Even though Certificate Authorities are necessary for reliable verification of digital signatures, the current status of the industry does not lend itself well to a wide implementation of a digital signature framework.
Operating systems and software usually come preconfigured to trust certificates issued by a few well known CAs, such as VeriSign, Thawte and others. Software that works with PDF documents will normally look to the operating system for trusted CAs, when verifying digital signatures. Even though the list of trusted CAs can be modified, both in the operating system and in PDF software, in most cases, only the default CAs are trusted.
Unfortunately, this means that the few CAs that are trusted by all operating systems have a lock on the market, which has driven pricing for certificates to be too expensive and non-scalable: pricing models for certificates factor in usage in terms of the number of signatures applied, as well as the size of the enterprise that purchases them. Wide adoption of the framework that we propose implies very, very large volumes of documents signed by some of the biggest companies in the world. Current pricing from CAs for this scenario is completely out of reach by any measure.
We propose that all documents that contain financial information delivered in electronic form should use the PDF format and that they should always include a digital signature.
Digital signatures should be applied to these documents at the time of creation and should use a distinct digital certificate from that entity that is intended for this purpose alone.
Having a digital signature on every document ensures that the document has not been modified from the time of creation, and so ensures that the information contained in the document has not been tampered with.
Upon receipt of a document, verification is straightforward, all signatures should be verified by comparing the current signature hash to the stored signature hash, to detect any changes to the document, and by checking all of the certificates in the certificate chain until a certificate is found that comes from a trusted CA. This verification confirms the identity of the signer of the document as well as the integrity of the document.
Verification should be performed both in unattended processing of documents, and by human actors when the documents are being reviewed by a person.
There is wide availability of server systems that provide functions to receive and verify digital signatures in incoming documents, and then implement routing rules to handle the documents accordingly. Documents that have valid signatures are routed to the next step in the document workflow, while those that do not pass verification can be routed differently and a human actor can be notified.
Additionally, there are integration products available as well that can be used to add this capability to existing document processing or management systems.
When people are reviewing documents directly, any commercial PDF viewer application can verify digital signatures and alert the end user if there are any problems.
As a side effect to having this framework prevalent is that, if all documents are expected to have digital signatures, then any documents that do not have a signature would immediately stand out. On these documents, there should be human driven processes to verify the validity of the non-signed documents before they are accepted.
To resolve the cost issues with the existing CA framework, we propose that a single organization should be created charged with issuing certificates for the purpose of validating financial information documents. This organization can be a government agency, perhaps an agency that is already charged with regulating financial entities, such as the FDIC, or it could also be an industry sponsored group, similar to ICANN.
Financial entities would apply for digital certificates used for signing financial documents from this agency. The agency would then verify that the financial institution is real and legitimate and issue certificates with itself as the Certificate Authority.
This entity would also be tasked with participating in the verification process for certificates. This can be done statically, by having operating system manufacturers include the organization as a trusted CA, and also dynamically, by providing servers that can be queried to check that a certificate is valid and that it is in good standing.
Today’s reliability on insecure documents, electronic or otherwise, creates an opportunity for easy exploitation. When used in the context of financial transactions, this creates a significant risk factor that has to be factored into all transactions.
The framework that we propose is very straightforward to implement, both on the creation and in the verification of documents and would reduce costs as well as the potential for fraudulent documents to get past verification undetected.
Qoppa Software Related Products:
jPDFSecure – Java Library to secure and digitally sign PDFs
jPDFProcess – Java Library to secure, sign and manipulate PDFs
PDF Automation Server – Server Application to sign and secure PDFs