PDF/A and Passwords: Why Archival PDFs Refuse Encryption
Courts, tax authorities, universities, and healthcare regulators have all standardized on PDF/A for long-term document preservation. The ISO 19005 specification forbids encryption at every conformance level, yet many real-world documents claim to be PDF/A while still being password protected. This article explains the rules, how to tell when a file is really compliant, and what to do when you need to archive an encrypted PDF.
If a PDF says PDF/A and has a password, one of two things is wrong
Either the file's PDF/A claim is false, or the file's password was added after the archival declaration and silently invalidated it. Both situations are common. Both require fixing before the document can be accepted by a system that expects real PDF/A.
What PDF/A is and why it exists
PDF/A is the archival profile of the broader PDF specification. It was first published as ISO 19005-1 in 2005 by the International Organization for Standardization, in cooperation with the Association for Information and Image Management. The motivation was simple. Ordinary PDFs can reference external fonts, rely on proprietary plug-ins, embed executable JavaScript, and use encryption that requires specific reader support. All of those features are great for a living document, but every one of them becomes a risk for a document that has to remain readable thirty, fifty, or a hundred years from now.
PDF/A sets boundaries. Every font used in the document must be fully embedded so that no future system needs to find the right typeface file. Every color space must be declared. Every glyph must map to a Unicode character so that text extraction remains possible regardless of which reader comes along. External references are banned because the internet will eventually move or delete the targets. JavaScript is banned because it depends on a runtime that may not exist in fifty years. Multimedia objects are banned because codecs become obsolete. And, most relevant to this article, encryption is banned because encryption ties readability to a secret that can be lost.
The preservation community's logic on encryption is worth internalizing. A PDF/A file is meant to outlive the person who created it. Archival storage typically has its own layer of access control at the repository level, usually physical security for paper analogues and authenticated storage systems for digital ones. The document itself is expected to be readable without further gatekeeping. If the document is encrypted and the only person who knew the password dies or retires, the archive is silently destroyed. The file still exists, the metadata still declares PDF/A, but no human will ever read it again. That is exactly what archival format was designed to prevent.
The three flavors: PDF/A-1, PDF/A-2, PDF/A-3
The standard has been revised twice. PDF/A-1, published in 2005 as ISO 19005-1, is based on PDF 1.4 and forbids everything later than PDF 1.4 features. It has two conformance levels: Level A for full accessibility with tagged structure, and Level B for basic visual reproduction only. PDF/A-1a is the strictest real-world profile used by most government archives.
PDF/A-2, published in 2011 as ISO 19005-2, updates the base to PDF 1.7 and permits JPEG 2000 compression, OpenType fonts, transparency with explicit blending rules, and PDF collections. It adds Level U, which requires Unicode text mapping but not full accessibility structure. Most modern long-term archives now prefer PDF/A-2b or PDF/A-2u because they capture more real-world document content without losing information.
PDF/A-3, published in 2012 as ISO 19005-3, is identical to PDF/A-2 except it allows arbitrary files to be embedded as attachments. This was specifically motivated by the European ZUGFeRD electronic invoice format, which pairs a human-readable PDF with a machine-readable XML. PDF/A-3 is controversial in the preservation community because embedded arbitrary files may themselves rot or become unreadable, but it has become the standard for e-invoicing across much of Europe.
All three versions share one thing: none of them permit encryption, JavaScript, external references, or multimedia. A compliant PDF/A file, regardless of version, can be opened by any conformant reader without a password and without an internet connection.
Detecting an encrypted file claiming to be PDF/A
The easiest first check is whether the reader asks for a password when opening the file. If it does, the file is not valid PDF/A regardless of what its metadata says. A subtler check is the XMP metadata. PDF/A files contain an XMP packet with the namespace http://www.aiim.org/pdfa/ns/id/ declaring the part and conformance level. A quick way to inspect this is to open the PDF in a hex editor or with the command strings file.pdf | grep pdfaid and look for pdfaid:part and pdfaid:conformance entries.
The declaration is unfortunately easy to fake. Nothing stops a tool from writing pdfaid:part=1 and pdfaid:conformance=A into a PDF that violates every rule in the standard. Document management systems regularly ingest these files because a naive check of just the declaration passes. That is why serious archives do not trust the declaration alone. They run validators.
VeraPDF is the reference validator. It is open-source software maintained by the Open Preservation Foundation, which is the same organization that publishes the conformance test corpus used to certify PDF/A tools. VeraPDF implements every rule from the ISO 19005 family and reports every failure. You can run it from the command line with verapdf --flavour 2b file.pdf, where 2b is the conformance level you want to check against. The output is a structured report listing each violated rule with a reference back to the specification section.
When VeraPDF finds an encryption dictionary, it reports rule 6.1.3-1 of ISO 19005-1, which states that a conforming file shall not use encryption. The same rule number is echoed in the PDF/A-2 and PDF/A-3 specifications. If your validator is telling you 6.1.3-1, the file is password protected and the PDF/A claim is false.
Where non-compliant PDF/A comes from
A file claiming to be PDF/A but failing validation almost always has one of three origins. The first is aggressive conversion tools that output the PDF/A XMP marker regardless of whether the resulting file actually conforms. Many desktop printers and scan-to-PDF tools do this. They output a PDF/A-1b marker because it looks professional and it does not hurt them if the file fails somebody's validator later.
The second is post-processing. A user starts with a genuine PDF/A file, opens it in Acrobat, and adds an open password before sending. Acrobat writes the encryption dictionary, the file now fails 6.1.3-1, but the XMP packet still says PDF/A-1b because nothing updated it. The resulting file looks archival and behaves encrypted. To any system that expects real PDF/A, this is a broken file.
The third is deliberate misuse. Some enterprise workflows produce PDF/A-flagged documents because a downstream regulator requires the format, but they also require password protection for transmission. Rather than managing a clean pipeline, they produce hybrid files and hope nobody looks too carefully. Modern regulators do look, and submissions that fail validation are typically rejected and sent back for rework.
Converting an encrypted PDF into genuine PDF/A
When you receive a password-protected document that needs to be archived, the correct workflow is: unlock, then convert. Both steps must happen in that order. Conversion tools cannot strip encryption because they need to read the bytes before they can rewrite them into PDF/A structure. If you hand an encrypted file to a tool like LibreOffice or Acrobat's PDF/A save-as feature, the output will either fail or silently keep the encryption.
If you know the password, open the file in Acrobat Pro, choose File, Properties, Security, and change the method to No Security. Enter the password when prompted. Save the file. Now open it again and choose File, Save As Other, Archivable PDF (PDF/A). Acrobat runs the conversion. Depending on the content you may receive warnings about unsupported features that the tool auto-remedies: transparency is flattened, JavaScript is removed, external links are stripped.
If you do not have Acrobat, the free open-source path uses qpdf and LibreOffice in sequence. Run qpdf --decrypt --password=YOURPASSWORD input.pdf decrypted.pdf, then open the decrypted file in LibreOffice, choose File, Export As, PDF, and in the PDF Options dialog tick Archive (PDF/A-1a) or Archive (PDF/A-2b) depending on your target. LibreOffice's export is reasonably conformant for simple documents; for complex documents, run the output through VeraPDF to confirm.
If you do not know the password, the honest answer is that you have to recover it before you can archive the document. That is the situation our forgot PDF password workflow was built for. Document management systems do sometimes need to ingest old records whose password was lost when an employee left. GPU-based recovery handles this regularly, and the resulting unlocked file can then be converted to PDF/A.
How to handle access control without breaking PDF/A
The fundamental tension is that organizations archive documents specifically because they contain valuable information, and valuable information often must be access-controlled. If the file itself cannot be encrypted, where does the access control live?
The answer is at the repository layer. Institutional archives use authenticated document management systems that enforce who can read each file. The file on disk is an unencrypted PDF/A, but the disk itself lives behind access control and audit logging. When a user requests the document, the system decrypts the storage volume transparently, checks permissions, and streams the bytes. If the user is authorized, they see the file as an ordinary PDF/A. If not, they cannot even see the filename.
For external delivery, the common pattern is to produce a one-time encrypted wrapper around the archival copy. The repository exports the PDF/A to a temporary location, wraps it in a zero-knowledge encrypted container like a password-protected 7-zip archive, and emails only the container. The recipient decrypts the container with a separately delivered password and pulls out the pristine PDF/A. The archival copy inside is untouched and still valid.
For systems that require cryptographic proof of origin, PDF/A-2 and PDF/A-3 permit digital signatures. A signed PDF/A is still unencrypted, but the signature proves the document's integrity and signer identity. This is the path regulated industries typically take. For the mechanics of signatures, see our companion article on removing a digital signature from a PDF.
Real-world compliance scenarios
Court e-filing
United States federal CM/ECF and many state e-filing systems require PDF/A for sealed or evidentiary documents. They reject files with encryption. The accepted workflow is to archive first and apply sealing through the court's own mechanism rather than adding a password to the PDF.
European e-invoicing
ZUGFeRD, Factur-X, and XRechnung all mandate PDF/A-3. Invoices sent with passwords are rejected by the receiving party's invoicing pipeline. Vendors sometimes try to send protected invoices and discover the problem when their payments stall.
University dissertations
ProQuest, ETD repositories, and most university libraries require PDF/A for dissertation deposit. Students who submit encrypted PDFs receive a rejection email asking them to re-submit without security.
Medical records retention
HIPAA-regulated entities in the United States store records in PDF/A inside encrypted repositories. The archival files themselves are unencrypted; the encrypted perimeter is the storage system.
The preservation principle
If a secret is needed to read a document, the document is not really preserved. Archival formats push access control out of the file and into the storage layer, where it can be managed by organizations rather than forgotten by individuals.
Read next
For the broader technical picture of how PDF handles encryption versus structure, see how PDF encryption works. If you need to unlock a protected file before archiving, read PDF encryption types to choose the right recovery route.
Common questions
Can I add a password to a PDF/A later?
You can add a password but doing so makes the file no longer conformant. If any downstream system runs VeraPDF or an equivalent validator, the file will be rejected. Handle access control at the storage layer instead.
Are digital signatures allowed in PDF/A?
Yes. Signatures prove integrity without hiding content, so they match the preservation goals of the standard. Both PDF/A-2 and PDF/A-3 permit signatures.
What about redaction?
Redaction is allowed provided it is done destructively, meaning the underlying text is actually removed rather than covered. Visual covers that still contain the redacted bytes violate the preservation principle and fail validation in some profiles.