Nakamoto Research

Nakamoto Research The Bitcoin whitepaper	Version	v0.3.2
Updated	2025-03-30
Author	obxium	License	BY-NC-ND

Satoshi Nakamoto announced release of the Bitcoin whitepaper, titled “Bitcoin: A Peer-to-Peer Electronic Cash System,” in different communications channels on October 31, 2008.

Characteristics

ARTISTIC REENACTMENT What it could have been like that day as Nakamoto exported the whitepaper PDF.

metzdowd cryptography mailing list announcement

Friday Oct 31 14:10:00 EDT 2008, Nakamoto used the satoshi@vistomail.com email address to announce the paper in a popular mailing list for cypherpunks, in a post entitled Bitcoin P2P e-cash paper:

Potential academic connection

Nakamoto introduced Bitcoin as an academic paper, and the Bitcoin papers cites other academic papers. It stands to reason that one or more of the involved actors had or has academic ties, and has likely published similar papers before.

Academic computer scientists have a particular penchant for formatting their papers with TeX or LaTeX.

It’s entirely possible that Nakamoto didn’t author the whitepaper in OpenOffice Writer, but instead wrote the paper in TeX (or LaTeX), imported that file into OpenOffice, and exported the PDF from OpenOffice Writer as a measure of misdirection.

PDF file

Document structure

Satoshi Nakamoto released the original Bitcoin whitepaper as a PDF document. I’ve used open source tools and techniques to analyze the facts and structure in this document.

This version of the whitepaper shows 1 /OpenAction. An open action is any action, such as a request to an external website, that triggers when you open the file.

Examine the document for stream objects to find that the /OpenAction is object 66:

This is an explicit destination stream object included in all OpenOffice documents to present the first page of the document with desired settings.

There’s nothing to worry about here, although the reference to the en-GB language for the document is certainly interesting. Whether it’s a genuine reflection of the operating environment at time of document authoring or a clever ruse is unknown.

Document metadata

When researching at the metadata level, some critical document properties emerge, namely details about the document creation. These details include the software and version along with creation date.

This page presents some of those details in plain text followed by their actual representation in the document, which sometimes includes hexadecimal encoding.

PDF comments

If you browse the PDF data, the first thing you’ll note are 2 PDF comments. The first is just the PDF version:

It’s unclear what this selection of German umlaut characters means, if anything.

The PDF comment appears just to be an EOF and end of line character: '%%EOF\n', or effectively an empty comment value.

Creation date

Creator

It’s not clear what ‘Writer’ means here, but the initial hunch is that it refers to OpenOffice Writer, a component of the OpenOffice suite. This suggests that tool is part of the authoring workflow. Of course, it could also just be a ruse to disguise the true authoring workflow.

More research into what can appear in this field by default from the software in question will help to push the topic further

Producer

Document checksum

The document checksum represented in the PDF is: 6F72EA7514DFAD23FABCC7A550021AF7.

ID field

The /ID field is interesting, because it’s an MD5 digest of concatenated key document metadata items covered earlier with some known and some unknown values:

The advanced PDF forensics page details this further, including the source code responsible for generating this value.

exiftool abbreviated output

These are relevant details from the output of exiftool bitcoin.pdf. They’re pretty much the same results that pdfid.py outputs, but the exiftool has cleaner output:

Python code example

You can also use code, like this Python example to fetch the PDF from its URL and process it directly. The output will also be like that produced by exiftool or pdfid.py.

{
    "Creator": "Writer",
    "Producer": "OpenOffice.org 2.4",
    "CreationDate": "D:20090324113315-06'00'"
}

Advanced PDF notes

The PDF document represents a true flag to capture from the perspective of a forensics examiner. As detailed in How you will not uncover Satoshi, the file contains a unique ID that’s essentially an MD5 digest of some known and unknown metadata components in the document. One of the unknowns represented by that MD5 digest is the original document path, which could potentially contain an OS username, and thus a significant clue to the Nakamoto persona’s true identity.

Regardless of the correct and complete document authoring workflow, there is no doubt about some facts around the creation of this PDF file:

ID string as identity oracle

The original PDF contains an ID string that OpenOffice.org generates from an MD5 digest of document metadata items. Some of these items are known, and some are unknown.

One can’t reverse the digest to reveal the values behind the digest, but one could create similar digests from the known items with guesses for the unknowns, and compare to different replacement values for the unknowns to confirm an OS username value as part of the document original filesystem path.

Direct document authoring and release workflow

It’s plausible that the person who authored the paper did so entirely on a Windows XP system using the OpenOffice.org Writer software:

Alternate document authoring and release workflow

If the person who authored the paper was an academic, then there exists a strong potential authoring workflow that involves TeX/LaTeX due to their popularity in academia:

Document ID value source code

The following is an abbreviated snippet of code from OpenOffice version 2.4 that shows precisely how it generates the document ID value.

Known values	Unknown values
Title	Author
Creator	Subject
Producer	Keywords
	Document creation date

OStringBuffer aID( 1024 );
    if( m_aDocInfo.Title.Len() )
        appendUnicodeTextString( m_aDocInfo.Title, aID );
    if( m_aDocInfo.Author.Len() )
        appendUnicodeTextString( m_aDocInfo.Author, aID );
    if( m_aDocInfo.Subject.Len() )
        appendUnicodeTextString( m_aDocInfo.Subject, aID );
    if( m_aDocInfo.Keywords.Len() )
        appendUnicodeTextString( m_aDocInfo.Keywords, aID );
    if( m_aDocInfo.Creator.Len() )
        appendUnicodeTextString( m_aDocInfo.Creator, aID );
    if( m_aDocInfo.Producer.Len() )
        appendUnicodeTextString( m_aDocInfo.Producer, aID );
...
    aID.append( m_aCreationDateString.getStr(), m_aCreationDateString.getLength() );
    aInfoValuesOut = aID.makeStringAndClear();
    osl_getSystemTime( &aGMT );            
    rtlDigestError nError = rtl_digest_updateMD5( m_aDigest, &aGMT, sizeof( aGMT ) );
    if( nError == rtl_Digest_E_None )
        nError = rtl_digest_updateMD5( m_aDigest, m_aContext.URL.getStr(), m_aContext.URL.getLength()*sizeof(sal_Unicode) ); // unicode value
    if( nError == rtl_Digest_E_None )
        nError = rtl_digest_updateMD5( m_aDigest, aInfoValuesOut.getStr(), aInfoValuesOut.getLength() );