Why optimize? First, why would anyone want to search engine optimize their PDF files? Well, if you had an eBook, brochure, product description or technical document in PDF format, you may wish to optimize these to pick up some extra search engine traffic.
Can
search engines read PDF files?
Yes, most of
major search engines now can read
basic contents of PDF files, though getting these pages to rank as well as HTML files is still questionable.
How is it supposed to work?
This is how
workflow is supposed to work. Create your file in MS Word, or in a draw or page layout program that later can be distilled into a PDF (with some applications you will have to create an EPS file first and then distill it and with other applications, you can distill right out of
apps). If you are using a program such as MS Word, be mindful to apply
H1, H2, H3 tags where necessary and optimize
body text as you would an HTML file.
When you are finished, distill
file. Bring this file into
full version of Adobe Acrobat 6 for editing. Plug in
appropriate content, post
PDF on your website and let
search engine robots index
file.
How do I plug in
appropriate content?
In Adobe Acrobat 6 there are two places to input content into a PDF file. The first place is under File / Document Properties and
second place is under Advanced / Document Metadata. Under File / Document Properties there are several menus but
most relevant for our purposes is
Description menu. Under
Description menu, there are fields for Title, Author, Subject and Keywords.
Now to confuse matters more, let’s go over to
Advanced / Document Metadata menu. There are a couple of choices here, but let’s once again look at
Description menu. Under this Description menu, there are fields for Title, Author, Description, Description Writer, Keywords, Copyright State, Copyright Notice and Copyright Info URL.
How does
PDF store
data?
With duplicate fields, it is important to find out how
data is stored so that we may make some educated guesses as to how
search engines read this data. I performed a few small experiments and here is what I have found. The Title and Author fields seem to be linked to each other because when you change one and check on
other you will see it too has changed. Also,
Subject field of
Document Properties menu seems to be linked to
Description field of
Document Metadata menu for
same reasons. The Keyword fields, however, are not linked. Separate sets of keywords can be added to both fields. When
file is saved, both sets of keywords are stored in
PDF file.
Which set of keywords is correct then?
Adobe stores its metadata in XML format. Opening
PDF file in Notepad, it appears that
Keyword field under Document Properties is
one that
search engines will use (this hasn’t been proven, yet though). The keywords input into this field appear in
PDF as we have come to expect, separated by commas, like this: Keywords(movies, cinemas, matinees, theatres, popcorn).