SEO Your PDF’s - Does This
Work?
|
First, why would anyone want to search engine
optimize their PDF files? Well, if you had an eBook, brochure,
product description or technical document in PDF format, you may
wish to optimize these to pick up some extra search engine traffic.
Can the search engines read PDF files?
Yes, most of the major search engines now can read the basic
contents of PDF files, though getting these pages to rank as well
as HTML files is still questionable.
How is it supposed to work?
This is how the workflow is supposed to work. Create your file in MS
Word, or in a draw or page layout program that later can be
distilled into a PDF (with some applications you will have to create
an EPS file first and then distill it and with other
applications, you can distill right out of the apps). If you are
using a program such as MS Word, be mindful to apply the H1, H2,
H3 tags where necessary and optimize the body text as you would an
HTML file.
When you are finished, distill the file. Bring this file into the
full version of Adobe Acrobat 6 for editing. Plug in the
appropriate content, post the PDF on your website and let the search
engine robots index the file.
How do I plug in the appropriate content?
In Adobe Acrobat 6 there are two places to input content into a PDF
file. The first place is under File / Document Properties and the
second place is under Advanced / Document Metadata. Under File /
Document Properties there are several menus but the most relevant
for our purposes is the Description menu. Under the Description
menu, there are fields for Title, Author, Subject and Keywords.
Now to confuse matters more, let’s go over to the Advanced /
Document Metadata menu. There are a couple of choices here, but
let’s once again look at the Description menu. Under this
Description menu, there are fields for Title, Author,
Description, Description Writer, Keywords, Copyright State,
Copyright Notice and Copyright Info URL.
How does the PDF store the data?
With duplicate fields, it is important to find out how the data is
stored so that we may make some educated guesses as to how the
search engines read this data. I performed a few small experiments
and here is what I have found. The Title and Author
fields seem to be linked to each other because when you change one
and check on the other you will see it too has changed. Also, the
Subject field of the Document Properties menu seems to be linked to
the Description field of the Document Metadata menu for the same
reasons. The Keyword fields, however, are not linked. Separate sets
of keywords can be added to both fields. When the file is saved,
both sets of keywords are stored in the PDF file.
Which set of keywords is correct then?
Adobe stores its metadata in XML format. Opening the PDF file in
Notepad, it appears that the Keyword field under Document
Properties is the one that the search engines will use (this hasn’t
been proven, yet though). The keywords input into this
field appear in the PDF as we have come to expect, separated by
commas, like this: Keywords(movies, cinemas, matinees, theatres,
popcorn).
The keywords that were input into the Document Metadata menu appear
as a sort of list like this: trees woodchips
Of course, this doesn’t mean anything really – it is how the search
engines read this that counts.
How does it really work?
I’ve run some preliminary tests (and by this I mean very
preliminary) and more testing will need to be completed to verify
these results, but here is what I have come up with so far. When a
PDF file was first opened in Acrobat 6 the Document Properties or
Document Metadata title and author fields were already filled in
with the file name and author’s initials (information received from
MS Word)
Without filling in any extra data into the Document Properties or
Document Metadata menu, Google used the Title field information for
the title in the results and the description in the results was
acquired from the body copy. Yahoo!, in older PDF’s use the largest
text on the page as the title text. In regards to more recently
indexed PDF documents, however, Yahoo! is using the Title field
information as the title text in the search results. At this
writing, the description text in the search engine results comes
from the body text of the PDF and not the Document Properties or
Document Metadata text.
Thinking I might just get lucky (and hoping for quick results), I
ran a few optimized and non-optimized PDF’s through some of the more
popular search engine spider simulators on the web, but these
spiders did not handle the binary code very well. None of them
returned title or meta tag information and the most popular keywords
were snippets of binary code.
So, at this point, does it really pay to optimize a PDF?
The simple answer is, yes. The title tag and body copy can still be
optimized and the major search engines will index it
accordingly. As far as the Keywords and Description meta tags, well
Google ignores this in PDF’s just as it does in HTML documents and
Yahoo!, which does use the description tag, is only half way to
where it needs to be.
But Google and Yahoo! aren’t the only two search engines /
directories around and with algorithms changing all the time,
perhaps someday soon either the SE’s will be able to fully read a
PDF file or Adobe will offer a patch that will make PDF’s more
SE-friendly. It’s only a matter of time, my friend. Will you be
ready?
Copyright © 2004 SEO Resource
http://www.seoresource.net
Kevin Kantola head’s up SEO Resource, a California search engine
optimization company devoted to achieving high rankings.
Author Name: Kevin Kantola
Author Email: info@seoresource.net
Author Website:
http://www.seoresource.net
|
|
1. Choose a desired template of 8,500 +
finest templates
2. Describe your project to manager
3. Monitor production process and get
work done!
|
Our builder is a user-friendly that lets nearly anyone
create and handle a successful website quickly, and at a
great price!
|
|
The Best Affiliate Program |
|