Monday, November 26, 2007

How do I convert a PDF to a Word document?

In my translation business I often get PDF files which the customer wants translated and returned identically-formatted. The only way I can do this (partly because of the software we use to help us in translation) is to somehow convert the file to Word format and do the editing/translating there, and then convert back to PDF if necessary.

You probably have a similar situation, where you have a PDF file that you need to convert into a Word file so you can then continue editing the document in Word for whatever reason, and you want formatting preserved.

I have decided to test VeryPDF's PDF to Word software, one of the solutions that are out there for converting PDF files into Word documents, and see how well a product like this really works.

What I decided to do was get a PDF file and try converting it with PDF to Word to see if it really does the job. I opted for a really exciting document, the British Government's SA-100 tax form, which is full of nice formatting which ought to trip up the program. Here is a snapshot of the file (this is from page 2, click for full size):

The easy way
Of course, you could just open the file in Adobe Reader and copy/paste the text into Word. So let's try that.

You basically get an unformatted text file, like so (click for full size):

As you can see, it's next to useless - there is no formatting preserved and you would have a real job trying to reconstruct the original document from this.

Now, if Adobe Reader had an option like Save As -> Word Document... That would be nice, but they want you to buy the full Acrobat software for that, costing 100's of dollars! And having used it in the past, I was not overly impressed with its export to Word function, though this may have improved.

The VeryPDF way
I downloaded trial versions of several PDF converter programs, and they all did the job fairly well, but I eventually decided to demonstrate VeryPDF's, mainly because it offers a fairly generous trial period - 99 tries and only a 5-page limitation in trial mode. Some of the other programs (such as this one) seemed to do a good job, but obfuscated the results with asterisks and stuff. I understand they need to do to ensure sales of the full version, but it was very annoying and did not allow me to properly trial the program. VeryPDF's PDF to Word software is very easy to use so I won't go into details - all you have to do is select the source PDF and name the target Word file and in a few minutes the job is finished. And here is the result (yes, this is a screenshot of the resulting Word file!):

I am impressed - I honestly wasn't expecting it to be that good! I expected some bits of the PDF to be converted to graphical elements in Word but they were not - every bit of the text is editable, as far as I can see. And the actual formatting is perfect. The only issue is the main font, where it didn't use a sans-serif font. I guess this is because the font was not a standard Windows one, and this little glitch may be fixable through the options, though I couldn't find anything like that.

Also, of course, the results would be a lot different if the document contained scanned text. I can't show you the results of this because I tried it on a confidential document, but you should know that this program does not appear to OCR text that is in the form of a bitmap.

All in all though, these are small niggles and I was most impressed with this program. I will have to seriously consider buying the full version as it could give me a huge competitive advantage to be able to supply the customer with a translated AND fully-formatted Word/PDF document. At only $35, the product would pay for itself very quickly

