redgoose.ca

Perl and pre-filled PDFs

A client at work required some PDF functionality to reduce the wasted time and effort required to take data from a webapp and handwrite it on a printed PDF form. Since we already had the necessary data in place, I figured it would be pretty easy to do with Perl since I had done it in the past using the Adobe FDF Toolkit and C#.

Thankfully, there are several Perl PDF libraries on CPAN that can help do the job. I checked out four of them with varying results.

PDF::Reuse
Looked like a popular option, but I failed miserably with this module. The interface made me cry and in the end I wasn't able to "reuse" a single existing PDF with it and just got cryptic errors instead.

CAM::PDF
This module worked well at filling text fields, but not much else. Theres a hacky undocumented interface for filling checkboxes and absolutely no support for radio buttons. I also ran into some annoying text positioning issues where the rendered text in each field would display a few pixels lower and hide surrounding elements of the document. Manually modifying the text field (ie. entering a space) in the outputted PDF would reposition the text at the right height and reveal obscured elements. Didn't find anyone having similar issues so I'll give this module the benefit of the doubt and chalk this up to my setup or the PDFs I'm using (hopefully).
use CAM::PDF;
my $file = "/test/form.pdf";
my $pdf = CAM::PDF->new($file);

# get a list of all fields 
my @f = $pdf->getFormFieldList();

my %form = (
    name 	=> "John Doe",
    street 	=> "20 Camden Street",
    gender	=> "Male", # oops radios dont work
    agree	=> "Yes", # oops checkboxes need a hack

);
$pdf->fillFormFields(%form);

# hacky way to fill checkbox
$pdf->getFormField("agree")->{value}->{value}->{AS}->{value} = 'Yes';

print $pdf->toPDF();

PDF::FDF::Simple
This module is a very simple alternative to using the full blown Adobe FDF Toolkit. A FDF (Forms Data Format) file is just a standard file format containing your fields and corresponding text values along with a link associating it with a PDF form. This module will build FDF files for you, but since they are so simplistic, you can be a ninja and build them yourself as follows.
my $file = "http://www.mysite.com/form.pdf";
print qq|
    \%FDF-1.2

    1 0 obj
    <<
    /FDF << /Fields 2 0 R/F ($file)>>
    >>
    endobj
    2 0 obj
    [<< /T (name) /V (John Doe) >>
    << /T (street) /V (20 Camden Street) >>
    << /T (city) /V (Toronto) >>
    << /T (gender) /V (Male) >>
    << /T (agree) /V (Yes) >>
    ]
    endobj
    trailer
    <<
    /Root 1 0 R

    >>
    \%\%EOF
|;
This method supports text fields, checkboxes and radio buttons and results in a perfect looking PDF (unlike CAM::PDF).

Stopping here will require a confusing workflow for your users because they will need to first download a FDF file that will prompt them to download a PDF file at which point they are presented with a merged PDF. Ideally, you'll want to do this merging server side with Pdftk so your users just have to download one final, merged PDF.

You also don't want to do client side merging since you'll have to deal with compatibility issues as FDF files aren't supported by all PDF readers (ie. Preview on OSX).

PDF::API2
PDF::API2 is a feature packed module that let's you do some awesome stuff. Unfortunately, filling form fields isn't one of them. However, what you can do is render text anywhere within an existing, "dumb" PDF. This saves you the trouble of having to spend time setting up a proper PDF form. An obvious drawback of course is that text isn't going to be modifiable by the user.
use PDF::API2;
my $file = "/test/form.pdf";
my $pdf = PDF::API2->open($file);

my $pdf_page = $pdf->openpage(1);
my $pdf_text = $pdf_page->text();

# set font
my $pdf_font = $pdf->corefont('Helvetica');
$pdf_text->font($pdf_font, 12);
	
# draw some text
$pdf_text->translate(200,400);
my $text = "John Doe";
$pdf_text->text($text);

# look ma! i done a checkbox
$pdf_text->translate(100,200);
my $text = "X";
$pdf_text->text($text);

print $pdf->stringify();
$pdf->end();
Rendering requires you get acquainted with PDF style coordinates where 0,0 is the bottom left corner rather than the top left. An easy way to find coordinates for placing text can be done by opening a PDF in Photoshop with no cropping, and a 72dpi setting. Then show the Rulers, and drag them to the bottom left of the PDF. Show the Info panel and you'll get x,y coordinates.

In my case all I needed to do was to pre-fill some text, check some checkboxes and pre-select radio buttons. The PDF I was given wasn't initially setup as a PDF form so depending on what module/method I chose, I would have needed to spend some time setting up interactive form fields within the PDF using Adobe Acrobat. If you don't have access to fully fledged PDF creation software, PDF::API2 text rendering would be your only option.

In the end I went with PDF::API2 as the client just wanted something simple and didn't require the text to be modifiable after the fact. Going forward I'd either use PDF::API2 or PDF::FDF::Simple depending on the requirements.

Comments

Oh boy, you saved my day! Very terse and exhaustive survey on Perl tools for PDF Forms.
Very useful the trick on checkboxes with CAM::PDF, you avoided me to do some analysis with Data::Dumper or something like it to sort that out. You missed only the part on flatting filled files.
My best compliments.

Casual Googler - Tuesday March 27, 2012

First, thanks for the tips. I am struggling a bit with using CAM::PDF and was hoping you encountered this. When I return the list of field names the names contain hexadecimal nulls and some other non-ASCII characters. Is ther some way to correct this? I am not able to update any form fields.

Josh - Tuesday June 12, 2012

Assuming you’re doing things right, my guess is that the PDF you’re accessing isn’t in the proper state for pre-filling.

Have you tried creating your own PDF with a few dummy form fields and accessing that file?

Tariq - Wednesday June 13, 2012

I’m going with creating the FDF file. Do you know of a simple reference?

I’ve got a simple (one page, 3 fields) form working. I’d like the option to create multiple pages of the same form (different data) in a single document.

John - Monday May 20, 2013

PDF::Reuse is my favorite module for reusing PDFs. Your PDF forms need to be PDF versions 1.5 and lower. PDF::reuse also creates an helpful error.log.

Matthias - Monday August 12, 2013

I happened to come across this website PDFfiller http://goo.gl/hgAuc3 when I was looking for a perfect way to edit PDF files. I would really recommend this, no software to download and install. It’s a great service you can upload your own PDF file so you can edit or fill it out online and sign the document, save, print, email and even fax it. Also you can find the right fillable form anytime using the form search engine that contains more than 10 Million forms.

Ellilou Ilano - Thursday September 12, 2013

Say something






Textile Help

Hello

My name is Tariq. I am a twenty something website developer based in Toronto, Canada, working at kanetix doing what I love to do. Yeppers, I like turtles and get on (the TTC) daily.

Popular Posts

Subscribe

RSS Feed