Developers Club geek daily blog

3 years, 2 months ago
During the work on the web project sometimes there is need to generate PDF-files with big tables: price lists on thousands of positions. There were different libraries for generation of the PDF-file from PHP-script:

• FPDF
• MPDF — based on the FPDF library allowing to generate PDF file from any html-code
• DOMPDF
• TCPDF

and many different other libraries. The MPDF library, if not critical shortcoming in our case was the most powerful and suitable, besides, initially correctly working with Cyrillics: big tables and in general big files were extremely slowly generated. Moreover, often generation did not happen absolutely, and the script stopped with error 504.

Further search has helped to find the wkhtmltopdf program. Program site: http://wkhtmltopdf.org.

Unlike PHP libraries, it is the server program distributed including in the form of packets and executable files for linux, windows and other operating systems. The program accepts html-code (in the form of the web address, way to the file or code line) and generates on its basis of PDF file on the server.

Preliminary experience has shown that on the local XAMPP server under Windows the huge HTML table on 300-500 pages will be transformed to PDF file in 1-2 seconds!

The wkhtmltopdf installation on CentOs 6
For the program webkit and qt is necessary for work.

So, we will install required environment and the program on the server. On our server CentOs 6 is set. Let's come on the server with the rights of root and we will execute the following commands.

Let's receive rpm-packet of the wkhtmltopdf program for the link from the site of the developer and we will set it on the working server:

wget http://download.gna.org/wkhtmltopdf/0.12/0.12.2.1/wkhtmltox-0.12.2.1_linux-centos6-i386.rpm yum --nogpgcheck localinstall wkhtmltox-0.12.2.1_linux-centos6-i386.rpm

All dependences of packets have to be checked and satisfied automatically. If the environment for some reason was not established, use teams:

yum install urw-fonts libXext openssl-devel libXrender yum install xorg-x11-fonts-cyrillic.noarch xorg-x11-fonts-misc.noarch xorg-x11-fonts-truetype.noarch xorg-x11-fonts-100dpi.noarch xorg-x11-fonts-75dpi.noarch fonts-ISO8859-2.noarch fonts-ISO8859-2-100dpi.noarch fonts-ISO8859-2-75dpi.noarch freefont.noarch

Until recently the program was not provided in the form of rpm-packet, and it was necessary to copy the binary file and manually to set all necessary packets.

Use of wkhtmltopdf on CentOs 6
General format of start of the program such:

wkhtmltopdf <путь, имя исходного файла.html> <путь, имя выходного файла.pdf>

Besides, the program allows to build in automatically cap and the cellar of the document from separate html-files. For this purpose syntax such:

wkhtmltopdf --header-html <путь,имя шапки.html> --footer-html <путь,имя подвала.html> <путь,имя исходного файла.html> <путь, имя выходного файла.pdf>

Also among options of start of the program — the configured size of fields of the received PDF file. In top and bottom margin the program substitutes cap and the cellar:

wkhtmltopdf --margin-top 35mm --margin-bottom 27mm --margin-left 10mm --margin-right 10mm --header-html <путь,имя шапки.html> --footer-html <путь,имя подвала.html> <путь,имя исходного файла.html> <путь, имя выходного файла.pdf>

In this example:
• top margin: 35 mm
• bottom margin: 27 mm
• left, right weeding: on 10mm

I will give also cellar code sample. In our case form automatically and substituted to the cellar of number of pages. Thus, in our document pages will automatically be numbered:

<html><head><script> function subst() { var vars={}; var x=document.location.search.substring(1).split('&'); for (var i in x) {var z=x[i].split('=',2);vars[z[0]] = unescape(z[1]);} var x=['frompage','topage','page','webpage','section','subsection','subsubsection']; for (var i in x) { var y = document.getElementsByClassName(x[i]); for (var j=0; j<y.length; ++j) y[j].textContent = vars[x[i]]; } } </script></head><body style="border:0; margin: 0;" onLoad="subst()"> <div align="right" style="font-family:'Times New Roman', Times, serif; font-size: 14px;"> /<span class="page"></span>/ </div> </body></html>

Also among useful options of start of the program:

— encoding – the indication of the coding of the initial html-file, for example:
--encoding windows-1251

— page-size – the indication of format of the page, for example:
--page-size A4

— orientation – orientation of the page, for example:
--orientation Landscape

In our web project for PHP page, creating PDF file, such PHP code is used:
$tmp=time(); $f=fopen(ABSPATH.'/tmp/'.$tmp.'.html','w'); fputs($f, $llg); fclose($f); $cd = "cd ".ABSPATH.'/tmp'; exec($cd); $command = "wkhtmltopdf-i386 --margin-top 35mm --margin-bottom 27mm --margin-left 10mm --margin-right 10mm --footer-html ".ABSPATH."/tpl-sm/pl_pdf/pdf_footer.html --header-html ".ABSPATH."/tpl-sm/pl_pdf/pdf_header.html ".ABSPATH.'/tmp/'.$tmp.'.html'." ".ABSPATH.'/tmp/'."$tmp.pdf"; exec($command); if (file_exists(ABSPATH.'/tmp/'.$tmp.'.pdf')) { header('Content-type: application/pdf'); header('Content-Disposition: attachment; filename="pricelist.pdf"'); readfile(ABSPATH.'/tmp/'.$tmp.'.pdf'); } unlink(ABSPATH.'/tmp/'.$tmp.'.pdf'); unlink(ABSPATH.'/tmp/'.$tmp.'.html');

In this code:
• $llg variable — contains price list html-code
• the constant ABSPATH — absolute path to the folder of the web project on the server.
The code does the following:
• Writes price list html-code in the temporary file;
• Passes into the temporary directory;
• Starts wkhtmltopdf with required options;
• If PDF file has been successfully created — that returns it to the user in the browser, suggesting to download the file under the name of pricelist.pdf;
• Deletes temporary html-and PDF files from the temporary directory.

This article is a translation of the original post at habrahabr.ru/post/266571/
If you have any questions regarding the material covered in the article above, please, contact the original author of the post.
If you have any complaints about this article or you want this article to be deleted, please, drop an email here: sysmagazine.com@gmail.com.

We believe that the knowledge, which is available at the most popular Russian IT blog habrahabr.ru, should be accessed by everyone, even though it is poorly translated.
Shared knowledge makes the world better.
Best wishes.

comments powered by Disqus