Developers Club geek daily blog

2 years, 11 months ago
Practically each developer creating information systems faces need of forming of different reports and printing forms. It is characteristic and for the majority of the applications developed on our platform. For example, in system on which I work now, them 264. Not to write every time to the logician of forming of reports from scratch, we have developed special library (under cat it will be explained why we were suited existing). It is called as YARG? Yet Another Report Generator.
YARG allows:
  • To generate the report in format of template or to convert result in PDF;
  • To create templates of reports in usual and widespread formats: DOC, ODT, XLS, DOCX,XLSX, HTML;
  • To create difficult XLS and XLSX templates: with the enclosed areas of data, diagrams, formulas, etc.;
  • To use in reports of the image and the HTML layout;
  • To store structure of reports in the XML format;
  • To start standalone application for reports generation that does possible use of library out of Java-ecosystem (for example for reports generation in PHP);
  • To be integrated with IoC-frameworks (Spring, Guice).

This library is used in the CUBA platform as basis for the engine of reports. We develop it since 2010, but have quite recently decided to make it open, and have laid out its code on GitHub with the license Apache 2.0.
This article is urged to draw to it attention of community.


The simple idea of division of data sampling and data mapping in the ready report (data layer &presentation; layer) is the cornerstone of library. Data sampling is described by various scripts, and data mapping is configured directly in documents templates. Thus, to create template special means are not required, it is enough to have near at hand Open Office or Microsoft Office.

The report consists of so-called bands. The band at the same time is both data set and area in template where these data are displayed (connects data layer and presentation layer).

Let's review for a start example in Hello World style.

Very simple example


Let's provide that we have firm and we need to output the list of all staff of firm, with the indication of position of the employee.
We create band of the report of c the name Staff in which we specify that data are loaded by SQL request
select name, surname, position from staff

Java code
ReportBuilder reportBuilder = new ReportBuilder();
ReportTemplateBuilder reportTemplateBuilder = new ReportTemplateBuilder()
        .documentPath("/home/haulmont/templates/staff.xls")
        .documentName("staff.xls")
        .outputType(ReportOutputType.xls)
        .readFileFromPath();
reportBuilder.template(reportTemplateBuilder.build());
BandBuilder bandBuilder = new BandBuilder();
ReportBand staff= bandBuilder.name("Staff")
        .query("Staff", "select name, surname, position from staff", "sql")
        .build();
reportBuilder.band(staff);
Report report = reportBuilder.build();

Reporting reporting = new Reporting();
reporting.setFormatterFactory(new DefaultFormatterFactory());
reporting.setLoaderFactory(
        new DefaultLoaderFactory().setSqlDataLoader(new SqlDataLoader(datasource)));

ReportOutputDocument reportOutputDocument = reporting.runReport(
        new RunParams(report), new FileOutputStream("/home/haulmont/reports/staff.xls"));

Further we create xls-template in which we note the named region of Staff and we place aliasa in cells.
YARG? open-source library for reports generation
Examples are more difficult reviewed below.

It is a little history


A few years ago we had had need for mass creation of reports for one of our projects. It was necessary to create reports in the XLS and DOC format, and also to convert result from DOC and XLS in PDF. It was required to us, that library:
  1. allowed to create reports (at least, templates of reports) to normal users;
  2. supported data loading from different sources;
  3. supported different formats of templates (XLS, DOC, HTML);
  4. supported converting of reports in PDF;
  5. was expanded (allowed fast adding of new ways of data loading and new formats of templates);
  6. it was easily built in different IoC containers.

At first we tried to use JasperReports, but it, first, is not able to create DOC reports (there is paid library for this purpose), secondly, its opportunities for generation of XLS of reports are strongly limited (it will not turn out to use diagrams, formulas, formats of cells), and, thirdly, creation of templates demands certain skill and special purpose tools, and for the description of data loading it is necessary to write Java-code. There were also many libraries concentrating on some specific format, but uniform library we have not found.
Therefore we have decided to create the mechanism allowing to describe uniformly reports irrespective of type of template and way of data loading.

First steps


For work with XLS already then there were many different libraries (POI-HSSF, JXLS, etc.) and have been decided to use Apache POI, as the popular most at that time. And here for work with DOC files of such variety it was not observed. Options was very little: to use UNO Runtime? API for integration into the Open Office server or to work with DOC files through COM-objects. The POI-HWPF project then was in embryo (it has nearby left and now). We have decided to use integration with Open Office because have seen many positive responses from people who were successfully integrated with Open Office in absolutely different languages (Python, Ruby, C#).
If with POI-HSSF everything was more or less simple (except for total absence of possibility of work with diagrams), with UNO Runtime we had to have set of problems.
  1. There is no distinct API for work with tables. For example, to copy line of the table, it is necessary to use system clipboard (selecting line, copying and inserting it into the right place).
  2. For each generation of the report process of Open Office is generated (and it is destroyed after printing). Initially we used bootstrapconnector library for generation of processes, but were soon convinced that in many cases it leaves process in live (in the hung-up status) or does not try to complete at all process that led to collapse of system through some time. We had to copy logic of start and destruction of processes of Open Office, having used practices of the children who have written jodconverter.
  3. UNO Runtime (and Open Office the server) has problems with thread safety because of what under loading the server can hang up or suddenly stop because of internal error. It has led to what was necessary to do the mechanism of restart of reports (if the report was not printed? to try to print it once again). It, naturally, affects the speed of work of this type of reports.


Docx4j


Long time we used only XLS and DOC templates, but then has been decided to support also XLSX and DOCX. The choice has fallen on DOCX4J library which has gained popularity by then.
The important advantage of this library for us was that it provides low-level access to structure of the document (actually operating with XML). On the one hand it has a little complicated code and logic, and with another has opened almost boundless opportunities for management of the document as any operations over it were possible now.
Opportunity to refuse start of Open Office for generation of DOCX reports became even more serious advantage.

The example is more difficult


Let's provide that we have bookstore. Let's try make the report displaying in the XLS list of shops and the list of the books sold in each of shops by means of our library.
Let's provide also that we (owners of shop) do not know the Java programming language at all, but on our happiness our system administrator is familiar with SQL, and we even have database containing information on all sales.
First of all let's create report template in the xls format. At once we will note report bands by means of the named regions.
YARG? open-source library for reports generation
Then we will describe data loading by means of SQL.

select shop.id as "id", shop.name as "name", shop.address as "address" 
from store shop

select book.author as "author", book.name as "name", book.price as "price",  count(*) as "count" 
from book book where book.store_id = ${Shop.id} 
group by book.author, book.name, book.price

Now we have to describe the report by means of XML.
<?xml version="1.0" encoding="UTF-8"?>
<report name="report">
    <templates>
        <template code="DEFAULT" documentName="bookstore.xls" documentPath="./test/sample/bookstore/bookstore.xls" outputType="xls" outputNamePattern="bookstore.xls"/>
    </templates>
    <rootBand name="Root" orientation="H">
        <bands>
            <band name="Header" orientation="H"/>
            <band name="Shop" orientation="H">
                <bands>
                    <band name="Book" orientation="H">
                        <queries>
                            <query name="Book" type="sql">
                                <script>
                                    select book.author as "author", book.name as "name", book.price as "price",  count(*) as "count" from book  where book.store_id = ${Shop.id} group by book.author, book.name, book.price
                                </script>
                            </query>
                        </queries>
                    </band>
                </bands>
                <queries>
                    <query name="Shop" type="sql">
                        <script>
                            select shop.id as "id", shop.name as "name", shop.address as "address" from store shop
                        </script>
                    </query>
                </queries>
            </band>
        </bands>
        <queries/>
    </rootBand>
</report>

Having started the report from command line, we will receive the following document
YARG? open-source library for reports generation
In the report we see that one band can refer to another. The band of Book refers to Shop band, thus for each shop we select the list of the books sold in it. The band of Book is enclosed in Shop.

Still example


Now we will provide that our shop has secured the large order and we need to make out bill to the customer. Let's try to create the report in which as template the document DOCX is used, and the result is converted in PDF. For a change we will describe data loading Groovy-script.
<?xml version="1.0" encoding="UTF-8"?>
<report name="report">
    <templates>
        <template code="DEFAULT" documentName="invoice.docx" documentPath="./test/sample/invoice/invoice.docx" outputType="pdf" outputNamePattern="invoice.pdf"/>
    </templates>
    <formats>
        <format name="Main.date" format="dd/MM/yyyy"/>
        <format name="Main.signature" format="${html}"/>
    </formats>
    <rootBand name="Root" orientation="H">
        <bands>
            <band name="Main" orientation="H">
                <queries>
                    <query name="Main" type="groovy">
                        <script>
                            return [
                              [
                               'invoiceNumber':99987,
                               'client' : 'Google Inc.',
                               'date' : new Date(),
                               'addLine1': '1600 Amphitheatre Pkwy',
                               'addLine2': 'Mountain View, USA',
                               'addLine3':'CA 94043',
                               'signature':<![CDATA['<html><body><b><font color="red">Mr. Yarg</font></b></body></html>']]>
                            ]]
                        </script>
                    </query>
                </queries>
            </band>
            <band name="Items" orientation="H">
                <queries>
                    <query name="Main" type="groovy">
                        <script>
                            return [
                                ['name':'Java Concurrency in practice', 'price' : 15000],
                                ['name':'Clear code', 'price' : 13000],
                                ['name':'Scala in action', 'price' : 12000]
                            ]
                        </script>
                    </query>
                </queries>
            </band>
        </bands>
        <queries/>
    </rootBand>
</report>

It is possible to notice, what the Groovy-script returns the list of associative arrays as result (if is more exact? List <Map <String, Object>). Thus, each element of the list represents line with the named data (key? parameter name, value? parameter).
Now we will create account template. In the table on top we will place name and the client's address, and also date of drawing of the account.
Further we will create the table with the list of goods for which bill is made out. In order that table 2 has been attached to the list of goods, we will insert into the first cell special marker (##band=Items).
YARG? open-source library for reports generation
Having started the report, we will see the following.
YARG? open-source library for reports generation

Integration and expansion of functionality


The library was initially designed for expansion and integration into different applications. Use of YARG in the CUBA platform can be example of such integration. As IoC-framework we use Spring. Let's look as YARG it can be built in Spring.

<bean id="reporting_lib_Scripting" class="com.haulmont.reports.libintegration.ReportingScriptingImpl"/>
<bean id="reporting_lib_GroovyDataLoader" class="com.haulmont.yarg.loaders.impl.GroovyDataLoader">
<constructor-arg ref="reporting_lib_Scripting"/>
</bean>
<bean id="reporting_lib_SqlDataLoader" class="com.haulmont.yarg.loaders.impl.SqlDataLoader">
<constructor-arg ref="dataSource"/>
</bean>
<bean id="reporting_lib_JpqlDataLoader" class="com.haulmont.reports.libintegration.JpqlDataDataLoader"/>
<bean id="reporting_lib_OfficeIntegration"
      class="com.haulmont.reports.libintegration.CubaOfficeIntegration">
<constructor-arg value="${cuba.reporting.openoffice.path?:/}"/>
<constructor-arg>
    <list>
        <value>8100</value>
        <value>8101</value>
        <value>8102</value>
        <value>8103</value>
    </list>
</constructor-arg>
<property name="displayDeviceAvailable">
    <value>${cuba.reporting.displayDeviceAvailable?:false}</value>
</property>
<property name="timeoutInSeconds">
    <value>${cuba.reporting.openoffice.docFormatterTimeout?:20}</value>
</property>
</bean>
<bean id="reporting_lib_FormatterFactory"
      class="com.haulmont.yarg.formatters.factory.DefaultFormatterFactory">
<property name="officeIntegration" ref="reporting_lib_OfficeIntegration"/>
</bean>
<bean id="reporting_lib_LoaderFactory" class="com.haulmont.yarg.loaders.factory.DefaultLoaderFactory">
<property name="dataLoaders">
    <map>
        <entry key="sql" value-ref="reporting_lib_SqlDataLoader"/>
        <entry key="groovy" value-ref="reporting_lib_GroovyDataLoader"/>
        <entry key="jpql" value-ref="reporting_lib_JpqlDataLoader"/>
    </map>
</property>
</bean>
<bean id="reporting_lib_Reporting" class="com.haulmont.yarg.reporting.Reporting">
<property name="formatterFactory" ref="reporting_lib_FormatterFactory"/>
<property name="loaderFactory" ref="reporting_lib_LoaderFactory"/>
</bean>

The main bean in this description? reporting_lib_Reporting. It provides access to the main functionality of library? to creation of reports. For normal functioning it is necessary to define factory of the formatters (working with different document types? DOCX, XLSX, DOC, etc.) and factory of the loaders (loading data). Also, if you are going to use DOC reports, it is necessary to set bin of reporting_lib_OfficeIntegration which is responsible for integration with Open Office (by means of which DOC and ODT reports are processed).
It is necessary to notice that for adding, for example, of the new loader it is not necessary to redefine any classes of library, it is enough to add of it to the description of dataLoaders property in reporting_lib_LoaderFactory bin. What we in principle have also made, having added jpql the loader of data (<entry key =? jpql? value-ref=?reporting_lib_JpqlDataLoader?/>).
For more serious changes it is possible to inherit library classes or to create the from scratch, realizing the provided interfaces. Practically all functionality of library is connected via interfaces and easily extends.

Standalone mode


One more feature of YARG library is that it is possible to use it as standalone application for reports generation. Thus, having the set JRE on the computer, you can generate reports from command line. For example, you have server application for PHP and you want to generate XLS reports. It is enough to you to create XLS template, XML description of the report and after that you by means of simple console team will be able to generate the report.
Team example:
yarg -rp ~/report.xml -op ~/result.xls ?-Pparam1=20/04/2014?


Conclusion


I will give some screenshots of UI which is provided by the CUBA platform for creation of reports on the YARG engine as the conclusion:

Fragments of the editor of the report
YARG? open-source library for reports generation

YARG? open-source library for reports generation


Master of creation of reports
YARG? open-source library for reports generation
YARG? open-source library for reports generation
YARG? open-source library for reports generation
YARG? open-source library for reports generation


And report example with diagrams:

The report with diagrams
Report template with the diagram and the chart:
YARG? open-source library for reports generation

The ready report with the diagram and the chart
YARG? open-source library for reports generation


This article is a translation of the original post at habrahabr.ru/post/224125/
If you have any questions regarding the material covered in the article above, please, contact the original author of the post.
If you have any complaints about this article or you want this article to be deleted, please, drop an email here: sysmagazine.com@gmail.com.

We believe that the knowledge, which is available at the most popular Russian IT blog habrahabr.ru, should be accessed by everyone, even though it is poorly translated.
Shared knowledge makes the world better.
Best wishes.

comments powered by Disqus