Chapter 2. Burst Excel Reports

2.1. Overview
2.2. Burst by Distinct Sheets
2.3. Burst by Distinct Column Values
2.3.1. How It Works
2.3.2. Steps to Follow When Bursting an Excel Report by Distinct Column Values
2.3.3. Sample - Customers-Distinct-Column-Values.xls

DocumentBurster can burst and split reports which are generated for the following versions of Microsoft Excel spreadsheet application

Although it does complex report processing, DocumentBurster doesn't require Microsoft Excel application to be installed on the machine where DocumentBurster is deployed. Even more, being a cross platform software, DocumentBurster can process Excel reports on a Windows or on any UNIX/Linux machine. A common installation scenario is to have DocumentBurster deployed on a separate dedicated server machine which is responsible for doing asynchronous batch report bursting and scheduled report distribution.

In order to achieve more business scenarios, DocumentBurster supports two different ways of bursting Excel reports

Depending on each specific business situation, one or the other of the two distinct bursting methods will be more appropriate. Following paragraphs will give details about each of the supported bursting methods.

When bursting Excel reports using this method, DocumentBurster will generate a separate output file for each Excel sheet which is found in the input report.

The best way to describe this bursting method is by looking at samples/Payslips-Distinct-Sheets.xls sample report. This report is provided with DocumentBurster and contains monthly income data for three fictional employees. The data for each employee is found on a separate sheet and the name of the sheet is the employee's email address.

The input report has the following three distinct sheets cgrew@northridgehealth.org, kbutford@northridgehealth.org and awaldback@northridgehealth.org which, after bursting, will generate the following three distinct files

Using email addresses as sheet names for the input report can help when distributing the generated output files by email (if the respective email addresses would actually be valid and existing email addresses). For more details about distributing the generated output reports by email please check Chapter 3, Distribute Reports .

This bursting method is straightforward and there is not much more to discuss about it. Following paragraph will present a more complex report bursting method which can generate a separate output file for each distinct value found in a specified column from a specified sheet.

Note

By default, DocumentBurster will burst Excel reports by distinct sheets and the names of the sheets will be used as burst tokens.

Note

Beside payslips, DocumentBurster can be configured to break up and distribute any other Excel report types such as invoices, statements, purchase orders, commissions reports or dunning letters.

When bursting Excel reports using this method, DocumentBurster will generate a separate output file for each distinct value found in a specified column from a specified sheet.

Please check the existing sample report samples/Customers-Distinct-Column-Values.xls which can be found in samples folder. Customers-Distinct-Column-Values.xls report contains a list of customers from various countries which, after bursting, DocumentBurster will generate a separate file with each file containing the customers for the relevant country.

Each configured burst token will generate a different output file.

The last sheet of the input report samples/Customers-Distinct-Column-Values.xls is called burst and contains the metadata required for bursting the report.

Important - Replace All Excel Formulas With the Corresponding Calculated Values

In order to burst by distinct column values all the formulas (if the report contains formulas) should be pre-calculated before the report is sent for processing to DocumentBurster™ .

Microsoft Excel documentation - Replace a formula with its result

Following screenshot shows how the (meta)information burst sheet should look.

(*)burstMethod : (distinct-sheets|distinct-column-values) - Required information which specifies the bursting method to be used. Following two valid values are possible

  • distinct-column-values – should be used when splitting the report by the distinct values found in a column
  • distinct-sheets – should be used when splitting the input report by each distinct sheet

Default value is distinct-column-values.

(*)burstMethod is an information which is mandatory to be provided.

burstSheetIndex - 0 (zero) based index of the sheet which needs to be split. For example, if it is required to split the first sheet then the index should be 0, if second sheet should be split then the index should be 1 and so on.

-1 is an allowed conventional value which means to split the first sheet of the Excel report.

Default value is -1.

burstColumnIndex - 0 (zero) based index of the column to be used when splitting the report. If it is required to split using the first column then burstColumnIndex should be defined as 0, if second column is required for splitting then burstColumnIndex should be defined as 1 and so on.

Usually the splitting column will be the last column from the splitting sheet. -1 is an allowed conventional value which means to split using the last column from the splitting sheet.

Default value is -1.

(*)burstTokens - required information which specifies the list of distinct burst tokens for which DocumentBurster should generate a separate output file. For example, in the above screenshot the list of burst tokens contains the set of distinct countries for which DocumentBurster should generate an output file.

(*)burstTokens is an information which is mandatory to be provided.

userVariables - information to be provided in case there is a need to specify custom user variables for each distinct burst token. For more details on how to define user variables when bursting Excel reports please read Chapter 5, Variables .

configFile - information to be provided in case there is a need to process distinct Excel report types by using a different set of program configurations. By default, if no custom configFile is provided, DocumentBurster is processing all Excel reports using the default configuration file ./config/burst/settings.xml. This should be fine in most situations, however there are cases when different reports will require different sets of configurations. For more details about how to process Excel reports using a custom configuration file please read Section 4.2, ?Process Excel Reports Using a Custom Configuration File ? .

Next screenshot shows how the report samples/Customers-Distinct-Column-Values.xls was configured in order to be processed by DocumentBurster™ . For reasons of better visualization, the last column which is only used in this sample report for the purpose of report bursting, it is highlighted with the yellow color. It is not mandatory that the bursting column should have a specific color. Usually the burst column will be a hidden Excel column which will not be visible to the normal report users but only defined for the purpose of bursting the report.

(1) - DocumentBurster will keep the "empty rows" in the output reports. This is useful for keeping in the output files various headers and/or footers which a report might have.

(2) - Last column contains the values from which DocumentBurster will decide which Excel row goes to which output file. It can have any color and will usually be a hidden Excel column.

The previous steps will be exemplified when bursting the same samples/Customers-Distinct-Column-Values.xls.

  1. Identify the appropriate burst token.

    The report to be burst contains a list of customers and each customer has a country of origin. The requirement is to break the report by country and each output file should contain the customers from the relevant country.

    The burst token is identified as being the Country column.

  2. Configure and prepare the input Excel report to include the (meta)information required for bursting the document.

    In order to burst by distinct column values it is mandatory that the last sheet of the input report will be called burst and this sheet should contain the proper (meta)information required for bursting the report.

    The last column from the splitting sheet is yellow (for better visualization) and is added only for the purpose of bursting the report. The bursting column doesn't need to have a specific color and will usually be an invisible Excel column (so that the normal report user will not see it).

    (1) - Extra column which is added for the purpose of splitting the report. It can be defined as a hidden Excel column so that it will be invisible to the normal report users.

  3. Calculate and replace Excel formulas with the corresponding values

    Customers-Distinct-Column-Values.xls sample report doesn't contain any Excel formula, so there is nothing to do for this report in regards with that.

  4. Burst the report and check the output files.

    Burst samples/Customers-Distinct-Column-Values.xls report and check the list of output files, one Excel file for each country with each output file containing the list of customers from the specific country.