Data Batching
In scenarios like generating individual documents for each customer, data batching is a more efficient option than sending separate document generation requests. This is accomplished by providing SQL that returns multiple filenames along with corresponding data sections.
There are two methods to implement data batching:
Manual Data Batching
In this approach, SQL is structured to generate multiple filenames and corresponding data sections. AOP then creates separate documents for each dataset. The generated output files will be returned in a ZIP file.
Example
The following example demonstrates SQL that generates three separate documents, one for each dataset.
For data batching with template types such as DOCX, XLSX, and PPTX, where the output type is same as template type, the generated documents will be returned in a ZIP file containing the respective generated documents for each batch.
However, if the output type is PDF, it is possible to merge all the generated document by performing the merge operation using the following Init PL/SQL.
- SQL
- JSON
aop_api_pkg.g_output_merge := 'true';
"output_merge": true;
Data Source
The SQL query generates distinct objects with a filename and data for each selected customer with IDs 1, 2, and 3. The data contains customer information, which AOP uses for data replacement via tags. A visual representation of the batching process can be seen in the following image.
- SQL
- PL/SQL returning SQL
- PL/SQL returning JSON
- JSON
select
'file_' || to_char(d.customer_id) as "filename",
cursor(
select
c.cust_first_name as "cust_first_name",
c.cust_last_name as "cust_last_name",
c.cust_city as "cust_city",
cursor(select o.order_total as "order_total",
'Order ' || rownum as "order_name",
cursor(select p.product_name as "product_name",
i.quantity as "quantity",
i.unit_price as "unit_price", APEX_WEB_SERVICE.BLOB2CLOBBASE64(p.product_image) as "image"
from aop_sample_order_items i, aop_sample_product_info p
where o.order_id = i.order_id
and i.product_id = p.product_id
) "product"
from aop_sample_orders o
where c.customer_id = o.customer_id
) "orders"
from aop_sample_customers c
where c.customer_id = d.customer_id
) as "data"
from aop_sample_customers d
where d.customer_id in (1,2,3)
declare
l_return clob;
begin
l_return := q'[
select
'file_' || to_char(d.customer_id) as "filename",
cursor(
select
c.cust_first_name as "cust_first_name",
c.cust_last_name as "cust_last_name",
c.cust_city as "cust_city",
cursor(select o.order_total as "order_total",
'Order ' || rownum as "order_name",
cursor(select p.product_name as "product_name",
i.quantity as "quantity",
i.unit_price as "unit_price", APEX_WEB_SERVICE.BLOB2CLOBBASE64(p.product_image) as "image"
from aop_sample_order_items i, aop_sample_product_info p
where o.order_id = i.order_id
and i.product_id = p.product_id
) "product"
from aop_sample_orders o
where c.customer_id = o.customer_id
) "orders"
from aop_sample_customers c
where c.customer_id = d.customer_id
) as "data"
from aop_sample_customers d
where d.customer_id in (1,2,3)
]';
return l_return;
end;
declare
l_cursor sys_refcursor;
l_return clob; -- can also be varchar2, make sure it corresponds to the structure in the JSON
begin
apex_json.initialize_clob_output(dbms_lob.call, true, 2);
open l_cursor for
select
'file_' || to_char(d.customer_id) as "filename",
cursor(
select
c.cust_first_name as "cust_first_name",
c.cust_last_name as "cust_last_name",
c.cust_city as "cust_city",
cursor(select o.order_total as "order_total",
'Order ' || rownum as "order_name",
cursor(select p.product_name as "product_name",
i.quantity as "quantity",
i.unit_price as "unit_price", APEX_WEB_SERVICE.BLOB2CLOBBASE64(p.product_image) as "image"
from aop_sample_order_items i, aop_sample_product_info p
where o.order_id = i.order_id
and i.product_id = p.product_id
) "product"
from aop_sample_orders o
where c.customer_id = o.customer_id
) "orders"
from aop_sample_customers c
where c.customer_id = d.customer_id
) as "data"
from aop_sample_customers d
where d.customer_id in (1,2,3)
apex_json.write(l_cursor);
l_return := apex_json.get_clob_output;
return l_return;
end;
[
{
"filename": "file_1",
"data": [
{
"cust_first_name": "John",
"cust_last_name": "Dulles",
"cust_city": "Sterling",
"orders": [
{
"order_total": 2380,
"order_name": "Order 1",
"product": [
{
"product_name": "Business Shirt",
"quantity": 3,
"unit_price": 50,
"image": " --- base64encoded string of Image --- "
},
{
"product_name": "Trousers",
"quantity": 3,
"unit_price": 80,
"image": " --- base64encoded string of Image --- "
},
{
"product_name": "Jacket",
"quantity": 3,
"unit_price": 150,
"image": " --- base64encoded string of Image --- "
},
{
"product_name": "Blouse",
"quantity": 3,
"unit_price": 60,
"image": " --- base64encoded string of Image --- "
},
{
"product_name": "Skirt",
"quantity": 3,
"unit_price": 80,
"image": " --- base64encoded string of Image --- "
},
{
"product_name": "Ladies Shoes",
"quantity": 2,
"unit_price": 120,
"image": " --- base64encoded string of Image --- "
},
{
"product_name": "Belt",
"quantity": 2,
"unit_price": 30,
"image": " --- base64encoded string of Image --- "
},
{
"product_name": "Bag",
"quantity": 4,
"unit_price": 125,
"image": " --- base64encoded string of Image --- "
},
{
"product_name": "Mens Shoes",
"quantity": 2,
"unit_price": 110,
"image": " --- base64encoded string of Image --- "
},
{
"product_name": "Wallet",
"quantity": 2,
"unit_price": 50,
"image": " --- base64encoded string of Image --- "
}
]
}
]
}
]
},
{
"filename": "file_2",
"data": [
{
"cust_first_name": "William",
"cust_last_name": "Hartsfield",
"cust_city": "Atlanta",
"orders": [
{
"order_total": 1640,
"order_name": "Order 1",
"product": [
{
"product_name": "Blouse",
"quantity": 4,
"unit_price": 60,
"image": " --- base64encoded string of Image --- "
},
{
"product_name": "Skirt",
"quantity": 4,
"unit_price": 80,
"image": " --- base64encoded string of Image --- "
},
{
"product_name": "Ladies Shoes",
"quantity": 4,
"unit_price": 120,
"image": " --- base64encoded string of Image --- "
},
{
"product_name": "Bag",
"quantity": 4,
"unit_price": 125,
"image": " --- base64encoded string of Image --- "
},
{
"product_name": "Wallet",
"quantity": 2,
"unit_price": 50,
"image": " --- base64encoded string of Image --- "
}
]
},
{
"order_total": 730,
"order_name": "Order 2",
"product": [
{
"product_name": "Blouse",
"quantity": 4,
"unit_price": 60,
"image": " --- base64encoded string of Image --- "
},
{
"product_name": "Skirt",
"quantity": 3,
"unit_price": 80,
"image": " --- base64encoded string of Image --- "
},
{
"product_name": "Bag",
"quantity": 2,
"unit_price": 125,
"image": " --- base64encoded string of Image --- "
}
]
}
]
}
]
},
{
"filename": "file_3",
"data": [
{
"cust_first_name": "Edward",
"cust_last_name": "Logan",
"cust_city": "East Boston",
"orders": [
{
"order_total": 1515,
"order_name": "Order 1",
"product": [
{
"product_name": "Jacket",
"quantity": 3,
"unit_price": 150,
"image": " --- base64encoded string of Image --- "
},
{
"product_name": "Ladies Shoes",
"quantity": 3,
"unit_price": 120,
"image": " --- base64encoded string of Image --- "
},
{
"product_name": "Bag",
"quantity": 3,
"unit_price": 125,
"image": " --- base64encoded string of Image --- "
},
{
"product_name": "Mens Shoes",
"quantity": 3,
"unit_price": 110,
"image": " --- base64encoded string of Image --- "
}
]
},
{
"order_total": 905,
"order_name": "Order 2",
"product": [
{
"product_name": "Business Shirt",
"quantity": 2,
"unit_price": 50,
"image": " --- base64encoded string of Image --- "
},
{
"product_name": "Trousers",
"quantity": 2,
"unit_price": 80,
"image": " --- base64encoded string of Image --- "
},
{
"product_name": "Blouse",
"quantity": 2,
"unit_price": 60,
"image": " --- base64encoded string of Image --- "
},
{
"product_name": "Skirt",
"quantity": 2,
"unit_price": 80,
"image": " --- base64encoded string of Image --- "
},
{
"product_name": "Belt",
"quantity": 3,
"unit_price": 30,
"image": " --- base64encoded string of Image --- "
},
{
"product_name": "Bag",
"quantity": 1,
"unit_price": 125,
"image": " --- base64encoded string of Image --- "
},
{
"product_name": "Wallet",
"quantity": 3,
"unit_price": 50,
"image": " --- base64encoded string of Image --- "
}
]
}
]
}
]
}
]
Template
The template is a standard invoice format that can be used for each customer's details.
Output
After processing the provided data source and template with an output type of docx, AOP returns a ZIP file containing three generated output documents.
When used the output type pdf and Init PL/SQL aop_api_pkg.g_output_merge := 'true';
, the resultant following merged PDF output will be generated.
AOP Internal Data Batching
Available From: v24.3When the provided SQL generates a very large dataset and AOP is unable to generate a single document for DOCX, XLSX, PPTX, or PDF, AOP Internal Data Batching can be useful.
Some applications and use cases for AOP Internal Data Batching are listed below:
- Generating a single file per customer.
- For PDF output, large datasets with output type PDF may cause a timeout error. In such cases, batching the data and merging is the best solution.
- AOP Internal Batching provides the following internal tags, which can be used as conditional tags to add separate sections based on requirements (as shown in the example):
batch_number
– returns the batch number starting from 1 tobatch_size
.batch_condition
– returns the conditions on which the batch is generated.last_batch
– returns true or false.
There are two ways to perform batching, which are listed as follows:
To enable batching, the following Init PL/SQL should be provided.
- SQL
- JSON
aop_api_pkg.g_output_batch_selector varchar2(500 char) := null; -- The hierarchy of data selector to point the data key that needs to be batch/split, e.g: orders:products
aop_api_pkg.g_output_batch_size number := null; -- Number of batches to generate; refers to number of files to split into
aop_api_pkg.g_output_batch_condition varchar2(500char) := null; -- Batch condition to generate batch E.g-1: category , E.g-2: unit_price > 100? "Expensive" : unit_price < 50 ? "Cheap" : "Medium"
"output": {
"batch_selector" : "", // e.g: orders:product
"batch_size" : 2, // number of batches or no of files to split into
"batch_condition" : "" // conitions such as e.g-1: category , e.g-2: unit_price > 100? "Expensive" : unit_price < 50 ? "Cheap" : "Medium"
}
Enabling the batch selector activates data batching. The batch selector is a hierarchy of keys or columns that AOP uses to split the data into batches. For instance, if the batch selector is customer:order:product
, the final selector product
(the product cursor in SQL) must always return one or more data arrays, while the preceding selectors customer
and order
must each return an array containing a single data entry. The hierarchy must also be maintained, meaning the customer
cursor should include order
, and the order
cursor should include product
.
If no batch size or condition is specified, two batches will be created by default. When a batch size is provided, AOP will evenly distribute the data on each batch based on the data length of the final selector.
Batch by Size
Batch by size refers to batching of the data based on the number of files to generate. Following Init PL/SQL is used to enable data batch by size.
- SQL
- JSON
aop_api_pkg.g_output_batch_selector varchar2(500 char) := null; -- The hierarchy of data selector to point the data key that needs to be batch/split, e.g: orders:products
aop_api_pkg.g_output_batch_size number := null; -- Number of batches to generate; refers to number of files to split into
"output": {
"batch_selector" : "", // e.g: orders:product
"batch_size" : 2, // number of batches or no of files to split into
}
Example 1
Following example demonstrates the data batching by size for the template type DOCX and XLSX and the output type is same as template type resulting all outputs in the ZIP.
Data Source
The data source for batching by size is given as:
- SQL
- PL/SQL returning SQL
- PL/SQL returning JSON
- JSON
select
'file1' as "filename",
cursor(
select
c.cust_first_name as "cust_first_name",
c.cust_last_name as "cust_last_name",
c.cust_city as "cust_city",
cursor(select o.order_total as "order_total",
'Order ' || rownum as "order_name",
cursor(select p.product_name as "product_name",
p.category as "category",
i.quantity as "quantity",
i.unit_price as "unit_price",
APEX_WEB_SERVICE.BLOB2CLOBBASE64(p.product_image) as "image",
40 as "image_max_width"
from aop_sample_order_items i, aop_sample_product_info p
where o.order_id = i.order_id
and i.product_id = p.product_id
) "product"
from aop_sample_orders o
where c.customer_id = o.customer_id
) "orders"
from aop_sample_customers c
where customer_id = 1
) as "data"
from dual
declare
l_return clob;
begin
l_return := q'[
select
'file1' as "filename",
cursor(
select
c.cust_first_name as "cust_first_name",
c.cust_last_name as "cust_last_name",
c.cust_city as "cust_city",
cursor(select o.order_total as "order_total",
'Order ' || rownum as "order_name",
cursor(select p.product_name as "product_name",
p.category as "category",
i.quantity as "quantity",
i.unit_price as "unit_price",
APEX_WEB_SERVICE.BLOB2CLOBBASE64(p.product_image) as "image",
40 as "image_max_width"
from aop_sample_order_items i, aop_sample_product_info p
where o.order_id = i.order_id
and i.product_id = p.product_id
) "product"
from aop_sample_orders o
where c.customer_id = o.customer_id
) "orders"
from aop_sample_customers c
where customer_id = 1
) as "data"
from dual
]';
return l_return;
end;
declare
l_cursor sys_refcursor;
l_return clob; -- can also be varchar2, make sure it corresponds to the structure in the JSON
begin
apex_json.initialize_clob_output(dbms_lob.call, true, 2);
open l_cursor for
select
'file1' as "filename",
cursor(
select
c.cust_first_name as "cust_first_name",
c.cust_last_name as "cust_last_name",
c.cust_city as "cust_city",
cursor(select o.order_total as "order_total",
'Order ' || rownum as "order_name",
cursor(select p.product_name as "product_name",
p.category as "category",
i.quantity as "quantity",
i.unit_price as "unit_price",
APEX_WEB_SERVICE.BLOB2CLOBBASE64(p.product_image) as "image",
40 as "image_max_width"
from aop_sample_order_items i, aop_sample_product_info p
where o.order_id = i.order_id
and i.product_id = p.product_id
) "product"
from aop_sample_orders o
where c.customer_id = o.customer_id
) "orders"
from aop_sample_customers c
where customer_id = 1
) as "data"
from dual
apex_json.write(l_cursor);
l_return := apex_json.get_clob_output;
return l_return;
end;
[
{
"cust_first_name": "John",
"cust_last_name": "Dulles",
"cust_city": "Sterling",
"orders": [
{
"order_total": 2380,
"order_name": "Order 1",
"product": [
{
"product_name": "Business Shirt",
"quantity": 3,
"unit_price": 50,
"category": "Mens",
"image": " --- base64encoded string of Image ---",
"image_max_width": 40
},
{
"product_name": "Trousers",
"quantity": 3,
"unit_price": 80,
"category": "Mens",
"image": " --- base64encoded string of Image ---",
"image_max_width": 40
},
{
"product_name": "Jacket",
"quantity": 3,
"unit_price": 150,
"category": "Mens",
"image": " --- base64encoded string of Image ---",
"image_max_width": 40
},
{
"product_name": "Blouse",
"quantity": 3,
"unit_price": 60,
"category": "Womens",
"image": " --- base64encoded string of Image ---",
"image_max_width": 40
},
{
"product_name": "Skirt",
"quantity": 3,
"unit_price": 80,
"category": "Womens",
"image": " --- base64encoded string of Image ---",
"image_max_width": 40
},
{
"product_name": "Ladies Shoes",
"quantity": 2,
"unit_price": 120,
"category": "Womens",
"image": " --- base64encoded string of Image ---",
"image_max_width": 40
},
{
"product_name": "Belt",
"quantity": 2,
"unit_price": 30,
"category": "Accessories",
"image": " --- base64encoded string of Image ---",
"image_max_width": 40
},
{
"product_name": "Bag",
"quantity": 4,
"unit_price": 125,
"category": "Accessories",
"image": " --- base64encoded string of Image ---",
"image_max_width": 40
},
{
"product_name": "Mens Shoes",
"quantity": 2,
"unit_price": 110,
"category": "Mens",
"image": " --- base64encoded string of Image ---",
"image_max_width": 40
},
{
"product_name": "Wallet",
"quantity": 2,
"unit_price": 50,
"category": "Accessories",
"image": " --- base64encoded string of Image ---",
"image_max_width": 40
}
]
}
]
}
]
Following Init PL/SQL is used to provide the batch selector and size:
- SQL
- JSON
aop_api_pkg.g_output_batch_selector := 'orders:products';
aop_api_pkg.g_output_batch_size := 3;
"output": {
"batch_selector" : "orders:products",
"batch_size" : 3,
}
Template
Here the following DOCX and XLSX template are used.
Output
Upon processing the above template and data source, AOP will return the ZIP file including the three generated DOCX and XLSX files with evenly distrubuted products.
Example 2
Following example demonstrates the data batching by size for the template type DOCX and output type PDF.
Data Source
Data Source SQL is same as previous example's data Source.
Following Init PL/SQL is used to provide the batch selector, batch size and merge:
- SQL
- JSON
aop_api_pkg.g_output_batch_selector := 'orders:products';
aop_api_pkg.g_output_batch_size := 3;
aop_api_pkg.g_output_merge := 'true';
"output": {
"output_type":"pdf",
"batch_selector" : "orders:products",
"batch_size" : 3,
"output_merge": true
}
Template
The DOCX template includes the tags like {#batch_number==1} ... {/}
which encloses the section that needs to be printed in 1st batch file only, and {#last_batch} ... {/}
encloses the texts that needs to be enclosed in the last batch only. The values for the batch_number
and last_batch
is created by AOP when creating the data batches.
Output
Upon processing the given data source and template, following output is obtained.
Batch by Condition
Batch by condition refers to batching of the data based on the condition. Condition can be provided as an angular expressions which results in true or false, ternary operator result values, or the distinct values etc.
Some of the examples of conditional expressions are:
"batch_condition":"unit_price > 80"
// this creates two batches, first batch which satisfies the condition (batch with `batch_condition: true`) and other batch which doesnot satisfies the condition (batch with `batch_condition: false`)
"batch_condition":"category"
// this will batch the data on the basis of uniqueness/distinct values of category. For each unique value of category, separate batches will be generated.
"batch_condition":"unit_price > 100? \"Expensive\" : unit_price < 50 ? \"cheap\" : \"Medium\""
// Three batches will be generated with batch_condition:"Expensive", batch_condition:"cheap" and batch_condition:"Medium"
"batch_condition":"product_name.includes(\"Shoes\")"
// two batches, which gives true and false for the statement.
The conditional expression/statement supports all the conditional operator and functions documented at Simple Angular like expressions
The keys which are used in the condition expression/statement should belong to the cursor of the final selector.
Following Init PL/SQL is used to enable data batch by condition:
- SQL
- JSON
aop_api_pkg.g_output_batch_selector varchar2(500 char) := null; -- The hierarchy of data selector to point the data key that needs to be batch/split, e.g: orders:products
aop_api_pkg.g_output_batch_condition varchar2(500char) := null; -- Batch condition to generate batch E.g-1: category , E.g-2:
"output": {
"batch_selector" : "", // e.g: orders:product
"batch_condition" : "" // conditions such as e.g-1: category , e.g-2: unit_price > 100? "Expensive" : unit_price < 50 ? "Cheap" : "Medium"
}
Example 1
Following example demonstrates the data batching by condition that uses ternary operator for the template type DOCX and the output type PDF with the merge operation.
Data Source
The data source for batching by condition is similar to the previous data source.
Following Init PL/SQL is used to provide the batch selector and condition:
- SQL
- JSON
aop_api_pkg.g_output_batch_selector := 'orders:products';
aop_api_pkg.g_output_batch_condition := 'unit_price > 110? "Expensive" : unit_price < 80 ? "Cheap" : "Medium"';
aop_api_pkg.g_merge := 'true';
"output": {
"output_type":"pdf",
"batch_selector" : "orders:products",
"batch_condition": "unit_price > 110? \"Expensive\" : unit_price < 80 ? \"Cheap\" : \"Medium\"",
"output_merge": true
}
Template
For the template, following DOCX template format is used.
template.docx
Output
Upon processing the above template and data source, AOP will return the PDF file including the merged document contents based on the given condition.
Example 2
Following example demonstrates the data batching by condition based on the distinct values of the key and uses template type DOCX and output type PDF with the merge operation.
Data Source
Data Source SQL is same as of previous example's data source.
Following Init PL/SQL is used to provide the batch selector, batch condition and merge:
- SQL
- JSON
aop_api_pkg.g_output_batch_selector := 'orders:product';
aop_api_pkg.g_output_batch_condition := 'category';
aop_api_pkg.g_output_merge := 'true';
"output": {
"batch_selector" : "orders:product",
"batch_condition" : "category",
"output_merge": true
}
Template
For the template, following template format the which includes the tags like {#batch_number==1} ... {/}
which enclosed the section that needs to be printed in 1st file only, and {#last_batch} ... {/}
encloses the texts that needs to be enclosed in the last batch only. The values for batch_condition
, batch_number
and last batch
will be created by AOP when creating the batches.
Output
Upon processing the given data source and template, following PDF output is obtained.