Below are the recommendations for running Reporting in Ephesoft:
- It is advisable that Ephesoft Reporting Service is running on a UI Server. Running Reporting Service on an executing server may cause delays in Report ETL Job execution due to high CPU utilization by batches.
- Dashboard and Standard report Scripts and Cron jobs have been merged into a single entity. This means both Dashboard and Throughput Reports will be updated at the same time on UI. Cron for this common ETL Script (Dashboard) should be frequent ~ every 15-20 minutes (Every 15 minutes by default)
- Clean-up algorithm will delete data for FINISHED batches for which all Reports have been successfully run (As per license). Along with this, XMLs from SharedFolders/report-data and Activiti tables for the same batch instances will also be cleared.
- It is recommended that all cron jobs should be configured so that minimal collisions occur. For example, cleanup cron could be set at the 45th/50th minute of every “n” hours ensuring that chances of another cron being fired at the same instance are less.
- ETL scripts are memory intensive due to frequent sorting and aggregation of data. Hence, as per load of batches, JVM heap memory should be configured accordingly.
Specific Recommendations for:
Low Load (below 1000 pages/day)
Medium Load Customers (1000-10000 pages/day)
High Load Customers (above 10000 pages/day):
Low Load Customers:
- Dashboard Cron job can be set to more frequent interval (Every 5 minutes)
- JVM Heap memory does not need to be tweaked or increased exceptionally for Reporting to function properly
- Cron Timings should be exclusive of each other by making sure they do not get fired at the same time. This can be achieved by specifying exact minute for the cron to be triggered. Eg. 0 23 0/5 * * * denotes a cron that will be fired in the 23rd minute after every 5 hours.
Medium and High Load Customers:
- Multi Server Setup is recommended with Reporting Service running on a UI Server
- Dashboard Cron (which will update data for Dashboard and Throughput Reports) should be set to a slightly longer interval ~ every 15 minutes to allow scripts more time to complete.
- JVM Heap memory should be increased in case extremely high number of batches are being processed regularly.
- Cron Timings should be exclusive of each other by making sure they do not get fired at the same time. This can be achieved by specifying exact minute for the cron to be triggered. Eg. 0 23 0/5 * * * denotes a cron that will be fired in the 23rd minute after every 5 hours.
Additional Info:
Scale of Testing:
- We have tested the reporting system on 2 different environments: Single Server and 5 cluster Multi-Server(8 core licence)
- For all testing, we ran a mix of 600 Page Batches along with Batches of assorted smaller size.
- Continuous testing for ~5 days has been done on this setup
- Memory footprint was ~3-3.5 Gb during ETL runs and ~1-1.5 Gb at idle