Quantcast
Channel: Ephesoft Docs
Viewing all 478 articles
Browse latest View live

KB0010159 – Mail is sending multiple batch instances within multi server environment.

$
0
0

KB Article #: KB0010159

Topic/Category: Mail

Applies to: All versions.

Issue: When installing Ephesoft on multiple servers, mail is sending two batch instances which are duplicating e-mails.

Root Cause: Mail server is run on Ephesoft server.

Solution: To resolve this issue, you will need to disable the mail service on the other Ephesoft server. Allow only one server running the mail service. To do so, please follow the steps below:

1. Locate the filename applicationContext.xml in the folder <Ephesoft install directory>\Ephesoft\Application\. This file is used to run Ephesoft services.

2. You will need to comment-out the Mail Module service, look for this line below.

<!-- Un-comment it to start Mail Module on this machine. -->
<!-- -->
<import resource="classpath:/META-INF/applicationContext-dcma-mail-import.xml" />

3. You will need to move the closing comment to the line below the Mail Module service, see below.

<!-- Un-comment it to start Mail Module on this machine. -->
<!--
<import resource="classpath:/META-INF/applicationContext-dcma-mail-import.xml" />
-->

4. Restart the Ephesoft service.

Documentation Main Page  | How To Articles | Downloads and Updates |


KB0010178 Large Document (PDF, TIF): Not All Recostar BD HOCR Files Created

$
0
0

KB Articles

 

KB Article # 10178

Topic/Category: Recostar, tif

Issue: Large Document (PDF, TIF): Not All Recostar BD HOCR Files Created

Analysis:

Recostar may not be able to process the files with respect to the following. Please check your files and try the suggestions:

RecoStar has a pixel size limitation: Maximum document height : 24000 pixels; Maximum document width: 32000 pixels

A large sample set of files may require changing the timeout periods in Ephesoft.

 

Solution:

Increase timeout period for Recostar-wait

Since number of learn files are large, Recostar Learn process times-out after a specific interval of time which can be configured from application.properties.
To learn such a large set of files user will require to increase the wait time in application.properties file.
Please follow the steps to increase the wait time.
1.)    Stop Ephesoft Server.
2.)    Open the application.properties file under <Ephesoft-Installation-Directory>/WEB-INF/classes/META-INF/application.properties
3.)    Provide a new value for the property recostar.command.wait.time=<new-time-in-minutes>

 

 

< Back|KB Main Page | Next KB Article >

Updates and Downloads – Ephesoft v3.1.2.14 for v3.1.2.x – Linux

concurrent.scanner.operators in dcma-twain-settings

$
0
0

concurrent.scanner.operators in dcma-twain-settings

KB Article #10224

Topic/Category:WebScanner

Applies to: 4031 +

Summary:

The new property governs how many number of parallel thread will be spawned for web scanning requests. This is to avoid server breakdown and errors while large number of scanner requests are coming to the server.
Ideal value of this property should be equal to the number of concurrent web scanner user’s customer wants to support on a UI server. But in case number of users are very large then this value need to be set lesser than the number of users so that at a point of time only certain number of web scanning requests can be served and excess requests will be queued up for further processing.

 

|
Documentation Main Page
|
How To Articles
|
Downloads and Updates
|

4.0 Admin Interface Overview

$
0
0

A Masterclass showing you an overview of the Admin Interface in the 4.0.x versions.

4.0 Operator Interface Overview

$
0
0

A Masterclass showing you an overview of the Operator Interface in the 4.0.x versions.

KB0010266 – Error out at Folder Import process.

$
0
0

KB Article #: KB0010266

Topic/Category: GhostScript

Applies to: All versions.

Issue: During Folder Import process, IMPORT_MULTIPAGE_FILES module error out due to GhostScript directory is missing.

Root Cause: GhostScript directory is blank.

Solution:

Steps to resolve:

  1. Stop Ephesoft service.
  2. Go to <Ephesoft installed directory>\Dependencies and remove the gs directory.
    For 32-bit server: Rename gs32bit folder to gs folder.
    For 64-bit server: Rename gs64bit folder to gs folder.
  3. Start Ephesoft service.

 

Documentation Main Page  | How To Articles | Downloads and Updates |

KB0010268 – Ephesoft Service won’t start due to missing JDK.

$
0
0

KB Article #: KB0010268

Applies to: All versions.

Issue: Ephesoft Service will not start.

Root Cause: You must have the JDK application started for Ephesoft to work.

Solution:

Steps to resolve:

  1. Stop Ephesoft service.
  2. Go to <Ephesoft installed directory>\Dependencies.
    For 32-bit server: Rename jdk1.7.0_71_32bit folder to jdk1.7.0_71 folder.
    For 64-bit server: Rename jdk1.7.0_71_64bit folder to jdk1.7.0_71 folder.
  3. Start Ephesoft service.

 

Documentation Main Page  | How To Articles | Downloads and Updates |


KB0010301 508 Compliance: Voluntary Product Accessibility Template – VPAT

KB0010274 How Do You Make An Index Field Optional During Validation

$
0
0

KB Articles

KB Article # 10274

Topic/Category: Validation

Ephesoft Version: 3x, 4x

Issue: How Do You Make An Index Field Optional During Validation?

Analysis:

Say you want to create a regular expression to validate with specific reges rules AND blank values. How would you do so?

Solution:
Combine your regular expression with the regular expression for blank(empty/null) space.
This REGEX will validate BLANK:

 ^\s*$

 

Say you want to accept yes, no or empty space as a validation rule. The following would apply:

 yes|no|^\s*$

 

< Back|KB Main Page | Next KB Article >

Java applet interaction with Ephesoft

$
0
0

 

Java applet interaction with Ephesoft

KB Article #10249

Topic / Category: Webscanner

Applies to: Ephesoft 4.0+

Details:

The Java applet interfaces with Ephesoft server by sending request to Servlets.

The following communication is done by the applet:

 

  1. Convey different form of Scanning Status.
  2. Send client logs to server.
  3. Upload scanned Images to server.

 

Following are servlet configuration define inside the web.xml:-

 

  1. Servlet for handling Status request.

 

<servlet>

                        <servlet-name>ScannerStatusConveyerServlet</servlet-name>

                        <servlet-class>com.ephesoft.dcma.twain.ScannerStatusConveyerServlet</servlet-class>

                        <load-on-startup>1</load-on-startup>

        </servlet>

  1. Servlet for logging client logs in server.

<servlet>

                <servlet-name>loggingServlet</servlet-name>

                <servlet-class>com.ephesoft.dcma.twain.LoggingServlet</servlet-class>

                <load-on-startup>1</load-on-startup>

</servlet>

  1. Servlet for uploading images.

<servlet>

                <servlet-name>uploadServlet</servlet-name>

                <servlet-class>com.ephesoft.dcma.twain.UploadServlet</servlet-class>

                <load-on-startup>1</load-on-startup>

</servlet>

 

 

|
Documentation Main Page
|
How To Articles
|
Downloads and Updates
|

KB0010063 Manually Cleaning(Drop) the Activiti Tables

$
0
0

KB Articles

KB Article # 10063

Topic/Category: Workflow, MSSQL, mysql, mariadb

Ephesoft Version: 4x

Issue: “Variable @black is undefined” displayed in web browser

Analysis:

Manually removing Activiti tables will impact accuracy of reporting data, thus we would not recommend cleaning up Activiti tables manually. However in case of Activiti corruption and getting a system back online this may be the only alternative.

Solution:

Description
A significant amount of data related to workflow get accumulated over a period of time. This piling up of data may hamper the overall application performance in the long run. By manually cleaning up the Activiti related tables this situation can be avoided.
Activiti Table Cleanup for MSSQL
Take the backup of the database before proceeding. Stop Ephesoft server if it is running.
A) Prerequisites: All batches must be in the following states to initiate the clean-up.
1. DELETED
2. ERROR
3. FINISHED

B) Drop the following tables from <ephesoft_DB> by executing the queries in the query editor. While executing the query replace <ephesoft_DB> with the name of the Ephesoft database

MSSQL

drop table <ephesoft_DB>.dbo.ACT_EVT_LOG;
drop table <ephesoft_DB>.dbo.ACT_GE_PROPERTY;
drop table <ephesoft_DB>.dbo.ACT_HI_ACTINST;
drop table <ephesoft_DB>.dbo.ACT_HI_ATTACHMENT;
drop table <ephesoft_DB>.dbo.ACT_HI_COMMENT;
drop table <ephesoft_DB>.dbo.ACT_HI_DETAIL;
drop table <ephesoft_DB>.dbo.ACT_HI_IDENTITYLINK;
drop table <ephesoft_DB>.dbo.ACT_HI_PROCINST;
drop table <ephesoft_DB>.dbo.ACT_HI_TASKINST;
drop table <ephesoft_DB>.dbo.ACT_HI_VARINST;
drop table <ephesoft_DB>.dbo.ACT_ID_INFO;
drop table <ephesoft_DB>.dbo.ACT_ID_MEMBERSHIP;
drop table <ephesoft_DB>.dbo.ACT_ID_USER;
drop table <ephesoft_DB>.dbo.ACT_RE_MODEL;
drop table <ephesoft_DB>.dbo.ACT_RU_EVENT_SUBSCR;
drop table <ephesoft_DB>.dbo.ACT_RU_IDENTITYLINK;
drop table <ephesoft_DB>.dbo.ACT_RU_JOB;
drop table <ephesoft_DB>.dbo.ACT_RU_TASK;
drop table <ephesoft_DB>.dbo.ACT_RU_VARIABLE;
drop table <ephesoft_DB>.dbo.ACT_GE_BYTEARRAY;
drop table <ephesoft_DB>.dbo.ACT_ID_GROUP;
drop table <ephesoft_DB>.dbo.ACT_RE_DEPLOYMENT;
drop table <ephesoft_DB>.dbo.ACT_RU_EXECUTION;
drop table <ephesoft_DB>.dbo.ACT_RE_PROCDEF;

MySQL or MariaDB

drop table <ephesoft_DB>.act_evt_log;
drop table <ephesoft_DB>.act_ge_property;
drop table <ephesoft_DB>.act_hi_actinst;
drop table <ephesoft_DB>.act_hi_attachment;
drop table <ephesoft_DB>.act_hi_comment;
drop table <ephesoft_DB>.act_hi_detail;
drop table <ephesoft_DB>.act_hi_identitylink;
drop table <ephesoft_DB>.act_hi_procinst;
drop table <ephesoft_DB>.act_hi_taskinst;
drop table <ephesoft_DB>.act_hi_varinst;
drop table <ephesoft_DB>.act_id_info;
drop table <ephesoft_DB>.act_id_membership;
drop table <ephesoft_DB>.act_id_user;
drop table <ephesoft_DB>.act_re_model;
drop table <ephesoft_DB>.act_ru_event_subscr;
drop table <ephesoft_DB>.act_ru_identitylink;
drop table <ephesoft_DB>.act_ru_job;
drop table <ephesoft_DB>.act_ru_task;
drop table <ephesoft_DB>.act_ru_variable;
drop table <ephesoft_DB>.act_ge_bytearray;
drop table <ephesoft_DB>.act_id_group;
drop table <ephesoft_DB>.act_re_deployment;
drop table <ephesoft_DB>.act_ru_execution;
drop table <ephesoft_DB>.act_re_procdef;

After the operation ensure that all the activity related tables (ACT_*) are removed from the database.
C) In the <reporting_DB> execute the given query after updating the <reporting_DB> with the database name used in reporting

MSSQL

 update <reporting_DB>.dbo.last_execution set last_execution_at = GETDATE() where job in ('BATCH_INSTANCE_LAST_UPDATE' ,'DASHBOARD' ,'STANDARD', 'ADVANCED');

MySQL or MariaDB

 update <report_DB>.last_execution set last_execution_at = NOW() where job in ('BATCH_INSTANCE_LAST_UPDATE' ,'DASHBOARD' ,'STANDARD', 'ADVANCED');

 

D) Set workflow.deploy parameter to true in file {Ephesoft Installation Directory}\WEB-INF\classes\META-INF\dcma-workflows\dcma-workflows.properties and restart the server.
E) Restart the Ephesoft server.

NOTE: It is important to note that no batches must be in a state other than DELETED, ERROR or FINISHED. Cleaning up data otherwise may lead to database corruption.
By executing the following procedure previous reporting data will be lost. Ephesoft reporting is directly linked to the Activiti tables and deleting them would impact the reporting data.

 

< Back|KB Main Page | Next KB Article >

v4.0.4.0 Release Notes

$
0
0

New Feature : Retain sorting information in cookies for Batch Instance Management and Batch List Screen.
New Feature : Validation tab to get opened by default if there are no batches in Review, Review tab to get opened by default if there are no batches in Validation or if there are batches in both Review and Validation.
New Feature : Review Validate tab title should display either Review or Validate.
New Feature : Web Service API to submit a batch instance and get back the Batch Instance Id.
New Feature : Display Page Id’s in the Review, Validation and Web Scanner screen thumbnail view.
New Feature : Collapse the Web Scanner leftmost panel on the basis of configuration in the property file. The panel should be collapsed by default.
New Feature : Files in Shared Folders should not be allowed to be downloaded directly.
New Feature : Turn the Deploy button red in case user is required to deploy the workflow changes.
New Feature : On the Validation Table view, highlight all the extracted columns of a row in case the checkbox is checked.
New Feature : Move the Setting.sts file for Nuance at the Batch Class level.
New Feature : Job to delete the finished and deleted batches from the database if reporting has been executed on them.
New Feature : Webscanner: Add a “forward” and “back page” shortcut to go through the Preview of Scanned Pages.
New Feature : Use Adv. Key Value screen vs. simple kv extraction in KV page Process plugin
New Feature : Incorporate GraphicsMagick with as a separate option (along with imagemagick) for file conversion.
New Feature : Ability to disable KV Extraction Rules.
New Feature : Ability to delete a document or multiple documents in Validation.
New Feature : Sticky field feature at Index Fields level.
New Feature : Ability to create new batch class that should copy BC1 by default.
New Feature : Support variables similar to COPY_BATCH_XML in CMIS export.
New Feature : In case of a server not active for x amount of duration, the entry should be DELETED from the Server_Registry table in database.
New Feature : Given the same aspect ratio of images but different resolutions, the ephesoft system should have built in mathematics to automatically deal with differing resolutions of the same image and be able to successfully handle KV extraction.
New Feature : Batch Instance ID to be passed through on the workflow error email template.
New Feature : Shortcut to close a batch on RV screen and confirm all navigation changes from the batch with the help of confirmation popup.
New Feature : Switch to generate plugin level batch xmls
New Feature : Always on feature for ETL, Logi Info and Batch class management PPM charts
New Feature : Retain filters in batch instance management/batch list/etc
New Feature : Being able Delete New Status batches
New Feature : Internationalization of embedded logiinfo report contents
New Feature : Performance enhancements in ETL
New Feature : Feature to show limited number of alternate values for fields to prevent large size xmls causing slowness on Validation screen
New Feature : Addition of workflow start time in batch instance table
New Feature : “Allow Fuzzy DB and DB Export to configure for Database Schema other than “dbo” for MSSQL.


New Feature : No two users with same username should be allowed to login into Web Scanner.
New Feature : As an administrator, I want to view the Ephesoft script logging when executing scripts so that I am aware of the operations taking place while script execution.
Improvement : As an administrator, I do not want to see the F1 key in the Shortcut key column in the Function Key Mapping screen so that F1 does function key does not conflict with the browser shortcuts and cause conflicts.
Improvement : As an administrator/operator, I want to view the column width of the Batch Name column set to optimum number of characters on Batch Instance Management and Batch List screen so that I can view most of the batch names fully without re sizing the column.
Improvement : As an administrator, I want to install Ephesoft without any restriction on the database Password so that there are no conflicts with the external DB server policies.
Improvement : As an administrator, when I install the Ephesoft application, open-office.properties should set the references to localhost instead of the computer name so that the file can be copied over to another server without changes.
Improvement : As an administrator, I should be able to configure the text to be displayed for document in Document type drop down present in the middle panel in Review/Validation screen so that it is consistent in left panel and drop down.
Improvement : Tomcat changes to support more concurrent users in Web Scanner.
Improvement : Change default values of user connectivity props file and add comments for easier AD configuration for new installation.
Improvement : Deadlock issues due to heartbeat configurations in multi server environments.
Improvement : Reporting Schema Enhancements.
Improvement : Bring back MySQL Support in Ephesoft Installer.
Improvement : Ordering of the priority should be kept in the Batch List screen graph.
Improvement : Batch Instance Error cause should get displayed in the BIM bottom panel at all times.
Improvement : Ability to delete *.bd files after use from Learning to save hard drive space.
Improvement : Field value option list enhancement.
Improvement : Code enhancement for faster import of PDFs and their breakup into tiffs.
Improvement : Add Additional Settings column in the Index Field Listing screen.
Improvement : Application will show KV extraction plugin selected by default in Test Extraction UI.
Improvement : Usability enhancement on the button positioning on KV Extraction Screen and all other screens in the application.
Improvement : Feature to display a combo-box listing all the categories used for index fields in the document type.
Improvement : Restrict on UI special characters in DocType Name.
Improvement : Ask user whether to override or make a copy while importing document type/index field with same name.
Improvement : Pagination dropdown on Pagination Bar
Improvement : Fuzzy DB popup on RV screen should not show any extra space in case of lesser number of columns
Improvement : Adding X-offset and Y-offset on KV Extraction Rule defination and KVPP rule defination screen.
Improvement : Feature to eliminate the unused alternatives for computed table combinations
Fix : Blank screen and Document Mode behavior in IE9 browser; “page fit” does not work
Fix : Active Directory configuration in server.xml is lost.
Fix : As an administrator, I should be able to successfully upgrade through installer and preserve the configured port settings and database permissions so that I can have a hassle free upgrade and don’t have to reconfigure the port/database settings.
Fix : With administrative installation, file permissions are not automatically set correctly for active directory so that ephesoft doesn’t start properly.
Fix : Ephesoft should not pick up emails from mail server if the email configuration has been disabled.
Fix : Support for Windows Server 2012 R2.
Fix : Retain the priority of the existing batch class while overriding the batch class.

 

DocumentationGo To Full Release Notes Page >

E-mail Configuration

$
0
0

Overview & Purpose

E-mail Configuration Screen is responsible for configuring the user mail accounts and E-mail Import plugin is responsible for importing the documents present in a defined form from the user’s mail account. User is allowed to configure any mail account as well as the type of documents which the plug-in will support. This configuration is done per batch class. Multiple email accounts can be setup for each batch class.

Configuration

Mail configuration

Following are the configurable mail account properties:

Configurable property Type of value Value options Description
Username String A valid email account username. The user account name to be configured with Ephesoft on which the Email Import service will keep a watch.
Password String Corresponding password for the configured username Password for the configured user account. Password will always visible as 8 * in password field.
Incoming Server String A valid mail server name The name of the mail server to which the configured user account belongs.
Server type Dropdown A valid mail server type.
  • IMAP
  • POP3
The type of the mail server to which the configured user account belongs. Server Type can be IMAP/POP3.
Folder Name String A valid and existing mail folder name The name of the mail folder on which the Ephesoft Email import will be checking. Folder Name cane be Inbox.
Is SSL Check Box
  • Checked
  • Unchecked
The property that defines whether we’ll be connecting to mail server using the SSL settings or Non-SSL.
Enable Check Box
  • Checked
  • Unchecked
The property that defines whether user email is enable or disable for importing documents. Enable/disable particular email will be marked with green/red indicator respectively.
Port number Integer A valid port number The port number on which the configured mail server type will work. Port number will be in range of 1 to 65535.

If any field of Email configuration is not valid the field will be shown in red color.

Configurable Properties file

  • <Ephesoft installation directory>\ Application\WEB-INF\classes\META-INF\dcma-mail-import\mail-import. properties:
Configurable property Type of value Value options Description
dcma.importMail.cronExpression String A valid cron expressions The CRON expression defining the look up time for the plug-in, i.e. at what time the plug-in looks for any updates in the configured mail account.
dcma.supported.attachment.extension String List of valid file extensions Defines the supported documents by the plug-in. Multiple entries are separated by a “;”.
  • <Ephesoft installation directory>\ Application\WEB-INF\classes\META-INF\dcma-mail-import\open-office.properties:
Configurable property Type of value Value options Description
openoffice.serverUrl List of values
  • ON
  • OFF
Server used for connecting to the remote open office server instance. Used in case of connecting to external/remote service.
openoffice.serverPort Integer A valid and available port number. Port number used for connecting to the open office server instance. Default port is 8100
openoffice.autoStart Boolean
  • True
  • False
If the open office server should be started / connected upon XE starts. Default value is false.
openoffice.homePath String N-A Path to open office installation. If no path is provided, a default value will be calculated based on the operating environment.
openoffice.maxTasksPerProcess Integer Any valid integer value. Maximum number of simultaneous conversion tasks to be handled by a single open office process. Default value for optimized performance is 50.
openoffice.taskExecutionTimeout Integer Timeout for conversion tasks (in milliseconds). Default value for optimized performance is 30 seconds.

Email Enable

This feature is responsible for enabling/disabling an email for importing the documents present in a defined form from the user’s mail account. A batch class can have multiple user mail accounts and multiple enable/disable user mail accounts.

Test Email

This feature is responsible to verify whether a configured user mail account is valid. User can check the connection for user’s mail account after configuration of mail account.

Characteristics

  • The functionality/service allows the user to set up any number of mail accounts for gathering data.
  • The user is allowed to configure the account via UI.
  • The functionality/service can support multiple document formats.
  • The functionality/service makes use of the open-office to convert the received data files into application usable formats.
  • The functionality/service is capable of downloading and saving the attachments of a mail.

Steps of execution/working

  • When the plug-in properties have been set up properly, Ephesoft moves ahead with mail downloading by accessing the mail account.
  • Email import service reads the user’s mail configuration from the database, and tries to access the user’s mail account using the configured settings.
  • If the service is able to connect to the user account, it reads all the mails contained in the configured folder.
  • After the service has read the mails, it starts processing multiple mails at a time.
  • Each read mail goes through a three step procedure of processing, downloading, converting and creating a batch for the mail.
  • If any error occurs processing of a mail, the service sends notification mail to mail accounts configured for notification.

Troubleshooting

Following are few common error messages received due to mal-functioning of the plugin:

S no. Error message Possible root cause
1 Unable to convert Email into PDF file. Open office service is either not running or have not been configured correctly
2 Error in Server Type Configuration, only IMAP/POP3 is allowed. Plug-in only supports IMAP or POP3 server type. Check the user’s account configuration.
3 Not able to establish connection. Connection could not be established for the current user’s account configuration.
4 Could not find port number. Trying with default value of 995. Port number specified in the user’s configuration is invalid, hence plug-in tries to connect on the default pop3 port.
5 Could not find port number. Trying with default value of 993. Port number specified in the user’s configuration is invalid, hence plug-in tries to connect on the default IMAP port.
6 Error while reading mail contents Either email body or other attachments could not be read and converted
7 Not able to process the mail reading. Some error in reading the contents of mail. Open-office could not convert the source file into desired.

Email Import with Folder other than Inbox

Earlier in Email Import Configuration folder name was not editable. It was fixed to Inbox folder only. Now, Folder name is editable for IMAP Server but is fixed to Inbox for POP3 Server.

C:\Users\lipsysingla\Desktop\9.png

Function Keys

$
0
0

Overview

This functionality aims at providing the application user (mainly review operators) the flexibility of customizing the shortcuts for specific operations on the RV screen. The user can run some code script as per the need which will be fired just by pressing a key.

  1. Parameters involved:
        1. Method name: defines the name of the method in the script which should be executed upon usage.
        2. Key: the shortcut key associated with the method.
        3. Description: contains the user’s description for the method.

Below snapshot displays the parameters involved in function key mapping.

  1. Steps For adding a function key:
  2. Click on Add button.

  1. Enter the corresponding values in method name.

  1. Click on Apply button.

  1. A message will be displayed that “Batch Class updated successfully”.

  1. Sample values:

Characteristics

  • The functionality allows one to customize the RV screen to use shortcut keys performing user defined functions.
  • The user can associate a function key to a particular method specified in the script ‘ScriptFieldValueChange.java‘ present at the location ‘{Ephesoft-Home}\SharedFolders\{Batch class ID}\scripts’
  • User can run the script either by clicking the button displayed on UI or the function key button available on the Review Validate UI.
  • User can only associate one method to a particular key but same method can be assigned to multiple keys.
  • User can choose all the values from F2 to F11 except F5. Support for F1 and F5 has not been provided so that they do not conflict with the browser shortcuts.
  • The functionality allows user to add description of the script associated to a key.
  • Function keys are document type specific and will only be displayed on RV screen if selected document type has function keys defined for a batch class.

Working

      • When a batch reached Review/Validation stage, user can either press the function key to run a particular method or can press the function key button displayed in the 2nd panel.

      • A dialog box saying ‘Executing function keyscript’ will appear. By the time it goes off, user’s script has been executed.


GraphicMagick

$
0
0

Overview

GraphicsMagick has been added as an option for image conversion process in the following Batch Class plugins along with ImageMagick:

  • Import multipage file plugin
  • Create Display Image plugin
  • Create Thumbnails plugin
  • Create OCR Input plugin
  • Create MultiPage Files plugin

Batch Class Plugin Changes

  • Import multipage file plugin

 

Following new properties has been added:

Configurable property Type of value Value options Description
Image Conversion Process List of values
  • IMAGE_MAGICK
  • GRAPHIC_MAGICK
Multi-page TIFF image conversion process. Default value is IMAGE_MAGICK.
GM Convert Output Image Parameters String N-A Output parameters for GraphicMagick image conversion process that should be used for multipage tiff to multiple single page tiffs conversion.
GM Convert Input Image Parameters String N-A Input parameters for GraphicMagick image conversion process that should be used for multipage tiff to multiple single page tiffs conversion.

 

  • Create Display Image plugin, Create Thumbnails plugin & Create OCR Input plugin

A property named “image_conversion_using_graphicsMagick” has been added in “imagemagick.properties” file located in Ephesoft_Installation_Directory\Application\WEB-INF\classes\META-INF\dcma-imagemagick folder governing whether GraphicsMagick will be used as Image conversion process or not. The property can take value either ‘ON’ or ‘OFF’.

By default, the value of property will be set as “ON” stating that GraphicsMagick will be used as image conversion process. In case the value is set as “OFF”, ImageMagick will be used as image conversion process.

  • Create MultiPage Files plugin

In the “Create MultiPage Files plugin”, GraphicsMagick has been added as an option in the “Multipage File Export Process” dropdown.

Performance

Below are the comparative performance numbers obtained while running batches using ImageMagick & using GraphicMagick:

 

Refer http://www.graphicsmagick.org/ for more details.

Supported Parameters

Following parameters are supported by GraphicMagick:

Parameters Support Alternative
-compress lzw Yes
-compress Group4 Yes
-colorspace gray Yes
-colorspace rgb Yes
-alpha off No +matte
-thumbnail Yes
-resize Yes
-limit area 100mb No -limit memory 100mb

 

 

Refer http://www.graphicsmagick.org/GraphicsMagick.html for see the list of supported parameters

Import multipage files plugin

$
0
0

Overview

Import Multipage Files plugin is required when running a batch on multipage images. This plugin will break the multipage pdf’s and tiffs into multiple single page tiffs. Multipage pdf’s will be converted to single page tiffs using Ghostscript/Recostar whereas multipage tiffs will be converted to page single page tiffs using ImageMagick/GraphicMagick .

Configuration

UI Configuration

IMPORT_MULTIPAGE_FILES properties can be edited at following admin UI:

Configurable property Type of value Value options Description
IM Convert Input Image Parameters String N-A Input parameters for ImageMagick image conversion process that should be used for multipage tiff to multiple single page tiffs conversion.
Multi Page Import List of values
  • YES
  • NO
Switch for multipage files import plugin. If set to NO, multipage files (pdf and tiff) will not be converted to multiple single page tiffs.
IM Convert Output Image Parameters String N-A Output parameters for ImageMagick image conversion process that should be used for multipage tiff to multiple single page tiffs conversion.
Ghostscript Image Parameters: String N-A Parameters for Ghostscript

command that should be used for multipage pdf to multiple single page tiffs conversion.

PDF To TIFF Conversion Process List of values
  • Ghostscript
  • Recostar
PDF to TIFF conversion process
Recostar Compression Ratio List of values
  • 10
  • 20
  • 30
  • 40
  • 50
  • 60
  • 70
Compression ratio while converting PDF to TIFF through Recostar
Image Conversion Process List of values
  • IMAGE_MAGICK
  • GRAPHIC_MAGICK
Multi-page TIFF image conversion process. Default value is IMAGE_MAGICK.
GM Convert Output Image Parameters String N-A Output parameters for GraphicMagick image conversion process that should be used for multipage tiff to multiple single page tiffs conversion.
GM Convert Input Image Parameters String N-A Input parameters for GraphicMagick image conversion process that should be used for multipage tiff to multiple single page tiffs conversion.

Property File Configuration

Property File location: <Ephesoft-Installation-Path>\ Application\WEB-INF\classes\META-INF\dcma-import-folder\dcma-import-folder.properties\*

Configurable property Type of value Value options Description
import.folder_ignore_char_list String N-A Semi colon separated of characters that are to be replaced in the file names encountered by the plugin.
import.ignore_replace_char String N-A Character specified here that will replace the characters mentioned in “import.folder_ignore_char_list” for the file names encountered by the plugin.

Optimization parameters and results

“-sDEVICE” parameter

  • -sDEVICE=tiff12nc
    • Produces 12-bit RGB output
  •  -sDEVICE=tiff24nc
    • Produces 24-bit RGB output
  •  -sDEVICE=tiff48nc
    • Produces 48-bit RGB output
  •  -sDEVICE=tiff32nc
    • Produces 32-bit CMYK output
  • -sDEVICE=tiff64nc
    • Produces 64-bit CMYK output
  • -sDEVICE=tiffscaled24 -sCompression=lzw
    • Produces a 24 bit RGB image and allows the use of a special compression tag along with it which allows us to compress the size of the image.
  • -sDEVICE=tifflzw
    • Produces black-and-white output and can be combined with various compression options.
  • Following are the results of images produced by splitting a PDF with the given specifications under different GhostScript parameters:

Results

  • PDF Size: 514Kb
  • Number of pages in PDF: 26

Note: PDF contained mixture of colored and B/W images

-sDEVICE Type of output Size per image produced(in KB) Total images size(in MB)
tiff12nc Same type of images 12,241 325
tiff24nc Same type of images 25,446 626
tiff48nc Same type of images 51,148 1258
tiffscaled24 -sCompression=lzw Same type of images 250-400 6.75
tifflzw All images converted to B/W 50-90 1.4

Troubleshooting

Following are few common error messages received due to mal-functioning of the plugin:

S no. Error message Possible root cause
1 Invalid property file configuration The following properties located in “<Ephesoft-Installation-Path>\ Application\WEB-INF\classes\META-INF\dcma-import-folder\ dcma-import-folder.properties” file in are empty:
  • import.folder_ignore_char_list
  • import.ignore_replace_char
2 Converted Tiff files count not equal to the TIFF pages count. The number of pages in PDF/Multipage Tiff is not equal to the converted tiff files.

KV Extraction Normalization

$
0
0

Overview

Given the same aspect ratio of images but different resolutions, the Ephesoft system have built in mathematics to automatically deal with differing resolutions of the same image and be able to successfully handle KV extraction.

Usability

This feature takes in account the case when a KV extraction rule is configured for a sample of specific resolution, whereas the batch execution input image is of different resolution other than the configured one. Application will internally adjust overlays for the image of Batch Execution by normalizing them.

Instead of storing key-Value coordinates as distance from origin of the image, we would be storing it in normalized form. This way, the coordinates stored would be independent of image size.

For Example, we have following

Key Coordinates: (x, y)

Value Coordinates: (x’, y’)

Image Coordinates: (X, Y)

cid:image001.jpg@01D0ADA7.59CBB350

Normalized Values to be stored would be:

Normalized Key Coordinates: (x/X, y/Y)

Normalized Value Coordinates: (x’/X, y’/Y)

Similarly other values related to KV would be calculated and stored.

During extraction the normalized values would be de-normalized with respect to the input image given for extraction.

Assuming Image Coordinates are: (P, Q)

De-Normalized Key Coordinates: ((x/X)*P, (y/Y)*Q)

De-Normalized Value Coordinates: ((x’/X)*P, (y’/Y)*Q)

These de-normalized Coordinates would be normalized with respect to new image dimensions. Key Value extraction would be performed based on these normalized coordinates.

Key Value Extraction

$
0
0

Overview

This feature allows a user to specify a ‘Key-Value pair’ which can be used for extracting document level index field values based on relative location of ‘value’ against a specified key.

Usability

User is provided with ‘Add’ and ‘Edit’ buttons to define and modify the KV patterns. These buttons are available on KV Extraction Rule node present under each index field:

Figure: Key Value Extraction Grid under Batch Class Management

Option has been provided to enable/disable particular KV extraction rule. This setting would allow users to select what extraction rules should be used for Batch processing. Enable/disable particular extraction rule will be marked with green/red indicator respectively.

As soon as user will click on any of the above specified buttons, following UI will be displayed.

Copy KV Extraction Rule

.

At a time, you can copy only one KV Extraction Rule.

Key Value Extraction View

Each key value field consists of following attributes in KV extraction:

  • Key Pattern (regex or other pre-defined field)
  • Value Pattern (regex)
  • Fuzzy Percent (None, 10%, 20% or 30%)
  • Fetch Value (First, Last or All)
  • Page Value (First, Last or All)
  • Zone Value(All, Top, Left, Right, Middle or Bottom)
  • Weight (0 to 1)

Key: Regular expression pattern for the key.

Value: Regular expression pattern for the value.

Fuzzy Percent: Fuzzy percent option returns extracted results that match a pattern approximately.

Allowed user values are none, 10%, 20% and 30%.

Fetch Value: User can specify following fetch value while defining advanced key value pair:

  • First  to extract only first data from the value zone matching the value pattern specified.
  • Last  to extract only last data from the value zone matching the value pattern specified.
  • All  to extract only all data from the value zone matching the value pattern specified.

Page Value: User can specify following page value while defining advanced key value pair:

  • ALL KV Extraction will be performed on all pages of the document.
  • FIRST KV Extraction will be performed on first page of the document.
  • LAST KV Extraction will be performed on last page of the document.

Weight: multiplied with confidence score value to calculate new confidence score.

Anchor to be used for multiple fields:

It aims to use the result of previously extracted document level fields for extraction of other document level fields.

User can use previously defined field as a key while defining key value field for some other document level field.

  • There is a “Use Existing Field For Key” checkbox present on KV extraction UI.

  • On checking this, a list will be populated with the names of document level fields that can be used as a key.

User can select any of those fields as key.

    • Note: Only those document level fields will be shown in drop down whose field order number is less than the field order number of the field for which key value pair is being defined.
  • If “Use Existing Field For Key” check box is selected, value of the field selected as key should be captured.

Example: Suppose there are two document level fields State and City, and image contains following data:

State: CALIFORNIA

City: LA

While defining the advanced key value field for City,

  • Use existing field for key should be checked.
  • State should be selected from the drop down for key pattern.
  • CALIFORNIA should be captured as key.
  • LA should be captured as a value.

Uploading File:

Multiple file uploading support is provided on KV extraction screen from 4.0 version. File format supported for uploading are:

  • PDF
  • TIFF/TIF

Figure: Key Value Extraction File Upload View

Dropdowns and arrow keys are available on menu bar for traversing between files and pages respectively.

Editing Overlays in KV Extraction:

Functionality to edit key and value overlays on the KV Extraction Screen has been made much smoother and easier on the latest 4.0 version.

Default Overlays for key and value appears on the uploaded image. Overlays are resizable and draggable inside the image, which makes working on them much easier.

View OCR Data:

Feature to view OCR’ed contents of image is introduced in version 4.0. ‘View OCR Data’ toggle button is available on menu bar. This button generates the OCR content of the current loaded image and displays it on UI.

Figure: Key Value Extraction View OCR Data View

Test KV:

Feature to view the extracted results from Key-Value pair is enhanced to show the matched values on image. Values extracted from the image based on key and value pattern are drawn on the image and respective details of extraction are shown on bottom panel grid. User can scroll into view by clicking on respective row.

Figure: Key Value Extraction Test KV View

Clear:

Feature to clear the extracted results from the image and redraws Key-Value overlays.

Toolbar:

Navigation Icons to traverse to next, previous, first and last images in Image toolbar.

Apply Button is renamed to ‘Apply KV’ on Advanced KV Screen. Please find below illustrative screenshot:

Ephesoft ocrClassifyExtract Mobile Web Service API

$
0
0

Overview

The new API performs extraction on the input document PDF or a ZIP file (enclosed single page or multipage tiff/tif or pdf). Extraction plugins are fetched from the batch class corresponding to the input batch class identifier. The extraction will be performed based on the extraction plugins configurations and rules configured for the particular batch class.

If the document type is given as an input parameter then document classification is not performed and extraction is performed as per specified document type, otherwise classification and extraction is performed on the input to generate the results.

Input Parameters

Input parameters to the Web Service API would be

INPUT PARAMETERS

1. PDF File (single or multipage)/ ZIP File (zip file may contain single page or multipage tif/tiff or pdf)

2. batchClassIdentifier: String parameter for batch class identifier

3. docType (optional parameter) if user enters a docType then no document classification is performed otherwise classification of the document will be performed.

Output Parameters

Batch XML will be output for the web service.

Web Service URL

http://<HOSTNAME>:8080/dcma/rest/ocrClassifyExtract

Example-

localhost:8080/dcma/rest/batchClass/ocrClassifyExtract

Checklist:

  1. Extraction would be done only if Extraction module is configured for the particular batch class
  2. Extraction would be performed only for the plugins which have extraction switch ON in batch class configuration.

Sample client code using apache commons http client:-

private static void ocrClassifyExtract() {

HttpClient client = new HttpClient();

String url = “http://localhost:8080/dcma/rest/ocrClassifyExtract”;

PostMethod mPost = new PostMethod(url);

// Adding HTML file for processing

File file1 = new File(“C:\\sample\\US-Invoice.tiff”);

Part[] parts = new Part[2];

try {

parts[0] = new FilePart(file1.getName(), file1);

// Adding parameter for batchClassIdentifier

parts[1] = new StringPart(“batchClassIdentifier”, “BC1”);

MultipartRequestEntity entity = new MultipartRequestEntity(parts, mPost.getParams());

mPost.setRequestEntity(entity);

int statusCode = client.executeMethod(mPost);

if (statusCode == 200) {

System.out.println(“Web service executed successfully..”);

String responseBody = mPost.getResponseBodyAsString();

System.out.println(statusCode + ” *** ” + responseBody);

} else if (statusCode == 403) {

System.out.println(“Invalid username/password..”);

} else {

System.out.println(mPost.getResponseBodyAsString());

}

} catch (FileNotFoundException e) {

System.err.println(“File not found for processing..”);

} catch (HttpException e) {

e.printStackTrace();

} catch (IOException e) {

e.printStackTrace();

} finally {

if (mPost != null) {

mPost.releaseConnection();

}

}

}

Viewing all 478 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>