Overview
The new API performs extraction on the input document PDF or a ZIP file (enclosed single page or multipage tiff/tif or pdf). Extraction plugins are fetched from the batch class corresponding to the input batch class identifier. The extraction will be performed based on the extraction plugins configurations and rules configured for the particular batch class.
If the document type is given as an input parameter then document classification is not performed and extraction is performed as per specified document type, otherwise classification and extraction is performed on the input to generate the results.
Input Parameters
Input parameters to the Web Service API would be
INPUT PARAMETERS
1. PDF File (single or multipage)/ ZIP File (zip file may contain single page or multipage tif/tiff or pdf)
2. batchClassIdentifier: String parameter for batch class identifier
3. docType (optional parameter) if user enters a docType then no document classification is performed otherwise classification of the document will be performed.
Output Parameters
Batch XML will be output for the web service.
Web Service URL
http://<HOSTNAME>:8080/dcma/rest/ocrClassifyExtract
Example-
localhost:8080/dcma/rest/batchClass/ocrClassifyExtract
Checklist:
- Extraction would be done only if Extraction module is configured for the particular batch class
- Extraction would be performed only for the plugins which have extraction switch ON in batch class configuration.
Sample client code using apache commons http client:-
private static void ocrClassifyExtract() {
HttpClient client = new HttpClient();
String url = “http://localhost:8080/dcma/rest/ocrClassifyExtract”;
PostMethod mPost = new PostMethod(url);
// Adding HTML file for processing
File file1 = new File(“C:\\sample\\US-Invoice.tiff”);
Part[] parts = new Part[2];
try {
parts[0] = new FilePart(file1.getName(), file1);
// Adding parameter for batchClassIdentifier
parts[1] = new StringPart(“batchClassIdentifier”, “BC1”);
MultipartRequestEntity entity = new MultipartRequestEntity(parts, mPost.getParams());
mPost.setRequestEntity(entity);
int statusCode = client.executeMethod(mPost);
if (statusCode == 200) {
System.out.println(“Web service executed successfully..”);
String responseBody = mPost.getResponseBodyAsString();
System.out.println(statusCode + ” *** ” + responseBody);
} else if (statusCode == 403) {
System.out.println(“Invalid username/password..”);
} else {
System.out.println(mPost.getResponseBodyAsString());
}
} catch (FileNotFoundException e) {
System.err.println(“File not found for processing..”);
} catch (HttpException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
if (mPost != null) {
mPost.releaseConnection();
}
}
}