This API will extract the document level fields for the corresponding Key Value pattern provided using input XML. This API will take the HOCR file as input. If the Key Value pattern is not found in the HOCR file then it will create the empty document level fields.
Request Method POST
Input Parameters
Input Parameter | Values | Descriptions |
AdvancedKV | Either “true”/”false” | This parameter is used to specifying the KeyValue extraction is perform by advanced key value or not. |
LocationType | This value should be one of the following:
TOP, RIGHT, LEFT, BOTTOM, TOP_RIGHT, TOP_LEFT, BOTTOM_LEFT, BOTTOM_RIGHT
|
This parameter will fetch the Value pattern of the particular key pattern on the location provided. |
NoOfWords | Should be Integer | This parameter is used for specify in case of AdvancedKV is false. This parameter is used for adding number word of RIGHT location in the result of the value pattern found in the HOCR. |
KeyPattern | This value should not be empty.
This value should be valid regex expression.
|
This is used for verify the Key pattern present in given HOCR. |
ValuePattern | This value should not be empty.
This value should be valid regex expression.
|
This is used for verify the Value pattern present in given HOCR for that particular Key Pattern. |
KVFetchValue | This value should be one of the following:
ALL, FIRST, LAST |
This parameter is used to specify the whether we need fetch all, first or last value pattern found.
|
Multiplier | This value should be float and should be in between 0 to 1 | This value is used to multiply with confidence for updating the confidence of the fields extracted using advanced KV. |
Length | This value should be integer | For getting length value use Ephesoft Admin Screen as display screen shot above |
Width | This value should be integer | For getting width value use Ephesoft Admin Screen as display screen shot above |
Xoffset | This value should be integer | For getting xoffset value use Ephesoft Admin Screen as display screen shot above |
Yoffset | This value should be integer | For getting yoffset value use Ephesoft Admin Screen as display screen shot above |
Weight | This value should be float and should be in between 0 to 1 | This value is used to set the weightage for a extraction rule for a particular document level field. |
KeyFuzziness | This value should be float and should be in between 0 to 1 | This value is used to define the acceptable fuzziness in the key generated from HOCR |
hocrFileName | This value should be string | This value should be having HOCR file name passing for processing in XML file format. |
Along these parameters hocrFileName string parameter is also to be supplied containing the name of the HOCR file uploaded.
Web Service URL: http://{serverName}:{port}/dcma/rest/extractKV
CheckList:
- For using Advance KV user should have admin access to fetch the accurate value of Length, Width, Xoffset and Yoffset. Before using AdvancedKV, please test the image with Ephesoft Admin Screen and note the values of Length, Width, Xoffset, Yoffset and LocationType for the particular KeyValue pattern.
- If AdvancedKV is true than NoOfWords is not use and all other parameters is used.
- If AdvancedKV is false than NoOfWords, KeyPattern, ValuePattern and LocationType will work.
Format for XML:
<ExtractKVParams> <Params> <AdvancedKV>true</AdvancedKV> <LocationType>BOTTOM_LEFT</LocationType> <NoOfWords>0</NoOfWords> <KeyPattern>Invoice</KeyPattern> <ValuePattern> [a-zA-Z]{10,15}</ValuePattern> <KVFetchValue>ALL</KVFetchValue> <Multiplier>1</Multiplier> <Length>384</Length> <Width>251</Width> <Xoffset>284</Xoffset> <Yoffset>105</Yoffset> <Weight>0.1</Weight> <KeyFuzziness>0.2</KeyFuzziness> </Params> </ExtractKVParams>
Sample client code using apache commons http client:-
private static void extractKV() { HttpClient client = new HttpClient(); String url = "http://localhost:8080/dcma/rest/extractKV"; PostMethod mPost = new PostMethod(url); // Adding XML for the input. File f1 = new File("C:\\sample\\ExtractKVParam.xml"); // Adding HOCR for processing. File f2 = new File("C:\\sample\\US-Invoice_HOCR.xml"); Part[] parts = new Part[3]; try { parts[0] = new FilePart(f1.getName(), f1); parts[1] = new FilePart(f2.getName(), f2); parts[2] = new StringPart("hocrFileName", f2.getName()); MultipartRequestEntity entity = new MultipartRequestEntity(parts, mPost.getParams()); mPost.setRequestEntity(entity); int statusCode = client.executeMethod(mPost); if (statusCode == 200) { System.out.println("Web service executed successfully."); String responseBody = mPost.getResponseBodyAsString(); // Generating result as responseBody. System.out.println(statusCode + " *** " + responseBody); } else if (statusCode == 403) { System.out.println("Invalid username/password."); } else { System.out.println(mPost.getResponseBodyAsString()); } } catch (FileNotFoundException e) { System.err.println("File not found for processing."); } catch (HttpException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } finally { if (mPost != null) { mPost.releaseConnection(); } } } }