Within the last two years many companies had to ask their customers to sign the SEPA Direct Debit Mandates. It is an established procedure to send out forms with filled customer data (the SEPA Mandate). The customer signs the mandate and sends it back to the company.
One of our customers – an insurance company – is using Kofax Capture and Kofax Transformation Modules for mailroom automation. In this context the SEPA Mandates had to be recognized by KTM und the appropriate business process had to be triggered.
Till then two processes for SEPA Mandates were established:
- The customer has signed the mandate: the flag ‘SEPA Mandate was granted’ is set. No further action is needed.
- The customer did not sign the mandate: further administrative processing must be started.
Within KTM the recognition of the signature is done by an advanced zone locator and blackness values of zones.
In the course of time this concept was diluted. On one hand the customers changed or supplemented the filled in customer data by handwritten comments, because the data was wrong or incomplete. On the other hand, some customers received blank SEPA Mandates, which were filled by the customer with handwritten text.
Thus the insurance company was in need of another process for SEPA Mandates:
- customer has signed the mandate, but within the ‘customer data’ region of the mandate handwritten notes were added. A process to change or add customer data must be started.
The challenge for KTM was to recognize if there are handwritten notes in defined regions of the SEPA Mandate.
This is an example of a filled mandate, which was signed by the customer:
And now an example with handwritten changes by the customer:
To identify the handwritten notes you can use the OCR engine ‘Mixed Print’. This engine is for reading typescript and handwriting on a document. However we are not interested in the content of the handwritten notes – we just want to know if there are handwritten notes at all. The ‘Mixed Print’ engine won’t give good results for the content of the written notes, as in these cases typescript and handwriting will often overlap.
But the ‘Mixed Print’ engine provides information, if there was handwriting at all. Candidates for handwritten notes are marked with so-called ‘boxes’ You can view these ‘boxes’ by using the XDOC browser, which comes with the KTM installation. First, you run the ‘Mixed Print’ engine on the mandate document (you can do that in the KTM Project Builder). Then you start the XDOC browser to open the xdc file of the mandate document:
‘Representation 0’ (the ‘Mixed Print’ engine) has three ‘boxes’. Each box stands for a region with candidates for handwriting. These ‘boxes’ can be retrieved by KTM scripting. By selecting a region within the mandate where you look for the ‘boxes’, everything is ready to judge if somebody scribbled on your form.
To define the ‘search region’ you could use the words ‘one-off payment’ (upper right corner, defines upper bound of search region) and ‘By signing this mandate form’ (text underneath the customer data, defines lower of search region). To find this words you could use format locators or search directly within the OCR result of the document. The following scripting example looks directly into the OCR result. The function Is_handwritten returns TRUE, if at least one ‘box’ is found within the search region.
The example script needs a reference to ‘Kofax memphis Forms 4.0’. So please add this reference in your KTM script:
The underlying KTM project uses OCR recognition with RecoStar or FineReader by default. To check if somebody scribbeled on the mandate you may use the following function:
1Function Is_handwritten(pXDoc As CASCADELib.CscXDocument) As Boolean
2'Checks is something handwritten is in a region of the page
3
4Dim i As Integer
5Dim BoxAnzahl As Integer
6Dim StartTOP As Long
7Dim EndeTOP As Long
8
9BoxAnzahl=0
10StartTOP=0
11EndeTOP=0
12Is_handwritten=False
13
14'Search 'one-off payment' and add 80 to TOP. Only look south.
15For i=0 To pXDoc.TextLines.Count-1
16 If InStr(LCase(pXDoc.TextLines(i).Text),"one-off payment")>0 Then
17 StartTOP=pXDoc.TextLines(i).Top
18 StartTOP=StartTOP+80 '~ line height
19 Exit For
20 End If
21Next
22
23'Search 'By signing this mandate form'. Only look north of this.
24For i=0 To pXDoc.TextLines.Count-1
25 If InStr(LCase(pXDoc.TextLines(i).Text),"By signing this mandate form")>0 Then
26 EndeTOP=pXDoc.TextLines(i).Top
27 Exit For
28 End If
29Next
30
31'Re-OCR with engine 'Mixed Print'
32FullPageRecognition_1(pXDoc, "", "Mixed Print")
33
34'only count boxes south of StartTOP
35'only count boxes north of EndeTOP
36'Box.width>200 to avoid 'dirt'
37'Box.left>275 to leave out the left border (holes, barcodes)
38
39For i= 0 To pXDoc.Boxes.Count-1
40 If pXDoc.Boxes.ItemByIndex(i).Top>StartTOP And pXDoc.Boxes.ItemByIndex(i).Width>200 And pXDoc.Boxes.ItemByIndex(i).Left>275 And pXDoc.Boxes.ItemByIndex(i).Top<EndeTOP Then
41 BoxAnzahl=BoxAnzahl+1
42 End If
43Next
44
45'OCR back to RecoStar or FineReader, for standard processing
46FullPageRecognition_1(pXDoc, "", "RecoStar")
47
48If BoxAnzahl>0 Then 'at least one box: there was some handwriting!
49 Is_handwritten= True
50Else
51 Is_handwritten= False
52End If
53End Function
And finally the called procedure FullPageRecognition_1, which does an Re-OCR:
1Public Sub FullPageRecognition_1(ByVal pXDoc As CscXDocument, ByVal ImageCleanProfile As String, ByVal OCRProfile As String) 2 'remove existing OCR results and perform OCR on page one with profile OCRProfile 3 Dim i as Integer 4 Dim oPRP As IMpsPageRecogProfile 5 Dim oPR As New MpsPageRecognizing 6 7 'OCR only on page 1 8 pXDoc.CDoc.Pages(0).SuppressOCR=False 9 10 '# Remove any representations, before proceeding to perform full page recognition 11 For i = pXDoc.Representations.Count -1 To 0 Step -1 12 pXDoc.Representations.Remove (i) 13 Next 14 15 Set oPRP = Project.RecogProfiles.ItemByName(OCRProfile) '# Use the page recognition profile OCRProfile 16 oPR.Recognize(pXDoc, oPRP, 0) '# Perform recognition on the first page 17 18 '# At design time the text lines need to be analysed. At runtime this will be done automatically 19 If Project.ScriptExecutionMode = CscScriptExecutionMode.CscScriptModeServerDesign Then pXDoc.Representations(0).AnalyzeLines 20End Sub
Older blog articles about KTM and KC:
Kofax Transformation Modules (KTM): ‘free-form recognition’ for handwritten numbers
Kofax Capture – Document Separation and Barcodes
KTM and insurance companies: Document Process Automation
Document classification with Kofax Transformation Modules (KTM)
Kofax Transformation Modules – format locators and dynamic regular expressions – Part 2
Kofax Transformation Modules – format locators and dynamic regular expressions
More articles
fromJürgen Voss
Your job at codecentric?
Jobs
Agile Developer und Consultant (w/d/m)
Alle Standorte
Gemeinsam bessere Projekte umsetzen.
Wir helfen deinem Unternehmen.
Du stehst vor einer großen IT-Herausforderung? Wir sorgen für eine maßgeschneiderte Unterstützung. Informiere dich jetzt.
Hilf uns, noch besser zu werden.
Wir sind immer auf der Suche nach neuen Talenten. Auch für dich ist die passende Stelle dabei.
Blog author
Jürgen Voss
Do you still have questions? Just send me a message.
Do you still have questions? Just send me a message.