Extracting Data From Line Charts in Scanned Medical Documents

Date
2019
Authors
Silva de Azevedo, Kathleen
Supervisor
Pears, Russel
Asif Naeem, Muhammad
Crofts, Catherine
Item type
Thesis
Degree name
Master of Computer and Information Sciences
Journal Title
Journal ISSN
Volume Title
Publisher
Auckland University of Technology
Abstract

Hand-drawn charts contained in printed forms are used to summarize data in a format that can be quickly processed and understood by humans. They differ from computer- generated charts in a few different ways: Firstly, hand-drawn charts are less predictable than computer-generated charts due to the inherent unpredictability of human beha- viour; Secondly, they present higher levels of noise as they must be scanned prior to processing, which interferes with the signal. Much of past research has explored the recognition of machine-generated charts, but with no focus on hand-drawn charts in a noisy medium. Therefore, this research develops methods for the recognition of line charts in scanned medical documents. The approach uses geometrical and positional relationships between the elements of the chart to determine the values of its markers, with no human intervention. The experiments were conducted using two distinct data- sets: one with 200 machine-generated charts and another with 478 scanned medical form sheets. Experimental evaluation showed a high level of accuracy for the method devised to process the machine-generated dataset. The method applied to the medical form sheets extracted the markers with a low level of error. As future work, the rate of extraction may be improved by making the procedure that detects the region in which the data lines are contained more precise.

Description
Keywords
Chart recognition , Image processing , OpenCV , Data extraction
Source
DOI
Publisher's version
Rights statement
Collections