Heuristic and rule-based knowledge acquisition: classification of numeral strings in text

aut.researcherMacDonell, Stephen Gerard
dc.contributor.authorMin, KH
dc.contributor.authorMacDonell, S
dc.contributor.authorMoon, Y
dc.contributor.editorHoffmann, A
dc.contributor.editorKang, BH
dc.contributor.editorRichards, D
dc.contributor.editorTsumoto, S
dc.date.accessioned2011-08-06T03:55:00Z
dc.date.available2011-08-06T03:55:00Z
dc.date.copyright2006
dc.date.issued2006
dc.description.abstractThis paper describes the rule-based classification of numerals and strings that include numerals, composed of a number and semantic unit(s) that indicate a SPEED, NUMBER, or other measure, at three levels: morphological, syntactic, and semantic. The approach employs three interpretation processes: word trigram construction with tokeniser, rule-based processing of number strings, and n-gram based classification. We extracted numeral strings from 378 online newspaper articles, finding that, on average, they comprised about 2.2% of the words in the articles. To manually extract n-gram rules to disambiguate the number strings’ meanings, our approach was trained on 886 numeral strings and tested on the remaining 3251 strings. We implemented two heuristic disambiguation methods based on each category’s frequency statistics collected from the sample data, and precision ratios of both methods were 86.8% and 86.3% respectively. This paper focuses on the acquisition and performance of different types of rules applied to numeral strings classification.
dc.identifier.citationProceedings of the 2006 Pacific Rim Knowledge Acquisition Workshop (PKAW), Guilin, China, Lecture Notes in Computer Science, 2006, Volume 4303/2006, pages 40 - 50
dc.identifier.doi10.1007/11961239_4
dc.identifier.issn0302-9743
dc.identifier.urihttps://hdl.handle.net/10292/1593
dc.publisherSpringer-Verlag Berlin Heidelberg
dc.rights© Springer-Verlag Berlin Heidelberg 2006. The author may post his/her version provided acknowledgement is given to the original source of publication and a link is inserted to the published article on Springer’s website. The final publication is available at www.springerlink.com
dc.rights.accessrightsOpenAccess
dc.titleHeuristic and rule-based knowledge acquisition: classification of numeral strings in text
dc.typeConference Contribution
pubs.organisational-data/AUT
pubs.organisational-data/AUT/Design & Creative Technologies
pubs.organisational-data/AUT/PBRF Researchers
pubs.organisational-data/AUT/PBRF Researchers/Design & Creative Technologies PBRF Researchers
pubs.organisational-data/AUT/PBRF Researchers/Design & Creative Technologies PBRF Researchers/DCT C & M Computing
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Min, MacDonell and Moon (2006) PKAW SERL.pdf
Size:
117.29 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
licence.htm
Size:
29.98 KB
Format:
Unknown data format
Description: