SciDok

Eingang zum Volltext in SciDok

Lizenz

Report (Bericht) zugänglich unter
URN: urn:nbn:de:bsz:291-scidok-38979
URL: http://scidok.sulb.uni-saarland.de/volltexte/2011/3897/


Text skimming as a part in paper document understanding

Bleisinger, Rainer ; Gores, Klaus-Peter

Quelle: (1994) Kaiserslautern ; Saarbrücken : DFKI, 1994
pdf-Format:
Dokument 1.pdf (227 KB)

Bookmark bei Connotea Bookmark bei del.icio.us
SWD-Schlagwörter: Künstliche Intelligenz
Institut: DFKI Deutsches Forschungszentrum für Künstliche Intelligenz
DDC-Sachgruppe: Informatik
Dokumentart: Report (Bericht)
Schriftenreihe: Technical memo / Deutsches Forschungszentrum für Künstliche Intelligenz [ISSN 0946-0071]
Bandnummer: 94-01
Sprache: Englisch
Erstellungsjahr: 1994
Publikationsdatum: 08.07.2011
Kurzfassung auf Englisch: In our document understanding project ALV we analyse incoming paper mail in the domain of single-sided German business letters. These letters are scanned and after several analysis steps the text is recognized. The result may contain gaps, word alternatives, and even illegal words. The subject of this paper is the subsequent phase which concerns the extraction of important information predefined in our "message type model". An expectation driven partial text skimming analysis is proposed focussing on the kernel module, the so-called "predictor". In contrast to traditional text skimming the following aspects are important in our approach. Basically, the input data are fragmentary texts. Rather than having one text analysis module ("substantiator") only, our predictor controls a set of different and partially alternative substantiators. With respect to the usually proposed three working phases of a predictor - start, discrimination, and instantiation - the following differences are remarkable. The starting problem of text skimming is solved by applying specialized substantiators for classifying a business letter into message types. In order to select appropriate expectations within the message type hypotheses a twofold discrimination is performed. A coarse discrimination reduces the number of message type alternatives, and a fine discrimination chooses one expectation within one or a few previously selected message types. According to the expectation selected substantiators are activated. Several rules are applied both for the verification of the substantiator results and for error recovery if the results are insufficient.
Lizenz: Standard-Veröffentlichungsvertrag

Home | Impressum | Über SciDok | Policy | Kontakt | Datenschutzerklärung | English