Utilizing Clues in Syntactic Relationship for Automatic Target Word Sense Disambiguation
Ebony Domingo | Rachel Edita Roxas
Multiple translations to the target language are due to several meanings of source words and various target word equivalents, depending on the context of the source word. Thus, an automated approach is presented for resolving target-word selection, based on "word-to-sense" and "sense-to-word" relationship between source words and their translations, using syntactic relationships (subject-verb, verb-object, adjective-noun). Translation selection proceeds from sense disambiguation of source words based on knowledge from a bilingual dictionary and word similarity measures from WordNet, and selection of a target word using statistics from a target language corpus. Test results using English to Tagalog translations show an overall 64% accuracy for selecting word translation with a standardized precision of at least 80% for generating expected translations using 200 sentences with ambiguous words (an average of 4 senses) in three categories: nouns, verbs, and adjectives. This system is tested on 145,746 word pairs in syntactic relationships that are extracted from target corpora (with 317,113 words). Sense profile, with 2681 entries for source words, is built from an existing bilingual dictionary that includes clues for disambiguation and target translations. The results show an improvement on the performance of the method with the utilization of syntactic information in resolving target-word ambiguity. Further improvements include the integration of other content words and syntactic categories, the addition of reliable clues for sense disambiguation, and the integration of smoothing techniques.