Unsupervised acquisition of a markov model for word correction using Wikipedia

Dorado, Rubén

doi:10.21158/23823399.v2.n2.2014.1246

Unsupervised acquisition of a markov model for word correction using Wikipedia

ONTARE. REVISTA DE INVESTIGACIÓN DE LA FACULTAD DE INGENIERÍA This paper presents a work in progress on the area of automatic acquisition of corpora for spelling correction. Wikipedia contains a high quantity of information including relationships between concepts and named annotations. However, it also contains linguistic information such as misspellings written by many of the Wikipedia collaborators. In this paper, we propose an efficient method to analyze the link structure of Web-based dictionaries to construct a list of misspelled words and their corrections. The method is currently being researched and applied to the Wikipedia as a corpus.

Guardado en:

Revista

Revista Ontare

ISSN

2382-3399

EISSN

2745-2220

Autores

Dorado, Rubén

Editor

UNIVERSIDAD EAN

DOI