Õ̃ppijasõbralik korpuslause: automaatse valiku võimalusi

Kristina Koppel; Jelena Kallas

doi:10.5128/LV26.07

Õ̃ppijasõbralik korpuslause: automaatse valiku võimalusi

Kristina Koppel, Jelena Kallas

Abstract

User-friendly corpus sentence: Parameters for automatic selection

he paper presents how corpus sentences can be used in learners’ lexicography and in data-driven language learning.

There are two methods for the automatic selection of corpus sentences suitable for language learners: machine learning methods and rule-based methods. The paper focuses on the rule-based methods and describes them through the example of a tool called GDEX (Good Dictionary Example) (Kilgarriff et al. 2008). GDEX helps automatically select sentences suitable for language learners. It takes into account certain parameters: sentence and word length, threshold of low frequency words, keyword position, the absence and presence of certain words etc. The paper introduces the parameters of Estonian GDEX configuration and discusses which parameters need to be studied further.

The paper also introduces the new corpus EstonianNC GDEX, aimed at language learners. The corpus contains only sentences that meet the requirements for Estonian GDEX configuration. In the sentences there are no low frequency words, vocabulary is controlled (no slang, vulgarisms or profanities occur), and all sentences are full sentences and contain verbs. At the moment, the new corpora is accessible only in the corpus query system Sketch Engine (Kilgarriff et al. 2004). In future, it will be possible to integrate it into dictionary portals aimed at language learners.

Keywords

corpus linguistics; corpus lexicography; learners’ lexicography; language learning; Estonian

Full Text:

PDF

DOI: http://dx.doi.org/10.5128/LV26.07

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

ISSN 1736-9290 (print)
ISSN 2228-3854 (online)
DOI https://doi.org/10.5128/LV.1736-9290

Username
Password
Remember me

Lähivõrdlusi. Lähivertailuja

Õ̃ppijasõbralik korpuslause: automaatse valiku võimalusi

Abstract

Keywords

Full Text:

Refbacks