Buerki, Andreas ![]() |
Abstract
In this workshop, participants are taken through the process of extracting items of formulaic language from a set of corpus documents, assessing the quality of the extraction and using the resulting list to annotate text documents for formulaicity. This is done using the N-Gram Processor (Buerki 2013) and SubString (Buerki 2011) software packages and Wikipedia text files.
Item Type: | Conference or Workshop Item (Other) |
---|---|
Date Type: | Completion |
Status: | Unpublished |
Schools: | English, Communication and Philosophy |
Subjects: | P Language and Literature > P Philology. Linguistics |
Last Modified: | 21 Oct 2022 06:54 |
URI: | https://orca.cardiff.ac.uk/id/eprint/98729 |
Actions (repository staff only)
![]() |
Edit Item |