Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings


Buerki, Andreas ORCID: 2011. SubString. GitHub.

Full text not available from this repository.


The SubString package is an open-source set of Unix Shell scripts used for substring reduction and frequency consolidation of word n-grams of different length. In the process, the frequencies of substrings are reduced by the frequencies of their superstrings and a consolidated list with n-grams of different lengths is produced without an inflation of the overall word count. The functions performed by SubString will primarily be of interest to linguists working on formulaic language, multi-word sequences and similar phraseological phenomena.

Item Type: Other
Date Type: Publication
Status: Published
Schools: English, Communication and Philosophy
Subjects: P Language and Literature > P Philology. Linguistics
Publisher: GitHub
Last Modified: 28 Oct 2022 10:23

Actions (repository staff only)

Edit Item Edit Item