Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

N-Gram Processor

Buerki, Andreas ORCID: 2013. N-Gram Processor. GitHub.

Full text not available from this repository.


The N-Gram Processor is a set of scripts and a Perl module allowing the creation and processing of n-gram lists out of text files. The feature set of the N-Gram Processor is simple enough: - creation of word n-gram lists out of input text, with n-gram frequencies - listing of document counts (in how many docs an n-gram occurs) - combination of large numbers of lists (of one n) into a single list - unicode support - support for processing of reasonably large corpora (depending on hardware) - support for processing of annotated corpora Please refer to the manual for a more detailed description. The NGP is a branch of the Ngram Statistics Package (NSP, v1.09) by Ted Pedersen and collaborators including code of the v1.10 re-write by Bjoern Wilmsmann.

Item Type: Other
Date Type: Publication
Status: Published
Schools: English, Communication and Philosophy
Subjects: P Language and Literature > P Philology. Linguistics
Q Science > QA Mathematics > QA76 Computer software
Publisher: GitHub
Last Modified: 28 Oct 2022 10:23

Actions (repository staff only)

Edit Item Edit Item