Seminar on language documentation - Prof. Bernard Caron
Bernard Caron, Emeritus Research Director at LLACAN (UMR 8135, CNRS-INALCO-EPHE), posted at IFRA from 2017 to 2019, presented a seminar on language documentation on Thursday, November 18. It was a preview of the final website of the ANR project NaijaSynCor (A syntactic treebank, a parser and a wiktionary for Naija), which he directed from February 2017 to September 2021.
The purpose of this project was to describe Naija (aka Common Nigerian Pidgin) from a lexical, grammatical, prosodic, and sociolinguistic perspective. During the 5 years of the project, a team of about 30 Nigerian and French researchers developed the Natural Language Processing tools needed to produce a treebank of nearly 500,000 words, a dictionary, an audio player and text viewer, a grammatical tool (GREW) and a tool for lexical statistics (i-Trameur).
In the first part of the seminar, Bernard Caron presented an overview of the methodology of the project, its team of researchers, and its results. In the second part he showed how the GREW grammatical tool could be used to extract a quantitative grammar of the Naija treebank, and how it could be argued, with quantitative elements of frequency and distribution, that, contrary to English, the negative particle NO in Naija should be annotated as an auxiliary and not an adverb.