English Profile - Compiling the EGP

The research was primarily carried out by Anne O'Keeffe and Geraldine Mark - both authors of English Grammar Today. Their work was reviewed and moderated by an expert panel that included Michael McCarthy and Ron Carter - world-leading experts in using corpus linguistics for language analysis. You can read their work on the English Grammar Profile in the International Journal of Corpus Linguistics.

The primary source of data was the Cambridge Learner Corpus, a 55 million-word corpus comprising over 250,000 scripts from Cambridge English exams at all levels, and from over 130 countries around the world. So it is a great resource for seeing just what sort of language learners from around the world actually use.

The grammar of English is complex, but finite. From the evidence of course books and reference books such as Cambridge University Press’s Cambridge Grammar of English and English Grammar Today, there are core features that are considered essential for learners, including tenses, articles, the major word classes, modal and auxiliary verbs, word order, reported speech and so on. Based on this canon of grammar and on evidence from the larger, multi-billion-word Cambridge English Corpus, the researchers were able to draw up a long list of grammar features to search for in the learner data.