Home > ELT > Dictionaries > Language Researchers
Cambridge homepage Dictionary searches Dictionary searches Contact us Link to us Top 20 Word of the day Help


Dictionary data for language researchers

Cambridge University Press is happy to consider requests to use its dictionary data in natural language processing or other electronic applications. The two most common types of licence are:

  • A standard commercial licence allows a single company to use the data for any research, and also allows its use in any number of products at no extra cost, providing that none of the dictionary data is reverse engineerable, and that the products are not and do not include paper or electronic dictionaries.

  • A standard university department research licence allows for research use by any number of researchers in a university department for a 3 year period, but does not allow for any product development.

There are a number of dictionary databases available, and we can consider customising data sets on demand. All data will be supplied as SGML-encoded files on PC-recordable CDs. Prices start from £500, depending on the data set, licence term, number of commercial products, and size of the institution or company. For more details email cald@cambridge.org. Our products include:

  • Cambridge Advanced Learner's Dictionary

    The text of the Cambridge Advanced Learner's Dictionary (2nd edition published 2005), containing 72,500 sense entries including: definitions, grammatical coding, register and variety coding, example sentences with highlighted collocates, plus the CD-ROM SMART thesaurus encoding (each sense classified into 2500 hierarchically structured categories).

  • English Pronouncing Dictionary

    The text of the definitive Cambridge English Pronouncing Dictionary (17th edition published 2006), containing 125,000 words (including inflections) and 220,000 phonetic transcriptions. This database now also includes fully expanded variant phonetic forms, and parts of speech for each headword.

  • Cambridge sound recordings

    140,000 sound recordings by native speakers, carefully checked by language experts, for 70,000 words in British English and American English (from the Cambridge English Pronouncing Dictionary)

  • Cambridge word lists

    Definitive word lists for use in spell checking and word games, generated to your specification from a range of Cambridge dictionaries.

  • English bilingual dictionaries

    The Cambridge Klett bilingual dictionaries (115,000 headwords and example sentences, and 170,000 translations) for French (including French phonetic transcriptions) and Spanish (including regional variants from Spain and Latin America), and the Word Routes/Selector series of parallel bilingual mini-thesauri in French, Spanish, Portuguese, Italian, Greek and Catalan (all with English).

We can offer co-branded versions of our online dictionaries - see for example http://dictionary.cambridge.org/learnenglish/results.asp?searchword=word - or can provide dictionary data for hosting on your own website. Both these services are subject to a significant annual licence fee. For more information contact us by email at cdrom@cambridge.org.