*COCA, Google 등과 같은 대규모 오픈 코퍼스, 게다가 세부 주제나 장르에 따른 검색이 가능한 코퍼스가 구축됨에 따라 코퍼스 구축과 같은 데이터 경쟁의 시대는 끝났다고 판단된다. 이제는 교재개발, 교수학습, 음성인식, 자동채점, 학술연구 등 응용의 시대다. 앞으로의 코퍼스 연구는 기술적인 접근이 아닌 창의적인 응용이 핵심이다.
http://corpus.byu.edu/coca/ (genre-based corpus search)
COCA is composed of more than 450 million words in 189,431 texts, including 20 million words each year from 1990-2012. The most recent addition of texts (Apr 2011 - Jun 2012) was completed in June 2012.
http://www.wordandphrase.info/analyzeText.asp (freqency level, collocational combination, POS search, etc)
http://corpus2.byu.edu/glowbe/
See the new Corpus of Global Web-Based English (GloWbE): 20 countries, 1.9 billion words and the Google Books corpus (50+ billion words from 1980s-2000s).
http://googlebooks.byu.edu/x.asp (50+ billion words from 1980s-2000s)
http://www.academicvocabulary.info/ (COCA)
This site contains academic vocabulary lists of English that are based on 120 million words of academic texts in the Corpus of Contemporary American English (COCA). As our August 2013 article in Applied Linguistics points out, there are important differences between these lists and the Academic Word List created by Coxhead (2000).
There are three ways to access the lists (all of which are completely free):
1. Download the lists (as Excel files) for offline use (these three links are short one-page samples):
-
The Academic Vocabulary List (AVL) itself (top 3,000 lemmas, which occur in all academic domains)
-
The AVL words grouped into word families (similar to Coxhead's AWL, but with much more information)
-
Top 20,000 words (lemmas) in COCA-Academic, including AVL words, domain-specific words, and "genre-neutral" words
2. Use the online interface to browse through the list and see detailed information about each word -- its definition, the frequency in each of the nine academic domains (e.g. Medicine, Science, or Business), the collocates (nearby words, which give great insight into word meaning and usage), re-sortable concordance lines (which show the patterns in which the word occurs), and synonyms -- all from just COCA Academic.
3. Use the online interface to input entire academic texts, and see frequency profiles of all words in the text. You can click on any word to see all of the information shown in #2 above. You can also click on any phrase to find related phrases in COCA-Academic.