APJCR_2023_4_2_75

Asia Pacific Journal of Corpus Research Vol. 4, No. 2, pp. 75-87
Abbreviation: APJCR
e-ISSN: 2733-8096
Publication date: 31 December 2023
Received: 1 October 2023 / Received in Revised Form: 17 November 2023 / Accepted: 10 December 2023
DOI: https://doi.org/10.22925/apjcr.2023.4.2.75

Vocabulary analyzer based on CEFR-J wordlist for self-reflection (VACSR) version 2

Yukiko Ohashi (Yamazaki University of Animal Health Technology), JAPAN; Noriaki Katagiri (Hokkaido University of Education), JAPAN; Takao Oshikiri (Bunkyo Gakuin University), JAPAN
Copyright 2023 APJCR

This is an open access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

This paper presents a revised version of the vocabulary analyzer for self-reflection (VACSR), called VACSR v.2.0. The initial version of the VACSR automatically analyzes the occurrences and the level of vocabulary items in the transcribed texts, indicating the frequency, the unused vocabulary items, and those not belonging to either scale. However, it overlooked words with multiple parts of speech due to their identical headword representations. It also needed to provide more explanatory result tables from different corpora. VACSR v.2.0 overcomes the limitations of its predecessor. First, unlike VACSR v.1, VACSR v.2.0 distinguishes words that are different parts of speech by syntactic parsing using Stanza, an open-source Python library. It enables the categorization of the same lexical items with multiple parts of speech. Second, VACSR v.2.0 overcomes the limited clarity of VACSR v.1 by providing precise result output tables. The updated software compares the occurrence of vocabulary items included in classroom corpora for each level of the Common European Framework of Reference–Japan (CEFR-J) wordlist. A pilot study utilizing VACSR v.2.0 showed that, after converting two English classes taught by a preservice English teacher into corpora, the headwords used mostly corresponded to CEFR-J level A1. In practice, VACSR v.2.0 will promote users’ reflection on their vocabulary usage and can be applied to teacher training.

Keywords

Vocabulary, Classroom Corpus, CEFR-J, Teacher Training

References

Anthony, L. (2022). AntConc (Version 4.2.0) [Computer Software]. Tokyo, Japan: Waseda University. Available from https://www.laurenceanthony.net/software

Coxhead, A. (1998). The development and evaluation of an academic word list (Master Thesis, Victoria University of Wellington, New Zealand).

Coxhead, A. (2000). A new academic word list. TESOL Quarterly, 34(2), 213-238.

Kilgarriff, A., Baisa, V., Bušta, J., Jakubíček, M., Kovář, V., Michelfeit, J., Rychlý, P., & Suchomel, V. (2014). The sketch engine: Ten years on. Lexicography, 1, 7-36.

Negishi, M.,Takada, T., & Tono, Y. (2013). A progress report on the development of the CEFR-J. In E.D. Galaczi & C. J. Weir (Eds.), Exploring language frameworks. Proceedings of the ALTE Krakow Conference, July 2001, 135-163.

Ohashi, Y., & Katagiri, N. (2020a). The Ratios of CEFR-J vocabulary usage compared with GSL and AWL in elementary EFL classrooms and suggestions of vocabulary items to be taught. Asia Pacific Journal of Corpus Research, 1(1), 35-65.

Ohashi, Y., Katagiri, N., & Oshikiri, T. (2022b). Developing classroom corpus tagger for teachers’ reflective practice: A spoken language tagger to compile classroom corpora. English Corpus Studies, 29, 41-62.

Ohashi,Y., Katagiri, N., & Oshikiri, T. (2022). Vocabulary analyzer based on CEFR-J wordlist for self-reflection (VACSR): From classroom corpus compilation to self-reflection. International Journal of Language Learning and Applied Linguistics World, 31(1), 1-15.

Qi, P., Zhang, Y., Zhang, Y., Bolton, J., & Manning, C. (2020). Stanza: A Python Natural Language Processing Toolkit for Many Human Languages. https://nlp.stanford.edu/pubs/qi2020stanza.pdf

Schmitt, N. (2010). Researching Vocabulary: A Research Manual. Basingstoke: Palgrave Macmillan.

West, M. (1953). A General Service List of English Words. Longman, London.

Penn Treebank P.O.S. Tags. (n.d.). www.ling.upenn.edu. https://www.ling.upenn.edu/courses/Fall_2003 /ling001/penn_treebank_pos.html

The Authors

Yukiko Ohashi is an Associate Professor at Yamazaki University of Animal Health Technology. Her principal research lies in the field of corpus linguistics.

Noriaki Katagiri is a Professor at Hokkaido University of Education. His research interests include spoken corpora, classroom discourse analyses, and English language acquisition.

Takao Oshikiri is an Associate Professor at Bunkyo Gakuin University. In 2003, he earned his MSc in International Business from London South Bank University. He has authored books in the field of digital marketing.

The Authors’ Addresses

First and Corresponding Author
Yukiko Ohashi
Associate Professor
Yamazaki University of Animal Health Technology
4-7-2 Minami-Osawa, Hachioji, Tokyo 192-0364, JAPAN
E-mail: y_watanabe@yamazaki.ac.jp

Co-author
Noriaki Katagiri
Professor
Hokkaido University of Education, Asahikawa
9 Chome, Hokumoncho, Asahikawa, Hokkaido 070-8621, JAPAN
Email: katagiri.noriaki@a.hokkyodai.ac.jp

Co-author
Takao Oshikiri
Associate Professor
Bunkyo Gakuin University
1-19-1 Mukogaoka, Bunkyo, Tokyo 113-8668, JAPAN
E-mail: toshikiri@bgu.ac.jp

☞ How to submit your manuscript to APJCR.