English is the primary language for papers in all subfields of computer science (CS). Computer science students who use English as a second language (ESL) must usually learn CS and English simultaneously. Are there any subfields of CS in which the literature is particularly difficult for learners to read as English texts? This research question has been underexplored. For the present study, we built a highly accurate automatic readability assessor to automatically evaluate 38 subcategories of CS subfields based on approximately 460,000 CS abstracts submitted to the arXiv. We found that about 75% of these abstracts are readable for intermediate English learners. In addition, discrete mathematics appears to be the easiest subfield for ESL readers, while digital library is the most difficult. Our results imply that the degree of language support that ESL learners need to read abstracts varies widely by CS subfield.
Ida Ngambeki Purdue University, Matt Bishop The University of California, Davis, Jun Dai California State University, Sacramento, Phil Nico California Polytechnic State University