Cross-Linguistic Overlap English-Spanish Vocabulary: A Data-DrivenApproach to Lexical Similarity

Main Article Content

Maria Isabel Maldonado Garcia

Abstract

This study presents a computational methodology for identifying shared vocabulary between English and Spanish to support second language acquisition. Similarity indexes (S.I.) were calculated to determine the percentage of orthographic and semantic overlap between the two languages. Both Spanish and English use the Roman script, which allowed for string-based comparison. English has received multiple loanwords from Latin, from which Spanish also derives. This study analyses the most frequent vocabulary in both languages to assess the similarity level and extract similar lexical items. The results show that a lexical similarity level of 53.13% corresponding to 1594 shared lexical terms was calculated using a lexico statistical computational method to compare the 3000 highest frequency terms of the basic vocabulary of English and Spanish, identifying shared  vocabulary as a pedagogical tool.

Downloads

Download data is not yet available.

Article Details

Section

Articles