Arabic Curriculum Analysis

Developing a platform that analyzes the content of curricula can help identify their shortcomings and whether they are tailored to specific desired outcomes. In this paper, we present a system to analyze Arabic curricula and provide insights into their content. It allows users to explore word presence, surface-forms used, as well as contrasting statistics between different countries from which the curricula were selected. Also, it provides a facility to grade text in reference to given grade-level and gives users feedback about the complexity or difficulty of words used in a text.


Introduction
Effective language curricula are critical to teaching communication skills that are required in professional and academic settings. Building suitable curricula or updating existing ones is typically a laborious and time consuming process often requiring many specialists working together. Due to its complexity, it is imperative to have tools that ascertain if the curricula achieve the desired learning objectives, as measured for example by vocabulary level. Developing a platform that analyzes curricula can help identify shortcomings and whether they are tailored to desired outcomes. Natural Language Processing (NLP) can provide automated methods to perform such analysis and provide feedback to curricula developers.
A wealth of research devoted to build, curate, and assess educational materials has been published for English and other Latin languages (Tyler, 1950;Oliva, 2005;Braun et al., 2006;Soto, 2015). Though some recent NLP work on Arabic has addressed language learning, readability and textbook assessments (Zaghouani et al., 2014;Zalmout et al., 2016;Al Khalil et al., 2018), the work is limited with rather scarce resources and tools. This paper aims to contribute to curricula assessment, and fill some of the gaps in the literature. We focus on analyzing Arabic curricula taught in Gulf countries at elementary school level. We built a tool that analyzes curricula by providing: statistics about word usage and morphological forms in different grades; words belonging to specific categories, such as food or animals; comparison with other curricula; and complexity levels of words in a text according to selected grades. The tool provides insights into the strengths and weaknesses of curricula and highlights what they cover in terms of vocabulary and morphological constructs. Further, we added some features of potential help to instructors and learners. These include word pronunciation, using text-to-speech, English translation, using machine translation, diacritized forms of words, using automatic diacritization, and linguistic information, such as word segmentation and part-of-speech tagging. As far as we know, this is the first system that: i) allows for browsing and comparing word usages in Arabic curricula from different countries, and ii) showcases whether students in a particular grade would likely understand pieces of text.

Related Work
While the research in readability is not novel; advances in technology have permitted researchers to explore further the topic and propose formulations to approximate the difficulty of texts for readers. Benjamin et. al, (2012) surveyed the developments in the field of readability from the perspective of education, linguistics, cognitive science, and psychology, and provided recommendations for the use of This work is licensed under a Creative Commons Attribution 4.0 International License. License details: http:// creativecommons.org/licenses/by/4.0/. such evaluation techniques. Collins-Thompson (2014) explored the challenges for automatic assessment of text readability and highlighted the opportunities for using automatic modeling to predict the reading difficulty of texts. Al-Khalifa and Al-Ajlan (2010) proposed a tool for readability analysis and applied this to curricula in Saudi Arabia. Zalmout et al. (2016) described a process to analyze the textbooks of two different English teaching methods for English as a Second Language (ESL) by using readability scoring technique. Al Khalil et al. (2018) presented an Arabic reading corpus that was collected from textbooks from first to twelfth grade from United Arab Emirates and works of fiction to enhance the inadequate resources that effected educational applications. García Salido et al. (2018) proposed a lexical tool for academic writing in Spanish and described the data extraction from a corpus of academic texts. This tool basically provides insight into how to use typical vocabulary for academic genre in order to build an entire text.
Arabic is a complex language with rich morphology. Stems are typically derived from a set of roots using predefined stem templates. Affixes can be attached to stems to generate words (surface forms). For example, the word ("wsyktbwnhA" -"and they will write it") 1 has two prefixes (and and will) and two suffixes (they and it). Further, Arabic is typically written without diacritics (or short vowels) which are essential to understand meaning and properly verbalizing words. This increases the complexity when analyzing Arabic texts.

Data Collection
We acquired the text versions of the Arabic subject primary school curricular textbooks from six Gulf countries covering grades 1 through 6 from either 2014 or 2015 2 . These countries 3 are: Bahrain (BH), Kuwait (KW), Oman (OM), Qatar (QA), Saudi Arabia (SA), and United Arab Emirates (AE). Statistics are shown in Table 1. Table 2 has example sentences from different grades and shows that the text from grade 1 is direct and simple and is comprised mainly of short declarative sentences. On the other hand, text from grade 6 is more complex, at the vocabulary and sentence structure levels, with longer sentences. The working man finds that his useful work contributes to the making of human civilization, while ... Table 2: Sample sentences from QA curriculum (from Grade 1 and Grade 6) 1 Buckwalter transliteration and translation are provided. 2 We thank The World Organization for Renaissance of Arabic Language (WORAL) for data collection and preparation. 3 We use ISO 3166-1 alpha-2 for country codes.

System Description
System Architecture: An overview of the system functionalities is illustrated in Figure 1, and the system can be publicly accessed using the following URL: curriculum.qcri.org. After the acquisition of the textbooks collection, we used the publicly available Farasa Arabic NLP toolkit to process the text. This includes morphological segmentation (Abdelali et al., 2016), diacritization (Darwish et al., 2017); and lemmatization (Mubarak, 2018). These steps are crucial to enhance the analysis given the complexities of Arabic. Next, language experts classified lemmas into 50 categories (ex: Function Words, Human, Animal, Food, History, Politics, Travel, Religious Acts, etc.) The system provides the following functions: Term Usage, Category, Statistics, Differences, and Text Grading. It also uses Text to Speech (TTS) , Machine Translation (MT), and Farasa Tools to pronounce, translate, and provide morphological analysis of lexical items respectively.
Design: To implement our tool, we used Django 4 , a Python web framework for the rapid development of database-driven websites with high performance web applications. The framework supports modelview-controller (MVC) design patterns to separate the data model and business rules from the user interface. Accordingly, the system modules are separated to ensure reuse and support multiple users and sessions.

System Functionalities
The system provides five main functions: Term Usage, Category, Statistics, Differences, and Text Grading.
Term Usage: This provides the distribution of input words in all or a subset of grades and countries, and displays all relevant word forms as word clouds as shown in Figures 2, 3 and 4. For ambiguous words, users can select either a diacritized form or an undiacritized form. If the input word is in English, it provides the most frequent translation and displays its information. For any displayed word, users can get translation, listen to pronunciation, and obtain morphological information while hovering (Figure 3). Category: Users can browse words belonging to a specific category per country and/or grade as shown in Figure 5. Such functionality gives a glimpse into the overall coverage of a given topic/category. Statistics: This option shows the distribution of all the lemmas of each country and grade. Results can be shown per grade or accumulated, meaning that for each grade, results of all previous grades are also included. From Figure 6 5 we see that Arab students learn ∼ 1.5k lemmas in grade 1 and end up learning 8k lemmas in grade 6. Also, the curricula of QA and OM have lower vocabulary richness compared to the rest of the Gulf countries. This is very important to experts in the field of curriculum development.
Differences: To compare a curriculum of a specific country with that of other countries, users can browse words that are unique to this country in a selected grade as shown in Figure 7. This feature allows users to spot "difficult" words in a given input texts. Difficulty of a text can be measured using different methods. The system shows only words whose lemmas did not appear in the selected or preceding grades. As shown in Figure 8, for the input text, the system highlights difficult words in grade 1 (ex: ("xbrA', jr>p, flsfp" -experts, audacity, philosophy)) as they are not seen in the textbooks of the selected grade. When we select higher grades, difficult words decrease. For the second example in Figure 8, the former words in the text are no longer considered difficult for grade 4 except the words ("tHwylyp, vrA' " -transformative, richness).

Conclusion and Future Work
We presented a tool for curricula analysis. For demonstration, we used a collection of elementary school grades (grades 1 to 6) from all Gulf countries. The system provides valuable insights into word usages, vocabulary coverage, and the richness of the curriculum for each grade level. Also, it provides a function for text grading in reference to a grade level, word pronunciation and translation, and morphological analysis. In the future, we aim to extend the collection to other countries and to improve text grading by considering syntactic and semantic features and give grade-level scores for input texts. We are also considering text simplification.