From European Portuguese to Portuguese Sign Language

Several efforts have been done towards the development of platforms that allow the translation of speciﬁc sign languages to the correspondent spoken language (and vice-versa). In this (demo) paper, we describe a freely available system that translates, in real time, European Portuguese (text) into Portuguese Sign Language (LGP), by using an avatar. We focus in how some linguistic elements are dealt in LGP, and in the Natural Language Processing (NLP) step implemented in our system. The system’s interface will be used to demonstrate it. Although only a small set of linguistic phenomena are implemented, it can be seen how the system copes with it.


Introduction
Sign languages are now considered full natural human languages. Contrary to auditory-vocal languages, sign languages are visual-gestural languages that merge manual communication and body language [1]. The meaning is expressed with a combination of different hand shapes, orientation and movement of the hands (manual features). Non-manual features, such as body movements (upper torso) and facial expressions are also used, as well as fingerspelling -the process of spelling out words by using signs that correspond to the letters of the word in the local vocal language.
Sign Languages have their own vocabulary and grammatical rules, which do not match the correspondent spoken language as the writing system does [1]. For instance, American Sign Language and British Sign Language are different and not mutually understandable, although learnt by people living in English speaking countries. Thus, it is very difficult to take advantage of existing resources when moving to a new sign language.
For some languages, several studies emerged in the last years, with their focus ranging from linguistic and humanistic to automatic translation. However, only very recently, LGP (officially recognised in 1997 1 ) has been a target of these studies. Currently, there are several dictionaries [2,3] both in image and video format, but only one grammar [4] in a very incomplete state. There is no official number for deaf persons in Portugal, but the 2011 census [5] mentions 27,659 deaf persons, making, however, no distinction in the level of deafness, and on the respective level of Portuguese and LGP literacy. Aiming to contribute to this community, we developed a system, which, given as input a sentence in (European) Portuguese, performs the correspondent signs in LGP, by using an avatar. At the basis of this system, there is a flexible architecture that takes advantage of NLP tools, as these can give an important contribution to the translation process. For instance, if a proper noun is identified, if no sign is associated with it, fingerspelling is the solution. Moreover, as we will see, in some cases, a word can be signed by signs associated with its root and suffixes. Thus, a stemmer or a Part-of-Speech (POS)-tagger can play a fundamental role in these situations. A detailed description of the system can be found in [6] and [7]. The system can be downloaded from http://web.ist.utl.pt/~ist163556/pt2lgp. This paper is organised as follows: in Section 2 we present related work, in Section 3 we describe some basic linguistic phenomena in LGP, in Section 4 we describe our prototype, and, in Section 5, we explain what can be tested in our demo. Finally, in Section 6 we conclude and point to future work.

Related Work
Many efforts were done towards the development of translators from different sign languages to their spoken counterparts and vice-versa. A number of projects in the area are focusing in the entire system pipeline (from spoken to sign languages and viceversa), as the work of [8], for Portuguese, and [9] for Mandarin; others only target part of it (for instance [10], which deals with Italian Sign Language). Current trends in Automatic Machine Translation cannot be followed as there are no parallel corpora (except in some specific contexts) to train the translators. Thus, most of the existing systems are based on handcrafted glosses, relating signs with words, which is also our approach.
Recently LGP has been the focus of several computational studies. The work described in [11] focus on avatars, and on how to produce avatars signs, based on human signs; the work in [12] targets the teaching of LGP; the Virtual Sign Translator [8] contributes with a translator between European Portuguese and LGP, and it was also applied to be used in a game that teaches LGP [13]. However, to the best of our knowledge, none of these works explored how current NLP tasks can be applied to help the translation process.

Linguistic Concepts
In this section we make a brief overview of some linguistic phenomena in LGP. At the basis of our study are the static images of hand configurations presented in an LGP dictionary [2], LGP videos from different sources, such as the Spread the Sign initiative 2 , and, the (only) LGP grammar [4], from 1994. LGP interpreters were also consulted, as we could not found information regarding some linguistic phenomena in the previous mentioned materials.

Nouns
Concepts in LGP usually do not have an associated gender, and, thus, do not need inflection. For animated beings, and when relevant, gender can be specified with a prefix, expressing 'male' or 'female' (as for 'coelha' ('female rabbit'), which becomes 'female' + 'rabbit'). In case of omission, the male gender is assumed. However, there are classes of nouns that are female by default as is the case of 'enfermeira' ('nurse'), and need to be obligatorily prefixed with 'male'. Another (more common) exception is to have two separate words to denote the male and female case, as in 'leão' ('lion') and 'leoa' ('lioness').
Considering plural cases for LGP, this can be done in several different ways. The first is incorporation, allowing to explicitly specify a small quantity after the noun. Examples are 'pessoas+4' ('persons+4'), or 'pessoas+muitas' ('per-sons+many'). The second is repetition, meaning to perform a sign multiple times as seen for 'árvores' (trees). The last is reduplication, meaning to make the sign with both hands as in 'pessoas' (persons). However, there are many non identified processes for LGP and the cases of the usage of each type of plural are not clear.
With regard to proper nouns, fingerspelling is often used. If the person does not have a known gestural name, fingerspelling the letters of her/his name is the solution.

Adjectives
The sign for the adjective follows the sign for the noun. Figure  1 illustrates the signs for 'coelho pequeno' ('little rabbit'). Notice that, if the signs for 'coelho' and 'pequeno' are available (and although this cannot be seen as a rule) words as 'coelhinho' (also 'little rabbit'), can also be translated, as long as we are able to properly identify suffixes.

Numbers
Numbers can be used as a quantitative qualifier, isolated number (cardinal), ordinal number, and composed number (e.g. 147). Signs associated with each number also vary their forms if we are expressing a quantity, a repetition or a duration, and if we are using them as an adjective or complement to a noun or verb. Reducing the test case to ordinal numbers, the main difficulty is to express numbers in the order of the tens and up. For instance, '147' is signed as '1', followed by '4' and '7' with a slight offset in space as the number grows. Numbers from '11' to '19' can be signed with a blinking movement of the units' number. Some numbers, in addition to this system, have a totally different sign (as e.g. '11', which has its own sign).

Verbs
When the use of the verb is plain, with no past or future participles, the infinitive form is used in LGP. Most verbs are inflected according to the associated subjects and are affected by the action, the time and the way the action is realised. For instance, for the regular use of the verb 'to eat', the hand goes twice to the mouth, closing from a relaxed form, with palm up. However, this verb in LGP is highly contextualised with what is being eaten. Thus, the verb should be signed recurring to different hand configurations and expressiveness, describing how the thing is being eaten (not all the deaf associations agree on this).
The Portuguese grammar [4] refers a temporal line in the gesturing space with which verbs should concord with in past, present and future tenses. The verb inflection is made along this imaginary line with eye, eyebrow and upper body movement. A common practice is to add a time adverb to the sentence, such as passado 'past', futuro 'future' or amanhã 'tomorrow'. The adverbial expression is also performed along the timeline with a possible emphasis on the distance in time. For example, the word agora 'now' is always signed in front of the signer and close to the torso, but it can be signed more and more close to express immediateness or the reverse to express laxness. This is a feature often found in other sign languages.
In what concerns verb agreement, to the best of our knowledge, there is no gender or number agreement in LGP. This information must be express by direct referencing to the subject, for example, by mentioning a personal pronoun before the verb. For instance, in the sentence eu pergunto-te 'I ask you', the verb is directed from 'I' to 'you, while in the sentence tu perguntasme 'You ask me', the verb changes directionality. Additionally, the pronoun 'you' is signed in the direction of the second person's face in the case of the verb 'ask', but in the direction of the chest with the verb 'to give'.
Modality is realised throughout the imaginary temporal line, indicating duration and repetition through movement. An example is the verb andar 'to walk', which is signed with different movement modulation for andar 'walk', ir andando 'walking', andar apressadamente 'walk hurriedly', andar pesadamente 'stumping' and so on.

Syntax
Syntax in sign languages is made by spatial agreement of signs. To the best of our knowledge, there are no studies at the sentence level for the LGP, but studies for American Sign Language (ASL) [14], indicate the existence of several complex phenomena, such as loci and surrogates for the agreement of verbs with virtual entities. However, it is known that in a syntactic point of view, LGP is Object-Subject-Verb (OSV), while spoken Portuguese is predominantly Subject-Verb-Object (SVO).

The prototype
The Natural Language ToolKit (NLTK) 3 was used in all the NLP tasks. Blender 4 was our choice regarding the 3D package for animation. Both are widely used, community driven, free and open source. Moreover, NLTK offers taggers, parsers, and other tools in several languages, including Portuguese. In the following we describe each one of the main tasks of our system. Figure 2 presents the general architecture of our prototype. Several well established tasks in the NLP field where integrated in our system, namely: • Error correcting and normalization: A step that enforces lowercase and the use of latin characters. Common spelling mistakes can be corrected in this step.
• Stemmer: As a form of morphologic parsing, we apply a stemmer that identifies suffixes and prefixes to use as an adjective or classifier to the gloss. This allows, for example, 'coelhinha' ('little female rabbit'), to be understood, by its suffixes ('inho' +'a') , to be a small ('inho) and a female ('a') derivation of the root 'coelh(o)'.
• POS-Tagger: We make use of NLTK's n-gram taggers, starting with a bigram tagger, with a backoff technique for an unigram tagger and the default classification of 'noun' (the most common class for Portuguese). We used the treebank 'floresta sintá(c)tica' corpus [15] for training the taggers.

Lookup
The animation lookup, given a gloss, is done via a JSON file mimicking a database. The database consists of a set of glosses and a set of actions. The action ids are mapped to blender actions, that are in turn referenced by the glosses. One gloss may link to more than one action, that are assumed to be played sequentially.

Animation
We implemented base hand configurations. These differ from sign language to sign language. For LGP there are 57 base configurations, composed of 26 hand configurations for letters, 10 for numbers, 13 for named configurations and 8 extra ones matching greek letters (examples in Figure 3). Doing a set of base hand configurations to start, proved to be a good choice as it allowed to test the hand rig and basic methodology. All the 57 basic hand configuration were manually posed and keyed from references gathered from [2,4,3], and also from the Spread the Sign project videos 5 . The ten (0 to 9) implemented hand configurations are shown in Figure 4. The animation is synthesised by directly accessing and modifying the action and f-curve data. We always start and end a sentence with the rest pose. For concatenating the actions, we blend from one to the other in a given amount of frames by using Blender's Non-Linear Action tools that allow action layering. Channels that are not used in the next gesture, are blended with the rest pose instead. We adjust the number of frames for blending according to the hints received. For fingerspelling mode, we expand the duration of the hand configuration (which is originally just one frame). Further details about this process can be found in [6] and [7].

The demo
Users can interact with our system via an interface, which consists of an input text box, a button to translate, and a 3D view with the signing avatar. The 3D view can be rotated and zoomed, allowing to see the avatar from different perspectives.
The breakdown down in Figure 5 shows the interface. Additionally, we provide an interface for exporting video of the signing that supports choosing the resolution, aspect ratio and file format. This panel is indicated in the image in orange. Displayed in green, is a panel indicating the authors and describing the project. All panels but the one used for the main interaction start folded. It should be clear that it is still possible to use extra functionalities of Blender, thus making advanced usage of the system.
In what concerns current possibilities of the system, common spelling mistakes in the words used for the test cases can be corrected. Moreover, several words deriving from the stem 'coelho' were implemented, such as 'coelha' (female rabbit) and 'coelhinho' (little rabbit). Besides isolated words, some full sentences, such as 'O João come a sopa', can be tested. The verb sign had to be extended, as for eating soup, it is done as if handling a spoon (for instance, for eating apples, the verb is signed as if holding the fruit).
To conclude, we should say that two deaf associations were reached for a preliminary evaluation. Feedback on clarity and readability was very positive.

Conclusions and Future Work
In this paper we described the system we would like to demonstrate, focusing on its NLP component. It was designed to be free and open-source. All the basic hand signs for LGP were implemented, as well as the whole basic infrastructure (already accommodating different language phenomena).
This work led to a collaborative project between academia and industry that aims at creating a prototype that translates European Portuguese (text and speech) into LGP, in real time. As future work, besides moving to the translation between LGP and European Portuguese, we will extend the database and the dictionaries. Also, we will work in an interface that will allows us to easily add data to the system. Current NLP tasks and techniques will also be further explored. A more formal evaluation also needs to be done.