Hanieh Poostchi


2019

pdf bib
A multi-constraint structured hinge loss for named-entity recognition
Hanieh Poostchi | Massimo Piccardi
Proceedings of the 17th Annual Workshop of the Australasian Language Technology Association

2018

pdf bib
Cluster Labeling by Word Embeddings and WordNet's Hypernymy
Hanieh Poostchi | Massimo Piccardi
Proceedings of the Australasian Language Technology Association Workshop 2018

Cluster labeling is the assignment of representative labels to clusters obtained from the organization of a document collection. Once assigned, the labels can play an important role in applications such as navigation, search and document classification. However, finding appropriately descriptive labels is still a challenging task. In this paper, we propose various approaches for assigning labels to word clusters by leveraging word embeddings and the synonymity and hypernymy relations in the WordNet lexical ontology. Experiments carried out using the WebAP document dataset have shown that one of the approaches stand out in the comparison and is capable of selecting labels that are reasonably aligned with those chosen by a pool of four human annotators.

pdf bib
BiLSTM-CRF for Persian Named-Entity Recognition ArmanPersoNERCorpus: the First Entity-Annotated Persian Dataset
Hanieh Poostchi | Ehsan Zare Borzeshi | Massimo Piccardi
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2016

pdf bib
PersoNER: Persian Named-Entity Recognition
Hanieh Poostchi | Ehsan Zare Borzeshi | Mohammad Abdous | Massimo Piccardi
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Named-Entity Recognition (NER) is still a challenging task for languages with low digital resources. The main difficulties arise from the scarcity of annotated corpora and the consequent problematic training of an effective NER pipeline. To abridge this gap, in this paper we target the Persian language that is spoken by a population of over a hundred million people world-wide. We first present and provide ArmanPerosNERCorpus, the first manually-annotated Persian NER corpus. Then, we introduce PersoNER, an NER pipeline for Persian that leverages a word embedding and a sequential max-margin classifier. The experimental results show that the proposed approach is capable of achieving interesting MUC7 and CoNNL scores while outperforming two alternatives based on a CRF and a recurrent neural network.