Recent Submissions

  • A sentiment analysis dataset for code-mixed Malayalam-English 

    Chakravarthi, Bharathi Raja; Jose, Navya; Suryawanshi, Shardul; Sherly, Elizabeth; McCrae, John P. (European Language Resources Association (ELRA), 2020-05-11)
    There is an increasing demand for sentiment analysis of text from social media which are mostly code-mixed. Systems trained on monolingual data fail for code-mixed data due to the complexity of mixing at different levels ...
  • Corpus creation for sentiment analysis in code-mixed Tamil-English text 

    Chakravarthi, Bharathi Raja; Muralidaran, Vigneshwaran; Priyadharshini, Ruba; McCrae, John P. (European Language Resources Association (ELRA), 2020-05-11)
    Understanding the sentiment of a comment from a video or an image is an essential task in many applications. Sentiment analysis of a text can be useful for various decision-making processes. One such application is to ...
  • Is trust between AI institutions and the public “morally rotten?” 

    Carter, Sarah (Machine Ethics Research Group, School of Computer Science, University College Dublin, 2020)
    Developing artificial Intelligence (AI) technology has become a business of power. AI innovation is increasingly centralized in a few large companies – mainly, Google, Facebook, and Apple.1 Specialized data scientists - ...
  • A comparative study of different state-of-the-art hate speech detection methods in Hindi-English code-mixed data 

    Rani, Priya; Suryawanshi, Shardul; Goswami, Koustava; Chakravarthi, Bharathi Raja; Fransen, Theodorus; McCrae, John P. (European Language Resources Association (ELRA), 2020-05-11)
    Hate speech detection in social media communication has become one of the primary concerns to avoid conflicts and curb undesired activities. In an environment where multilingual speakers switch among multiple languages, ...
  • NUIG at TIAD: Combining unsupervised NLP and graph metrics for translation inference 

    McCrae, John P.; Arcan, Mihael (European Language Resources Association (ELRA), 2020-05-11)
    In this paper, we present the NUIG system at the TIAD shard task. This system includes graph-based metrics calculated using novel algorithms, with an unsupervised document embedding tool called ONETA and an unsupervised ...
  • A dataset for troll classification of Tamil memes 

    Chakravarthi, Bharathi Raja; Varma, Pranav; Arcan, Mihael; McCrae, John P.; Buitelaar, Paul; Shardul, Suryawanshi (European Language Resources Association (ELRA), 2020-05-11)
    Social media are interactive platforms that facilitate the creation or sharing of information, ideas or other forms of expression among people. This exchange is not free from offensive, trolling or malicious contents ...
  • Multimodal meme dataset (MultiOFF) for identifying offensive content in image and text 

    Suryawanshi, Shardul; Chakravarthi, Bharathi Raja; Arcan, Mihael; Buitelaar, Paul (European Language Resources Association (ELRA), 2020-05-11)
    A meme is a form of media that spreads an idea or emotion across the internet. As posting meme has become a new form of communication of the web, due to the multimodal nature of memes, postings of hateful memes or related ...
  • Challenges of word sense alignment: Portuguese language resources 

    Salgado, Ana; Ahmadi, Sina; Simões, Alberto; McCrae, John P.; Costa, Rute (National University of Ireland Galway, 2020-05-16)
    This paper reports on an ongoing task of monolingual word sense alignment in which a comparative study between the Portuguese Academy of Sciences Dictionary and the Dicionario Aberto ´ is carried out in the context of the ...
  • A corpus of the Sorani Kurdish folkloric lyrics 

    Ahmadi, Sina; Hassani, Hossein; Abedi, Kamaladdin (National University of Ireland Galway, 2020-05-16)
    Kurdish poetry and prose narratives were historically transmitted orally and less in a written form. Being an essential medium of oral narration and literature, Kurdish lyrics have had a unique attribute in becoming a ...
  • Veritas annotator: Discovering the origin of a rumour 

    Azevedo, Lucas; Moustafa, Mohamed (Association for Computational Linguistics (ACL), 2019-11-03)
    Defined as the intentional or unintentional spread of false information (K et al., 2019) through context and/or content manipulation, fake news has become one of the most serious problems associated with online ...
  • Towards sharing task environments to support reproducible evaluations of interactive recommender systems 

    Barraza-Urbina, Andrea; d'Aquin, Mathieu (NUI Galway, 2019-09-20)
    Beyond sharing datasets or simulations, we believe the Recommender Systems (RS) community should share Task Environments. In this work, we propose a high-level logical architecture that will help to reason about the core ...
  • BEARS: Towards an evaluation framework for bandit-based interactive recommender systems 

    Barraza-Urbina, Andrea; Koutrika, Georgia; d'Aquin, Mathieu,; Hayes, Conor (NUI Galway, 2018-10-06)
    Recommender Systems (RS) deployed in fast-paced dynamic scenarios must quickly learn to adapt in response to user evaluative feedback. In these settings, the RS faces an online learning problem where each decision should ...
  • WordNet gloss translation for under-resourced languages using multilingual neural machine translation 

    Chakravarthi, Bharathi Raja; Arcan, Mihael; McCrae, John P. (European Association for Machine Translation, 2019-08-19)
    In this paper, we translate the glosses in the English WordNet based on the expand approach for improving and generating wordnets with the help of multilingual neural machine translation. Neural Machine Translation (NMT) ...
  • Multilingual multimodal machine translation for Dravidian languages utilizing phonetic transcription 

    Chakravarthi, Bharathi Raja; Priyadharshini, Ruba; Stearns, Bernardo; Jayapal, Arun; Sridevy, S.; Arcan, Mihael; Zarrouk, Manel; McCrae, John P. (European Association for Machine Translation, 2019-08-19)
    Multimodal machine translation is the task of translating from a source text into the target language using information from other modalities. Existing multimodal datasets have been restricted to only highly resourced ...
  • Neural machine translation of literary texts from English to Slovene 

    Kuzman, Taja; Vintar, Špela; Arčan, Mihael (Machine Translation Summit 2019, 2019-08-19)
    Neural Machine Translation has shown promising performance in literary texts. Since literary machine translation has not yet been researched for the English-toSlovene translation direction, this paper aims to fulfill ...
  • CoFiF: A corpus of financial reports in French language 

    Ahmadi, Sina; Daudert, Tobias (NUI Galway, 2019-08-12)
    In an era when machine learning and artificial intelligence have huge momentum, the data demand to train and test models is steadily growing. We introduce CoFiF, the first corpus comprising company reports in the French ...
  • Creating a fine-grained corpus for a less-resourced language: the case of Kurdish 

    Omer Abdulrahman, Roshna; Hassani, Hossein; Ahmadi, Sina (NUI Galway, 2019-07-28)
    Kurdish is a less-resourced language consisting of different dialects written in various scripts. Approximately 30 million people in different countries speak the language. The lack of corpora is one of the main obstacles ...
  • NUIG at the FinSBD Task: Sentence boundary detection for noisy financial PDFs in English and French 

    Daudert, Tobias; Ahmadi, Sina (NUI Galway, 2019-08-12)
    Portable Document Format (PDF) has become the industry-standard document as it is independent of the software, hardware or operating system. Publicly listed companies annually publish a variety of reports and too take ...
  • Passive diagnosis incorporating the PHQ-4 for depression and anxiety 

    Delahunty, Fionn; Johansson, Robert; Mihael, Arcan (NUI Galway, 2019)
    Depression and anxiety are the two most prevalent mental health disorders worldwide, impacting the lives of millions of people each year. In this work, we develop and evaluate a multilabel, multidimensional deep neural ...
  • On lexicographical networks 

    Ahmadi, Sina; Arcan, Mihael; McCrae, John (NUI Galway, 2018-12-06)
    In this study, we analyze various aspects of lexicographical networks. We would like to answer our research questions of what are the characteristics of the lexicographical networks? In addition to the existing notions of ...

View more