<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <title>DSpace Collection:</title>
  <link rel="alternate" href="http://hdl.handle.net/10174/38446" />
  <subtitle />
  <id>http://hdl.handle.net/10174/38446</id>
  <updated>2026-04-06T18:38:57Z</updated>
  <dc:date>2026-04-06T18:38:57Z</dc:date>
  <entry>
    <title>Performance Evaluation of NLP Models for European Portuguese: Multi-GPU/Multi-node Configurations and Optimization Techniques</title>
    <link rel="alternate" href="http://hdl.handle.net/10174/41453" />
    <author>
      <name>Santos, Daniel</name>
    </author>
    <author>
      <name>Miquelina, Nuno</name>
    </author>
    <author>
      <name>Schmidt, Daniela</name>
    </author>
    <author>
      <name>Quaresma, Paulo</name>
    </author>
    <author>
      <name>Nogueira, Vítor Beires</name>
    </author>
    <id>http://hdl.handle.net/10174/41453</id>
    <updated>2026-02-25T10:42:34Z</updated>
    <published>2025-02-17T00:00:00Z</published>
    <summary type="text">Title: Performance Evaluation of NLP Models for European Portuguese: Multi-GPU/Multi-node Configurations and Optimization Techniques
Authors: Santos, Daniel; Miquelina, Nuno; Schmidt, Daniela; Quaresma, Paulo; Nogueira, Vítor Beires
Abstract: Natural Language Processing (NLP) research has predominantly focused on the English language, leading to a wealth of resources and advancements tailored to English. However, there is a growing need to extend these capabilities to other languages, such as European Portuguese, to ensure the inclusivity and accessibility of NLP technologies. In this study, we explore the evaluation of NLP models in the European Portuguese language using a multi-GPU/multi-node machine. We utilized various tools such as PyTorch, Accelerate, Transformers, and DeepSpeed with ZeRO Stage 3 to handle the computational demands of large-scale model training. We provide all the key aspects of our methodology to evaluate various models on translated GLUE tasks. Additionally, we introduce AiBERTa, a base model with 110 million parameters, developed and pre-trained on a corpus tailored for European Portuguese. This research highlights the effectiveness of advanced tools and distributed computing in scaling NLP model training, providing a foundation for future enhancements in European Portuguese language processing.</summary>
    <dc:date>2025-02-17T00:00:00Z</dc:date>
  </entry>
  <entry>
    <title>A Galician-Portuguese Generative Model</title>
    <link rel="alternate" href="http://hdl.handle.net/10174/41452" />
    <author>
      <name>Gamallo, Pablo</name>
    </author>
    <author>
      <name>Rodríguez, Pablo</name>
    </author>
    <author>
      <name>Sotelo, Susana</name>
    </author>
    <author>
      <name>Miquelina, Nuno</name>
    </author>
    <author>
      <name>Paniagua, Silvia</name>
    </author>
    <author>
      <name>Schmidt, Daniela</name>
    </author>
    <author>
      <name>de-Dios-Flores, Iria</name>
    </author>
    <author>
      <name>Quaresma, Paulo</name>
    </author>
    <author>
      <name>Bardanca, Daniel</name>
    </author>
    <author>
      <name>Pichel, José Ramom</name>
    </author>
    <author>
      <name>Nogueira, Vítor</name>
    </author>
    <author>
      <name>Barro, Senén</name>
    </author>
    <id>http://hdl.handle.net/10174/41452</id>
    <updated>2026-02-25T10:42:22Z</updated>
    <published>2024-11-16T00:00:00Z</published>
    <summary type="text">Title: A Galician-Portuguese Generative Model
Authors: Gamallo, Pablo; Rodríguez, Pablo; Sotelo, Susana; Miquelina, Nuno; Paniagua, Silvia; Schmidt, Daniela; de-Dios-Flores, Iria; Quaresma, Paulo; Bardanca, Daniel; Pichel, José Ramom; Nogueira, Vítor; Barro, Senén
Abstract: Large language models (LLMs) have revolutionized natural language processing, but their predominant focus on English has resulted in biases and performance differences across various languages. This situation is maintained in generative multilingual models, where English continues to be the predominant language. In these models, the presence of European Portuguese is marginal and that of the Galician variety is almost residual. In this work, we describe an open-source Galician-Portuguese generative model, Carvalho_pt-gl, focused precisely on these two language variants, which are very close lexically and syntactically. The model was trained using a GPT architecture with 1.3 billion parameters on more than 6B words, balanced between the two varieties. The strategy of continual pertaining was used to adapt a pre-existing LLM that was trained on a trilingual dataset with related languages, thereby overcoming the data limitations that would be faced if the training was started from scratch. Evaluation results involving task-based datasets from standardized benchmarks indicate a promising performance. These findings highlight the critical importance of supporting linguistic diversity in generative models.</summary>
    <dc:date>2024-11-16T00:00:00Z</dc:date>
  </entry>
  <entry>
    <title>Parameter Efficient Fine-Tunning of LLMs: Application to Machine Translation from English to Portuguese</title>
    <link rel="alternate" href="http://hdl.handle.net/10174/41401" />
    <author>
      <name>Santos, Daniel</name>
    </author>
    <author>
      <name>Nogueira, Vitor</name>
    </author>
    <author>
      <name>Quaresma, Paulo</name>
    </author>
    <id>http://hdl.handle.net/10174/41401</id>
    <updated>2026-02-23T11:42:01Z</updated>
    <published>2025-01-01T00:00:00Z</published>
    <summary type="text">Title: Parameter Efficient Fine-Tunning of LLMs: Application to Machine Translation from English to Portuguese
Authors: Santos, Daniel; Nogueira, Vitor; Quaresma, Paulo
Abstract: Fine-tuning Large Language Models (LLMs) for specific tasks, such as machine translation, is a computationally expensive process that often requires substantial hardware resources. Parameter-Efficient Fine-Tuning (PEFT) methods, such as Low-Rank Adaptation (LoRA) and Quantized Low-Rank Adaptation (QLoRA), offer a resource-efficient alternative by significantly reducing the number of trainable parameters and memory requirements. In this work, we compare the performance and memory efficiency of LoRA and QLoRA on English-Portuguese translation tasks, utilizing two cutting edge LLMs, Meta LLaMA 3.1 8B and Mistral 7B. Our experiments demonstrate that both LoRA and QLoRA achieve substantial memory savings. Moreover, this work underscores the practical advantages of LoRA and QLoRA in resource-constrained environments, providing a foundation for further optimization and experimentation in machine translation using large language models.</summary>
    <dc:date>2025-01-01T00:00:00Z</dc:date>
  </entry>
  <entry>
    <title>GuideBP: Guided Backpropagation in Multi-output Neural Networks by Channeling Gradients Through Weaker Logits</title>
    <link rel="alternate" href="http://hdl.handle.net/10174/41303" />
    <author>
      <name>Ghosh, Swarnendu</name>
    </author>
    <author>
      <name>Mandal, Bodhisatwa</name>
    </author>
    <author>
      <name>Gonçalves, Teresa</name>
    </author>
    <author>
      <name>Quaresma, Paulo</name>
    </author>
    <author>
      <name>Nasipuri, Mita</name>
    </author>
    <author>
      <name>Das, Nibaran</name>
    </author>
    <id>http://hdl.handle.net/10174/41303</id>
    <updated>2026-02-19T11:12:41Z</updated>
    <published>2024-01-01T00:00:00Z</published>
    <summary type="text">Title: GuideBP: Guided Backpropagation in Multi-output Neural Networks by Channeling Gradients Through Weaker Logits
Authors: Ghosh, Swarnendu; Mandal, Bodhisatwa; Gonçalves, Teresa; Quaresma, Paulo; Nasipuri, Mita; Das, Nibaran
Abstract: Convolutional neural networks often generate multiple logits from multiple networks. In most cases, we use simple techniques like addition or column averaging for loss computation. But this allows gradients to be distributed equally among all paths. The proposed approach attempts to guide the gradients of backpropagation along the weakest branches of the neural network. A weakness score is proposed that defines the class-specific performance of individual logits. This is then used to create a new output distribution that would guide gradients along the weakest pathways. The proposed approach has been shown to perform better than traditional column merging techniques and can be used in several application scenarios. Not only can the proposed model be used as an efficient technique for training multiple instances of a model parallelly but also CNNs with multiple output branches have been shown to perform better with the proposed upgrade. Various experiments establish the flexibility of the learning technique which is simple yet effective in various multi-objective scenarios both empirically and statistically.</summary>
    <dc:date>2024-01-01T00:00:00Z</dc:date>
  </entry>
</feed>

