About 66 results for large language models in Academic & Research
arXiv arxiv.org Jan 9, 2025

Enhancing Human-Like Responses in Large Language Models

arXiv preprint
This paper explores the advancements in making large language models (LLMs) more human-like. We focus on techniques that enhance natural language understanding, conversational coherence, and emotional intelligence in AI systems. The study evaluates various approaches, including fine-tuning with diverse datasets, incorp…
DOAJ mdpi.com Dec 1, 2025

Evaluating Model Resilience to Data Poisoning Attacks: A Comparative Study

Ifiok Udoidiok, Fuhao Li, Jielun Zhang — Information
Machine learning (ML) has become a cornerstone of critical applications, but its vulnerability to data poisoning attacks threatens system reliability and trustworthiness. Prior studies have begun to investigate the impact of data poisoning and proposed various defense or evaluation methods; however, most efforts remain…
CORE arxiv.org Apr 29, 2020

BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Model Performance

Schick, Timo, Schütze, Hinrich
Pretraining deep language models has led to large performance gains in NLP. Despite this success, Schick and Sch\"utze (2020) recently showed that these models struggle to understand rare words. For static word embeddings, this problem has been addressed by separately learning representations for rare words. In this wo…
NASA ADS doi.org Aug 1, 2021

Program Synthesis with Large Language Models

Austin, Jacob, Odena, Augustus, Nye, Maxwell, Bosma, Maarten et al. — arXiv e-prints
This paper explores the limits of the current generation of large language models for program synthesis in general purpose programming languages. We evaluate a collection of such models (with between 244M and 137B parameters) on two new benchmarks, MBPP and MathQA-Python, in both the few-shot and fine-tuning regimes. O…
OpenAlex doi.org Jul 17, 2023

Large language models in medicine

Arun James Thirunavukarasu, Darren Shu Jeng Ting, Kabilan Elangovan, Laura Gutiérrez et al. — Nature Medicine
DOAJ doi.org Apr 1, 2025

Impact of early life exposure to heat and cold on linguistic development in two-year-old children: findings from the ELFE cohort study

Guillaume Barbalat, Ariane Guilbert, Lucie Adelaïde, Marie-Aline Charles et al. — Environmental Health
Abstract Background A number of negative developmental outcomes in response to extreme temperature have been documented. Yet, to our knowledge, environmental research has left the question of the effect of temperature on human neurodevelopment largely unexplored. Here, we aimed to investigate the effect of ambient temp…
CORE arxiv.org Sep 12, 2017

Language Models of Spoken Dutch

Verwimp, Lyan, Pelemans, Joris, Lycke, Marieke, Van hamme, Hugo et al.
In Flanders, all TV shows are subtitled. However, the process of subtitling is a very time-consuming one and can be sped up by providing the output of a speech recognizer run on the audio of the TV show, prior to the subtitling. Naturally, this speech recognition will perform much better if the employed language model …
OpenAIRE hdl.handle.net Aug 1, 2023

Large language models: compilers for the 4th generation of programming languages?

Marcondes, Francisco Supino, Almeida, J. J., Novais, Paulo
This paper explores the possibility of large language models as a fourth generation programming language compiler. This is based on the idea that large language models are able to translate a natural language specification into a program written in a particular programming language. In other words, just as high-level l…
NASA ADS doi.org Jan 1, 2022

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

Wei, Jason, Wang, Xuezhi, Schuurmans, Dale, Bosma, Maarten et al. — arXiv e-prints
We explore how generating a chain of thought -- a series of intermediate reasoning steps -- significantly improves the ability of large language models to perform complex reasoning. In particular, we show how such reasoning abilities emerge naturally in sufficiently large language models via a simple method called chai…
OpenAlex doi.org Jul 12, 2023

Large language models encode clinical knowledge

Karan Singhal, Shekoofeh Azizi, Tao Tu, S. Sara Mahdavi et al. — Nature
Large language models (LLMs) have demonstrated impressive capabilities, but the bar for clinical applications is high. Attempts to assess the clinical knowledge of models typically rely on automated evaluations based on limited benchmarks. Here, to address these limitations, we present MultiMedQA, a benchmark combining…
arXiv arxiv.org May 18, 2024

Large Language Models Lack Understanding of Character Composition of Words

arXiv preprint
Large language models (LLMs) have demonstrated remarkable performances on a wide range of natural language tasks. Yet, LLMs' successes have been largely restricted to tasks concerning words, sentences, or documents, and it remains questionable how much they understand the minimal units of text, namely characters. In th…
DOAJ mdpi.com Oct 1, 2024

System 2 Thinking in OpenAI’s o1-Preview Model: Near-Perfect Performance on a Mathematics Exam

Joost C. F. de Winter, Dimitra Dodou, Yke Bauke Eisma — Computers
The processes underlying human cognition are often divided into System 1, which involves fast, intuitive thinking, and System 2, which involves slow, deliberate reasoning. Previously, large language models were criticized for lacking the deeper, more analytical capabilities of System 2. In September 2024, OpenAI introd…
CORE arxiv.org May 29, 2020

Using Large Pretrained Language Models for Answering User Queries from Product Specifications

Roy, Kalyani, Shah, Smit, Pai, Nithish, Ramtej, Jaidam et al.
While buying a product from the e-commerce websites, customers generally have a plethora of questions. From the perspective of both the e-commerce service provider as well as the customers, there must be an effective question answering system to provide immediate answers to the user queries. While certain questions can…
OpenAIRE doi.org Jan 1, 2024

Development of a Red-Teaming Dataset for Defending Large Language Models against Attacks

Irina Sergeevna Alekseevskaia, Konstantin Vladimirovich Arkhipenko, Denis Yuryevich Turdakov
Modern large language models are huge systems with complex internal mechanisms implementing black-box response generation. Although aligned large language models have built-in defense mechanisms against attacks, recent studies demonstrate the vulnerability of large language models to attacks. In this study, we aim to e…
NASA ADS doi.org Mar 1, 2022

Training Compute-Optimal Large Language Models

Hoffmann, Jordan, Borgeaud, Sebastian, Mensch, Arthur, Buchatskaya, Elena et al. — arXiv e-prints
We investigate the optimal model size and number of tokens for training a transformer language model under a given compute budget. We find that current large language models are significantly undertrained, a consequence of the recent focus on scaling language models whilst keeping the amount of training data constant. …
OpenAlex doi.org Jan 23, 2024

A Survey on Evaluation of Large Language Models

Yupeng Chang, Xu Wang, Jindong Wang, Yuan Wu et al. — ACM Transactions on Intelligent Systems and Technology
Large language models (LLMs) are gaining increasing popularity in both academia and industry, owing to their unprecedented performance in various applications. As LLMs continue to play a vital role in both research and daily use, their evaluation becomes increasingly critical, not only at the task level, but also at th…
DOAJ doi.org Nov 1, 2022

Collectively encoding protein properties enriches protein language models

Jingmin An, Xiaogang Weng — BMC Bioinformatics
Abstract Pre-trained natural language processing models on a large natural language corpus can naturally transfer learned knowledge to protein domains by fine-tuning specific in-domain tasks. However, few studies focused on enriching such protein language models by jointly learning protein properties from strongly-corr…
CORE arxiv.org Mar 31, 2016

BlackOut: Speeding up Recurrent Neural Network Language Models With Very Large Vocabularies

Ji, Shihao, Vishwanathan, S. V. N., Satish, Nadathur, Anderson, Michael J. et al.
We propose BlackOut, an approximation algorithm to efficiently train massive recurrent neural network language models (RNNLMs) with million word vocabularies. BlackOut is motivated by using a discriminative loss, and we describe a new sampling strategy which significantly reduces computation while improving stability, …
NASA ADS doi.org Dec 1, 2023

Retrieval-Augmented Generation for Large Language Models: A Survey

Gao, Yunfan, Xiong, Yun, Gao, Xinyu, Jia, Kangxiang et al. — arXiv e-prints
Large Language Models (LLMs) showcase impressive capabilities but encounter challenges like hallucination, outdated knowledge, and non-transparent, untraceable reasoning processes. Retrieval-Augmented Generation (RAG) has emerged as a promising solution by incorporating knowledge from external databases. This enhances …
OpenAlex doi.org Feb 9, 2023

Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models

Tiffany H. Kung, Morgan Cheatham, Arielle Medenilla, Czarina Sillos et al. — PLOS Digital Health
We evaluated the performance of a large language model called ChatGPT on the United States Medical Licensing Exam (USMLE), which consists of three exams: Step 1, Step 2CK, and Step 3. ChatGPT performed at or near the passing threshold for all three exams without any specialized training or reinforcement. Additionally, …
arXiv arxiv.org Jul 1, 2024

Self-Cognition in Large Language Models: An Exploratory Study

arXiv preprint
While Large Language Models (LLMs) have achieved remarkable success across various applications, they also raise concerns regarding self-cognition. In this paper, we perform a pioneering study to explore self-cognition in LLMs. Specifically, we first construct a pool of self-cognition instruction prompts to evaluate wh…
DOAJ ieeexplore.ieee.org Apr 22, 2026

Synthetic Data Pretraining for Hyperspectral Image Super-Resolution

Emanuele Aiello, Mirko Agarla, Diego Valsesia, Paolo Napoletano et al. — IEEE Access
Large-scale self-supervised pretraining of deep learning models is known to be critical in several fields, such as language processing, where its has led to significant breakthroughs. Indeed, it is often more impactful than architectural designs. However, the use of self-supervised pretraining lags behind in several do…
CORE arxiv.org Mar 2, 2018

Breaking the Softmax Bottleneck: A High-Rank RNN Language Model

Yang, Zhilin, Dai, Zihang, Salakhutdinov, Ruslan, Cohen, William W.
We formulate language modeling as a matrix factorization problem, and show that the expressiveness of Softmax-based models (including the majority of neural language models) is limited by a Softmax bottleneck. Given that natural language is highly context-dependent, this further implies that in practice Softmax with di…
OpenAIRE doi.org Jan 1, 2024

Modelling Language

Grindrod, Jumbly
This paper argues that large language models have a valuable scientific role to play in serving as scientific models of a language. Linguistic study should not only be concerned with the cognitive processes behind linguistic competence, but also with language understood as an external, social entity. Once this is recog…
NASA ADS doi.org Feb 1, 2024

Large Language Models: A Survey

Minaee, Shervin, Mikolov, Tomas, Nikzad, Narjes, Chenaghlu, Meysam et al. — arXiv e-prints
Large Language Models (LLMs) have drawn a lot of attention due to their strong performance on a wide range of natural language tasks, since the release of ChatGPT in November 2022. LLMs' ability of general-purpose language understanding and generation is acquired by training billions of model's parameters on massive am…
OpenAlex doi.org Jun 17, 2021

LoRA: Low-Rank Adaptation of Large Language Models

J. Edward Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu et al. — arXiv (Cornell University)
An important paradigm of natural language processing consists of large-scale pre-training on general domain data and adaptation to particular tasks or domains. As we pre-train larger models, full fine-tuning, which retrains all model parameters, becomes less feasible. Using GPT-3 175B as an example -- deploying indepen…
arXiv arxiv.org Sep 5, 2023

Making Large Language Models Better Reasoners with Alignment

arXiv preprint
Reasoning is a cognitive process of using evidence to reach a sound conclusion. The reasoning capability is essential for large language models (LLMs) to serve as the brain of the artificial general intelligence agent. Recent studies reveal that fine-tuning LLMs on data with the chain of thought (COT) reasoning process…
DOAJ ieeexplore.ieee.org Apr 22, 2026

Raman Spectroscopy Pre-Trained Encoder: A Self-Supervised Learning Approach for Data-Efficient Domain-Independent Spectroscopy Analysis

Abhiraam Eranti, Yogesh Tewari, Rafael Palacios, Amar Gupta — IEEE Access
Deep-learning methods have boosted the analytical power of Raman spectroscopy, yet they still require large, task-specific, labeled datasets and often fail to transfer across application domains. The study explores pre-trained encoders as a solution. Pre-trained encoders have significantly impacted Natural Language Pro…
CORE arxiv.org Sep 15, 2017

Multilingual Hierarchical Attention Networks for Document Classification

Pappas, Nikolaos, Popescu-Belis, Andrei
Hierarchical attention networks have recently achieved remarkable performance for document classification in a given language. However, when multilingual document collections are considered, training such models separately for each language entails linear parameter growth and lack of cross-language transfer. Learning a…
NASA ADS doi.org Mar 1, 2023

BloombergGPT: A Large Language Model for Finance

Wu, Shijie, Irsoy, Ozan, Lu, Steven, Dabravolski, Vadim et al. — arXiv e-prints
The use of NLP in the realm of financial technology is broad and complex, with applications ranging from sentiment analysis and named entity recognition to question answering. Large Language Models (LLMs) have been shown to be effective on a variety of tasks; however, no LLM specialized for the financial domain has bee…
OpenAlex doi.org Mar 31, 2023

A Survey of Large Language Models

Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang et al. — ArXiv.org
Language is essentially a complex, intricate system of human expressions governed by grammatical rules. It poses a significant challenge to develop capable AI algorithms for comprehending and grasping a language. As a major approach, language modeling has been widely studied for language understanding and generation in…
DOAJ nytsqb.aiijournal.com Aug 1, 2023

Review of Deep Learning for Language Modeling

WANG Sili, ZHANG Ling, YANG Heng, LIU Wei — Nongye tushu qingbao xuebao
[Purpose/Significance] Deep learning for language modeling is one of the major methods and advanced technologies to enhance language intelligence of machines at present, which has become an indispensable important technical means for automatic processing and analysis of data resources, and intelligent mining of informa…
CORE arxiv.org Aug 9, 2016

Neural Generation of Regular Expressions from Natural Language with Minimal Domain Knowledge

Locascio, Nicholas, Narasimhan, Karthik, DeLeon, Eduardo, Kushman, Nate et al.
This paper explores the task of translating natural language queries into regular expressions which embody their meaning. In contrast to prior work, the proposed neural model does not utilize domain-specific crafting, learning to translate directly from a parallel corpus. To fully explore the potential of neural models…
OpenAlex doi.org Jun 15, 2022

Emergent Abilities of Large Language Models

Wei, Jason, Yi Tay, Rishi Bommasani, Colin Raffel et al. — arXiv (Cornell University)
Scaling up language models has been shown to predictably improve performance and sample efficiency on a wide range of downstream tasks. This paper instead discusses an unpredictable phenomenon that we refer to as emergent abilities of large language models. We consider an ability to be emergent if it is not present in …
arXiv arxiv.org Jul 10, 2024

A Critical Review of Causal Reasoning Benchmarks for Large Language Models

arXiv preprint
Numerous benchmarks aim to evaluate the capabilities of Large Language Models (LLMs) for causal inference and reasoning. However, many of them can likely be solved through the retrieval of domain knowledge, questioning whether they achieve their purpose. In this review, we present a comprehensive overview of LLM benchm…
DOAJ frontiersin.org Jan 1, 2026

Accuracy and reliability of Manus, ChatGPT, and Claude in case-based dental diagnosis

Ahmed A. Madfa, Abdullah F. Alshammari, Bassam A. Anazi, Yousef E. Alenezi et al. — Frontiers in Oral Health
IntroductionArtificial intelligence (AI), particularly large language models (LLMs), is transforming healthcare education and clinical decision-making. While models like ChatGPT and Claude have demonstrated utility in medical contexts, their performance in dental diagnostics remains underexplored; additionally, the pot…
CORE arxiv.org May 11, 2020

DIET: Lightweight Language Understanding for Dialogue Systems

Bunk, Tanja, Varshneya, Daksh, Vlasov, Vladimir, Nichol, Alan
Large-scale pre-trained language models have shown impressive results on language understanding benchmarks like GLUE and SuperGLUE, improving considerably over other pre-training methods like distributed representations (GloVe) and purely supervised approaches. We introduce the Dual Intent and Entity Transformer (DIET)…
OpenAIRE doi.org Jan 1, 2023

Large Language Models: Compilers for the 4^{th} Generation of Programming Languages? (Short Paper)

S. Marcondes, Francisco, Almeida, José João, Novais, Paulo
This paper explores the possibility of large language models as a fourth generation programming language compiler. This is based on the idea that large language models are able to translate a natural language specification into a program written in a particular programming language. In other words, just as high-level l…
OpenAlex doi.org Jul 7, 2021

Evaluating Large Language Models Trained on Code

Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan et al. — arXiv (Cornell University)
We introduce Codex, a GPT language model fine-tuned on publicly available code from GitHub, and study its Python code-writing capabilities. A distinct production version of Codex powers GitHub Copilot. On HumanEval, a new evaluation set we release to measure functional correctness for synthesizing programs from docstri…
arXiv arxiv.org Oct 2, 2023

All Languages Matter: On the Multilingual Safety of Large Language Models

arXiv preprint
Safety lies at the core of developing and deploying large language models (LLMs). However, previous safety benchmarks only concern the safety in one language, e.g. the majority language in the pretraining data such as English. In this work, we build the first multilingual safety benchmark for LLMs, XSafety, in response…
DOAJ journals.lww.com Aug 1, 2025

Evaluating Artificial Intelligence’s Role in Developing Research Questions in Head and Neck Reconstruction

Sebastian Holm, MD, Mario Zambrana, MD, Juan E. Berner, MD, PhD, Reza Tabrisi, MD et al. — Plastic and Reconstructive Surgery, Global Open
Summary:. Generative artificial intelligence (AI) large language models are an emerging technology, with ChatGPT and Gemini being 2 well-known examples. The current literature discusses clinical applications and limitations of AI, but its role in research has not yet been extensively evaluated. This study aimed to asse…
CORE arxiv.org Dec 15, 2015

Strategies for Training Large Vocabulary Neural Language Models

Chen, Welin, Grangier, David, Auli, Michael
Training neural network language models over large vocabularies is still computationally very costly compared to count-based models such as Kneser-Ney. At the same time, neural language models are gaining popularity for many applications such as speech recognition and machine translation whose success depends on scalab…
OpenAIRE doi.org Jan 1, 2023

Lost in Translation: Large Language Models in Non-English Content Analysis

Nicholas, Gabriel, Bhatia, Aliya
In recent years, large language models (e.g., Open AI's GPT-4, Meta's LLaMa, Google's PaLM) have become the dominant approach for building AI systems to analyze and generate language online. However, the automated systems that increasingly mediate our interactions online -- such as chatbots, content moderation systems,…
OpenAlex doi.org May 24, 2022

Large Language Models are Zero-Shot Reasoners

Takeshi Kojima, Shixiang Gu, Machel Reid, Yutaka Matsuo et al. — arXiv (Cornell University)
Pretrained large language models (LLMs) are widely used in many sub-fields of natural language processing (NLP) and generally known as excellent few-shot learners with task-specific exemplars. Notably, chain of thought (CoT) prompting, a recent technique for eliciting complex multi-step reasoning through step-by-step a…
arXiv arxiv.org Dec 6, 2025

Classifying German Language Proficiency Levels Using Large Language Models

arXiv preprint
Assessing language proficiency is essential for education, as it enables instruction tailored to learners needs. This paper investigates the use of Large Language Models (LLMs) for automatically classifying German texts according to the Common European Framework of Reference for Languages (CEFR) into different proficie…
DOAJ frontiersin.org Jan 1, 2026

Large language model bias auditing for periodontal diagnosis using an ambiguity-probe methodology: a pilot study

Teerachate Nantakeeratipat — Frontiers in Digital Health
BackgroundLarge Language Models (LLMs) in healthcare holds immense promise yet carries the risk of perpetuating social biases. While artificial intelligence (AI) fairness is a growing concern, a gap exists in understanding how these models perform under conditions of clinical ambiguity, a common feature in real-world p…
CORE arxiv.org Feb 28, 2012

Using Built-In Domain-Specific Modeling Support to Guide Model-Based Test Generation

Kanstrén, Teemu, Puolitaival, Olli-Pekka — 'Open Publishing Association'
We present a model-based testing approach to support automated test generation with domain-specific concepts. This includes a language expert who is an expert at building test models and domain experts who are experts in the domain of the system under test. First, we provide a framework to support the language expert i…
OpenAIRE doi.org Jun 16, 2024

Mathematical Insights into Large Language Models

Ranjith Gopalan
Purpose: The paper presents an exhaustive examination of the mathematical frameworks that support the creation and operation of large language models. The document commences with an introduction to the core mathematical concepts that are foundational to large language models. It delves into the mathematical algorithms …