Hey there 👋
I'm Cuong!
PhD Student @ The University of Texas at Dallas

I am a PhD student in Computer Science at The University of Texas at Dallas, advised by Prof. Tien N. Nguyen. My research lies at the intersection of Software Engineering and AI, with three main directions:

  1. (1) Applying AI/ML to software engineering tasks at the large-scale repo/project level
  2. (2) Studying how/why Large Language Models understand and reason about program behavior
  3. (3) Building Code World Models that capture dynamic program behavior and support more effective software development workflows
Portrait of Cuong

Previously, I was a full-time Research Resident in the FPT AI Residency Program from my 3rd year of undergrad, where I worked under the supervision of Prof. Tien N. Nguyen (Full Professor @ UT Dallas) and Dr. Nghi Bui (Research Scientist @ Google). This selective program provides intensive research training, strong computational support, and world-class mentorship for publishing in top-tier venues.

My first-author papers were published in top-tier conferences including ACL'26, ICSE'26, NAACL'25, FORGE'25 (see my Google Scholar for more detail). I was also honored to receive the ACM SIGSOFT Distinguished Paper Award at ICSE 2026.

Contact

You can email me at:

News
ACL 2026 Acceptance

SpecMind was accepted to the ACL 2026 Main Conference with acceptance rate ~19%.

2 new Preprints on Automated Program Repair
EvolRepair
Semantic evolution for LLM-guided APR
SpecTune
Specification-guided repair with intermediate behavioral signals
ACM SIGSOFT Distinguished Paper Award

SWE-Synth received the ACM SIGSOFT Distinguished Paper Award at ICSE 2026, recognizing it among the top 1.5% of accepted papers.

ICSE 2026 Acceptances

Two papers accepted: TestWeaver and SWE-Synth in ICSE 2026 Research Track (April 12-April 18, 2026, Rio de Janeiro, Brazil)

NAACL 2025 Acceptance

VisualCoder was accepted to Findings of NAACL 2025 (April 29-May 4, 2025, Albuquerque, New Mexico, USA)

FORGE 2025 Acceptance

CodeFlow was accepted to FORGE 2025 (April 27-April 28, Ottawa, Ontario, Canada).

TestWeaver: Execution-aware, Feedback-driven Regression Testing Generation with Large Language Models
ICSE 2026

TestWeaver: Execution-aware, Feedback-driven Regression Testing Generation with Large Language Models

Cuong Chi Le, Cuong Duc Van, Tung Duy Vu, Minh V. T. Pham, Hoang Nhat Phan, Huy Nhat Phan, Tien N Nguyen

While recent advances in large language models (LLMs) have shown promise in automating test generation for regression testing, they often suffer from limited reasoning about program execution, resulting in stagnated coverage growth - a phenomenon known as the coverage plateau. This paper presents TestWeaver, a novel LLM-based approach that integrates lightweight program analysis to create a focused execution context that assists LLMs in better test generation. TestWeaver strategically chooses the following components to overcome LLMs' limited reasoning on complex execution: (1) it reduces hallucinations and improves focus by supplying the LLM with the backward slice from the target line instead of a full program context; (2) it identifies and incorporates close test cases - those that share control-flow similarities with the path to the target line - to provide focused execution context within the LLM's context window; and (3) it enhances LLM's reasoning with execution in-line annotations that encode variable states as comments along the executed path. By equipping LLMs with these targeted and contextualized inputs, it improves coverage-guided test generation and mitigates redundant explorations. Empirical results show that TestWeaver accelerates code coverage growth and generates more effective test cases than the state-of-the-art approaches.

Semantic Evolution over Populations for LLM-Guided Automated Program Repair
Preprint

Semantic Evolution over Populations for LLM-Guided Automated Program Repair

Cuong Chi Le, Minh Le-Anh, Cuong Duc Van, Tien N. Nguyen

Large language models (LLMs) have recently shown strong potential for automated program repair (APR), particularly through iterative refinement that generates and improves candidate patches. However, state-of-the-art iterative refinement LLM-based APR approaches cannot fully address challenges, including maintaining useful diversity among repair hypotheses, identifying semantically related repair families, composing complementary partial fixes, exploiting structured failure information, and escaping structurally flawed search regions. In this paper, we propose a Population-Based Semantic Evolution framework for APR iterative refinement, called EvolRepair, that formulates LLM-based APR as a semantic evolutionary algorithm. EvolRepair reformulates the search paradigm of classic genetic algorithm for APR, but replaces its syntax-based operators with semantics-aware components powered by LLMs and structured execution feedback. Candidate repairs are organized into behaviorally coherent groups, enabling the algorithm to preserve diversity, reason over repair families, and synthesize stronger candidates by recombining complementary repair insights across the population. By leveraging structured failure patterns to guide search direction, EvolRepair can both refine promising repair strategies and shift toward alternative abstractions when necessary. Our experiments show that EvolRepair substantially improves repair effectiveness over existing LLM-based APR approaches.

Enhancing Program Repair with Specification Guidance and Intermediate Behavioral Signals
Preprint

Enhancing Program Repair with Specification Guidance and Intermediate Behavioral Signals

Minh Le-Anh*, Cuong Chi Le*, Tien N. Nguyen

*co-first author

Automated Program Repair (APR) has recently benefited from large language models (LLMs). However, most LLM-based APR approaches still rely primarily on coarse end-to-end signals from test-suite outcomes to guide repair, providing limited insight into where a program's internal logic deviates from its intended behavior. In contrast, human debugging often relies on intermediate reasoning about program states through localized correctness conditions or assertions. Inspired by this observation, we propose SpecTune, a specification-guided debugging framework that incorporates intermediate behavioral reasoning into APR. SpecTune decomposes the repair task into suspicious regions connected by execution checkpoints and derives localized postconditions representing expected program behaviors at those points. By executing the buggy program and evaluating these postconditions, SpecTune produces micro-level debugging signals that indicate mismatches between observed and intended behaviors, enabling more precise fault localization and targeted patch generation. To address the potential unreliability of LLM-generated postconditions, we introduce two complementary signals: a specification validation signal alpha, which estimates the consistency of generated postconditions using partially passing test cases, and a discriminative signal beta, which detects violations of validated postconditions during execution. With these signals, SpecTune safely leverages automatically generated specifications for APR. Experimental results show that SpecTune improves fault localization and APR effectiveness than the baselines.