Kunal Pai

I am a Master's student at UC Davis working at the intersection of AI, Systems, and Software Engineering. My research centers on building and evaluating robust, autonomous agents for complex software engineering tasks, from secure code transpilation to automated bug repair.

Download Full Resume Download Short Resume

News

- Our project, NAAMSE, won the 2nd place in the Agent Safety Track at the 2026 AgentX - AgentBeats competition, organized by Berkeley RDI. See the announcement here. A huge thank you to my collaborators, Parth Shah and Harshil Patel, for their incredible work on this project!

Education

University of California, Davis

Master of Science

Computer Science

2023 - Ongoing

GPA: 4.0

Relevant Coursework: Machine Learning, Computer Security, Information Visualization, Software Engineering, Theory of Computation, Bias and Fairness in AI, Vision and Language Research, Compilers and Program Analysis

University of California, Davis

Bachelor of Science

Computer Science and Engineering

2019 - 2023

GPA: 3.83

Provost Scholar, Graduated with Honors

Work Experience

DavSec Lab @ University of California, Davis

Graduate Student Researcher

April 2025 - Present

Davis, CA

Built an automated pipeline for C-to-Rust transpilation using LLMs, with 5 prompt variations, targeting secure systems migration.

Identified Halstead vocabulary as the strongest metric for predicting translation difficulty.

Validated lightweight semantic augmentations (e.g., filename context) that improved functional accuracy by 5%.

Benchmarked state-of-the-art LLMs across 746 C/C++ programs, achieving 70.2% functional accuracy with best prompt design.

DArchR Lab @ University of California, Davis

Graduate Student Researcher

June 2023 - Present

Davis, CA

Leading a project to develop a full-system, cycle-level simulation model for a superconductor-based graph accelerator in the gem5 simulator, and to deliver models for cryogenic semiconductors and superconductors in gem5.

Mentored 5 undergraduate students in the Davis Computer Architecture Lab to prepare them for graduate research.

DECAL Lab @ University of California, Davis

Graduate Student Researcher

September 2022 - December 2024

Davis, CA

Developed a 4,500-sample dataset for pairwise code-documentation alignment from 200 open-source Python projects, enabling future research in software maintenance

Engineered a pipeline for measuring calibration and correctness of large language models for code repair, using Defects4J

Collaborated in validating efficacy of semantic augmentation of language model prompts for code summarization using precision and recall metrics like ROUGE and METEOR.

University of California, Davis

Teaching Assistant

September 2023 - December 2023

Davis, CA

Assisted 180 students in a senior-level Probability & Statistical Modeling class.

DArchR Lab @ University of California, Davis

Undergraduate Researcher

June 2022 - June 2023

Davis, CA

Implemented a model of the HiFive Unmatched RISC-V board in gem5, achieving 85% accuracy with hardware profiling metrics.

Authored a poster on the validation of hardware and simpoints with gem5, presented at the gem5 workshop at ISCA 2023.

Co-authored tutorials on perf compilation for RISC-V and documentation for the Standard Library in gem5.

View All Work Experiences

Projects

HASHIRU: Hierarchical Agent System for Hybrid Intelligent Resource Utilization

March 2025 - Present

PythonLLMsMulti-Agent Systems

GitHub Paper Live Demo

Collaborators: Parth Shah, Harshil Patel

Designed and deployed a multi-agent architecture enabling dynamic, LLM-driven collaboration across diverse tasks.

Implemented task decomposition with intelligent agent delegation based on resource cost models and task specialization.

Engineered autonomous generation of tools and APIs for task execution.

Developed a robust evaluation framework for agent performance across complex, multi-step tasks.

MARS: Multi-Agent Review System for Academic Papers

January 2025 - March 2025

PythonLLMsMulti-Agent Systems

GitHub Paper

Collaborators: Saisha Shetty

Built a multi-agent LLM pipeline that simulates peer review with specialized agents for novelty, grammar, and critical questioning.

Achieved high accuracy on ICLR 2023 reviews, outperforming o3-mini and NotebookLM baselines.

Deployed privacy-preserving, local LLM evaluations using Ollama on consumer-grade hardware.

gem5 Vision

January 2023 - June 2023

NextJSMongoDBPythonJSON Schema

Poster

Collaborators: Parth Shah, Harshil Patel, Arslan Ali

Boosted resource discovery speed by 20x with optimized search functionality across 1,200+ resources.

Enabled faster retrieval of resources across 20+ categories by introducing categorization and semantic versioning.

Enhanced accessibility for 500+ industry and academic users by integrating local/remote JSON files and MongoDB with gem5.

View All Projects

Publications / Talks

Implications of Full-System Modeling for Superconducting Architectures

conference

Kunal Pai, Mahyar Samani, Anusheel Nand, Jason Lowe-Power

Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC Workshops '25)

As Moore's Law slows, superconducting electronics offer ultra-low-power, high-speed computation potential. This paper presents the first full-system superconducting modeling in gem5, including cryogenic and superconducting cores, caches, and interconnects. Our results show that superconducting cores and caches can yield up to 24× speedup for compute-intensive workloads, but memory-intensive applications remain bottlenecked by room-temperature DRAM. This makes superconducting technology more suitable for domain-specific accelerators rather than general-purpose computing, with performance dependent on workload memory access patterns and data widths.

Computer ArchitectureSuperconductingCryogenic Computinggem5

View Publication Slides Talk

HASHIRU: Hierarchical Agent System for Hybrid Intelligent Resource Utilization

preprint

Kunal Pai, Parth Shah, Harshil Patel

arXiv preprint

To support resource-efficient multi-agent reasoning, we introduce HASHIRU, a hierarchical agent system that dynamically instantiates specialized agents under cost and memory constraints. HASHIRU combines hybrid LLM usage, autonomous API/tool creation, and a novel economic model for agent hiring/firing, outperforming larger models like Gemini 2.0 Flash on complex reasoning and safety tasks.

Artificial IntelligenceLarge Language Models (LLMs)Multi-Agent Systems

View Pre-Print View Source

CoDocBench: A Dataset for Code-Documentation Alignment in Software Maintenance

conference

Kunal Pai, Premkumar Devanbu, Toufique Ahmed

International Conference on Mining Software Repositories (MSR) 2025: Data and Tool Showcase Track

Understanding and implementing code changes is a key aspect of software maintenance. To support this, we introduce a new dataset of coupled changes to code and documentation mined from high-quality GitHub projects, where each sample represents a single commit with simultaneous updates to code and docstrings. This dataset enables training and evaluation on realistic, change-related tasks, which remain challenging for current models like Llama 3.1 405B and Mixtral 8×22B.

Software EngineeringGitHub MiningLarge Language Models (LLMs)

View Source View Publication View Pre-Print

Calibration and Correctness of Language Models for Code

conference

Claudio Spiess, David Gros, Kunal Suresh Pai, Michael Pradel, Md Rafiqul Islam Rabin, Amin Alipour, Sushmit Jha, Premkumar Devanbu, Toufique Ahmed

International Conference on Software Engineering (ICSE) 2025

Machine learning models often produce incorrect outputs, making reliable confidence measures essential for determining the trustworthiness of these outputs. This paper introduces a framework to evaluate and improve the calibration of code-generating models, finding that these models are generally poorly calibrated initially but can be improved using methods like Platt scaling, thereby enhancing decision-making in software engineering.

Software EngineeringMachine LearningNaturalness of Software

View Publication View Pre-Print

Potential and Limitation of High-Frequency Cores and Caches

poster

Kunal Pai, Anusheel Nand, Jason Lowe-Power

ModSim 2024: Workshop on Modeling & Simulation of Systems and Applications

The poster presentation explores the potential and limitations of high-frequency in-order and out-of-order cores and caches in modern processors, highlighting the trade-offs between speedups and bandwidth.

Computer ArchitectureCryogenic ComputingSuperconducting

View Poster View Presentation View Pre-Print

Automatic semantic augmentation of language model prompts (for code summarization)

conference

Toufique Ahmed, Kunal Suresh Pai, Premkumar Devanbu, Earl T. Barr

International Conference on Software Engineering (ICSE) 2024

Adding explicit semantic facts as prompts to Large Language Models improves their performance in code summarization tasks, with notable improvements exceeding 2 BLEU and, in some cases, even surpassing 30 BLEU, demonstrating the effectiveness of this approach in enhancing code analysis and extraction of essential information.

Software EngineeringMachine LearningNaturalness of Software

View Publication

Validating Hardware and SimPoints with gem5: A RISC-V Board Case Study

poster

Kunal Pai, Zhantong Qiu, Jason Lowe-Power

gem5 Workshop at International Symposium on Computer Architecture (ISCA) 2023

The poster discusses the development of a RISC-V board model (RISCVMatched) in gem5, along with a methodology for fine-tuning gem5 configurations to closely match real-life systems, resulting in more accurate hardware validation and simulation capabilities.

Computer Architecturegem5

View Publication

gem5 Vision

poster

Parth Shah, Kunal Pai, Harshil Patel, Arslan Ali

gem5 Workshop at International Symposium on Computer Architecture (ISCA) 2023

The gem5 Vision Project seeks to improve user-friendliness and accessibility by introducing advanced search functionality, comprehensive resource categorization, and expanded database support within the gem5 ecosystem for researchers and developers.

Computer Architecturegem5

View Publication

Service

- Program Committee member, 23rd International Conference on Mining Software Repositories: Data and Tool Showcase Track

Skills

Programming Languages

Python, C++, Java, JavaScript

Frameworks

React, Next.js, TensorFlow, PyTorch, Django, Flask, scikit-learn, pandas, NumPy, Matplotlib

Tools And Technologies

Git, Docker, MongoDB, gem5, Unix/Linux, LaTeX

Languages

English, Gujarati, Hindi, Spanish

Awards

Dean's List

UC Davis College of Engineering

Fall 2019

Dean's List

UC Davis College of Engineering

Fall 2020

Dean's List

UC Davis College of Engineering

Winter 2022

Dean's List

UC Davis College of Engineering

Spring 2022

Provost Award

UC Davis

2019-2023