Education
University of California, Davis
Master of Science
Computer Science
2023 - Ongoing
University of California, Davis
Bachelor of Science
Computer Science and Engineering
2019 - 2023
Work Experience
DArchR Lab @ University of California, Davis
Graduate Student Researcher
June 2023 - Present
Davis, CA
- Leading a team to develop a superconductor-based graph accelerator in the gem5 simulator.
- Leading a team to deliver full-system, cycle-level simulation models for cryogenic semiconductors and superconductors in gem5.
- Collaborating with a team to develop an autotuning methodology with 90% correlation between gem5 simulation results and hardware profiling metrics.
- Mentoring 5 undergraduate students in the Davis Computer Architecture Lab to prepare them for graduate research.
University of California, Davis
Student Researcher
September 2022 - Present
Davis, CA
- Developing a dataset for pairwise code-documentation alignment from open-source Python projects, to assist with future tasks in software maintenance.
- Created a pipeline for measuring calibration and correctness of large language models for code repair, using Defects4J.
- Assisted in validating the efficacy of semantic augmentation of language model prompts for code summarization using precision and recall metrics like ROUGE and METEOR.
University of California, Davis
Teaching Assistant
September 2023 - December 2023
Davis, CA
- Assisted 180 students in a senior-level Probability & Statistical Modeling class.
DArchR Lab @ University of California, Davis
Undergraduate Researcher
June 2022 - June 2023
Davis, CA
- Implemented a model of the HiFive Unmatched RISC-V board in gem5, achieving 85% accuracy with hardware profiling metrics.
- Authored a poster on the validation of hardware and simpoints with gem5, presented at the gem5 workshop at ISCA 2023.
- Co-authored tutorials on perf compilation for RISC-V and documentation for the Standard Library in gem5.
humanID
Tech Team Lead
January 2022 - June 2022
Davis, CA
- Delivered 10 completed projects with global teams, including:
- Documentation of a Discord bot that combats spam and fake users
- A Django-based web application for permission management for 100 users.
SiTime Corp.
Technical Product Marketing Intern
July 2021 - September 2021
Santa Clara, CA
- Presented strategy to improve distributor margin management and earned profits by $250,000.
- Conducted a market survey on optical transceivers used in AI networking, to identify customers for MEMS timing chips.
- Created Visio diagrams for the product requirements document (PRD) of a timing chip.
Academic Projects
Automated Frameworks of Semantic Augmentation to Improve Mathematical Word Problem Solving
April 2024 - June 2024
NLPPromptingMachine Learning
Collaborators: Nishant Acharya, Zeerak Babar
- Improved PaLM 2 LLM prompting accuracy on math word problems (MWPs) by 10% and TinyLlama fine-tuning LM accuracy by 60% through a one-shot digit-level semantics framework.
- Introduced a novel demonstration selection model to improve accuracy of LLMs. Model used BLEU scores and Levenshtein distance to identify the most similar equations for one-shot examples.
The Effects of Toxicity on Disengagement in Open Source Projects
January 2024 - March 2024
Open SourceGitHub MiningData Analysis
Collaborators: Saisha Shetty, Vijeth KL, Thrisha Kopula, Ariel Kamen
- Found a strong correlation ($R^2 = 0.76$) between high developer engagement in FAANG projects with larger codebases and lower levels of toxicity, offering actionable insights for community management.
- Quantified toxic behavior using sentiment analysis and mining corporate and non-profit repositories, revealing how toxicity disproportionately impacts new developers compared to experienced ones (up to 1.3x more).
What is the behavior of Spectre, a speculative prediction exploit, on the various branch predictors available in the computer architecture simulator gem5?
October 2023 - December 2023
gem5SpectreComputer Security
Collaborators: Yuyi Li, Frank Gomez
- Demonstrated up to a 55% reduction in susceptibility to speculative execution attacks by validating design enhancements like longer training periods and minimizing biased branches for Spectre-resistant branch predictors.
- Investigated the vulnerability of x86-based in-order and out-of-order processors to Spectre V1 attacks, revealing a strong correlation between branch predictor training periods and attack effectiveness.
gem5 Vision
January 2023 - June 2023
NextJSMongoDBPythonJSON Schema
Collaborators: Parth Shah, Harshil Patel, Arslan Ali
- Boosted resource discovery speed by 20x with optimized search functionality across 1,200+ resources.
- Enabled faster retrieval of resources across 20+ categories by introducing categorization and semantic versioning.
- Enhanced accessibility for 500+ industry and academic users by integrating local/remote JSON files and MongoDB with gem5.
Publications / Talks
Calibration and Correctness of Language Models for Code
conference
Claudio Spiess, David Gros, Kunal Suresh Pai, Michael Pradel, Md Rafiqul Islam Rabin, Amin Alipour, Sushmit Jha, Premkumar Devanbu, Toufique Ahmed
ICSE 2025
Machine learning models often produce incorrect outputs, making reliable confidence measures essential for determining the trustworthiness of these outputs. This paper introduces a framework to evaluate and improve the calibration of code-generating models, finding that these models are generally poorly calibrated initially but can be improved using methods like Platt scaling, thereby enhancing decision-making in software engineering.
Software EngineeringMachine LearningNaturalness of Software
Potential and Limitation of High-Frequency Cores and Caches
poster
Kunal Pai, Anusheel Nand, Jason Lowe-Power
ModSim 2024: Workshop on Modeling & Simulation of Systems and Applications
The poster presentation explores the potential and limitations of high-frequency in-order and out-of-order cores and caches in modern processors, highlighting the trade-offs between speedups and bandwidth.
Computer ArchitectureCryogenic ComputingSuperconducting
Automatic semantic augmentation of language model prompts (for code summarization)
conference
Toufique Ahmed, Kunal Suresh Pai, Premkumar Devanbu, Earl T. Barr
ICSE 2024
Adding explicit semantic facts as prompts to Large Language Models improves their performance in code summarization tasks, with notable improvements exceeding 2 BLEU and, in some cases, even surpassing 30 BLEU, demonstrating the effectiveness of this approach in enhancing code analysis and extraction of essential information.
Software EngineeringMachine LearningNaturalness of Software
Validating Hardware and SimPoints with gem5: A RISC-V Board Case Study
poster
Kunal Pai, Zhantong Qiu, Jason Lowe-Power
ISCA 2023: gem5 Workshop
The poster discusses the development of a RISC-V board model (RISCVMatched) in gem5, along with a methodology for fine-tuning gem5 configurations to closely match real-life systems, resulting in more accurate hardware validation and simulation capabilities.
Computer Architecturegem5
gem5 Vision
poster
Parth Shah, Kunal Pai, Harshil Patel, Arslan Ali
ISCA 2023: gem5 Workshop
The gem5 Vision Project seeks to improve user-friendliness and accessibility by introducing advanced search functionality, comprehensive resource categorization, and expanded database support within the gem5 ecosystem for researchers and developers.
Computer Architecturegem5
Skills
Programming Languages
Python, C++, Java, JavaScript
Frameworks
React, Next.js, TensorFlow, PyTorch, Django, Flask, scikit-learn, pandas, NumPy, Matplotlib
Tools And Technologies
Git, Docker, MongoDB, gem5, Unix/Linux, LaTeX
Languages
English, Gujarati, Hindi, Spanish