Kunal Pai

I am a Master's student at the University of California, Davis, primarily working on simulation accuracy, emerging computing technologies (superconductors and cryogenic semiconductors), and using large language models for software engineering tasks.

Education

University of California, Davis
Master of Science
Computer Science
2023 - Ongoing
  • GPA: 4.0
  • Relevant Coursework: Machine Learning, Computer Security, Information Visualization, Software Engineering, Theory of Computation
  • University of California, Davis
    Bachelor of Science
    Computer Science and Engineering
    2019 - 2023
  • GPA: 3.83
  • Provost Scholar, Graduated with Honors
  • Work Experience

    DArchR Lab @ University of California, Davis
    DArchR Lab @ University of California, Davis
    Graduate Student Researcher
    June 2023 - Present
    Davis, CA
    • Leading a team to develop a superconductor-based graph accelerator in the gem5 simulator.
    • Leading a team to deliver full-system, cycle-level simulation models for cryogenic semiconductors and superconductors in gem5.
    • Collaborating with a team to develop an autotuning methodology with 90% correlation between gem5 simulation results and hardware profiling metrics.
    • Mentoring 5 undergraduate students in the Davis Computer Architecture Lab to prepare them for graduate research.
    University of California, Davis
    University of California, Davis
    Student Researcher
    September 2022 - Present
    Davis, CA
    • Developing a dataset for pairwise code-documentation alignment from open-source Python projects, to assist with future tasks in software maintenance.
    • Created a pipeline for measuring calibration and correctness of large language models for code repair, using Defects4J.
    • Assisted in validating the efficacy of semantic augmentation of language model prompts for code summarization using precision and recall metrics like ROUGE and METEOR.
    University of California, Davis
    University of California, Davis
    Teaching Assistant
    September 2023 - December 2023
    Davis, CA
    • Assisted 180 students in a senior-level Probability & Statistical Modeling class.
    DArchR Lab @ University of California, Davis
    DArchR Lab @ University of California, Davis
    Undergraduate Researcher
    June 2022 - June 2023
    Davis, CA
    • Implemented a model of the HiFive Unmatched RISC-V board in gem5, achieving 85% accuracy with hardware profiling metrics.
    • Authored a poster on the validation of hardware and simpoints with gem5, presented at the gem5 workshop at ISCA 2023.
    • Co-authored tutorials on perf compilation for RISC-V and documentation for the Standard Library in gem5.
    humanID
    humanID
    Tech Team Lead
    January 2022 - June 2022
    Davis, CA
    • Delivered 10 completed projects with global teams, including:
    • Documentation of a Discord bot that combats spam and fake users
    • A Django-based web application for permission management for 100 users.
    SiTime Corp.
    SiTime Corp.
    Technical Product Marketing Intern
    July 2021 - September 2021
    Santa Clara, CA
    • Presented strategy to improve distributor margin management and earned profits by $250,000.
    • Conducted a market survey on optical transceivers used in AI networking, to identify customers for MEMS timing chips.
    • Created Visio diagrams for the product requirements document (PRD) of a timing chip.

    Academic Projects

    Automated Frameworks of Semantic Augmentation to Improve Mathematical Word Problem Solving
    April 2024 - June 2024
    NLPPromptingMachine Learning
    • Improved PaLM 2 LLM prompting accuracy on math word problems (MWPs) by 10% and TinyLlama fine-tuning LM accuracy by 60% through a one-shot digit-level semantics framework.
    • Introduced a novel demonstration selection model to improve accuracy of LLMs. Model used BLEU scores and Levenshtein distance to identify the most similar equations for one-shot examples.
    The Effects of Toxicity on Disengagement in Open Source Projects
    January 2024 - March 2024
    Open SourceGitHub MiningData Analysis
    • Found a strong correlation ($R^2 = 0.76$) between high developer engagement in FAANG projects with larger codebases and lower levels of toxicity, offering actionable insights for community management.
    • Quantified toxic behavior using sentiment analysis and mining corporate and non-profit repositories, revealing how toxicity disproportionately impacts new developers compared to experienced ones (up to 1.3x more).
    What is the behavior of Spectre, a speculative prediction exploit, on the various branch predictors available in the computer architecture simulator gem5?
    October 2023 - December 2023
    gem5SpectreComputer Security
    Collaborators: Yuyi Li, Frank Gomez
    • Demonstrated up to a 55% reduction in susceptibility to speculative execution attacks by validating design enhancements like longer training periods and minimizing biased branches for Spectre-resistant branch predictors.
    • Investigated the vulnerability of x86-based in-order and out-of-order processors to Spectre V1 attacks, revealing a strong correlation between branch predictor training periods and attack effectiveness.
    gem5 Vision
    January 2023 - June 2023
    NextJSMongoDBPythonJSON Schema
    • Boosted resource discovery speed by 20x with optimized search functionality across 1,200+ resources.
    • Enabled faster retrieval of resources across 20+ categories by introducing categorization and semantic versioning.
    • Enhanced accessibility for 500+ industry and academic users by integrating local/remote JSON files and MongoDB with gem5.

    Publications / Talks

    Calibration and Correctness of Language Models for Code
    conference
    Claudio Spiess, David Gros, Kunal Suresh Pai, Michael Pradel, Md Rafiqul Islam Rabin, Amin Alipour, Sushmit Jha, Premkumar Devanbu, Toufique Ahmed
    ICSE 2025
    Machine learning models often produce incorrect outputs, making reliable confidence measures essential for determining the trustworthiness of these outputs. This paper introduces a framework to evaluate and improve the calibration of code-generating models, finding that these models are generally poorly calibrated initially but can be improved using methods like Platt scaling, thereby enhancing decision-making in software engineering.
    Potential and Limitation of High-Frequency Cores and Caches
    poster
    Kunal Pai, Anusheel Nand, Jason Lowe-Power
    ModSim 2024: Workshop on Modeling & Simulation of Systems and Applications
    The poster presentation explores the potential and limitations of high-frequency in-order and out-of-order cores and caches in modern processors, highlighting the trade-offs between speedups and bandwidth.
    Automatic semantic augmentation of language model prompts (for code summarization)
    conference
    Toufique Ahmed, Kunal Suresh Pai, Premkumar Devanbu, Earl T. Barr
    ICSE 2024
    Adding explicit semantic facts as prompts to Large Language Models improves their performance in code summarization tasks, with notable improvements exceeding 2 BLEU and, in some cases, even surpassing 30 BLEU, demonstrating the effectiveness of this approach in enhancing code analysis and extraction of essential information.
    Validating Hardware and SimPoints with gem5: A RISC-V Board Case Study
    poster
    Kunal Pai, Zhantong Qiu, Jason Lowe-Power
    ISCA 2023: gem5 Workshop
    The poster discusses the development of a RISC-V board model (RISCVMatched) in gem5, along with a methodology for fine-tuning gem5 configurations to closely match real-life systems, resulting in more accurate hardware validation and simulation capabilities.
    gem5 Vision
    poster
    Parth Shah, Kunal Pai, Harshil Patel, Arslan Ali
    ISCA 2023: gem5 Workshop
    The gem5 Vision Project seeks to improve user-friendliness and accessibility by introducing advanced search functionality, comprehensive resource categorization, and expanded database support within the gem5 ecosystem for researchers and developers.

    Skills

    Programming Languages

    Python, C++, Java, JavaScript

    Frameworks

    React, Next.js, TensorFlow, PyTorch, Django, Flask, scikit-learn, pandas, NumPy, Matplotlib

    Tools And Technologies

    Git, Docker, MongoDB, gem5, Unix/Linux, LaTeX

    Languages

    English, Gujarati, Hindi, Spanish

    Awards

    Dean's List

    UC Davis College of Engineering
    Fall 2019

    Dean's List

    UC Davis College of Engineering
    Fall 2020

    Dean's List

    UC Davis College of Engineering
    Winter 2022

    Dean's List

    UC Davis College of Engineering
    Spring 2022

    Provost Award

    UC Davis
    2019-2023