From idea to product
Agents · DevTools · Startups
Engineer · Product-minded

Utkarsh Bali

I build products and infra around AI agents, developer tools, and startups. Right now I am working on Checkpoint, an agent testing platform for dev teams.

Portrait of Utkarsh Bali

About

Hi, I am Utkarsh Bali.

I am a CS + AI student at Purdue, currently working on Checkpoint and looking to meet people who care about agents, devtools, and thoughtful engineering.

Personal note

I am a builder. As a kid, I used to make designs for quadcopters (even design iron man suits with my friend!), and build motor powered cars and call them "Thrust SSC". In high school, I built mobile apps and games that I wish I had. At Purdue and beyond, I have built AI agents, devtools, healthcare tools, and research prototypes. Each project has taught me something new about building real products that people can use and rely on.

My work has crossed AI agents, LLM pipelines, healthcare tools, mobile apps, and research prototypes. I am most interested in the projects where taste and systems thinking both matter.

1

I love building real products

The best work is not just clever. It is understandable, useful, and stable enough for people to depend on.

2

I love travelling to new places

Travelling is one of my favorite ways to learn about the world and more importantly, to understand myself.

3

I love all forms of art and expression

I believe that doing art and being creative is a core human survival trait, whether it is music, dance, movies, or anything else.

Built expertise at

A few places that shaped how I build.

Purdue logo
QualGent logo
Microsoft logo
Recurly logo

Now

Building Checkpoint in San Francisco.

Checkpoint is the CI/CD pipeline for AI agents. We catch harness failures before users do.

CheckpointCTO

Pre-production testing for agent teams.

We are building the layer between an AI agent and its first real user: generated test suites, stateful mocked tool calls, sandboxed execution, and clear failure reports. The goal is simple: catch the broken loop before it reaches production.

usecheckpoint.dev

What I am working on

Engineers submit an agent config: prompts, tools, and schemas.

Checkpoint generates adversarial multi-turn tests and runs them in a sandbox.

I am leading backend, sandbox infrastructure, and LLM orchestration as CTO.

Founding team: Ayushman Gupta, Utkarsh Bali, and Aaditya Gaur.

Projects

A few things I have built.

These are the projects that best show how I think: build the core loop, make it usable, and learn from where it breaks.

Agent infrastructure<1% failure rate

Autonomous App Crawler

At QualGent, I worked on a crawler that helped agents understand real Android apps. The hard part was keeping state, recovery, and scale sane while running across messy mobile flows.

PythonTypeScriptDockerKubernetes
Agent QA>99% deterministic execution

LLM Multi-Agent QA System

I built a small multi-agent QA system for Android flows: one agent plans, one executes, one checks state, and one handles recovery when the app does something unexpected.

PythonOpenAI APIADBLLMs
Consumer AI300+ users in 22+ countries

WalleX: AI Wallpapers

WalleX was my first real consumer app. I owned the mobile UX, model integration, analytics, deployment, and monetization, then watched real users use it across countries.

FlutterPythonHugging FaceFirebase
Agent memory~35% faster retrieval

Semantic Memory GC

I built a pruning pipeline for vector memory stores so long-running agents could retrieve context faster without carrying every redundant memory forever.

PythonLLMsVector databasesEmbeddings
Healthcare AI~40% less documentation overhead

Clinical AI Assistant

In Purdue research, I worked on clinical AI tools with privacy constraints: local speech-to-text, self-hosted models, and risk explanations that people could inspect.

FlutterTypeScriptLLaMAPyTorch

Stack

The tools I reach for when I build.

I like being able to move between various tools when the goal is to ship something real.

AI / ML

LLMsAgentsEval systemsRAGTool-callingPyTorchTensorFlowLangChainHugging FaceSHAP

Backend

PythonTypeScriptNode.jsFastAPIExpressgRPCRESTGraphQLPostgreSQLRedis

Frontend

ReactNext.jsTailwind CSSFlutterDesign systemsResponsive UIProduct polish

Infra / DevTools

DockerKubernetesGCPAzureAWSSparkDatabricksKafkaCI/CDTerraform

Experience

Places I've built and learned.

A short version of the work behind the projects: internships, research, and startups.

Recurly

Software Engineer Intern

May 2026 - Aug 2026

As a member of Recurly's engineering team, working on internal automation software for subscription management infrastructure.

Software engineeringSaaSProduction systems

QualGent (YC X25)

Software Engineer Intern

Sep 2025 - Dec 2025

Worked directly under the CTO on QualGent AI Assistant and App Crawler infrastructure, Android automation, and internal AI tooling. My main work was taking crawler and agent systems from prototype into production use.

AgentsKubernetesGCPPython

The Data Mine

Microsoft Research Collab

Aug 2024 - May 2025

Built LLM and Spark pipelines to analyze large-scale Minecraft community data. I focused on making the workflow cheaper, easier to query, and reliable enough for repeated analysis.

LLMsSparkDatabricksAzure

Purdue University

AI Researcher

Aug 2024 - Present

Built research prototypes for clinical assistants, private speech workflows, and ICU risk modeling. The work mixed applied ML with product constraints like privacy and interpretability.

ResearchHealthcare AIPyTorchLLaMA

Recognition

A few things outside the project list.

Some academic, research, and teaching context.

Mary-Ann Neel CS Scholar

Awarded to a top Purdue CS student for academic performance and technical impact.

Discovery Park Research Scholar

Recognized twice for interdisciplinary research and engineering work.

Published AI researcher

Presented AI research at Purdue research expos and talks.

KVPY Top 1%

All India Rank 1638 among 150,000+ candidates.

Teaching + community

TA for AI/data courses and helped run technical workshops for 200+ students.

San Francisco

Building toward the teams I want to be around.

I want to work with people who ship quickly, talk to users, and care about the details. Checkpoint is the center of that work for me right now.

Get in touch
San Francisco skyline at golden hour

Purdue now, San Francisco for the next chapter.

Contact

Reach out if you're building in this world.

I'm always happy to talk about agent tooling, devtools, startups, internships, or teams that care about shipping carefully.

baliutkarsh2@gmail.com·Purdue University · CS + AI