sohum_kashyap
menu
portfolio

Sohum Kashyap.

> cs + math @ purdue · ai/ml researcher · systems builder

Building agentic systems and ML infrastructure for science — from RNA structure prediction to on-chain prediction markets.

scroll
01 / manifesto

A short read of how I think about the work.

I build systems where the math is the product — pipelines that turn rough scientific intuition into something a machine can compute, train, and ship.

My work lives at the intersection of biomedical AI, large-scale ML, and the unglamorous infrastructure that holds it together: containers, schedulers, vector stores, and the occasional smart contract.

I'm drawn to the parts of a problem where a careful model and a careful system are the same thing.
3
Research Labs
Argonne · UPenn · Purdue
3
Publications
SC'25 · AAAI'25 · PSURC
8+
Languages Shipped
Py · C · Rust · TS · Swift
3+
Years Researching
since 2023 · HPC + ML
02 / research & publications

Papers that shipped.

01
2025Co-author

An AI Agentic Framework for Understanding Low-Dose Radiation Effects on Human Lung Epithelial Cells

SC'25 — Intl. Conference for High Performance Computing

Claybon, Kashyap, Conery, Rodriguez, Li, Wu, Nandi, Madduri

Co-built an Ollama-driven agentic framework, in Docker, that orchestrates LLM agents to reason over low-dose radiation gene-expression data.

02
2024Sole Author

Assessing a Knowledge Graph Framework for Combating Hallucinations in Large Language Models

Argonne National Laboratory

Sohum Kashyap

Designed a knowledge-graph-assisted biomedical LLM pipeline (LangChain, Neo4j, PyTorch) that materially reduces hallucinations in disease-related generation.

03
2026Co-author · Presentation with Distinction

Evaluating Tradeoffs Between Robustness, Fairness, and Model Integrity Through Controlled Tool Perturbations

Purdue Spring Undergraduate Research Conference

Patel, Chelliboyina, Kashyap, Ghodsi

Quantified the fairness-robustness frontier of the Landseer ML pipeline under controlled adversarial perturbations on HPC clusters.

03 / experience

Labs, startups,
side quests.

Jan 2026Present · West Lafayette, IN

Purdue University · Landseer ML Pipeline

Undergraduate Researcher
  • Built a fairness-enhancing tool for the Landseer ML pipeline; +30% in-training fairness.
  • Evaluated adversarial attacks/defenses on HPC SLURM clusters; new PyTorch modules cut successful attack rate.
  • Abstract earned 'Presentation with Distinction' at Purdue Spring Undergraduate Research Conference.
PythonPyTorchPandasNumPyDockerApptainerSLURM
Aug 2025Present · West Lafayette, IN

Purdue University · Kihara Lab

Undergraduate Researcher
  • Developing a novel foundation model for RNA structure prediction with the Kihara Lab.
  • Prototype improves prediction of RNA secondary structure using SHAPE reactivity data.
  • Runs on Linux HPC clusters via SLURM; reproducible builds with Docker / Biopython.
PythonBiopythonPyTorchSLURMDockerBash
Oct 2023Aug 2025 · Lemont, IL

Argonne National Laboratory

Research Assistant
  • Built an Ollama-based agentic framework (Docker) to study low-dose radiation effects — presented at SC'25.
  • Applied a transformer model with in-silico perturbation to identify transitional genes; +25–40% consistency on sparse data vs. classical statistical baselines.
  • Developed a knowledge-graph-assisted biomedical LLM pipeline (LangChain · Neo4j · PyTorch) to reduce hallucinations.
TransformersLangChainScanPyPyTorchRPythonHPCSLURM
May 2024Aug 2024 · Philadelphia, PA (Remote)

University of Pennsylvania

Researcher
  • Designed a knowledge-graph-based retrieval framework for biomedical LLMs.
  • Improved Faithfulness by up to 60% and Context Recall by up to 36%.
  • Co-authored joint biomedical LLM paper with UIUC; presented at AAAI 2025.
GraphRAGLLMPythonLangChainNeo4jHPC
Nov 2022Aug 2024 · Bay Area (Remote)

Stealth Startup (NDA)

AI + Backend Developer
  • Designed and deployed an LSTM on Google Cloud Run with Pub/Sub for real-time inference under 500 ms.
  • Built the MVP architecture on GCP (Docker · Cloud Run) on production-grade infrastructure.
  • Helped secure $35K in pre-seed funding and early investor traction.
PythonTensorFlowGCPCloud RunPub/SubDocker
education

Purdue University

B.S. Computer Science & Mathematics/Statistics (Double Major)

Linear Programming · Multivariate Calculus · Elementary Linear Algebra · OOP · C

Aug 2025May 2029 · West Lafayette, IN

Illinois Mathematics and Science Academy

High School Diploma

Machine Learning · OOP (Java) · Computational Science · Linux/UNIX

Aug 2022May 2025 · Aurora, IL

04 / selected work

Built, shipped,
on-chain & otherwise.

complete catalog

All projects, grouped by type.

14 repos · sorted by
05 / stack

The toolbox.

sohum@purdue:~ $ cat languages.txt
Python
C / C++
Java
TypeScript
Rust
Solidity
Swift
C#
sohum@purdue:~ $ cat ml_&_systems.txt
PyTorch
TensorFlow
LangChain · RAG
Transformers / LLMs
Neo4j · GraphRAG
ScanPy · Biopython
Anchor · Foundry
sohum@purdue:~ $ cat cloud_&_infra.txt
GCP · Cloud Run · Pub/Sub
AWS
Docker · Apptainer
SLURM · HPC
Supabase · Postgres
Linux
06 / contact

Let's build something
at the edge.

Open to research collaborations, internships, and freelance work in ML / systems / on-chain infra. Cold emails welcome — the weirder the question, the better.

open mail ↗
sohum@purdue:~/contact $
github::S-K-23
linkedin::sohumkashyap
location::West Lafayette, IN
   _____    __ __
  / ___/   / //_/
  \__ \   / ,<
 ___/ /  / /| |
/____/  /_/ |_|  
© 2026 sohum kashyap · all rights reserved
built with next.js, canvas, framer-motion · 2026