02 · Experience

Recent work.

2026

1 role
February 2026 - Present · Portland, Maine

Research Assistant

Northeastern University

Building a vision-language model for automated interpretation of veterinary fine needle aspirate cytology, focused on mast cell tumor detection and grading. End-to-end MedGemma 1.5 4B fine-tuning pipeline with QLoRA on a MedSigLIP encoder, deployed on Databricks with MLflow and Unity Catalog.

  • Designed end-to-end VLM fine-tuning pipeline using MedGemma 1.5 4B with QLoRA (4-bit quantization + LoRA adapters) on a MedSigLIP vision encoder for multi-class mast cell tumor classification and cytologic interpretation; built in PyTorch with HuggingFace Transformers, TRL (SFTTrainer), PEFT, and bitsandbytes.
  • Engineered multi-channel image preprocessing pipeline merging 4 fluorescence/brightfield channels (bf_green, bf_violet, fl_uv, fl_blue) into pseudo-RGB inputs via per-image P1/P99 normalization.
  • Analyzed ~8M single-channel cell images across 19 channels, mapped structured vs unstructured pathology fields, and isolated 2,653 disease-relevant cases spanning 9 grade categories and 66 ground-truth annotated runs.
  • Curated a 5-task VQA dataset (structured reporting, pathological process identification, key finding extraction, cell type classification, cytologic interpretation) by converting hierarchical pathologist dropdown annotations across 4 branches and 8 follow-up question types into natural-language Q&A pairs.
  • Encoded reasoning from 4 cytologic grading systems (Camus, Paes, Kiupel, Patnaik) as chain-of-thought training signals; built interactive visualizations of the diagnostic decision tree to align the team on annotation schema.
  • Deployed Databricks training infrastructure with auto-detection across 4 hardware tiers (T4, A10G, A100, H100), pre-loaded image caching that eliminated S3 I/O during training, custom callbacks tracking token accuracy and validation loss, and MLflow + Unity Catalog for experiment tracking and model registry.
  • Architected the Unity Catalog schema (catalogs, schemas, volumes for images, Q&A pairs, model artifacts); evaluated checkpoints with token-level accuracy, ROUGE-L, F1, precision, recall, and perplexity across train/validation/test splits.
MedGemma 1.5 4BMedSigLIPQLoRAPyTorchHuggingFace TransformersTRLPEFTbitsandbytesDatabricksUnity CatalogMLflowSpark

2025

1 role
January 2025 - August 2025 · Gandhinagar, India

Artificial Intelligence Engineer

BigCircle (UPSAAS Technologies LLP)

Built an end-to-end Deep Research pipeline and shipped pagination/auth systems for a high-concurrency platform; collaborated with a 5-engineer Agile team.

  • Built an end-to-end Deep Research pipeline in Python orchestrating LLM-based prompt generation, Firecrawl API for web scraping, OpenAI ChatGPT API for content summarization, graph visualization with Matplotlib and Seaborn, and automated Typst PDF report generation; reduced processing time by 75% through API call batching and concurrency tuning.
  • Engineered pagination and authentication systems using JavaScript and Next.js, accelerating page load times by 40% with Docker containerization for 500+ concurrent sessions.
  • Collaborated with a 5-engineer team in Agile sprints using Git for version control; contributed to code reviews and CI workflow standardization.
PythonFirecrawl APIOpenAI ChatGPT APITypstMatplotlibSeabornJavaScriptNext.jsDockerGitAgile Methodologies
Reach out

Got a question? Let's talk.