Large Language Models: architecture, pre-training, and alignment safety.

I am a deep learning engineer focused on optimizing model capability, safety frameworks, and scaling efficiency. I research transformer pre-training architectures and develop safety-critical alignment frameworks to prevent prompt injections and tool exploitation.

See my LLM knowledge in this repo ↗

Core Areas

Pre-training & Adaptation

Pre-training medical and domain-specific transformers using SwiGLU, Grouped Query Attention, and Rotary Embeddings. Architecting memory-efficient adaptations using LoRA/QLoRA on cloud clusters.

Alignment & Model Safety

Implementing alignment feedback loops (SFT, DPO, GRPO) for safety and instruction compliance. Building multimodal input/output guardrails for VLM applications — chain-of-thought safety classification served at sub-200ms via vLLM, with false-negative-rate diagnostics per policy category.

Experience & Education

Chronology of professional internships and academic coursework.

May 2026 — Present

A10 Networks

San Jose, CA

Deep Learning Engineer Intern — AI Security Team

Built and deployed a multimodal ML guardrail system spanning text, image+text, and multi-turn modalities in Python/PyTorch, serving production inference on in-house Hopper GPU infrastructure with a 5-person AI Security team.
Trained two separate LLM guardrail models (text-only and multimodal) with SFT and chain-of-thought rationales, then performed SLERP model merging for unified generative-AI safety coverage without retraining on combined data — deployed via vLLM with CI/CD at sub-200ms latency, outperforming leading public guardrail benchmarks.
Built a multi-stage data annotation and feature engineering pipeline (Pandas, Hugging Face) with automated quality filtering and inter-annotator agreement metrics, versioned with Git; designed a Weights & Biases evaluation framework to communicate diagnostics to non-technical stakeholders and drive safety policy decisions.

Python · PyTorch · SFT · SLERP · vLLM · Hugging Face · W&B · CI/CD

Oct 2025 — Present

Routes Technologies

Remote, TX

AI Engineer Intern

Architected AI/ML pipelines in Python, SQL, and Azure Cloud, deploying models in production across 5 managed online endpoints with CI/CD and Weights & Biases observability.
Built an LLM-powered NL-to-SQL pipeline with few-shot prompting, 470+ synonym mappings for input normalization, and parameterized SQL sanitization; developed an SVD collaborative filtering recommender (scikit-learn, Pandas) evaluated via NDCG@5 and Precision/Recall@K through a nightly ETL job.
Engineered multi-source data ingestion (Scrapy, GPT-4o-mini classification) and an ingredient normalization pipeline using sentence-transformer embeddings (all-MiniLM-L6-v2) with tiered matching — exact match → 5,400+ synonym table → cosine similarity — deployed as chained Azure ML pipelines achieving 99%+ resolution coverage.

Python · SQL · Azure ML · scikit-learn · Sentence-Transformers · W&B · Scrapy

May 2025 — Aug 2025

Dreamable Inc.

San Francisco, CA

AI/ML Engineering Intern

Trained, tuned, and deployed ML models in a production Cloud environment using Python, PyTorch, TensorFlow, and Hugging Face, working collaboratively with a cross-functional team.
Fine-tuned Qwen-2.5-7B via LoRA on GCP Cloud to lower training cost while delivering a Q&A model with comparable accuracy within budget, tracking experiments with Weights & Biases.
Curated NLP datasets (Pandas, NumPy, Hugging Face Datasets) to improve training data quality, and built a generative-AI outreach agent with LangChain and the OpenAI API to automate messaging workflows and increase response rates.

Python · PyTorch · TensorFlow · Hugging Face · LoRA · GCP · LangChain · W&B

Education

San Jose State University

2026 — 2027

B.S. Computer Science · GPA 3.94

NSP research with Professor William
CodePath Advanced DSA training
Dean's List Honoree

San Francisco State University

2023 — 2025

Computer Science (Transferred)

VP of AI Club & Tech Lead at SparkSF
Hosted SFHacks (400+ attendees)
Dean's List Honoree

Projects & Models

Pre-trained models, adapters, and open-source repositories.

Repositories

LLM Firewall for Agentic Tool-Calling

Low-latency inline defense intercepting prompt injections. GPT-2 attacker loop reduced bypass from 23.1% → 3.72%. BERT + LoRA adds ~20ms latency.

GitHub ↗

theHelper — AI Research Assistant

Production RAG with FAISS, LangChain chunking, cross-encoder reranking. Local observability into daily JSONL. CI-gated QA.

GitHub ↗

Kanting — Video RAG System

Indexes YouTube transcripts via Whisper. Semantic search across Sentence-Transformers DB returning precise clip timestamps.

GitHub ↗

End-to-End LLM Post-Pretraining

SFT and GRPO policy alignment pipeline on StableLM 1.6B.

GitHub ↗ HF ↗ W&B ↗

GatorGPT

63M transformer for consumer GPUs. Grouped Query Attention and RoPE.

GitHub ↗ HF ↗

How LLMs Are Made

Annotated code building GPT-2, DeepSeek MoE, and Kimi from scratch.

GitHub ↗

Hugging Face Weights

MedAssistGPT 303M & 401M

Medical-domain transformers on PubMed. SwiGLU, GQA, RoPE.

HuggingFace ↗

Qwen2.5-0.5B SFT+DPO 0.5B

Chat model fine-tuned with SFT and Direct Preference Optimization.

HuggingFace ↗

Llama-3.2-3B OpenHermes 3B

QLoRA on filtered OpenHermes conversational datasets.

HuggingFace ↗

StableLM 1.6B SFT+GRPO 1.6B

Aligned via GRPO on PKU safety preferences.

HuggingFace ↗

Skills & Credentials

Technical expertise, hackathons, and certifications.

LLM Engineering

Transformers · SFT · DPO · GRPO · PPO · LoRA/PEFT · TRL · vLLM · Quantization · Vector DBs · Prompt Eng.

ML & NLP

PyTorch · TensorFlow · Scikit-learn · LangChain · FAISS · Whisper · BART · Sentence Transformers · Pandas · NumPy

Backend & Cloud

FastAPI · Flask · Docker · Azure ML · GCP · PostgreSQL · MongoDB · Scrapy · Nginx · CI/CD

Programming

Python · SQL · Java · JavaScript · C++ · Bash · R · HTML/CSS · Git · Linux

Hackathons

CalHacks 12.0 — Palace of Fine Arts, SFOct 2025

MCP AWS Agentic Challenge — AWS Builder Loft, SFJul 2025

SacHacks — VirtualMar 2025

HackMerced — UC MercedMar 2025

Cal Hacks 11.0 — San FranciscoOct 2024

Certificates

AI Memory: LLM Memory Systems — LinkedIn
Fine-Tuning for LLMs: Beginner to Advanced — LinkedIn
Model Context Protocol (MCP) — LinkedIn
Introduction to Generative AI — Google Cloud
Introduction to Web Development — UC Davis
Programming in Python — University of Michigan
Special Theory of Relativity — Stanford University
Calculus through Data & Modelling (×4) — Johns Hopkins

Get in Touch

Send a brief message to open collaboration.

Open to inquiries about custom fine-tuning runs, alignment evaluation, and model safety audits.

Direct

kunjcr2@gmail.com