Large Language Models: architecture, pre-training, and alignment safety.
I am a deep learning engineer focused on optimizing model capability, safety frameworks, and scaling efficiency. I research transformer pre-training architectures and develop safety-critical alignment frameworks to prevent prompt injections and tool exploitation.
Core Areas
Pre-training & Adaptation
Pre-training medical and domain-specific transformers using SwiGLU, Grouped Query Attention, and Rotary Embeddings. Architecting memory-efficient adaptations using LoRA/QLoRA on cloud clusters.
Alignment & Model Safety
Implementing alignment feedback loops (SFT, DPO, GRPO) for safety and instruction compliance. Developing inline BERT-based firewalls that run with low latency (~20ms) to filter prompt injections.
Experience & Education
Chronology of professional internships and academic coursework.
May 2026 — Aug 2026
A10 Networks
San Jose, CA
Deep Learning Intern — LLM Research & Model Safety
Incoming intern focusing on Large Language Model (LLM) safety frameworks, alignment strategies, and interpretability research. Special emphasis on building robust alignment feedback evaluations.
Oct 2025 — Present
Routes Technologies
Remote, TX
AI Engineering Intern
Collaborating with the team to train, benchmark, and deploy AI models. Responsible for dataset pipelines, robust tool integration, and cloud serving infrastructure.
- Engineered a fully functional Web Crawler with Scrapy and BeautifulSoup4, enabling ethical extraction of company-relevant data from open web sources.
- Developed a Flask-based Instagram Graph API integration leveraging Python and Pydantic, featuring OAuth Authentication and intelligent hashtag/recipe detection.
- Deployed AI model endpoints via Azure ML Studio managed endpoints, configuring load balancing and auto-scaling for production-grade reliability.
May 2025 — Aug 2025
Dreamable Inc.
San Francisco, CA
AI Engineering Intern
Contributed to fine-tuning Qwen-2.5-7B using Hugging Face, PyTorch, and LoRA on Lambda Cloud for cost and memory efficient training — deployed on GCP Cloud Run.
- Spearheaded dataset curation pipeline using Pandas, NumPy, and Hugging Face Datasets library for Q&A task optimization.
- Tuned model hyperparameters to achieve low validation loss, tracking runs and metrics via Weights & Biases.
- Built an AI-powered Outreach Agent using LangChain, Exa.ai, and OpenAI API to automate and scale messaging workflows.
Education
San Jose State University
Jan 2026 — May 2027B.S., Computer Science · GPA 3.94 / 4.00
- Researching Next Sentence Prediction mechanisms with Professor William.
- Participated in CodePath Advanced Data Structures & Algorithms training.
- Dean's List Honoree.
Coursework: Advanced Data Structures & Algorithms · Computer Architecture · Full Stack Software Engineering
San Francisco State University
Jul 2023 — Dec 2025Computer Science (Transferred)
- Vice President of AI Club & Tech Lead at SparkSF.
- Hosted SFHacks (400+ attendees).
- Dean's List Honoree.
Projects & Models
Pre-trained models, adapters, and open source repositories.
Repositories
LLM Firewall for Agentic Tool-Calling
Low-latency inline defense intercepting prompt injections before tools execute.
- Fine-tuned GPT-2 attacker loop reduced bypass from 23.1% to 3.72%.
- BERT classifier with LoRA adapter adds only ~20ms latency.
theHelper — AI Research Assistant
Production RAG service backed by local FAISS vector index, LangChain recursive chunking, and cross-encoder reranking.
- Local observability layer tracing query-metrics into daily JSONL.
- Automated QA evaluation gated in a regression-prevention CI pipeline.
Kanting — Video RAG System
Indices YouTube video transcripts utilizing Whisper. Performs semantic search across Sentence-Transformers database returning Claude clip coordinates.
End-to-End LLM Post-Pretraining Pipeline
Production post-pretraining script covering Supervised Fine-Tuning (SFT) and GRPO policy alignment. Configured on StableLM 1.6B datasets.
GatorGPT
63M parameter transformer model optimized for consumer GPUs. Implements Grouped Query Attention (GQA) and RoPE architecture.
How LLMs Are Made
Deep architectural logs and code constructing GPT-2, DeepSeek MoE, and Kimi from absolute scratch.
Hugging Face Weights
MedAssistGPT
303M & 401MMedical-domain transformers pretrained on PubMed scientific papers. Implements SwiGLU, Grouped Query Attention, and Rotary Embeddings.
Hugging Face WeightsQwen2.5-0.5B SFT+DPO
0.5BFine-tuned chat model using Supervised Fine-Tuning and Direct Preference Optimization adapters on instruction-following datasets.
Hugging Face WeightsLlama-3.2-3B OpenHermes
3BQLoRA adaptation of Llama-3.2-3B on filtered OpenHermes conversational datasets, tracking low validation scores.
Hugging Face WeightsStableLM 1.6B SFT+GRPO
1.6BA StableLM base model aligned via Group Relative Policy Optimization on PKU safety preferences for secure utility.
Hugging Face WeightsSkills & Credentials
Technical expertise, hackathons, and certifications.
Programming
Python · SQL · Java · JavaScript · C++ · HTML5/CSS · TailwindCSS · Bash · R · Git · Linux
ML & NLP
PyTorch · TensorFlow · Scikit-learn · Pandas · NumPy · Neural Networks · LangChain · OpenCV · FAISS · Sentence Transformers · Whisper · BART · Matplotlib
LLM Engineering
Transformers · SFT/DPO/GRPO/PPO · Model Training & Inference · vLLM · LoRA/PEFT · TRL · Quantization · Vector Databases · Prompt Engineering
Backend & Cloud
FastAPI · Flask · Streamlit · PostgreSQL · MongoDB · MySQL · Pydantic · Docker · Azure ML Studio · GCP · Scrapy · BeautifulSoup · OAuth · Nginx · CI/CD
Hackathons
Palace of Fine Arts, SF · Project: Kanting
AWS Builder Loft, SF · Project: Nango Automation
Virtual · Project: Web Detective
UC Merced · Project: Web Detective
San Francisco, CA · Project: Workout Web App
Certificates
- AI Memory: Exploring and Building LLM Memory Systems — LinkedIn (Jul 2025)
- Automate Development Tasks with OpenAI's Codex — LinkedIn (Jul 2025)
- Fine-Tuning for LLMs: From Beginner to Advanced — LinkedIn (Jul 2025)
- Model Context Protocol (MCP): Hands-On with Agentic AI — LinkedIn (Jul 2025)
- Introduction to Generative AI — Google Cloud Education (Jan 2024)
- Crash Course on Python — Google Career Certificates (Jan 2024)
- Introduction to Programming Using Python — University of Michigan (Apr 2023)
- Python — University of Michigan
- Introduction to Web Development — UC Davis (Jan 2024)
- Introduction to Complex Analysis — Wesleyan University (Jul 2022)
- Understanding Einstein: The Special Theory of Relativity — Stanford University (Mar 2022)
- Calculus through Data & Modelling: Integration Applications — Johns Hopkins University
- Calculus through Data & Modelling: Series and Integration — Johns Hopkins University
- Calculus through Data & Modelling: Techniques of Integration — Johns Hopkins University
- Calculus through Data & Modelling: Vector Calculus — Johns Hopkins University
Get in Touch
Send a brief message to open collaboration.
If you have inquiries about custom fine-tuning runs, alignment evaluation, or model safety audits, get in touch directly:
Direct
kunjcr2@gmail.com