Kunj Shah | Portfolio

LLM Research & Development

How LLMs Are Made

An all-in-one GitHub repo documenting my hands-on journey building and experimenting with LLMs—from GPT, Deepseek, and Kimi architectures to advanced techniques like MoE, MoD, MHLA, and MLA. Includes code, experiments, insights, and resources.

Technical Insights & Documentation

Full-Stack LLM Solutions

Kimi, GPT, Deepseek Architectures

GatorGPT

A lightweight 63M param transformer-based language model with modern architecture built for my University. Features Grouped Query Attention, RoPE positioning, and SwiGLU activation. Deployed with vLLM and available via Docker (kunjcr2/gatorgpt).
Eval loss dropped from ~246 to ~1.503.

Fast Inference with torch.compile & flash attention

Memory Efficient GQA reduces memory usage

One-click Docker deployment with vLLM

Experience

AI Agent Intern, Dreamable Inc.
Collaborating on the development of AI agent solutions that enable developers to build intelligent LLM workflows 2–3x faster using low-code and no-code tools, focusing on agent orchestration, prompt chaining, and modular tool integration to simplify complex AI behavior into developer-friendly components.
- n8n outreach agent adopted by 14 interns,
- lead‑quality ↑2.3×
June 2024 to August 2025
Vice President of AI Club @ SF State
Collaborating on the development of AI agent solutions that enable developers to build intelligent LLM workflows 2–3x faster using low-code and no-code tools, focusing on agent orchestration, prompt chaining, and modular tool integration to simplify complex AI behavior into developer-friendly components.
- Bringing AI to students on campus of SFSU,
- Website
June 2024 to August 2025
Tech Director, SparkSF
Developed a responsive website for SparksF, an entrepreneurship club at SF State University, to streamline member access to announcements and account management, enhancing engagement through user authentication and real-time updates, along with a chatbot just for SparksF.
Website

Dec 2024 to Present
Intern, Dyna Grow Design Solutions
Built a responsive, user-friendly website using ReactJS and Tailwindcss to enhance Dyna Grow Design Solutions' digital presence, earning a Letter of Recommendation from the founder for delivering lasting impact.
- SEO & perf tweaks drove 2× qualified inquiries
May 2024 to January 2025

Projects

Max - AI Voice Assistant

90% voice accuracy 8 tools Langchain/OpenAI

Developed a voice-activated AI assistant using Langchain, OpenAI, Hugging Face, and SpeechRecognition to automate tasks like web search, YouTube streaming, and emailing, enhancing user experience through hands-free interaction.

Github

FLAN-T5 Stack Overflow Finetuning thumbnail

Llama-3.2-3b FInetuned on OpenHermes

~300k QA Pairs LoRA Finetuning 1.27->0.21 Train loss vLLM and Docker used

An instruction-tuned Llama-3.2-3B base model trained with LoRA on the OpenHermes dataset. This run transformed the base model into an instruct-capable assistant with only ~0.75% of parameters updated, making it lightweight, deployment-friendly, and packaged as a Docker image (kunjcr2/llama-3.2-3b-openhermes) for reproducible serving with vLLM.

Hugging Face | GitHub

Qwen2.5-0.5B SFT + DPO

85M tokens (SFT) 1.48 val loss (SFT) 66% reward accuracy (DPO)

A two-stage pipeline where the model was first trained on 85M tokens with supervised fine-tuning, reaching a validation loss of 1.48, and then optimized with Direct Preference Optimization to achieve 66% reward accuracy. This demonstrates how foundational instruction tuning can be reinforced through preference optimization to improve reasoning quality.

Hugging Face | GitHub

More projects on Github

Technical Skills

Programming Languages

Python JavaScript Java C++ HTML/CSS SQL

AI Tools & Frameworks

LangChain LangFlow n8n RAG Pipelines OpenAI API Hugging Face Transformers MCP Servers Vector Databases Prompt Engineering

Machine Learning & Deep Learning

PyTorch TensorFlow Scikit-learn Keras OpenCV Pandas NumPy Matplotlib NLP Computer Vision LoRA Neural Networks Weights & Biases Encoder–Decoder Models Reinforcement Learning DPO PPO

MLOps & Deployment

Docker vLLM Serving Hugging Face Hub Model Deployment GPU Optimization Distributed Training Vertex AI Git

Web Development

Node.js React.js Flask TailwindCSS Express.js

Database & Development Tools

MongoDB MySQL Vertex AI Git Docker

LLM Architectures & Systems

Transformers Attention Mechanisms Pretraining Finetuning Tokenizers vLLM Optimization Mixture of Experts Mixture of Recursions Mixture of Depths Rotary Positional Encodings Multi-Token Prediction Flash Attention Sliding Window Attention Reasoning Models HRMs GPU Training Distributed Learning

Core CS & Problem Solving

Data Structures & Algorithms Binary Trees & BSTs Graph DFS/BFS Dynamic Programming SQL Querying

Hackathons

Show all Hackathons

MCP AWS Agentic Challange

Where: AWS Builder Loft, SF

When: 7/25/2025

Project: Nango Automation

Cal Hacks 11.0

Where: San Francisco, CA

When: October 18, 2024 – October 20, 2024

Project: Workout Web App

SacHacks

Where: Virtual Hackathon

When: March 2, 2025 – March 3, 2025

Project: Web Detective

HackMerced

Where: University of California, Merced

When: March 9, 2025 – March 11, 2025

Project: Web Detective (Updated)

Certificates

Show all certificates

Programming for Everybody (Getting Started with Python) – University of Michigan
Python Data Structures – University of Michigan
Crash course on Python – Google
Calculus through data and modelling: Series and integration – Johns Hopkins University
Calculus through data and modelling: Techniques of integration – Johns Hopkins University
Calculus through data and modelling: Integration Applications – Johns Hopkins University
Calculus through data and modelling: Vector Calculus – Johns Hopkins University
Introduction to Web Development – UC Davis
Understanding Einstein: Special theory of relativity – Stanford University
Introduction to complex analysis – Wesleyan University
Understanding Basic SQL Syntax – Coursera Project Network
C++ Basics: Selection and Iteration – Codec
Building a Text-based Bank – Coursera Project Network
Create a Supermarket app using Java OOP – Coursera Project Network
Python 101: Develop Your First Python Program – Coursera Project Network
LOR by Duc Ta - CSC215
LOR by Maitra Shah - Internship Certificate

About Me

I am a second-year Computer Science student at San Francisco State University with expertise in AI/ML and Full-stack development. Currently serving as Tech Director at SparkSF, I specialize in Machine Learning, NLP, and MERN stack development. My notable projects include AI-powered applications like 'theHelper' research assistant and 'Max' voice assistant. I've participated in multiple hackathons including Cal Hacks 11.0 and SacHacks, creating innovative solutions like Workout Web App and Web Detective. With strong foundations in Python, JavaScript, and various AI frameworks including Hugging Face Transformers and OpenCV, I combine academic excellence (JEE qualifier) with practical development experience to deliver impactful solutions.