itsjaya — AI Portfolio
AI-powered personal portfolio with a RAG chatbot (Avocado), MDX blog, engagement analytics, and a fully automated deploy pipeline. Every layer is production-grade — not a toy.
Overview
Most portfolios are static pages someone scrolls past in 30 seconds. This one starts a conversation.
Avocado is an AI assistant backed by a hybrid RAG pipeline. It answers questions about experience, projects, skills, and blog posts in real time — with tokens streaming directly to the browser. Ask it "what's your strongest AI project?" and it retrieves the most relevant knowledge chunks, reranks them, and streams a grounded answer through Gemini.
The portfolio half is a full static Next.js site — experience, education, projects, and a blog with per-post views and claps tracked in SQLite. The whole system auto-deploys on every git push with zero manual steps.
Why build it this way
A traditional portfolio is a one-way broadcast. You decide what to highlight, the visitor reads what you chose, end of story. The problem is that the actual question someone wants answered — "does this person have experience with distributed caching?" — almost never aligns with the section you happened to emphasise.
An AI-backed portfolio inverts this. The visitor asks in natural language, the system surfaces the most relevant evidence, and Gemini synthesises an answer grounded in real data. The net effect is that a 30-second scroll becomes a conversation that can go as deep as the person wants.
ActiveSystem Architecture
┌──────────────────────────────────────────────────────────────────────────┐
│ CLIENT BROWSER │
│ │
│ ┌─────────────────────────────┐ ┌──────────────────────────────────┐ │
│ │ Avocado Chatbot │ │ Portfolio + Blog + Lab │ │
│ │ / (full-screen) │ │ /portfolio /experience │ │
│ │ /chat (nav-accessible) │ │ /education /projects │ │
│ │ │ │ /blog /blog/[slug] │ │
│ │ ChatInterface │ │ /lab /lab/[slug] │ │
│ │ ChatMessage (md renderer) │ │ │ │
│ │ Model badge · Stats │ │ BlogPostList BlogEngagement │ │
│ │ SSE ReadableStream │ │ BlogIndexStats BlogGuideDrawer │ │
│ └──────────────┬──────────────┘ └──────────────────┬───────────────┘ │
└─────────────────┼────────────────────────────────────────┼───────────────┘
│ HTTPS + SSE │ HTTPS REST
▼ ▼
┌──────────────────────────────────────────────────────────────────────────┐
│ api.jayaremala.com (Nginx :443 → Docker :8000) │
│ AWS Lightsail 2GB · Ubuntu 24.04 LTS │
│ │
│ POST /ai/chat/stream ──► RAG pipeline ──► Gemini SSE │
│ POST /ai/chat ──► RAG pipeline ──► Gemini sync │
│ POST /blog/{slug}/view ──► unique view per IP │
│ POST /blog/{slug}/clap ──► cumulative claps (max 50 / user / post) │
│ GET /blog/{slug}/stats │
│ GET /blog/stats/summary │
│ GET /stats total_responses · unique_visitors │
│ GET /stats/overview 7d / 30d / 1y / all-time for all metrics │
│ GET /health │
│ │
│ ┌────────────────────────────┐ ┌──────────────────────────────────┐ │
│ │ RAG Store │ │ SQLite analytics.db │ │
│ │ │ │ /data/analytics.db │ │
│ │ ChromaDB PersistentClient│ │ (Lightsail SSD 60 GB) │ │
│ │ HNSW cosine similarity │ │ │ │
│ │ all-MiniLM-L6-v2 embed │ │ interactions │ │
│ │ LRU cache (256 entries) │ │ ├── ip_hash TEXT │ │
│ │ │ │ └── created_at TIMESTAMP │ │
│ │ BM25Okapi (rank_bm25) │ │ │ │
│ │ in-memory, rebuilt on │ │ blog_views │ │
│ │ every startup │ │ ├── slug TEXT │ │
│ │ │ │ ├── ip_hash TEXT │ │
│ │ RRF merge k=60 │ │ └── created_at TIMESTAMP │ │
│ └────────────────────────────┘ │ UNIQUE(slug, ip_hash) │ │
│ │ │ │
│ ┌──────────────────────────────┐ │ blog_claps │ │
│ │ /data (Lightsail SSD) │ │ ├── slug TEXT │ │
│ │ ├── analytics.db │ │ ├── ip_hash TEXT │ │
│ │ ├── chroma_db/ │ │ ├── count INTEGER │ │
│ │ └── logs/ │ │ └── updated_at TIMESTAMP │ │
│ └──────────────────────────────┘ └──────────────────────────────────┘ │
│ │ │
│ │ daily cron 02:00 UTC │
│ ▼ │
│ ┌──────────────────────────────┐ │
│ │ S3: itsjaya-backups- │ │
│ │ analytics │ │
│ │ 7-day timestamped backups │ │
│ │ + latest_analytics.db │ │
│ └──────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ Knowledge Base backend/data/knowledge/ │ │
│ │ profile.json · experience.json · education.json │ │
│ │ projects.json · skills.json · testimonials.json │ │
│ │ blog.json (auto-generated from MDX on every push) │ │
│ └──────────────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────────────┘
│
│ Google AI API (HTTPS)
▼
┌─────────────────────────────────────┐
│ Gemini 2.5 Flash (primary) │
│ Gemini 2.0 Flash (fallback 1) │
│ Gemini 2.0 Flash Lite (fallback 2)│
│ Gemini Flash Latest (fallback 3) │
│ auto-retry on 503 / 429 │
└─────────────────────────────────────┘
RAG Pipeline — Deep Dive
The retrieval pipeline runs before every Gemini call. Four stages in sequence:
User message: "what's your strongest AI project?"
│
▼
┌───────────────────────────────────────────────────────────┐
│ STAGE 1 — Query Expansion │
│ │
│ Query 1 (verbatim): │
│ "what's your strongest AI project?" │
│ │
│ Query 2 (name-anchored): │
│ "what's your strongest AI project? Jaya Sabarish │
│ Reddy Remala" │
│ │
│ Query 3 (topic keyword — detected: project/built): │
│ "projects built SnapLog CodeCollab Multi-Agent │
│ GeneCart" │
│ │
│ Query 4 (conversation context): │
│ injected only if prior message exists │
│ │
│ Result: up to 4 query strings │
└──────────────────────┬────────────────────────────────────┘
│ 4 queries
┌────────────┴─────────────┐
▼ ▼
┌──────────────────────┐ ┌──────────────────────────────┐
│ STAGE 2a — DENSE │ │ STAGE 2b — LEXICAL (BM25) │
│ │ │ │
│ One batched encode │ │ BM25Okapi scoring │
│ call for all 4 │ │ Tokenize: lowercase, │
│ queries — │ │ strip punctuation, │
│ single forward pass │ │ keep len > 1 tokens │
│ ~160ms vs ~400ms │ │ │
│ for serial calls │ │ Catches exact matches: │
│ │ │ "3000 RPS", "SnapLog", │
│ HNSW cosine search │ │ "Qualcomm", "78%", │
│ in ChromaDB │ │ "115 GB/day" │
│ │ │ │
│ top 6 per query │ │ top 15 results │
│ (deduped by id) │ │ in-memory, rebuilt on │
└──────────┬───────────┘ │ every startup from docs │
└───────┬────────┘
▼
┌───────────────────────────────────────────────────────────┐
│ STAGE 3 — Reciprocal Rank Fusion │
│ │
│ score(doc) = sum of 1 / (k + rank_i) k = 60 │
│ Cormack, Clarke, Buettcher 2009 │
│ │
│ Chunks in both dense + BM25 get boosted. │
│ Output: up to 20 candidates ranked by RRF score │
└──────────────────────┬────────────────────────────────────┘
▼
┌───────────────────────────────────────────────────────────┐
│ STAGE 4 — Top-N by RRF Score │
│ │
│ Returns top 5 chunks ordered by RRF score │
│ Injected into Gemini system prompt as context │
└───────────────────────────────────────────────────────────┘
Deploy Pipeline
developer: git push origin main
│
▼
GitHub Actions (.github/workflows/deploy.yml)
│
├── npm install + npm run build
│ └── prebuild: sync-knowledge.mjs
│ ├── parses MDX frontmatter + body
│ ├── writes blog.json + lab.json
│ └── copies JSON → frontend/src/data/knowledge/
│
├── git commit "chore: sync knowledge base [skip ci]"
│
├── upload frontend/out/ → GitHub Pages (jayaremala.com)
│
└── SSH → AWS Lightsail (infra/scripts/deploy.sh)
│
├── git pull origin main
│
├── AUTO-RESTORE: if /data/analytics.db missing
│ └── aws s3 cp latest_analytics.db /data/analytics.db
│
├── docker tag :latest → :previous (rollback safety)
│
├── docker build → itsjaya-backend:latest
│
├── docker run --name itsjaya-backend-new -p 8001:8000
│ (staging container — old container still serves :8000)
│
├── health check loop (5s × 24 attempts = 120s max)
│ ├── PASS → stop old :8000, start new :8000
│ └── FAIL → rm staging, old container untouched
│
└── docker image prune -f
Rollback: bash /home/ubuntu/itsjaya/infra/scripts/rollback.sh
└── starts :previous image on :8000 in under 10s
Backup: cron 0 2 * * * → infra/scripts/backup.sh
└── aws s3 cp /data/analytics.db → s3://itsjaya-backups-analytics/
└── prunes backups older than 7 days
Analytics Architecture
All engagement data lives in a single SQLite file. IPs are SHA-256 hashed before storage.
analytics.db
│
├── interactions ← chat analytics
│ ├── ip_hash TEXT SHA-256 of visitor IP
│ └── created_at TIMESTAMP
│ Indexed: idx_ip ON ip_hash
│
├── blog_views ← unique views per post per IP
│ ├── slug TEXT
│ ├── ip_hash TEXT
│ └── created_at TIMESTAMP
│ UNIQUE(slug, ip_hash) idempotent — INSERT OR IGNORE
│
└── blog_claps ← cumulative claps per post per IP
├── slug TEXT
├── ip_hash TEXT
├── count INTEGER capped at 50 per user per post
└── updated_at TIMESTAMP
ON CONFLICT: count = count + excluded.count
Backup flow:
02:00 UTC → backup.sh
├── aws s3 cp /data/analytics.db → analytics_db/TIMESTAMP_analytics.db
├── aws s3 cp /data/analytics.db → analytics_db/latest_analytics.db
└── prune files with date < 7 days ago
Restore (disaster recovery):
aws s3 cp s3://itsjaya-backups-analytics/analytics_db/latest_analytics.db /data/analytics.db
docker restart itsjaya-backend
Frontend Architecture
jayaremala.com (GitHub Pages — static export, no basePath) │ ├── / Avocado — full-screen, no nav/footer ├── /chat Same chatbot, portfolio nav accessible ├── /portfolio Hero + domain chips + projects + skills ├── /experience Timeline ├── /education Cards ├── /projects Grid with source link pill buttons ├── /blog Index sorted by publishedAt ├── /blog/[slug] Source Serif 4 font + engagement ├── /lab Living system design index └── /lab/[slug] This page All portfolio routes share layout via (portfolio) route group. Chatbot lives outside — no nav, no footer, full screen.
Tech Stack
Key Decisions
Railway's free trial ended and stopped all backend services. The migration decision came down to minimum cost with maximum control. App Runner and ECS Fargate cost $20–40/month and add EFS complexity for a single SQLite file. Lightsail 2GB ($10/month) gives 2 vCPUs, 60GB SSD, and a static IP — all the resources the backend needs and nothing it doesn't.
The 2GB plan was chosen over 1GB (EC2 t2.micro free tier) because the fastembed ONNX model + ChromaDB + FastAPI peaks at 400–600MB under load. The 1GB option leaves no headroom for the 120-second ONNX warmup window at startup. A cold-start OOM kill during a recruiter demo would be worse than the $2/month difference.
What this means for the system: the backend is now a Docker container on a VPS with persistent SSD storage, Nginx reverse proxy, and Let's Encrypt HTTPS — a standard production setup that doesn't depend on any PaaS billing cycle.
The original deploy script stopped the old container, then started the new one. During the gap — up to 120 seconds for the ONNX model to warm up — the API was completely down. For a portfolio chatbot, downtime during a deploy is especially bad: the visitor who opens the page while a deploy is running gets an error on their first message.
The fix is blue-green deployment at the container level. The new image starts on port 8001 while the old container keeps serving port 8000 through Nginx. A health check loop polls localhost:8001/health every 5 seconds for up to 120 seconds. Only if the check passes does the script stop the old container and start the new one on port 8000. If the health check fails, the staging container is removed and the old one continues serving — zero downtime, zero user impact.
The entire logic lives in infra/scripts/deploy.sh in the repo. GitHub Actions calls it with a single SSH command, keeping the workflow YAML thin and the deploy logic version-controlled and independently testable.
Two options existed for protecting analytics.db from instance loss. Option A: Lightsail additional disk ($2/month) — a separate volume that survives instance replacement automatically. Option B: daily S3 backup (< $0.01/month) — a cron job copies the file to S3 and keeps 7 days of history.
Option B wins for this workload. The analytics.db file is under 1MB. Storing 7 daily copies costs fractions of a cent per month. The restore path is one command. The additional disk would add 20% to the monthly bill for a file that changes by kilobytes per day.
The one scenario where Option A wins is if the instance is destroyed and rebuilt frequently. That doesn't apply here. The S3 backup cron runs at 02:00 UTC daily via infra/scripts/backup.sh. The deploy script auto-restores from S3 if /data/analytics.db is missing — so a fresh instance recovers all analytics data on the first deploy with no manual step.
Moving the backend from Railway (which gave a free HTTPS URL) to a raw Lightsail IP created a mixed content problem. The frontend is served over HTTPS from GitHub Pages. Modern browsers block http://IP:8000 calls from an HTTPS page — Avocado would silently fail on every message.
The fix requires HTTPS on the backend, which requires a domain. DuckDNS was attempted but proved unreliable. The right call was purchasing jayaremala.com on Namecheap (~$10/year), pointing api.jayaremala.com to the Lightsail static IP, and running Certbot for a free Let's Encrypt certificate behind Nginx. Certbot auto-renews via a systemd timer — zero maintenance.
The domain also replaced the GitHub Pages subpath (sabarishreddy99.github.io/jayaremala) with a clean root domain (jayaremala.com). The basePath: "/jayaremala" in next.config.ts was removed, and a CNAME file was added to frontend/public/ so GitHub Pages serves from the apex domain.
The standard portfolio write-up is a polished post-hoc rationalisation. The real decisions — dead-ends, alternatives considered, constraints that forced your hand — disappear.
A living MDX page that gets amended as the system evolves preserves the actual reasoning. The Decision and Update timeline components make it natural to add entries in-place instead of rewriting history. The constraint that forced MDX over a database-backed CMS is meaningful: the whole frontend is a static export. There is no database write path. MDX files committed to the repo are the only durable storage available at build time.
The collection-level hash approach had one critical failure mode: editing a single FAQ document or publishing one blog post triggered a full re-embed of all ~80 documents — a ~30-second ChromaDB wipe-and-rebuild on every deploy with any content change. As the knowledge base grows with more blog posts and lab entries, this gets worse linearly.
The replacement tracks a SHA-256 hash per document in .doc_hashes.json (stored on the Lightsail SSD alongside ChromaDB). On each startup, run_ingest() diffs the current document set against the stored hashes: new or changed documents are upserted, documents removed from the source are deleted from ChromaDB, unchanged documents are skipped entirely. Adding one blog post now embeds one document instead of eighty.
Per-document tracking also enables precise deletion — when a post or FAQ entry is removed, its vector is deleted from ChromaDB immediately rather than lingering as stale dead weight. A forced full re-embed is available via POST /admin/reingest?force=true (bearer token-gated via ADMIN_TOKEN env var) for cases where the hash file is lost or a clean slate is needed.
Two simpler alternatives exist and both are wrong. Always re-ingest: adds 15–20s to every deploy regardless of whether the knowledge base changed. Skip if ChromaDB non-empty: knowledge updates never propagate after the first deploy.
SHA-256 of the entire data/knowledge/ directory solves both failure modes with a single file read at startup. If the volume is wiped, the missing hash file triggers a fresh ingest, which is correct. Superseded by per-document hashing (see above) which preserves these properties while adding incremental precision.
Dense retrieval fails on specific identifiers. "Qualcomm" isn't weighted by the embedding model. "115 GB/day" and "3000 RPS" retrieve semantically similar documents, not the exact project. These failures matter for a portfolio because the most important recruiter queries are about specifics.
BM25 handles exact term recall with no model overhead — pure in-memory frequency calculation rebuilt from the document list on every startup (~5ms). The hybrid catches both failure modes.
The analytics workload is narrow: INSERT a view, INSERT OR UPDATE clap count, SELECT COUNT with a WHERE on timestamp. No joins. No concurrent writers. SQLite on the Lightsail SSD is zero-cost, same-process, and the entire database is one file that can be backed up with a single aws s3 cp command.
The portfolio UI has no per-request server-side needs. Everything dynamic — chat responses, blog stats — is client-side JavaScript calling the backend API. A fully static export generates HTML, CSS, and JS at build time and serves from GitHub Pages CDN for free with no cold starts.
With the custom domain migration, the basePath: "/jayaremala" constraint was lifted. The site now lives at the domain root. The one remaining constraint: always use Link from next/link for internal navigation — plain anchor tags bypass Next.js routing.
Before this refactor, portfolio data lived in two places: TypeScript files in frontend/src/data/ for the UI, and JSON files in backend/data/knowledge/ for RAG. They drifted. New projects appeared on the website but not in the chatbot's knowledge base.
The sync script makes backend JSON canonical and generates everything else from it. The TypeScript data files are thin typed wrappers over synced JSON copies. The sync runs before every build — there is no path where UI and chatbot knowledge are out of sync.
Gemini 2.5 Flash hits 503 and 429 capacity limits at peak times. A portfolio chatbot that returns an error on the first message is worse than no chatbot. The fallback chain retries through three additional models automatically. The frontend shows which model answered via a green pill badge — transparent without reading as a failure.
The questions that matter most in the first 30 seconds — "What makes Jaya stand out?", "How do I hire him?" — are not well-served by retrieval from raw experience data. The 12 FAQ documents are handwritten answers to the highest-probability recruiter questions, loaded into ChromaDB as their own document type. When one is retrieved, Gemini gets the answer already formed. These 12 documents have more impact on response quality than any pipeline improvement.
Progress Log
Replaced collection-level hash re-ingest with per-document incremental sync. Adding a blog post or lab entry now embeds only that one document rather than wiping and rebuilding the entire ~80-doc collection. Stale documents removed from source are deleted from ChromaDB automatically. Per-document hashes persisted in .doc_hashes.json on the Lightsail SSD. Token-gated POST /admin/reingest?force=true endpoint added for manual full re-embeds.
Migrated backend from Railway to AWS Lightsail 2GB ($10/month). Set up Nginx + Let's Encrypt for HTTPS on api.jayaremala.com. Purchased jayaremala.com domain — frontend now at root domain instead of GitHub Pages subpath. Removed basePath from next.config.ts, added CNAME file to public/.
Implemented zero-downtime blue-green deployment. New container health-checked on port 8001 before old :8000 is stopped. deploy.sh, rollback.sh, and backup.sh moved into infra/scripts/ and version-controlled in the repo. GitHub Actions deploy-backend job reduced to a single SSH call.
Set up daily S3 backup of analytics.db to itsjaya-backups-analytics bucket with 7-day retention. Deploy script auto-restores from S3 if /data/analytics.db is missing on a fresh instance. Docker log rotation configured: json-file driver, 10MB max, 3 files. CHROMA_DB_PATH made configurable via settings.py so ChromaDB persists to /data/chroma_db on the Lightsail SSD.
Built /lab section. MDX-based living system design docs with custom components: Status, Arch, Decision, Update, Stack, Metric. Added Lab to nav. First entry: itsjaya itself.
Blog guide drawer now shows live stats dashboard — unique visitors, Avocado responses, blog views, claps with 7d / 30d / 1y / all-time breakdown. New GET /stats/overview endpoint returns all periods in one API call.
Blog engagement fully live: views (unique per IP per post), claps (max 50/user, 1.5s debounced batching), per-post stats on index cards, totals in blog header. Backed by SQLite.
BM25 hybrid retrieval added. GET /stats/overview with period filtering (7d / 30d / 1y / all-time) added to both analytics and blog stats modules.
Chat markdown rendering rewritten — handles headings, bullets, numbered lists, bold, italic, inline code, links, dividers.
Gemini model fallback chain implemented. Model indicator badge added to chatbot.
Blog deployed with MDX, Source Serif 4 reading font, publishedAt-based sort. Sync script auto-generates blog.json so Avocado can answer questions about published posts.
Single source of truth refactor complete. Backend JSON is canonical — TypeScript files are typed re-exports. sync-knowledge.mjs runs before every build.
Project started. Basic FastAPI + ChromaDB + Next.js skeleton. First working Avocado response.
What's next
- Reading time estimate on blog cards
- Search within blog posts
- Avocado voice input (Web Speech API — already prototyped)
- A/B test shorter vs longer system prompt to measure response quality
- Monitoring dashboard (uptime + response time over time)