Live Projects
TheMarketCast.ai
20253,000+ organic weekly users tracking private fundraising activity
↗ Visit SitePlatform for parsing and analyzing SEC Form D filings daily, enabling investors to track private fundraising activity in real time. Attracts 3,000+ organic weekly users via SEO/GEO optimization with real-time tracking of 450,000+ filings. Enriched company data via Exa.
Equity research platform for financial analysts. Built latency-optimized agentic RAG system using Qwen3-235B that synthesizes earnings calls, SEC filings, and real-time news via Tavily to answer complex financial queries. SEC filings sub-agent achieved 91% accuracy on FinanceBench benchmark with LLM-as-judge evaluation. Built text-to-SQL stock screener using DuckDB for natural language financial queries.
Technical Projects
Web Search Engine
2026Distributed crawler over 10M+ pages with full-text search
Built fault-tolerant distributed web crawlers in Rust across 5 instances with politeness policies and Redis queue coordination, storing 10M+ tech-focused pages (200GB) in AWS S3 with metadata in PostgreSQL, provisioned via Terraform. Built full-text search over the crawled corpus using Tantivy inverted index with BM25 ranking and a search UI for querying curated tech content.
Financial Podcast Platform
202550+ early beta users, personalized portfolio podcasts on-demand
Real-time AI podcast platform using OpenAI Whisper and GPT-4o with Celery workers for async audio generation. React frontend with WebSocket streaming for live audio updates, backend deployed on AWS. Organically acquired 50+ users in early beta with portfolio-based personalized podcast generation on-demand. Market data pipeline processing 9,000 tickers every 20 minutes using SERP API and Redis caching.
Drop-in observability library for RAG pipelines. Traces retrieval quality, latency, and relevance with minimal instrumentation, designed so engineers can monitor and debug RAG systems without changing their existing pipeline code. Provides observability from the source documents with citations to detect broken chunking, incorrect parsing, and other ingestion-level failures that are otherwise invisible at query time.
Open Source
PySyft, OpenMined
2019–2022Core contributor to one of the most widely used privacy-preserving ML libraries
↗ GitHubImplemented FALCON protocol operations, the first Python implementation of an honest-majority maliciously secure framework for private deep learning. Planned the SyMPC library roadmap and performed code reviews for the secure multi-party computation library. Contributed to core privacy-preserving ML infrastructure using PyTorch, TensorFlow, and differential privacy.
Framework for privacy-preserving data analysis using Pandas with a pointer-based architecture for flexible EDA on private data without direct access. Implements differential privacy for individual row protection and federated analytics using secure multi-party computation.