Live Projects
TheMarketCast.ai
20253,000+ organic weekly users tracking private fundraising activity
↗ Visit SiteBuilt a Python/PostgreSQL platform parsing and analyzing 450,000+ SEC Form D filings with sub-second query latency via Redis caching and query optimization, attracting 3,000+ organic weekly users. Acquired by Smartkarma, a global investment research and analysis platform.
Equity research platform for financial analysts. Designed multi-agent system across 4 specialized sub-agents (SEC filings, earnings calls, news, stock screener) using Qwen3-235B with web search and text-to-SQL interfaces via Tavily, and agent observability via Logfire. SEC filings sub-agent achieved 91% accuracy on FinanceBench via latency-optimized RAG over 100,000+ vectorized filings in Qdrant, evaluated with an LLM-as-judge pipeline measuring relevance, accuracy, and citation quality. Built end-to-end with React/TypeScript frontend and FastAPI/PostgreSQL/S3 backend.
Helps X users know which accounts to trust by surfacing fake news and engagement bait patterns. First step: making 200,000+ Community Notes searchable and ranking accounts by how many they've received.
Technical Projects
Web Search Engine
2026Distributed crawler over 30M+ pages at 350k pages/hour for <$100/month
Built fault-tolerant distributed web crawlers in Rust across 10 instances with Redis queue coordination (lease management, domain locking, dead-letter queues, per-domain rate limiting), crawling 350,000 pages/hour for <$100/month. Built full-text search over 30M+ pages (1 TB+) using Tantivy inverted index with BM25 ranking and ReAct agent-driven query decomposition and reasoning, with a React/TypeScript frontend.
Financial Podcast Platform
202550+ early beta users, personalized portfolio podcasts on-demand
Real-time AI podcast platform using OpenAI Whisper and GPT-4o with Celery workers for async audio generation. React frontend with WebSocket streaming for live audio updates, backend deployed on AWS. Organically acquired 50+ users in early beta with portfolio-based personalized podcast generation on-demand. Market data pipeline processing 9,000 tickers every 20 minutes using SERP API and Redis caching.
Drop-in observability library for RAG pipelines. Traces retrieval quality, latency, and relevance with minimal instrumentation, designed so engineers can monitor and debug RAG systems without changing their existing pipeline code. Provides observability from the source documents with citations to detect broken chunking, incorrect parsing, and other ingestion-level failures that are otherwise invisible at query time.
Open Source
PySyft, OpenMined
2019–2022Core contributor to one of the most widely used privacy-preserving ML libraries
↗ GitHubImplemented FALCON protocol operations, the first Python implementation of an honest-majority maliciously secure framework for private deep learning. Planned the SyMPC library roadmap and performed code reviews for the secure multi-party computation library. Contributed to core privacy-preserving ML infrastructure using PyTorch, TensorFlow, and differential privacy.
Framework for privacy-preserving data analysis using Pandas with a pointer-based architecture for flexible EDA on private data without direct access. Implements differential privacy for individual row protection and federated analytics using secure multi-party computation.