Ummidvar — AI Job Application Agent
Discover → score → tailor → apply → track → sponsor — India-first, international by design.
0
Tests Passing
0+
Lines of Code
0
Modular Packages
Tech Stack
The Challenge
Job searching at scale means applying to dozens of roles with individually tailored materials — a process that takes hours per application done properly. The hard problems: aggregating and deduplicating jobs across 7+ platforms without keyword noise; scoring each role against a resume using a meaningful composite algorithm rather than shallow keyword overlap; generating cover letters that sound like the user wrote them — not an LLM — while never fabricating experience (anti-hallucination via a Facts Graph); submitting through real ATS systems (Greenhouse, Lever, Ashby, Workday) using browser automation that can pause on CAPTCHAs and hand off to the user instead of breaking; tracking email replies to classify interviews, rejections, and ghosting; and layering in visa-sponsorship intelligence for international job seekers across UK, AU, NZ, CA, and EU registries. The system also had to be India-first in design (Naukri integration, INR salary handling) while supporting multi-profile, multi-preference search.
Architecture & System Design

Multi-board job discovery engine ranks opportunities by skills match, role level, location fit, and compensation. Fact-grounded generation prevents hallucinated experience claims — all content references verified resume data. Browser automation submits applications through major ATS systems. Email integration tracks application replies (offers, interviews, rejections). Visa sponsorship eligibility checker across 5 countries. Modular architecture with pluggable LLM backend.
11 modular Python packages orchestrated by a FastAPI REST API. The `core` package defines pure domain models (Job, BaseResume, FactsGraph, Application, MatchScore) with no I/O — a clean domain layer. `discovery` wraps python-jobspy for multi-board aggregation with deduplication. `scoring` implements a 100-pt composite: skills overlap (0–40, Jaccard), title/seniority (0–20), location fit (0–15), salary alignment (0–10), semantic similarity (0–15) — every score is explainable. `tailoring` uses deterministic Facts Graph extraction to build an anti-hallucination contract before calling any LLM, ensuring generated text only references verified experience. `humanizer` applies a style-preserving rewrite pass that strips AI tell-tale patterns. `adapters` provides Playwright-backed ATS submit flows for Greenhouse, Lever, Ashby, Workday, and LinkedIn, with a HITL blocker queue that snapshots state on CAPTCHA/MFA and resumes from checkpoint. `replies` ingests Gmail/IMAP and classifies responses (offer, interview, rejection, ghost). `sponsorship` bundles 5 live government registries. `referrals` ingests LinkedIn CSV exports, scores connection strength, and drafts intro messages. LLM backend is fully pluggable: TemplateLLM (zero-dep default), Ollama, OpenAI, Anthropic, or any compatible API — swapped via env var.
Code Walkthrough
3-step walk-through of the production implementation — file paths and intent shown above each block.
- 01
Step 1 of 3
Cross-board deduplication
ummidvar/discovery/dedupe.pyA single role surfaces on LinkedIn, Indeed, Naukri, and the company's own careers page simultaneously. Without deduplication the scorer wastes compute on duplicates and the user sees the same job five times. Canonical fingerprinting collapses them before anything else runs.
pythondef fingerprint(job: Job) -> str: """Stable fingerprint for cross-board deduplication.""" title = re.sub(r"[^a-z0-9 ]", "", job.title.lower()) title = re.sub(r"\b(senior|sr|jr|junior|lead|principal)\b", "", title).strip() company = job.company.lower().strip() # Location is often missing or inconsistent — fall back to remote flag location = (job.location or "remote").lower().split(",")[0].strip() return sha1(f"{title}|{company}|{location}".encode()).hexdigest() def deduplicate(jobs: Iterable[Job]) -> list[Job]: seen: dict[str, Job] = {} for job in jobs: fp = fingerprint(job) existing = seen.get(fp) # Prefer the posting with more fields filled in if existing is None or job.completeness() > existing.completeness(): seen[fp] = job return list(seen.values())TakeawayNormalise title (strip seniority prefixes) + company + location city → one hash per logical role, regardless of which board it came from.
- 02
Step 2 of 3
Facts Graph — structural anti-hallucination
ummidvar/tailoring/facts_graph.pyCover letters must never claim experience the user doesn't have. Instead of trusting an LLM prompt, every generated sentence is checked against a pre-extracted graph of verified facts. Ungrounded claims raise an error, not a warning.
python@dataclass class FactsGraph: skills: set[str] roles: list[RoleFact] # title, company, dates, bullets education: list[EducationFact] achievements: list[str] # quantified results only def verify_claim(self, claim: str) -> bool: """Returns True only if claim is grounded in an extracted fact.""" tokens = set(claim.lower().split()) grounded = ( tokens & self.skills or any(r.matches_tokens(tokens) for r in self.roles) or any(e.matches_tokens(tokens) for e in self.education) ) return bool(grounded) def assert_grounded(self, text: str) -> None: sentences = [s.strip() for s in text.split(".") if s.strip()] violations = [s for s in sentences if not self.verify_claim(s)] if violations: raise HallucinationError(f"Ungrounded claims detected: {violations}")TakeawayHallucination isn't handled by prompt engineering — it's a hard invariant enforced after generation. Ungrounded text never reaches the user.
- 03
Step 3 of 3
Playwright ATS adapter with HITL hand-off
ummidvar/adapters/greenhouse.pyReal ATS flows hit CAPTCHAs, MFA challenges, and custom screening questions. Rather than fail silently or pretend to solve CAPTCHAs, the adapter snapshots browser state and queues the session for human completion — resuming from checkpoint once the user finishes.
pythonasync def submit(self, application: Application) -> SubmitResult: async with async_playwright() as pw: browser = await pw.chromium.launch(headless=True) ctx = await browser.new_context(storage_state=application.session_state) page = await ctx.new_page() await page.goto(application.apply_url, wait_until="networkidle") await self._fill_basic_fields(page, application.resume) await self._upload_resume(page, application.resume.pdf_path) if await self._captcha_detected(page): state = await ctx.storage_state() snapshot = await page.screenshot(full_page=True) self.hitl_queue.enqueue( BlockedSubmission( application_id=application.id, reason="captcha", state=state, snapshot=snapshot, ) ) return SubmitResult.blocked("captcha") await page.click('button[type="submit"]') return SubmitResult.ok()TakeawayNo CAPTCHA-solving services, no brittle retries — when the browser hits a wall, the state is frozen and a human finishes the job. Silent failures are eliminated by design.
Results
Ummidvar is production-deployable via Docker Compose in a single command. The codebase spans 6,600+ lines across 11 packages with 286 passing tests (unit + integration) covering every major component — Facts Graph extraction, composite scoring, LLM adapters, ATS submission adapters, reply classifier, sponsorship registry loaders, and referral finder. The discovery layer aggregates and deduplicates from LinkedIn, Indeed, Glassdoor, ZipRecruiter, Google Jobs, Naukri, and direct career pages. The Playwright engine handles Greenhouse, Lever, Ashby, Workday, and LinkedIn ATS flows with HITL blocker queue for CAPTCHAs. The sponsorship layer cross-references 5 government registries (UK · AU · NZ · CA · EU) live, refreshed weekly via GitHub Actions. Apache-2.0 licensed for future hosted-tier expansion.
Gallery & Demos
Application Tracker Dashboard
Kanban board showing job applications across stages: Discovered, Ranked, Tailored, Applied, Interview Scheduled, Offer Received, and Rejected.
Click any image or video to expand · ← → keys navigate
Interested in this work?
Full architecture walkthrough and code review available during interviews.