# Facts Graph — anti-hallucination contract for LLM generation # All generated claims must resolve to an extracted fact; fabrication is structurally impossible. @dataclass class FactsGraph: skills: set[str] roles: list[RoleFact] # title, company, dates, bullets education: list[EducationFact] achievements: list[str] # quantified results only def verify_claim(self, claim: str) -> bool: """Returns True only if claim is grounded in an extracted fact.""" tokens = set(claim.lower().split()) grounded = ( tokens & self.skills or any(r.matches_tokens(tokens) for r in self.roles) or any(e.matches_tokens(tokens) for e in self.education) ) return bool(grounded) def assert_grounded(self, text: str) -> None: sentences = [s.strip() for s in text.split(".") if s.strip()] violations = [s for s in sentences if not self.verify_claim(s)] if violations: raise HallucinationError(f"Ungrounded claims detected: {violations}")

BackendAI/MLDevOps

Ummidvar — AI Job Application Agent

Discover → score → tailor → apply → track → sponsor — India-first, international by design.

View on GitHub

Tests Passing

Lines of Code

Modular Packages

Tech Stack

PythonFastAPIPlaywrightClaudeOpenAIOllamaDockerPostgreSQLReact

The Problem

Applying to jobs properly — writing tailored cover letters, tracking applications, following up — takes hours per application. Most people either apply carelessly or burn out from the volume. There was no tool that handled the tedious parts intelligently while keeping the human in control and never putting words in their mouth they didn't actually say.

The Challenge

Job searching at scale means applying to dozens of roles with individually tailored materials — a process that takes hours per application done properly. The hard problems: aggregating and deduplicating jobs across 7+ platforms without keyword noise; scoring each role against a resume using a meaningful composite algorithm rather than shallow keyword overlap; generating cover letters that sound like the user wrote them — not an LLM — while never fabricating experience (anti-hallucination via a Facts Graph); submitting through real ATS systems (Greenhouse, Lever, Ashby, Workday) using browser automation that can pause on CAPTCHAs and hand off to the user instead of breaking; tracking email replies to classify interviews, rejections, and ghosting; and layering in visa-sponsorship intelligence for international job seekers across UK, AU, NZ, CA, and EU registries. The system also had to be India-first in design (Naukri integration, INR salary handling) while supporting multi-profile, multi-preference search.

Architecture & System Design

Ummidvar — AI Job Application Agent system architecture

Multi-board job discovery engine ranks opportunities by skills match, role level, location fit, and compensation. Fact-grounded generation prevents hallucinated experience claims — all content references verified resume data. Browser automation submits applications through major ATS systems. Email integration tracks application replies (offers, interviews, rejections). Visa sponsorship eligibility checker across 5 countries. Modular architecture with pluggable LLM backend.

Full system schematic available upon request

11 modular Python packages orchestrated by a FastAPI REST API. The `core` package defines pure domain models (Job, BaseResume, FactsGraph, Application, MatchScore) with no I/O — a clean domain layer. `discovery` wraps python-jobspy for multi-board aggregation with deduplication. `scoring` implements a 100-pt composite: skills overlap (0–40, Jaccard), title/seniority (0–20), location fit (0–15), salary alignment (0–10), semantic similarity (0–15) — every score is explainable. `tailoring` uses deterministic Facts Graph extraction to build an anti-hallucination contract before calling any LLM, ensuring generated text only references verified experience. `humanizer` applies a style-preserving rewrite pass that strips AI tell-tale patterns. `adapters` provides Playwright-backed ATS submit flows for Greenhouse, Lever, Ashby, Workday, and LinkedIn, with a HITL blocker queue that snapshots state on CAPTCHA/MFA and resumes from checkpoint. `replies` ingests Gmail/IMAP and classifies responses (offer, interview, rejection, ghost). `sponsorship` bundles 5 live government registries. `referrals` ingests LinkedIn CSV exports, scores connection strength, and drafts intro messages. LLM backend is fully pluggable: TemplateLLM (zero-dep default), Ollama, OpenAI, Anthropic, or any compatible API — swapped via env var.

Code Walkthrough

3-step walk-through of the production implementation — file paths and intent shown above each block.

01
Step 1 of 3
Cross-board deduplication
ummidvar/discovery/dedupe.py
A single role surfaces on LinkedIn, Indeed, Naukri, and the company's own careers page simultaneously. Without deduplication the scorer wastes compute on duplicates and the user sees the same job five times. Canonical fingerprinting collapses them before anything else runs.
python
Takeaway
Normalise title (strip seniority prefixes) + company + location city → one hash per logical role, regardless of which board it came from.
02
Step 2 of 3
Facts Graph — structural anti-hallucination
ummidvar/tailoring/facts_graph.py
Cover letters must never claim experience the user doesn't have. Instead of trusting an LLM prompt, every generated sentence is checked against a pre-extracted graph of verified facts. Ungrounded claims raise an error, not a warning.
python
Takeaway
Hallucination isn't handled by prompt engineering — it's a hard invariant enforced after generation. Ungrounded text never reaches the user.
03
Step 3 of 3
Playwright ATS adapter with HITL hand-off
ummidvar/adapters/greenhouse.py
Real ATS flows hit CAPTCHAs, MFA challenges, and custom screening questions. Rather than fail silently or pretend to solve CAPTCHAs, the adapter snapshots browser state and queues the session for human completion — resuming from checkpoint once the user finishes.
python
Takeaway
No CAPTCHA-solving services, no brittle retries — when the browser hits a wall, the state is frozen and a human finishes the job. Silent failures are eliminated by design.

Results

Ummidvar is production-deployable via Docker Compose in a single command. The codebase spans 6,600+ lines across 11 packages with 286 passing tests (unit + integration) covering every major component — Facts Graph extraction, composite scoring, LLM adapters, ATS submission adapters, reply classifier, sponsorship registry loaders, and referral finder. The discovery layer aggregates and deduplicates from LinkedIn, Indeed, Glassdoor, ZipRecruiter, Google Jobs, Naukri, and direct career pages. The Playwright engine handles Greenhouse, Lever, Ashby, Workday, and LinkedIn ATS flows with HITL blocker queue for CAPTCHAs. The sponsorship layer cross-references 5 government registries (UK · AU · NZ · CA · EU) live, refreshed weekly via GitHub Actions. Apache-2.0 licensed for future hosted-tier expansion.

Gallery & Demos

Application Tracker Dashboard

Kanban board showing job applications across stages: Discovered, Ranked, Tailored, Applied, Interview Scheduled, Offer Received, and Rejected.

Click any image or video to expand · ← → keys navigate

More from Personal Project

Product Management — Ummidvar Job Agent

Product vision and feature scoping for Ummidvar — an autonomous, multi-profile job application agent. Defined the 11-package modular architecture from a product perspective, prioritised the anti-hallucination Facts Graph as a non-negotiable product constraint, and scoped visa-sponsorship intelligence for 5 international markets.

PRD AuthoringFeature PrioritisationMulti-Persona Design

Interested in this work?

Full architecture walkthrough and code review available during interviews.

Get in Touch All Projects

The Problem

The Challenge

Architecture & System Design

Full system schematic available upon request

Code Walkthrough

3-step walk-through of the production implementation — file paths and intent shown above each block.

Step 1 of 3

Cross-board deduplication

ummidvar/discovery/dedupe.py

A single role surfaces on LinkedIn, Indeed, Naukri, and the company's own careers page simultaneously. Without deduplication the scorer wastes compute on duplicates and the user sees the same job five times. Canonical fingerprinting collapses them before anything else runs.

python

Takeaway

Normalise title (strip seniority prefixes) + company + location city → one hash per logical role, regardless of which board it came from.

Step 2 of 3

Facts Graph — structural anti-hallucination

ummidvar/tailoring/facts_graph.py

Cover letters must never claim experience the user doesn't have. Instead of trusting an LLM prompt, every generated sentence is checked against a pre-extracted graph of verified facts. Ungrounded claims raise an error, not a warning.

python

Takeaway

Hallucination isn't handled by prompt engineering — it's a hard invariant enforced after generation. Ungrounded text never reaches the user.

Step 3 of 3

Playwright ATS adapter with HITL hand-off

ummidvar/adapters/greenhouse.py

Real ATS flows hit CAPTCHAs, MFA challenges, and custom screening questions. Rather than fail silently or pretend to solve CAPTCHAs, the adapter snapshots browser state and queues the session for human completion — resuming from checkpoint once the user finishes.

python

Takeaway

No CAPTCHA-solving services, no brittle retries — when the browser hits a wall, the state is frozen and a human finishes the job. Silent failures are eliminated by design.

Results