Senior AI Engineer Document Intelligence & LLM Infrastructure USA Shift

india, Rajasthan, Jaipur

Full-time

Posted on: 6 days ago

Job Title – Senior AI Engineer Document Intelligence & LLM Infrastructure Nature: ContractTime Zone : US ShiftWorking Hours : 5.30pm - 2.30amWork Location :RemoteExp : 6 to 8 +YearsContract Duration : 6 MonthsNo of Position : 1 Role OverviewWe are hiring a senior individual contributor to own and scale large-scale, OCR-driven document intelligence systems powered by self-hosted LLMs.This is a deeply hands-on engineering role focused on production systems that process long-form documents (200+ pages), extract structured data deterministically, and run on optimized GPU-backed inference infrastructure.You will work closely with the AI leadership team but will independently own architecture, performance, and reliability of document processing pipelines.Core Responsibilities1.⁠ ⁠Large-Scale Document Intelligence Pipelines○ Design and build end-to-end pipelines for processing long-form, OCR-heavy documents○ Own PDF ingestion, layout-aware parsing, and multi-page document assembly○ Implement robust chunking, segmentation, and metadata tracking across long documents○ Handle exception detection, retries, and deterministic failure handling○ Optimize systems to reliably process 200+ page documents at scale 2.⁠ ⁠OCR & Structured Extraction Systems○ Work with OCR engines (Tesseract, PaddleOCR, layout-aware models, vision-language models)○ Build layout-aware extraction systems using bounding boxes and structural metadata○ Implement deterministic schema validation and cross-field consistency checks○ Reduce reliance on manual QA through rule-based validation layers○ Ensure traceability from extracted field back to source span 3.⁠ ⁠Self-Hosted LLM Inference (Production Ownership)○ Deploy and operate open-source LLMs using:§ vLLM§ Hugging Face TGI§ GPU-backed serving stacks○ Tune inference performance:§ KV cache management§ Batching§ Context window control§ Throughput vs latency trade-offs○ Monitor and optimize GPU utilization and cost per request○ Own production reliability of LLM serving infrastructure 4.⁠ ⁠Deterministic Validation & Control Systems○ Design validation layers outside the LLM○ Implement schema enforcement, rule engines, invariants, and rejection logic○ Build automated exception routing without default human review○ Ensure auditability and reproducibility of extraction results○ Create measurable correctness guarantees for high-stakes use cases 5.⁠ ⁠Production Engineering & Scale○ Design systems that handle:§ Large document volumes§ Concurrency§ Failure states§ Observability and monitoring○ Build logging, tracing, and metrics around document processing pipelines○ Collaborate with cross-functional teams to ship production-grade AI systemsRequired Experience○ 6+ years of hands-on Python engineering○ Proven production experience building OCR-driven document pipelines○ Experience handling long-form PDFs (100+ pages)○ Strong experience with:§ vLLM or Hugging Face TGI§ GPU-based LLM serving§ Open-source LLMs (LLaMA, Qwen, Mistral, etc.)○ Experience building deterministic validation systems (schema + rule enforcement)○ Strong debugging and systems-level thinking○ Ability to clearly articulate system trade-offs and business impact Strongly Preferred○ Experience with layout-aware models (LayoutLM, DocFormer, vision-language models)○ Experience optimizing GPU cost and inference performance○ Experience in regulated domains (healthcare, finance, compliance)○ Familiarity with document-heavy workflows such as loan processing, underwriting, or claims