AI Engineer, Senior Staff

india, Maharashtra, Pimpri-Chinchwad

Full–time

Posted on: 7 days ago

Lattice Overview

There is energy here…energy you can feel crackling at any of our international locations. It’s an energy generated by enthusiasm for our work, for our teams, for our results, and for our customers. Lattice is a worldwide community of engineers, designers, and manufacturing operations specialists in partnership with world-class sales, marketing, and support teams, who are developing programmable logic solutions that are changing the industry. Our focus is on R&D, product innovation, and customer service, and to that focus, we bring total commitment and a keenly sharp competitive personality.Energy feeds on energy. If you flourish in a fast paced, results-oriented environment, if you want to achieve individual success within a “team first” organization, and if you believe you can contribute and succeed in a demanding yet collegial atmosphere, then Lattice may well be just what you’re looking for.

Key Responsibilities

Responsibilities & Skills
  • Select, evaluate, and benchmark large foundation models; lead model distillation/quantization for efficiency.
  • Prepare high‑quality datasets; perform data curation, filtering, labeling, and quality analysis.
  • Tune task‑specific cost functions, hyperparameters, and training loops for optimal performance.
  • Build domain‑aware RAG systems with strong retrieval metrics, evaluation pipelines, and citations.
  • Architect and deploy model‑serving pipelines using vLLM / TGI / Triton, including batching, caching, and streaming.
  • Lead experimentation cycles: dataset → training → evaluation → deployment → monitoring.
  • Mentor junior engineers and influence model architecture decisions.

  • Required Qualifications
  • 8–12 years total experience in software/ML engineering, with 3–5+ years hands‑on experience training or fine‑tuning LLMs or large models.
  • Strong expertise in LLM fine‑tuning techniques (LoRA, QLoRA, PEFT, SFT, RLHF).
  • Deep understanding of transformers, attention mechanisms, tokenization, embeddings, and model architecture trade‑offs.
  • Strong experience with PyTorch, HuggingFace ecosystem, DeepSpeed/FSDP, and vector DBs (FAISS, Milvus, Chroma).
  • Experience building and optimizing RAG pipelines, including embedding optimization, chunking strategies, and retrieval evaluation.
  • Hands‑on experience with model serving using vLLM/TGI/Triton, GPU utilization optimization, and inference‑time acceleration.
  • Practical experience with quantization (INT4/FP8), sparse/structured pruning, and distillation.
  • Strong Python engineering fundamentals and familiarity with MLOps tooling (Weights & Biases, MLflow, Ray, Airflow).