Senior MLOps Engineer Job at DeepRec.ai, San Jose, CA

YVJ5UWVlWE9lQTZnemU1cVZUcnZ3Q3AxNWc9PQ==
  • DeepRec.ai
  • San Jose, CA

Job Description

Senior MLOps Engineer

We are hiring for an MLOps Engineer for a fast-moving AI startup who are building a worldclass AI-powered video platform.

We are looking for a skilled and hands-on MLOps Engineer to join their growing team. You will play a critical role in deploying, scaling, and maintaining their machine learning infrastructure, supporting a range of tools that enable the controlled generation of high-quality animated videos.

Key Responsibilities

  • Design, deploy, and maintain scalable training and data-processing pipelines on distributed compute clusters (e.g., Slurm, Kubernetes, or cloud-native equivalents).
  • Optimize inference systems for latency and cost in a production setting.
  • Collaborate closely with ML researchers and engineers to productionize deep learning models.
  • Implement robust monitoring, logging, and alerting systems for model performance and infrastructure reliability.
  • Automate model testing, validation, and deployment processes across staging and production environments.
  • Ensure efficient usage of compute resources, including GPU clusters, and help identify bottlenecks or cost-saving opportunities.

Requirements

  • Proven experience in MLOps, ML infrastructure, or related roles.
  • Deep expertise in deploying and maintaining ML training pipelines on distributed systems.
  • Strong knowledge of inference optimization techniques, especially in reducing latency and cost at scale.
  • Proficiency with cloud platforms (AWS, GCP, Azure) and orchestration tools (Kubernetes, Docker).
  • Experience working with GPU scheduling, distributed training (e.g., PyTorch DDP), and model serving frameworks (e.g., Triton, TorchServe).
  • Familiarity with CI/CD for ML workflows.
  • Strong Python skills and experience with ML/DL frameworks like PyTorch or TensorFlow.

Bonus Points

  • Experience working in the creative media or animation industry.
  • Exposure to video processing, generative AI, or large-scale content production systems.
  • Experience collaborating with research teams or integrating research code into production pipelines.

Please apply for more information

Job Tags

Similar Jobs

Numeric Technologies

SAP MM/LE Consultant Job at Numeric Technologies

Experienced resource with SAP MM/Logistics expertise, and it would be beneficial to have experienced individuals with eWM and HANA upgrades.

Metric Geo

Bridge Engineer Job at Metric Geo

 ...Agencies (LPA) and/or federally funded programs Proficiency with bridge design and analysis software (e.g., LEAP Concrete, MDX, Midas, AASHTOWare ) Working knowledge of MicroStation or other CAD platforms Excellent collaboration, communication, and problem-solving... 

PrideNow

Industrial Maintenance Mechanic Job at PrideNow

 ...Follow all safety regulations to prevent workplace hazards. Perform other duties as assigned. Requirements: Proven experience as an industrial or maintenance mechanic. Strong knowledge of mechanical, hydraulic, and pneumatic systems. Proficiency in... 

NationsBenefits

Fraud Ops Analyst Job at NationsBenefits

 ...and India. Position Summary: We are seeking a detail-oriented and analytical Fraud Analyst to join our fraud management team. This role is responsible for detecting, investigating, and preventing fraudulent activity across customer accounts, transactions, and access... 

TAG - The Aspen Group

Lead Software Engineer Job at TAG - The Aspen Group

 ...accessible for everyone. About the Role To support our rapid growth and technological evolution, we are seeking a Lead Software Engineer to join our expanding IT team. Were hiring two Lead Software Engineers on our full-stack team: one who leans Front-End and...