Generative AI Operations & 4 others
EPAM Systems
Software Engineering, Operations, Data Science
Portugal · Remote
Posted on Nov 19, 2025
Responsibilities
- Design scalable AI/ML workloads aligned with organizational goals
- Develop and maintain reproducible Machine Learning pipelines
- Deploy AI models to production using model serving infrastructure
- Implement observability frameworks for monitoring and logging AI services
- Define infrastructure requirements for MLOps pipelines and components
- Collaborate with infrastructure engineers to support implementation of infrastructure
- Mentor and coach team members to promote best practices and continuous improvement
- Coordinate with cross-functional teams including data scientists and engineers
- Optimize ML workloads for performance and scalability
- Ensure compliance with security and data privacy standards
- Evaluate new tools and technologies to enhance AI service delivery
- Document system architectures and processes for knowledge sharing
- Troubleshoot and resolve production issues related to AI services
Requirements
- Bachelor’s or Master’s degree in Computer Science, Engineering, Data Science, or related field
- 3+ years of experience in AI, Machine Learning, data engineering, software development, or cloud infrastructure
- Strong proficiency in Python and experience with AI/ML frameworks such as PyTorch, TensorFlow, HuggingFace, or Scikit-learn
- Experience with model inference runtimes like vLLM, MLServe, or Torch Serve
- Proficiency in containerization and orchestration tools including Docker and Kubernetes
- Experience defining and implementing infrastructure requirements for ML pipelines
- Strong problem-solving skills and ability to work in agile, cross-functional teams
- Good communication and mentoring skills to support team development
- B2+ level English proficiency
Nice to have
- Experience with cloud platforms such as Azure, AWS, or Google Cloud
- Knowledge of Infrastructure as Code (IaC) practices
- Familiarity with experiment tracking systems and pipeline orchestration tools
We offer/Benefits
- International projects with top brands
- Work with global teams of highly skilled, diverse peers
- Healthcare benefits
- Employee financial programs
- Paid time off and sick leave
- Upskilling, reskilling and certification courses
- Unlimited access to the LinkedIn Learning library and 22,000+ courses
- Global career opportunities
- Volunteer and community involvement opportunities
- EPAM Employee Groups
- Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn