Principal AI Evaluation Engineer
Backbase
Software Engineering, Data Science
Posted on Dec 9, 2025
As a a Principal AI Evaluation Engineeryou will be leading the evaluation efforts in our AI-powered SDLC team. You will own the evaluation strategy for AI assistants and agentic workflows, ensuring they are reliable, observable, and safeguarded with strong guardrails. Beyond hands-on work, you will mentor engineers, lead triage and reporting, and make evaluation a cornerstone of release decisions.
What you'll do
- Define and lead the evaluation strategy and roadmap for AI-powered SDLC core product
- Build and oversee evaluation pipelines and guardrails.
- Build and maintain evaluation datasets (synthetic and real project data) to benchmark AI behavior.
- Analyze evaluation results, identify gaps, and produce clear, actionable reports for engineering and product stakeholders.
- Build a culture of innovation and excellence, encouraging continuous improvement and adoption of best practices in AI evaluation and deployment.
- Collaborate with cross-functional teams to integrate evaluation insights into development.
Who you are
- Strong understanding of software engineering principles and the software development lifecycle (SDLC).
- Hands-on experience with test design, test management, observability, and data analysis.
- Proficiency in Python (or another scripting language) for automating evaluations.
- Familiarity with AI Agent evaluation methods (faithfulness, answer relevancy, contextual accuracy, tool correctness).
- Excellent analytical and problem-solving skills.
- Strong communication and collaboration abilities, able to work with cross-functional teams and stakeholders.
- Demonstrated ability to mentor engineering talent, fostering collaboration and technical excellence.
- (Nice to have) Experience with evaluation frameworks, RAG systems, or agentic workflows.