Data Quality Engineering & 11 others
EPAM Systems
Data Science, Quality Assurance
Colombia · Amp. Gabriel Hernández, Ciudad de México, CDMX, Mexico · Remote
Posted on Nov 19, 2025
Responsibilities
- Lead the development and execution of data quality strategies, ensuring accuracy and reliability across data products and processes
- Drive data quality initiatives while promoting best practices across teams and projects
- Develop and implement advanced testing frameworks and methodologies to meet enterprise data quality standards
- Manage and prioritize complex data quality tasks, ensuring efficiency under tight deadlines and competing priorities
- Design and maintain comprehensive testing strategies for evolving system architectures and data pipelines
- Provide guidance on resource allocation and prioritize testing efforts to align with business and regulatory requirements
- Establish and continuously improve a data quality governance framework to ensure compliance with industry standards
- Build, scale, and optimize automated data quality validation pipelines for production environments
- Collaborate with cross-functional teams to address infrastructure challenges and enhance system performance
- Mentor junior team members and maintain detailed documentation for test strategies, plans, and frameworks
Requirements
- At least 3 years of professional experience in Data Quality Engineering
- Advanced programming skills in Python for data validation and automation
- Expertise in Big Data platforms, including tools from the Hadoop ecosystem such as HDFS, Hive, and Spark, as well as modern streaming platforms like Kafka, Flume, or Kinesis
- Practical experience with NoSQL databases such as Cassandra, MongoDB, or HBase, managing large-scale datasets
- Proficiency in data visualization tools like Tableau, Power BI, or Tibco Spotfire to support analytics and decision-making
- Extensive experience with cloud platforms such as AWS, Azure, or GCP, with a strong understanding of multi-cloud architectures
- Advanced knowledge of relational databases and SQL (PostgreSQL, MSSQL, MySQL, Oracle) in high-volume, real-time environments
- Proven experience in implementing and scaling ETL processes using tools like Talend, Informatica, or similar platforms
- Familiarity with deploying and integrating MDM tools into workflows, as well as performance testing tools like JMeter
- Advanced experience with version control systems such as Git, GitLab, or SVN, and expertise in automation for large-scale systems
- Comprehensive understanding of modern testing frameworks (TDD, DDT, BDT) and their application in data environments
- Experience with CI/CD practices, including pipeline implementation using tools like Jenkins or GitHub Actions
- Strong analytical and problem-solving skills, with the ability to interpret complex datasets into actionable insights
- Exceptional English communication skills (B2 level or higher), with experience engaging stakeholders and leading discussions
Nice to have
- Hands-on experience with additional programming languages like Java, Scala, or advanced Bash scripting for production data solutions
- Advanced knowledge of XPath and its use in data validation and transformation workflows
- Experience designing custom data generation tools and synthetic data techniques for advanced testing scenarios
We offer/Benefits
- International projects with top brands
- Work with global teams of highly skilled, diverse peers
- Healthcare benefits
- Employee financial programs
- Paid time off and sick leave
- Upskilling, reskilling and certification courses
- Unlimited access to the LinkedIn Learning library and 22,000+ courses
- Global career opportunities
- Volunteer and community involvement opportunities
- EPAM Employee Groups
- Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn