Data Software Engineering & 7 others
EPAM Systems
Software Engineering
Portugal · Remote
Posted on Nov 19, 2025
Responsibilities
- Build and optimize data pipelines with automated testing, lineage tracking, and privacy-by-design principles
- Design, implement, and maintain data warehouses using SQL, Python, and ELT processes
- Set up monitoring, testing, backup, and recovery mechanisms to ensure data reliability and availability
- Enforce governance, security standards, role-based access control (RBAC), and data masking protocols
- Translate business needs into scalable and reusable data solutions
- Standardize ingestion and transformation processes for machine learning workflows, including feature preparation and monitoring drift and quality
- Develop and deploy scalable machine learning models and pipelines
- Create reusable components and maintain detailed documentation for data systems
- Promote reliability, maintainability, and scalability in data workflows while optimizing pipelines for AI training and evaluation
- Advocate for best practices and reusable patterns across teams
- Contribute to onboarding processes and build communities of practice within the organization
- Resolve complex challenges, perform root cause analyses, and enhance engineering standards
- Make informed decisions, escalate issues when necessary, and improve scalability and cost efficiency of data systems
- Align and influence cross-functional teams with minimal supervision
- Mentor peers and support the development of team capabilities
Requirements
- At least 3 years of proven experience in Data Engineering
- Expertise in SQL and ELT design, including core SQL, CDC patterns, and optimization techniques
- Proficiency in Snowflake performance tuning, secure data sharing, role-based access, and data masking
- Experience in developing Matillion jobs, shared components, and orchestration processes
- Knowledge of CI/CD workflows using Bitbucket and Jenkins for building, testing, deploying, and managing environments and versions
- Familiarity with data observability practices, including logging, alerts, and automated checks
- Hands-on experience with feature engineering for machine learning, dataset validation, and management of data contracts and roles
- Practical experience with EMR / Apache Spark for distributed data processing
- Proficiency in using AWS for scalable data ingestion, processing, and preparation for machine learning pipelines
- Ability to apply RAG principles for policy-aware grounding and design logical and physical data models
- Skilled in maintaining documentation, APIs, and reusable patterns while supporting onboarding and best practices
- Fluent English communication skills, written and spoken, at a B2+ level or higher
Nice to have
- Knowledge of real-time data streaming technologies like Apache Kafka or AWS Kinesis
- Exposure to Big Data ecosystems and tools such as Hadoop or Hive
We offer/Benefits
- International projects with top brands
- Work with global teams of highly skilled, diverse peers
- Healthcare benefits
- Employee financial programs
- Paid time off and sick leave
- Upskilling, reskilling and certification courses
- Unlimited access to the LinkedIn Learning library and 22,000+ courses
- Global career opportunities
- Volunteer and community involvement opportunities
- EPAM Employee Groups
- Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn