Staff Data Engineer
Intuit
Staff Data Engineer
Company Overview
Intuit is the global financial technology platform that powers prosperity for the people and communities we serve. With approximately 100 million customers worldwide using products such as TurboTax, Credit Karma, QuickBooks, and Mailchimp, we believe that everyone should have the opportunity to prosper. We never stop working to find new, innovative ways to make that possible.
Job Overview
Come join the Intuit Data Platform (IDP) team as a Staff Engineer. The IDP team owns the Intuit Analytics Platform which is the foundation of big-data at Intuit and which enables real-time data ingestion, cataloging, analytics and machine learning on all of Intuit’s data. With Intuit’s customers growing rapidly year over year, the volume of data we handle is ever increasing and the excellent data engineering we do in IDP helps Intuit keep up with this data volume and leverage it for machine learning and data-driven product innovations. We are in the process of building finest ingestion (real time and batch) and cataloging engine to index, catalog the data and the metadata of the data. And we believe in open source technologies to solve this and contribute back to the community. If building this platform that directly impacts the data scientists and data analysts excites you and you‘d like to be part of this wonderful team then come and join us.
Responsibilities
- Architect, design and build fault-tolerant & scalable big-data platforms and solutions primarily based on open source technologies
- Build solid & scalable architecture to address normalization, lineage, data governance, ontology and data discoverability use cases
- Design solutions that involve complex, multi-system integration, possibly across BUs or domains
- Work with analysts and data scientists to identify datasets needed for deep customer insights and for building operational propensity models.
- Heavy hands-on coding in the Hadoop ecosystem (Java MapReduce, Spark, Scala, HBase, Hive) and build framework(s) to support data pipelines on streaming applications
- Work on technologies related to NoSQL, SQL and In-Memory databases
- Conduct code-reviews to ensure code quality, consistency and best practices adherence
- Drive alignment between enterprise architecture and business needs.
- Conduct quick proof-of-concepts (POCs) for feasibility studies and take them to the production
- Work with the data catalog team and architects to catalog all data sources at Intuit
- Lead fast moving development teams using agile methodologies
- Lead by example, demonstrating best practices for unit testing, CI/CD, performance testing, capacity planning, documentation, monitoring, alerting, and incident response
- Design a dimension model appropriate for customer business use cases.
Qualifications
- 12+ years of relevant experience with at least 5+ years in the big-data domain
- Should have experience in architecting E2E ecosystem for big-data and analytical platforms
- Expert level experience in building fault-tolerant & scalable big-data platforms and big-data solutions primarily based on Hadoop ecosystem
- Expert level experience with Java and Scala programming
- Expert level experience designing high throughput data services
- Good knowledge of machine learning and AI
- Experience with Big-Data Technologies (Hive, HBase, Spark, Kafka, Storm, MapReduce, HDFS, Splunk, Zookeeper, MemSQL, Cassandra, Redshift, GraphDB), understands the concepts and technology ecosystem around both real-time and batch processing in Hadoop.
- Strong communication skills
- Intermediate level in Python/R programming
- BE/BTech/MS in Computer Science (or equivalent)
- Effective listening skills and strong collaboration to lead change by example and through influence.