ETL Automation QE – ETL Testing, Python, SQL, Unix/Linux, AWS S3, Hadoop, Parquet
Toronto ON – Hybrid 4days onsite
12Months Contract
Key Responsibilities
- Design| develop| and maintain automated test frameworks for ETL workflows| ensuring data accuracy| consistency| and compliance with business requirements.
- Write complex SQL queries to validate data transformations| integrity| and quality across databases and data lakes (e.g.| Hadoop| Parquet).
- Implement Python scripts to automate testing| monitor pipeline performance| and generate reports on data quality metrics.
- Collaborate with engineering teams to integrate automated tests into CI/CD pipelines using GitHub and GitHub Actions.
- Manage data storage and retrieval processes in AWS S3| ensuring scalability and security.
- Apply QE methodologies (e.g.| test planning| risk analysis| defect tracking) to identify and resolve issues in ETL pipelines.
- Develop and execute test cases for API integrations (nice to have) and validate end-to-end data workflows.
- Document test strategies| results| and recommendations for process improvements.
- Work with cross-functional teams to troubleshoot data discrepancies and optimize ETL processes.
Must-Have Qualifications Technical Skills:
- Expertise in SQL for data validation and complex querying.
- Proficiency in Python for scripting and automation.
- Hands-on experience with Unix/Linux| cloud platforms (AWS)| Hadoop| Parquet| and AWS S3.
- Familiarity with GitHub for version control and collaboration.
- Proven experience in automated ETL testing frameworks.
Soft Skills:
- Strong communication skills to articulate technical issues and solutions to diverse stakeholders.
- Analytical mindset with a focus on quality and attention to detail.
Methodologies:
Understanding of QE principles| including test design| execution| and reporting.
Nice-to-Have Qualifications
- Experience with GitHub Actions for CI/CD pipeline automation.
- Knowledge of automated API testing tools (e.g.| Postman| REST-assured).
- Familiarity with AI/ML tools (e.g.| Copilot) for code optimization or data quality enhancements.
- Basic understanding of machine learning concepts as they relate to data pipelines
Pay: $65.00-$70.00 per hour
Work Location: Hybrid remote in Toronto, ON (Toronto District)