PySpark Data Engineer/ETL Developer - Kaygen Inc.
Irvine, CA 92618
About the Job
KAYGEN is an emerging leader in providing top talent for technology based staffing services. We specialize in providing high-volume contingent staffing, direct hire staffing and project based solutions to companies worldwide ranging from startups to Fortune 500 and Managed Service Providers (MSP) across a wide variety of industries
Job Description:
We are seeking a highly skilled and motivated Data Engineer/ETL Developer to join our dynamic team. The ideal candidate will have expertise in utilizing technologies such as Python, PySpark (Main Technology), Airflow, AWS, S3, QTest, GitHub, and SQL to design, develop, and maintain efficient data pipelines and ETL processes.
Responsibilities:
ETL Development:
- Design, develop, and implement scalable and efficient ETL processes using Python and PySpark.
- Collaborate with cross-functional teams to gather and understand data requirements.
- Ensure the quality, reliability, and performance of data pipelines.
Workflow Automation:
- Utilize Airflow to create and manage workflow automation for scheduling, monitoring, and orchestrating data tasks.
- Implement best practices for scheduling and dependencies within data workflows.
Cloud Integration:
- Work with AWS services, particularly S3, to store and retrieve large volumes of data.
- Implement and optimize data storage solutions on the cloud platform.
Testing and Quality Assurance:
- Collaborate with the testing team to ensure data quality through the use of QTest and other testing tools.
- Develop and execute test plans for ETL processes.
Version Control and Collaboration:
- Use GitHub for version control, collaborating with other developers to manage code repositories effectively.
- Participate in code reviews to ensure code quality and adherence to coding standards.
Database Management:
- Write complex SQL queries and optimize database performance.
- Work with various databases to extract, transform, and load data efficiently.
Qualifications:
- Bachelor s degree in Computer Science, Information Technology, or related field.
- Proven experience in developing ETL processes and data pipelines.
- Proficiency in Python, PySpark, Airflow, AWS (especially S3), QTest, GitHub, and SQL.
- Strong understanding of data modeling and database design concepts.
- Experience with version control and collaborative development workflows.
- Excellent problem-solving and troubleshooting skills.
- Strong communication skills with the ability to work in a collaborative team environment.
Preferred Skills:
- Familiarity with big data technologies and frameworks.
- Experience with data warehousing and business intelligence tools.
- Knowledge of best practices in data security and compliance.
At KAYGEN, we are always looking for dynamic, talented and experienced individuals. We invite you to join our team of talented IT professionals, consulting at client locations across the globe. Our culture is team-orientated; we strive to stand by our core values of respect, honesty and integrity. Our team of experienced staffing experts will work with you to find you the best opportunity. For more information please visit us at www.kaygen.com.
Benefits:
- Healthcare Insurance
- Vision and Dental Insurance
- 401(k) Retirement Plan
- Free Life Insurance
- Vacation Time Off
- Sick Time Off
- Family Medical Leave (FMLA)
- Mentorship Program
- Referrals
- Family and Wellness benefits
- Continuous Growth and Career Development
Source : Kaygen Inc.