Advanced Data Engineer - Apidel Technologies
Blue Ash, OH 45242
About the Job
Job Description:
As an Advanced Data Engineer, you will have the opportunity to lead the development of innovative data solutions, enabling the effective use of data across the organization. You will be responsible for designing, building, and maintaining robust data pipelines and platforms to meet business objectives, focusing on data as a strategic asset. Your role will involve collaboration with cross-functional teams, leveraging cutting-edge technologies, and ensuring scalable, efficient, and secure data engineering practices. A strong emphasis will be placed on expertise in GCP, Vertex AI, and advanced feature engineering techniques.
Requirements:
4%2B years of professional Data Development experience.
4%2B years of experience with SQL and NoSQL technologies.
3%2B years of experience building and maintaining data pipelines and workflows.
5%2B years of experience developing with Java.
2%2B years of experience developing with Python.
3%2B years of experience developing Kafka solutions.
2%2B years of experience in feature engineering for machine learning pipelines.
Experience with GCP services such as BigQuery, Vertex AI Platform, Cloud Storage, AutoMLOps, and Dataflow.
Experience with CI/CD pipelines and processes.
Experience with automated unit, integration, and performance testing.
Experience with version control software such as Git.
Full understanding of ETL and Data Warehousing concepts.
Strong understanding of Agile principles (Scrum).
Additional Qualifications:
Knowledge of Structured Streaming (Spark, Kafka, EventHub, or similar technologies).
Experience with GitHub SaaS/GitHub Actions.
Experience understanding Databricks concepts.
Experience with PySpark and Spark development.
Key Responsibilities
Provide Technical Leadership: Offer technical leadership to ensure clarity between ongoing projects and facilitate collaboration across teams to solve complex data engineering challenges.
Build and Maintain Data Pipelines: Design, build, and maintain scalable, efficient, and reliable data pipelines to support data ingestion, transformation, and integration across diverse sources and destinations, using tools such as Kafka, Databricks, and similar toolsets.
Drive Digital Innovation: Leverage innovative technologies and approaches to modernize and extend core data assets, including SQL-based, NoSQL-based, cloud-based, and real-time streaming data platforms.
Implement Feature Engineering: Develop and manage feature engineering pipelines for machine learning workflows, utilizing tools like Vertex AI, BigQuery ML, and custom Python libraries.
Implement Automated Testing: Design and implement automated unit, integration, and performance testing frameworks to ensure data quality, reliability, and compliance with organizational standards.
Optimize Data Workflows: Optimize data workflows for performance, cost efficiency, and scalability across large datasets and complex environments.
Mentor Team Members: Mentor team members in data principles, patterns, processes, and practices to promote best practices and improve team capabilities.
Draft and Review Documentation: Draft and review architectural diagrams, interface specifications, and other design documents to ensure clear communication of data solutions and technical requirements.
Cost/Benefit Analysis: Present opportunities with cost/benefit analysis to leadership, guiding sound architectural decisions for scalable and efficient data solutions.
Note to Vendors
Vendors, we need a fast turn around here - this will be a single interview, please make sure your candidates are packaged tightly
Stand out skills: highly skilled communications, agile thinking, ability to understand and present options, savvy to the situation at hand, bring a positive and accomplishment driven attitude. Dont have a lot of time to teach someone.
Project person will be supporting: They will be working on the Personalization team, under the Recommendations Core Pod
Work Location (in office, hybrid, remote): Prefers onsite in Cinci but open to remote
Interview process and when will it start: ASAP - Will be a single panel interview
Prescreening Details: Automated - 5 questions and a game
When do you want this person to start: ASAP