Data Engineer with ML Engineering Experience - Highbrow LLC
Atlanta, GA 30301
About the Job
Job Description
Education and Experience
· Bachelor's degree in Computer Science, Data Science, Engineering, or a related field. A Master's degree or relevant certifications (e.g., Google Professional Data Engineer) is a plus.
· 5+ years of experience in data engineering, with at least 2-3 years of experience in machine learning engineering or deploying ML models in production.
· Proven experience in building and maintaining scalable data pipelines, data warehouses, and infrastructure to support ML workflows.
Technical Skills:
· Proficiency in big data frameworks and tools such as Apache Spark, Hadoop, Kafka, and Airflow.
· Advanced skills in data modeling, ETL processes, and data pipeline automation, with a focus on performance and scalability.
· Experience with cloud platforms (AWS, GCP, Azure) and their data services, such as AWS Glue, Google BigQuery, or Azure Data Lake.
· Strong programming skills in Python, SQL, and experience with data query optimization.
· Familiarity with ML frameworks (e.g., TensorFlow, PyTorch, Scikit-Learn) and libraries for building and testing machine learning models.
· Knowledge of containerization and orchestration tools (Docker, Kubernetes) for deploying and managing ML models in production.
Machine Learning Engineering Skills
· Experience in feature engineering, data preprocessing, and building data pipelines to support ML training and inference.
· Knowledge of MLOps best practices for continuous integration, deployment, and monitoring of ML models in production.
· Familiarity with model lifecycle management tools such as MLflow, TFX, or Databricks to streamline ML workflows.
· Strong understanding of data versioning, reproducibility, and monitoring of ML models to ensure model integrity over time.
· Ability to work with structured and unstructured data, with hands-on experience in NLP, computer vision, or time-series data for machine learning applications.
Data Engineering Skills:
· Proficiency in data storage and warehousing solutions (e.g., Snowflake, Redshift, BigQuery) for scalable data architecture.
· Understanding of data governance, quality, and security best practices, including data lineage and compliance with regulations.
· Experience with data lake architecture and data partitioning strategies to support large-scale data analysis.
· Ability to optimize data infrastructure for low-latency access and high throughput, especially for real-time ML applications.
Communication and Collaboration Skills:
· Strong communication skills with the ability to work closely with data scientists, ML engineers, and product teams to align data infrastructure with business requirements.
· Collaborative mindset, with experience working in cross-functional teams to deliver end-to-end data and ML solutions.
· Ability to document data workflows, pipelines, and ML infrastructure, ensuring transparency and ease of knowledge sharing.
· Proven ability to understand and respond to the needs of diverse stakeholders, from technical teams to business leaders.
Additional Qualifications:
· Familiarity with A/B testing, experimentation frameworks, and data-driven evaluation of ML models.
· Knowledge of data privacy and security regulations (e.g., GDPR, CCPA) for responsible data management and ML practices.
· Experience in specific industries like Telcomunications is a plus.
· Passion for staying up-to-date on the latest in data engineering, ML tools, and techniques, with a proactive approach to continuous learning.