Machine learning - Lead - TechDigital
Irving, TX
About the Job
Mandatory Skills: Machine learning + Spark/Hive/SQL + Python, Scala, SQL PySpark, Kafka, use of scheduling tools, Devops using Jenkins
Key Responsibilities:
· Develop and implement data pipelines and Client Pipelines to facilitate model inference (both Real-time and Batch).
· Analyze large, complex data sets to identify the most performant way to process large volume data using Spark, Hive, and SQL
· Collaborate with cross-functional teams to gather requirements and design scalable solutions
· Work on deployment of machine learning models
· Monitor the performance of data pipelines and make improvements as necessary
· Stay up to date with the latest advances in big data processing
· Productionalize time-series and regression real-time models
Qualifications:
· Bachelor's degree in Computer Science, Mathematics, or a related field
· Strong experience in Spark/Hive/SQL, including hands-on experience building and deploying large volume data pipelines
· Proficiency in Python, Scala, SQL PySpark, Kafka, use of scheduling tools, Devops using Jenkins.
· Excellent problem-solving and critical thinking skills
· Experience with cloud platforms (AWS, GCP, or Azure) is a plus
Key Responsibilities:
· Develop and implement data pipelines and Client Pipelines to facilitate model inference (both Real-time and Batch).
· Analyze large, complex data sets to identify the most performant way to process large volume data using Spark, Hive, and SQL
· Collaborate with cross-functional teams to gather requirements and design scalable solutions
· Work on deployment of machine learning models
· Monitor the performance of data pipelines and make improvements as necessary
· Stay up to date with the latest advances in big data processing
· Productionalize time-series and regression real-time models
Qualifications:
· Bachelor's degree in Computer Science, Mathematics, or a related field
· Strong experience in Spark/Hive/SQL, including hands-on experience building and deploying large volume data pipelines
· Proficiency in Python, Scala, SQL PySpark, Kafka, use of scheduling tools, Devops using Jenkins.
· Excellent problem-solving and critical thinking skills
· Experience with cloud platforms (AWS, GCP, or Azure) is a plus
Source : TechDigital