Databricks Engineer - Neudesic, an IBM Company
Houston, TX 77246
About the Job
About Neudesic
Passion for technology drives us, but it’s innovation that defines usFrom design to development and support to management, Neudesic offers decades of experience, proven frameworks and a disciplined approach to quickly deliver reliable, quality solutions that help our customers go to market faster.
What sets us apart from the rest, is an amazing collection of people who live and lead with our core values. We believe that everyone should be Passionate about what they do, Disciplined to the core, Innovative by nature, committed to a Team and conduct themselves with Integrity. If these attributes mean something to you - we'd like to hear from you.
Role Profile
We are seeking skilled Databricks Engineers to join our Data and AI Practice. The ideal candidates will have extensive experience building scalable data pipelines and performing advanced analytics using Databricks. This role will focus on leveraging Apache Spark and Databricks to process and analyze large datasets, implement ETL workflows, and collaborate with cross-functional teams to deliver optimized data solutions. A strong background in Azure and Azure related technologies is strongly preferred.
Key Responsibilities:
- Design, build, and maintain scalable ETL/ELT data pipelines using Azure Databricks and Apache Spark
- Integrate data from various data sources such as Azure Data Lake, SQL databases, API services, and cloud storage
- Automate data ingestion, cleansing, transformation, and validation processes
- Leverage Databricks to perform large-scale data processing and distributed computing
- Optimize and tune Apache Spark jobs to handle high-volume data workloads efficiently
- Collaborate with other data team members and analysts to develop advanced analytics and machine learning workflows
- Work closely with business analysts, data architects, and product teams to understand business requirements and translate them into technical solutions
- Provide technical expertise and guidance on Databricks architecture, best practices, and troubleshooting
- Monitor and optimize Databricks clusters for cost and performance efficiency
- Troubleshoot performance bottlenecks, memory, and data storage issues in Spark applications
- Integrate Databricks with Azure Data Lake Storage, Azure Synapse Analytics, and other Azure services for a seamless data ecosystem
- Set up and manage Databricks Delta Lake for managing large-scale, real-time, and batch data pipelines
- Design and implement automated testing and deployment workflows using Azure DevOps, Git, and Databricks APIs
- Build and maintain CI/CD pipelines for Databricks jobs, notebooks, and workflows
Skills and Qualifications:
- 3+ years of hands-on experience with Databricks and Apache Spark for big data processing
- Experience monitoring and troubleshooting performance bottlenecks, tuning Databricks clusters for optimal performance and best practices of Databricks implementations
- Extensive experience in Databricks Development & Engineering to include pipeline development, orchestration, design and optimization of notebooks and workflows in Databricks to support ETL (Extract, Transform, Load) operations
- Strong proficiency in Python, SQL, and Scala for data engineering tasks
- Experience with Azure services, particularly Azure Data Lake Storage, Azure Synapse, and Azure Blob Storage
- Excellent communication skills and ability to collaborate with cross-functional teams
Preferred Qualifications:
- Experience with other big data technologies like Hadoop, Kafka, or HBase
- Familiarity with Azure Data Factory for orchestrating data pipelines
- Experience with machine learning workflows in Databricks
Strong knowledge of SQL and relational database management systems
- Proficiency in programming languages such as Python, R, SQL, and Scala
- Experience with Azure cloud services (Databricks on Azure preferred)
- Experience with other Azure technologies such as Azure Data Factory, Azure Data Lake, Synapse, Fabric, Copilot, PowerBI, Azure Machine Learning, CosmosDB, Azure SQL Database, Azure Blob Storage, Azure Kubernetes Service, Azure Logic Apps, etc.
- Familiarity with data governance frameworks and master data management principles
- Strong analytical and problem-solving skills, with a focus on performance optimization
- Excellent communication skills, both written and verbal, with the ability to articulate technical concepts to non-technical stakeholders
- Ability to work closely with cross-functional teams including business analysts, data architects, and business leaders to gather requirements and translate them into technical solutions
- Certifications in Databricks and Azure
Business Skills
- Analytic Problem-Solving: Approaching high-level challenges with a clear eye on what is important; employing the right approach/methods to make the maximum use of time
- Effective Communication: Detailing your techniques and discoveries to technical and non-technical audiences in a language they can understand
- Intellectual Curiosity: Exploring new territories and finding creative and unusual ways to solve problems
- Data Analysis Knowledge: Understanding how data is collected, analyzed and utilized
- Ability to travel up to 25%
Accommodations currently remain in effect for Neudesic employees to work remotely, provided that remote work is consistent with the work patterns and requirements of their team’s management and client obligations. Subject to business needs, employees may be required to perform work or attend meetings on-site at a client or Neudesic location.
Phishing Scam Notice
Please be aware of phishing scams involving fraudulent career recruiting and fictitious job postings; visit our Phishing Scams page to learn more.
Neudesic is an Equal Employment Opportunity Employer:
All employment decisions shall be made without regard to age, race, creed, color, religion, sex, national origin, ancestry, disability status, veteran status, sexual orientation, gender identity or expression, genetic information, marital status, citizenship status or any other basis as protected by federal, state, or local law.
Neudesic is an IBM subsidiary which has been acquired by IBM and will be integrated into the IBM organization. Neudesic will be the hiring entity. By proceeding with this application, you understand that Neudesic will share your personal information with other IBM companies involved in your recruitment process, wherever these are located. More Information on how IBM protects your personal information, including the safeguards in case of cross-border data transfer, are available here: https://www.ibm.com/us-en/privacy?lnk=flg-priv-usen