Databricks Technical Architect - ClifyX, INC
South PlainField, NJ 07080
About the Job
Job Title: Databricks Technical Architect
Location: USA
Employment Type: Full-Time
Position Overview:
We are looking for an experienced Databricks Technical Architect to lead the design and deployment of scalable big data solutions using the Databricks platform. You will be responsible for architecting high-performance data pipelines, optimizing data workflows, and providing technical leadership in the development of data solutions. This role requires a deep understanding of Databricks, Apache Spark, and cloud ecosystems.
Key Responsibilities:
Architect Big Data Solutions: Design, develop, and implement data solutions leveraging Databricks and Apache Spark to meet the performance, scalability, and governance needs of the organization.
Cloud Integration: Architect and deploy data pipelines on cloud platforms (AWS, Azure, or GCP), ensuring optimal integration with Databricks.
Data Pipelines & ETL: Lead the design and implementation of ETL/ELT processes for large-scale data transformation using Databricks and other big data technologies.
Optimization & Performance Tuning: Monitor, analyze, and improve the performance of Databricks workflows and Apache Spark jobs to ensure efficiency and low-latency data processing.
Collaboration: Work closely with data engineers, data scientists, and business stakeholders to design solutions that align with business goals and objectives.
Data Security & Governance: Ensure data architectures meet all security, compliance, and governance standards.
Automation: Automate workflows, job scheduling, and resource provisioning using tools like Databricks Jobs, Airflow, or similar orchestrators.
Mentorship & Leadership: Mentor and guide the technical development of data engineers and analysts, promoting best practices in data architecture and Databricks usage.
Prototyping & POC: Develop proof-of-concept solutions for new initiatives, demonstrating the potential and benefits of Databricks technologies.
Required Skills and Qualifications:
Educational Background: Bachelor's or Master's degree in Computer Science, Data Engineering, Information Systems, or a related field.
Experience:
5+ years of experience in data architecture, big data, or cloud-based data solutions.
2+ years of experience specifically with Databricks and Apache Spark.
Technical Skills:
Expert knowledge of Databricks and Apache Spark, including cluster management, job scheduling, and optimization.
Proficiency with cloud data services on platforms like AWS (Glue, Redshift, S3), Azure (Azure Data Factory, Azure Synapse, ADLS), or Google Cloud (BigQuery, Dataflow).
Strong programming skills in Python, Scala, or Java for Spark-based solutions.
Experience with SQL and data modeling for large-scale data warehousing.
Familiarity with orchestration tools such as Apache Airflow or Azure Data Factory.
Deep understanding of data lake, data warehouse, and data lakehouse concepts.
Soft Skills:
Strong problem-solving skills with the ability to troubleshoot and resolve complex technical issues.
Excellent communication and presentation skills to effectively collaborate with technical and non-technical stakeholders.
Leadership qualities with the ability to mentor and guide junior team members.
Preferred Qualifications:
Certifications:
Databricks Certified Associate Developer for Apache Spark or Databricks Certified Professional Data Engineer.
Cloud platform certifications such as AWS Certified Data Analytics, Microsoft Certified: Azure Data Engineer, or Google Cloud Certified: Professional Data Engineer.
Experience with Data Governance Tools: Exposure to tools like Azure Purview, AWS Lake Formation, or Collibra for managing data governance and security.
Benefits:
Competitive salary and performance-based bonuses.
Comprehensive health, dental, and vision insurance.
401(k) with company matching.
Flexible working hours with remote options.
Professional development opportunities and certifications.
Location: USA
Employment Type: Full-Time
Position Overview:
We are looking for an experienced Databricks Technical Architect to lead the design and deployment of scalable big data solutions using the Databricks platform. You will be responsible for architecting high-performance data pipelines, optimizing data workflows, and providing technical leadership in the development of data solutions. This role requires a deep understanding of Databricks, Apache Spark, and cloud ecosystems.
Key Responsibilities:
Architect Big Data Solutions: Design, develop, and implement data solutions leveraging Databricks and Apache Spark to meet the performance, scalability, and governance needs of the organization.
Cloud Integration: Architect and deploy data pipelines on cloud platforms (AWS, Azure, or GCP), ensuring optimal integration with Databricks.
Data Pipelines & ETL: Lead the design and implementation of ETL/ELT processes for large-scale data transformation using Databricks and other big data technologies.
Optimization & Performance Tuning: Monitor, analyze, and improve the performance of Databricks workflows and Apache Spark jobs to ensure efficiency and low-latency data processing.
Collaboration: Work closely with data engineers, data scientists, and business stakeholders to design solutions that align with business goals and objectives.
Data Security & Governance: Ensure data architectures meet all security, compliance, and governance standards.
Automation: Automate workflows, job scheduling, and resource provisioning using tools like Databricks Jobs, Airflow, or similar orchestrators.
Mentorship & Leadership: Mentor and guide the technical development of data engineers and analysts, promoting best practices in data architecture and Databricks usage.
Prototyping & POC: Develop proof-of-concept solutions for new initiatives, demonstrating the potential and benefits of Databricks technologies.
Required Skills and Qualifications:
Educational Background: Bachelor's or Master's degree in Computer Science, Data Engineering, Information Systems, or a related field.
Experience:
5+ years of experience in data architecture, big data, or cloud-based data solutions.
2+ years of experience specifically with Databricks and Apache Spark.
Technical Skills:
Expert knowledge of Databricks and Apache Spark, including cluster management, job scheduling, and optimization.
Proficiency with cloud data services on platforms like AWS (Glue, Redshift, S3), Azure (Azure Data Factory, Azure Synapse, ADLS), or Google Cloud (BigQuery, Dataflow).
Strong programming skills in Python, Scala, or Java for Spark-based solutions.
Experience with SQL and data modeling for large-scale data warehousing.
Familiarity with orchestration tools such as Apache Airflow or Azure Data Factory.
Deep understanding of data lake, data warehouse, and data lakehouse concepts.
Soft Skills:
Strong problem-solving skills with the ability to troubleshoot and resolve complex technical issues.
Excellent communication and presentation skills to effectively collaborate with technical and non-technical stakeholders.
Leadership qualities with the ability to mentor and guide junior team members.
Preferred Qualifications:
Certifications:
Databricks Certified Associate Developer for Apache Spark or Databricks Certified Professional Data Engineer.
Cloud platform certifications such as AWS Certified Data Analytics, Microsoft Certified: Azure Data Engineer, or Google Cloud Certified: Professional Data Engineer.
Experience with Data Governance Tools: Exposure to tools like Azure Purview, AWS Lake Formation, or Collibra for managing data governance and security.
Benefits:
Competitive salary and performance-based bonuses.
Comprehensive health, dental, and vision insurance.
401(k) with company matching.
Flexible working hours with remote options.
Professional development opportunities and certifications.
Source : ClifyX, INC