Data Engineer III - BankUnited
Miami Lakes, FL 33016
About the Job
SUMMARY: The Data Engineer plays a pivotal role in operationalizing the most urgent data and analytics initiatives for The Banks digital business initiatives by building, managing and optimizing data pipelines and then moving these data pipelines effectively into production for key data and analytics consumers.
ESSENTIAL DUTIES AND RESPONSIBILITIES include the following. Other duties and special projects may be assigned.
* Design, create and maintain data pipelines will be the primary responsibility of the Data Engineer.
* Support reporting and general analytics needs of the department.
* Drive automation through effective metadata management.
* Assist with renovating the data management infrastructure to drive automation in data integration and management.
* Utilize modern data preparation, integration and AI-enabled metadata management tools and techniques.
* Track data consumption patterns.
* Perform intelligent sampling and caching.
* Monitor schema changes.
* Recommend and automate integration flows.
* SENIOR LEVEL RESPONSIBILITIES
* Work with data science teams and with business (data) analysts to refine their data requirements for various data and analytics initiatives.
* Propose appropriate (and innovative) data ingestion, preparation, integration and operationalization techniques.
* Train counterparts such as data scientists, data analysts, LOB users or any data consumers in data pipelining and preparation techniques.
* Ensure that data users and consumers use the data provisioned to them responsibly through data governance and compliance initiatives. Participate in vetting and promoting content created in the business and by data scientists to the curated data catalog for governed reuse.
* Become a data and analytics evangelist by promoting the available data and analytics capabilities and expertise to business unit leaders and educating them in leveraging these capabilities in achieving their business goals.
* Adheres to and complies with applicable, federal and state laws, regulations and guidance, including those related to anti-money laundering (i.e. Bank Secrecy Act, US PATRIOT Act, etc.).
* Adheres to Bank policies and procedures and completes required training.
* Identifies and reports suspicious activity.
EDUCATION
* Bachelor's Degree in computer science, statistics, applied mathematics, data management, information systems, information science or a related quantitative field required Master's Degree An advanced degree in computer science preferred PhD statistics, applied mathematics preferred information science (MIS), data management, information systems, information science (post-graduation diploma or related) or a related quantitative field or equivalent work experience preferred combination of IT skills, data governance skills, analytics skills and banking domain knowledge with a technical or computer science degree preferred.
EXPERIENCE
* 6 years of work experience in data management disciplines including data integration, modeling, optimization and data quality, and/or other areas directly relevant to data engineering responsibilities and tasks required.
* 6 years of experience working in cross-functional teams and collaborating with business stakeholders in the banking business domain, in support of a departmental and/or multi-departmental data management and analytics initiative required.
* Strong experience with advanced analytics tools for object-oriented/object function scripting using languages such as R, Python, Java, and Scala required.
* Strong experience with popular database programming languages including SQL and PL/SQL for relational databases and certifications on upcoming NoSQL/Hadoop oriented databases like MongoDB and Cassandra for nonrelational databases required.
* Strong experience in working with large, heterogeneous datasets in building and optimizing data pipelines, pipeline architectures and integrated datasets using traditional data integration technologies These should include ETL/ELT, data replication/CDC, message-oriented data movement, API design and access and upcoming data ingestion and integration technologies such as stream data integration, CEP and data virtualization required.
* Strong experience in working with SQL on Hadoop tools and technologies including HIVE, Impala, Presto and others from an open source perspective and Hortonworks Data Flow (HDF), Dremio, Informatica, Talend among others from a commercial vendor perspective required.
* Strong experience in working with and optimizing existing ETL processes and data integration and data preparation flows and helping to move them in production required.
* Strong experience in working with both open-source and commercial message queuing technologies (Kafka, JMS, Azure Service Bus, Amazon Simple queuing Service), stream data integration technologies such as Apache Nifi, Apache Beam, Apache Kafka Streams, Amazon Kinesis, others and stream analytics technologies (Apache Kafka, KSQL, Apache Spark) required.
* Basic experience working with popular data discovery, analytics and BI software tools like Tableau, and OBI for semantic-layer-based data discovery required.
* Strong experience in working with data science teams in refining and optimizing data science and machine learning models and algorithms required.
* Basic experience in working with data governance teams and specifically business data stewards and the CISO in moving data pipelines into production with appropriate data quality, governance and security standards and certification required.
KNOWLEDGE, SKILLS AND ABILITIES
* Strong ability to design, build and manage data pipelines for data structures encompassing data transformation, data models, schemas, metadata and workload management. The ability to work with both IT and business in integrating analytics and data science output into business processes and workflows.
* Demonstrated ability to work across multiple deployment environments including cloud, on-premises and hybrid, multiple operating systems and through containerization techniques such as Docker, Kubernetes, AWS Elastic Container Service and others.
* Proficiency in agile methodologies and the capability of applying DevOps and increasingly DataOps principles to data pipelines to improve the communication, integration, reuse and automation of data flows between data managers and consumers across an organization.
* Deep domain knowledge or previous experience working in the banking business would be a plus.
* Strong experience in working with SQL on Hadoop tools and technologies including HIVE, Impala, Presto and others from an open source perspective and Hortonworks Data Flow (HDF), Dremio, Informatica, Talend among others from a commercial vendor perspective.
Equal Opportunity Employer Minorities/Women/Protected Veterans/Disabled