Data Engineer at PixelPlex
San Jose, CA 95199
About the Job
PixelPlex is currently looking for an experienced Data Engineer to join our new project. It is a Business Intelligence and Data Visualization service designed especially for NFT enthusiasts. Service uses on-chain, social and other metrics, enriches it to deliver unique insights ready for decisions.
Main focus
• Data Quality: automating the DQ process using Great Expectations;
• Data Marts: presenting data marts, optimizing analyst code if necessary;
Ensuring the collection and data loading into analytical and hot databases, including obtaining structured and unstructured data from various sources, preparing, cleaning, and pre-processing data from external sources, and building aggregates;
Responsibilities
• Creating and improving data processing processes (optimizing processor groups, optimizing data marts);
• Participating in the development of data processing solutions within ML projects and deploying pipelines;
• Developing and optimizing procedures for generating detailed data layers (raw to ods) and data mart layers in DataLake and hot database for online reporting;
• Implementing CI/CD processes and monitoring the developed data processing processes (Grafana, Prometheus);
• Handling ad-hoc queries.
Requirements
• Knowledge of database principles and HD design;
• Experience in ETL process development;
• Good understanding of ML pipelines and their deployment to production;
• Experience using Airflow (or other industry-standard pipeline orchestrators, such as Luigi, Dagster, etc.);
• Experience with high-load distributed data storage and processing systems;
• Excellent knowledge of SQL, query optimization experience;
• Willingness to learn, grow, and be motivated to achieve goals.
Nice to have
• Experience in Python or Node.js development;
• Knowledge of DevOps tools;
• Experience with cloud services.