Google Cloud Infra SME at Confidential company
About the Job
Hi,
Hope you are doing well?
I am looking for a candidate with experience as Site Reliability Engineer.
Job Role: Google Cloud Infra SME (Site Reliability Engineer)
Location: Parsippany, NJ
Duration: 6-12+ Months(possible extension)
Must be on our W2
Description:
Skills:
- Proficiency in Google Cloud services (Compute Engine, Kubernetes Engine, Cloud Storage, BigQuery, Pub/Sub, etc.).
- Familiarity with Google BI and AI/ML tools (Looker, BigQuery ML, Vertex AI, etc.)
- Experience with automation tools (Terraform, Ansible, Puppet).
- Familiarity with CI/CD pipelines and tools (Azure pipelines Jenkins, GitLab CI, etc.).
- Strong scripting skills (Python, Bash, etc.).
- Knowledge of networking concepts and protocols.
- Experience with monitoring tools (Prometheus, Grafana, etc.).
Preferred Certifications:
- Google Cloud Professional DevOps Engineer
- Google Cloud Professional Cloud Architect
- Red Hat Certified Engineer (RHCE) or similar Linux certification
Job Description: Site Reliability Engineer and Capacity Planning
We are looking for a talented Site Reliability Engineer (SRE) with a strong background in Google Cloud Platform (GCP), and RedHat OpenShift administration. The ideal candidate will be responsible for ensuring the reliability, performance, and scalability of our on-premise and cloud-based systems along with focus on reducing costs for Google Cloud.
System Reliability: Ensure the reliability and uptime of critical services and infrastructure.
Google Cloud Expertise: Design, implement, and manage cloud infrastructure using Google Cloud services.
Automation: Develop and maintain automation scripts and tools to improve system efficiency and reduce manual intervention.
Monitoring and Incident Response: Implement monitoring solutions and respond to incidents to minimize downtime and ensure quick recovery.
Collaboration: Work closely with development and operations teams to improve system reliability and performance.
Capacity Planning: Conduct capacity planning and performance tuning to ensure systems can handle future growth.
Documentation: Create and maintain comprehensive documentation for system configurations, processes, and procedures.
Qualifications:
Education: Bachelor’s degree in computer science, Engineering, or a related field.
Experience: 3+ years of experience in site reliability engineering or a similar role.
Skills:
Proficiency in Google Cloud services (Compute Engine, Kubernetes Engine, Cloud Storage, BigQuery, Pub/Sub, etc.).
Familiarity with Google BI and AI/ML tools (Looker, BigQuery ML, Vertex AI, etc.)
Experience with automation tools (Terraform, Ansible, Puppet).
Familiarity with CI/CD pipelines and tools (Azure pipelines Jenkins, GitLab CI, etc.).
Strong scripting skills (Python, Bash, etc.).
Knowledge of networking concepts and protocols.
Experience with monitoring tools (Prometheus, Grafana, etc.).
Preferred Certifications:
Google Cloud Professional DevOps Engineer
Google Cloud Professional Cloud Architect
Red Hat Certified Engineer (RHCE) or similar Linux certification