Senior Infrastructure Operations - FEDRAMP - Content Guru
Reston, VA 20190
About the Job
Working within a team responsible for all activities associated with designing, deploying and maintaining the global infrastructure supporting our key services and applications. The infrastructure role includes acting as an escalation point for engineering departments for customer facing faults. Engineers are tasked with creating, approving and executing changes associated with both the initial deployment and life-cycle management of all hardware and software, internal and external, deployed on production platforms. A primary responsibility of the team is the configuration, monitoring and resolution of all platform alarms, as well as providing training for and assessing rotation engineers. Working with other engineering teams and Project Management team is crucial to the role in order to ensure that customer pipelines are correctly factored into the capacity management of the platforms. As the team supports critical services including emergency services, there is an element of out of hours work including weekday overnight and weekend on-call shifts to ensure minimum disruption and best-in-class customer service at all hours. Working with DevOps methodology underpin every aspect of the Infrastructure Operations Engineering role.
There will be an emphasis on supporting Continuous Monitoring (Con-Mon) of the storm FedRAMP infrastructure and clients for the U.S. Federal Government. As such, the successful candidate must be a US citizen.
A successful Engineer recognises the need for speed in everything that they do, from identifying and resolving alarms through to delivering key project objectives, in doing so ensuring that all SLAs are met and that products and features are available for clients on the date promised. They will maintain a keen focus on the goals of the team, which is primarily to ensure service stability and improvement for all clients. In order to achieve both of these aims, they will effectively communicate with other departments to ensure all aspects and perspectives are considered in everything that they undertake.
In order to ensure the continued success of the team, engineers are expected to pass on their knowledge to other engineers, both during formal sessions and on a BAU basis, providing a friendly, approachable interface in order to maximise the effectiveness of the knowledge transfer and encourage reciprocity. In extension, engineers will seek to assist other engineers and departments where possible and appropriate and will work in such a way as required by the task at hand, being flexible to different approaches and ways of thinking.
Capacity Management
- Designing capacity monitoring solutions and proposing solutions for resolving capacity constraints
Faults / Escalations
- Act as an escalation point for faults, attend/lead investigation into high profile incidents and attend wash ups following high profile incidents as an infrastructure representative
- Propose solutions for preventative measures to mitigate recurrence of issues
- Create, approve and implement solutions to faults
Software Deployment
- Review of application designs and work instructions
- Installation of new services and upgrades of existing services
- Writing, approving and executing changes of all risk levels
Project
- Working with solution consultants during the pre-sale phase
- Leading sprint teams of various sizes to complete projects
- Design, plan, implement and handover projects of all complexities
OOH Work
- Act as a level 1 escalation on a rotational basis and level 2 escalation over Christmas period
- Complete OOH changes for own projects where relevant and attend datacentre sites when required
- Complete overnight shifts on a rotational basis or as required
Product/Supplier/Tool Management
- Assigned technical point of contact of a business tool
Training
Maintaining, creating and delivering training