Systems Engineer - Site Reliability-Alpharetta, GA (Remote) - Georgia IT Inc.
Alpharetta, GA
About the Job
Systems Engineer - Site Reliability
Location: Alpharetta, GA (remote to start)
Duration: Contract to hire
Rate: DOE
US Citizens and Green cards Preferred.
Job Responsibilities:
Skills and Experience Required:
Location: Alpharetta, GA (remote to start)
Duration: Contract to hire
Rate: DOE
US Citizens and Green cards Preferred.
Job Responsibilities:
- Dene and standardize tool sets and technology used for daily operations support, service delivery, and enablement of application development
- Standardizes monitoring disciplines for end to end application or service monitoring, proactive alerting of business- critical applications
- Identies platform or application bottleneck/defects and works with key stakeholders to drive remediation eorts
- Design and architect operational solutions for the management of applications and infrastructure, with specics goals around increasing automation, repeatability, and consistency of operational tasks
- Partner with internal engineering teams in a project delivery waterfall or agile methodology to support various business needs
- Manage work eorts split between operational, app, dev, and delivery engineering work with a strong focus on production availability
- Prioritize work cycles to ensure that operational needs of assigned applications/platforms are addressed as needed. Assist management with monthly operational performance reviews with key stakeholders
- Participate in on-call duties to triage, solve, and drive automate responses to problems in business-critical services
- Create and maintain monitoring technologies and processes that improve the visibility to our applications' performance and meets or exceeds dened business metrics
- Partner with other internal engineering teams for developing plans around risk and vulnerability remediation
- Automate processes and systems conguration/deployment
- Monitor and report on SLA/SLO for business-critical applications. Work with business partners and product owners to establish key performance indicators.
- Work with Application Development to ensure that assigned applications/platforms have appropriate level of monitoring and metrics in place
- Work with various engineers, architects, and leadership to develop the long-term Site Reliability Engineering road map which encompasses infrastructure, tools, and application lifecycle management
- Work with Release Management and business development teams to deploy software releases & updates
- Work with business partners and internal engineering teams to properly plan, coordinate, and announce all change releases. This includes execution, validation, and rollback strategies to be clearly dened, understood, and signed-o on prior to implementation.
- Ensure that business applications & platforms are operationally ready for production. This includes ability to read monitoring dashboards and ensuring all SOPs/knowledge articles are accounted for in the event of issues to prevent start of day.
- Assist with business unit application or infrastructure go-live events
- Review SOP/knowledge articles on a monthly basis for any new feature launch or other signicant change that may impact support documentation.
- Assist with training of Command Center and Application 1st level Support on new SOPs, knowledge articles, and any other support-related needs.
- Perform monthly capacity analysis of applications & platforms, including tracking of end of life assets for tech refresh opportunities.
Skills and Experience Required:
- 10+ years relevant technical experience preferred. Need excellent written, oral and interpersonal communication skills.
- At least 3 years of experience working in a progressive information security operations or engineering group.
- Prior experience with network security & related applications, tools and solutions
- Ability to work independently in addition to working closely in a team environment.
- Needs strong ability to multi-task and work effectively in a distributed and matrix oriented environment
- Demonstrated track record of success in helping deploy enterprise business solutions in a compliant manner
- Demonstrated effectiveness working across multiple business units to achieve results
- Demonstrated ability to think strategically about business, product and technical challenges
- Bachelor's degree in computer engineering/related field or equivalent work experience
Source : Georgia IT Inc.