Lead Systems Engineer at Three Point Solutions Inc
Washington, DC
About the Job
Job Title: Lead Systems Engineer
Client: Health Insurance Company
Duration: 12 Months
Location: Washington, DC 20065
General Information
Job Category/Level: Systems/Monitoring/Lead
Reports to: Manager, Systems Monitoring Team, Infrastructure and Production Operations
Background
The FEP Operations Center seeks a Lead Systems Engineer to support Systems Monitoring initiatives for various SOWs in 2024 and beyond.
Responsibilities
- Administer systems and applications monitoring tools (e.g., DataDog).
- DataDog administration experience on Linux platform for Java-based applications on Tomcat Application Server.
- Configure Infrastructure, Network Monitoring, and Centralized Logging.
- Administer ELK Stack (Elasticsearch, Logstash, Kibana) or similar tools.
- Strong Linux (Red Hat) background and scripting (Python, Shell, ANSIBLE).
- SSL setup on Linux servers, including CA cert installations.
- Knowledge of Network components (Switches, Routers, Palo Alto Networks, F5 Load Balancers, SNMP).
- Experience with Big Panda, CloudBeat (Synthetic Monitoring) desired.
Tasks
- Manage DataDog tools on Linux, ensuring effective network, server, and log monitoring.
- Configure centralized logging from various sources.
- Create DataDog dashboards for data visualization.
- Manage DataDog APM for Java applications and set up end-user monitoring.
- Support significant production issues, gather and analyze information for troubleshooting.
- Create training documentation and reports for tool usage.
- Work with Architecture and Project teams to ensure systems monitoring requirements are met early in development.
Competencies
- Organizational and communication skills.
- Adaptable, forward-thinking, problem-solving mindset.
- Time management and initiative.
- Technical skills, proactive work style, continuous learning enthusiasm.
Required Skills
- 5-8 years of IT experience across various platforms, including Linux/Unix, Microsoft systems, VMWare, SQL Server, LAN/WAN technologies.
- 3+ years in monitoring tools like DataDog administration or ELK Stack (Elasticsearch, Logstash, Kibana).
- Scripting in Shell, Python, Selenium, and experience with SSL certs on Linux.
- Developing and implementing monitoring strategies in large-scale environments.
- Process documentation, tool usage training, and support.
- Familiarity with database management (DB2, SQL).
- Troubleshooting using monitoring tools like DataDog.
- Experience in both waterfall and agile SDLC methodologies.
Education
Bachelor's in Computer Science, Engineering, or related field (or equivalent experience).
Licenses/Certifications
- ITIL Foundations v3 (within 180 Days Preferred)
- SAFe Certification
#ZR