HPC Systems Administrator - 1905 - KeyLogic Systems
Albuquerque, NM 87101
About the Job
|
KeyLogic is seeking a High-Performance Computing (HPC) Systems Administrator to maintain HPC clusters and systems hardware within the resource management team. Are you passionate about identifying and resolving hardware issues to improve management of the lifecycle for HPC hardware? Do you thrive in a fast-paced and dynamic environment? If so, you will want to consider applying for this opportunity. In this role, you will collaborate with system administrators and partner with multi-disciplinary teams in a variety of organizations and interact with HPC customers and vendors.
Responsibilities:
- Monitor the state of the HPC clusters and storage using software utilities, shell scripts, and command line interfaces.
- Troubleshoot and identify failed hardware, implement parts replacement through the RMA process, and work with HPC SMEs to resolve system failures.
- Work with HPC team and vendors in the delivery and deployment of clusters and storage systems including SNL shipping and receiving.
- Rack and cable servers and storage systems using HPC SME’s diagrams and spreadsheets.
- Actively seek opportunities to broaden and deepen your knowledge base and proficiencies.
- Build your professional development through mentorship partners and informal training
Qualifications:
- BS/BA degree with a minimum two years’ experience with Linux administration and troubleshooting. 6 Years of experience may substitute BS/BA degree requirements. Certificates can assist in offsetting experience requirements.
- US Citizen
- Ability to obtain and maintain a U.S. Department of Energy Q security clearance
REQUIRED SKILLS (List Most important to least)
- Experience with Linux Command Line.
- Proven track record of problem solving using strong critical thinking, analytical and troubleshooting skills.
- Strong technical aptitude and ability to research and solve complex issues independently
- Working knowledge and experience with bash shell scripting and other languages (Python, Perl, or Ruby)
- Experience in another IT field, such as Storage, Networking, Databases, or Cyber Security
- Ability to work seamlessly within a team as an active contributor.
- Strong verbal and written customer service and communications skills
- Knowledge of, and desire to follow IT Operations best practices and procedures such as issue management and incident response
- Familiar with information security best practices
DESIRED SKILLS (List Most important to least)
- Ability to apply critical thinking skills and apply detailed data into useful information, such as documentation and developing policies, processes and procedures
- Ability to work independently and within a team
- Cybersecurity experience or interest