Senior HPC Engineer LMS 2177

Medical Research Council


View Vacancy — Senior HPC Engineer LMS 2177
Open Date
14/06/2023, 10:30
Close Date
11/07/2023, 23:55

MRC London Institute of Medical Sciences

The LMS is an Institute funded by the MRC and is a Division of the Faculty of Medicine, Imperial College. Based on the Hammersmith Hospital Campus in West London (W12), the LMS has first class facilities and provides investigators from clinical and basic science backgrounds with the opportunity to pursue innovative, multidisciplinary research within the established clinical base of Imperial College.

For more information, visit www.lms.mrc.ac.uk.

UK Research and Innovation is a new entity that brings together nine partners to create an independent organisation with a strong voice for research and innovation, more information can be found at www.ukri.org

MRC – 3

London

£40,016 – £53,270

London Allowances (£3,727 & £1,402) per annum

Permanent

Science

Full Time

Known internationally for its high-quality basic and applied science, the MRC London Institute of Medical Sciences (LMS) houses over 300 research staff and postgraduate students. Sited at Hammersmith Hospital, the LMS forms part of Imperial College’s Institute of Clinical Sciences (ICS), one of five departments of the Faculty of Medicine, and enjoys close links with Imperial College and the Imperial College Healthcare NHS Trust.

As well as making use of wider HPC provisions available from Imperial College, the LMS houses a combined CPU and GPU cluster used extensively by its multidisciplinary research groups and the Institute facilities. The Institute is seeking to grow its Scientific Computing provision to cement its future support of computational biology.

 

Overall Purpose

Positioned within the Bioinformatics Facility, the Senior HPC Engineer will proactively design, manage, and develop the Institute’s Scientific Computing infrastructure, including the High Performance Computing (HPC) provision in line with current and future requirements. The postholder will project manage and conduct work to maintain and upgrade the Institute’s provision, coordinating their work within the broader context of the Institute’s setting within Imperial College and alongside regional and national HPC provisions, where relevant.

Working with the Head of IT, the postholder will plan, install, and maintain the Institute’s HPC systems and associated storage and network infrastructures that interface with wider IT provisions. Working alongside the Head of Bioinformatics, they will plan, provision, and maintain key software capabilities and on-demand HPC services.

In a rapidly evolving field, the postholder must adapt to the changing priorities and demands of the Institute’s dynamic research environment, forecasting future strategy, training, and infrastructure requirements through awareness of users’ needs and future research efforts. The postholder will lead the development and provision of Scientific Computing at the Institute and must demonstrate sound judgement and decision making in their work. This role requires close collaboration with the IT Facility and for strategy to be developed collaboratively in line with agreed budgets.

The postholder will have a prominent user-facing role and is expected to be a first point of contact for HPC users; they should engage proactively in advertising Scientific Computing and in onboarding, training, and assistance. As such, the postholder should be knowledgeable, experienced, and agile. The role requires a well-organised and motivated individual capable of undertaking continued personal development and of demonstrating and instilling a culture of continuous improvement.

 

Working relationships

The postholder reports to the Head of Bioinformatics and will work closely with the Head of IT. The postholder will interact closely with the Head of Operations, Research Group Heads, and Facility Heads to fulfil their role, as well as existing IT staff. The postholder will be supported by the Heads of Bioinformatics and IT to develop further necessary skills and ability with a view to in-band progression.

Supporting Imperial College staff within the LMS, the postholder will be required to interface on their behalf with the Imperial College Research Computing team.

The postholder will identify and initiate potential inter-institute collaborative work and maintain current working relationships.

Key Responsibilities

Strategy Design and Delivery

  • Meet regularly with Heads of Bioinformatics and IT, and other parties to develop and deliver Scientific Computing infrastructure, software provisions, and resilient services in line with strategic and scientific goals

  • Produce technical specifications for invitations to tender and technically assess responses against budget and performance requirements. Purchase new equipment in line with Institute, MRC, and UKRI policies, processes, regulations, and budgets

  • Interface with wider Imperial College facilities and services to ensure aligned strategy

  • Assume responsibility for coordination and provision of HPC user training and onboarding

  • Ensure best practices in project management are maintained to ensure work is aligned with defined goals and that projects are managed within agreed benefit, timescale, cost, quality, and risk limits

  • Ensure provision of reasonable and appropriate out-of-hours systems support where warranted by major disruption or outages affecting critical services

  • Monitor, report on, and promote the Institute’s Scientific Computing provisions

  • Design and manage varied communication strategies for effective and timely stakeholder engagement, including user-group meetings, seminars, and chat systems (Slack, Teams, etc) to ensure full visibility of their ongoing and future requirements

  • Maximise opportunities to network with similar organisations to benchmark services and activities Research developing trends to ensure future provision and cost-effective delivery of the Institute’s scientific goals

Technical Activities

  • Establish policies and systems to ensure that workloads placed on the HPC systems are managed, prioritised, and run to achieve optimal service levels and uptime

  • Resolve performance issues on the cluster, help to design HPC jobs

  • Install, configure, and manage Linux systems using HPC-specific software, apply consistent security configuration standards

  • Implement and maintain management and monitoring tools, report usage levels and service level data

  • Streamline and automate maintenance, deployment, and configuration tasks

  • Build from source, install, configure, and manage Linux applications manually and using deployment software

  • Alongside the Head of IT, assume responsibility for documenting and implementing relevant disaster recovery processes

  • Ensure awareness via monitoring of likely points of failure, proactive maintenance in mitigation, and repair of HPC resources by diagnosing, anticipating, and troubleshooting arising problems

  • Maintain relevant interfacing services and Linux infrastructure, racking hardware, patching fibre optic and CAT6A cabling

  • Support the Head of IT in implementing ICT security policy

  • Support LMS IT staff as requested in issues and queries relating to standalone Linux systems

  • Other duties commensurate with the grade of the post as directed by the supervisor

Education / Qualifications / Training required (will be assessed from application form):

Essential

  • Degree in computing or an equivalent technical discipline

  • Significant experience of working at a high-level within a relevant area

Desirable

  • Industry-standard Linux certifications (e.g. RHCA, RHCSA)

  • Current qualifications in project management (e.g. PRINCE2)

  • Knowledge, experience, or qualification in cloud computing

  • Knowledge or education to a higher level within a scientific field

  

Knowledge and experience (will be assessed from application form and at interview):

Specific Role Competencies

Essential

  • Providing HPC services within a client-facing role

  • Integration of heterogeneous Linux/Windows/Mac environments (e.g., Windows Active Directory)

  • Installing, configuring, and managing HPC clusters and cluster-based storage systems

  • Experience with automation tooling (e.g., Salt, xCAT)

  • Configuration, and maintenance of multi-queue job scheduling systems (e.g., SLURM)

  • Use of scientific software compilation and deployment systems (e.g., Spack, EasyBuild, Lmod, conda)

  • Virtualisation and containerisation (e.g., Docker, Singularity)

  • Excellent written communication skills in English

Desirable

  • Deployment of on-demand software with scheduling systems (e.g., Open OnDemand)

  • Experience of C, R, or Python programming

  • Experience with GPU-focused hardware and software

  • Disaster recovery systems (e.g., LTO systems)

People Leadership

Desirable

  • Managing, motivating, and developing staff

  • Project managing work packages within and between groups

  • Resolving complaints and in conflict resolution

Strategic Thinking

Essential

  • Ability to organise and prioritise resources to ensure delivery of short and long-term targets

  • Experience leading strategic debate to identify future opportunities

  • Discipline and regard for data integrity, confidentiality, and security

  • Ability to use user feedback to make improvements to the delivery and provision of services

Financial Management

Desirable

  • Managing and reviewing budgets, including capital expenditure

  • Negotiating for and procuring services and products

  • Clear understanding of the principles and practices of project management

  

Personal skills/behaviours/qualities (will be assessed at the interview):

Essential

  • Ability to self-motivate, multi-task, and prioritise work effectively
  • Ability to develop networks and contacts and to work collaboratively

  • Excellent verbal communication skills in English

Please upload your CV, names and contacts of two scientific references along with a covering letter describing your interests and motivations for applying this role (providing evidence against the requirements of the job as per the job description and person specification). Please quote reference number LMS 2177

Applications that do not provide a covering letter will not be considered. 

The MRC is a great place to work and progress your career, be it in scientific research or the support functions.The MRC is a unique working environment where our researchers are rewarded by world class innovation and collaboration opportunities that the MRC name brings. The MRC is an excellent place to develop yourself further and a range of training & development opportunities will be available to you, including professional registration with the Science Council.
Choosing to come to work at the MRC (part of UKRI) means that you will have access to a whole host of benefits from a defined benefit pension scheme and excellent holiday entitlement to access to employee shopping/travel discounts and salary sacrifice cycle to work scheme, as well as the chance to put the MRC and UKRI on your CV in the future.
Our success is dependent upon our ability to embrace diversity and draw on the skills, understanding and experience of all our people. We welcome applications from all sections of the community irrespective of gender, race, ethnic or national origin, religion or belief, sexual orientation, disability or age. As “Disability Confident” employers, we guarantee to interview all applicants with disabilities who meet the minimum criteria for the vacancy.
UKRI supports research in areas that include animal health, agriculture and food security, and bioscience for health which includes research on animals, genetic modification and stem cell research. Whilst you may not have direct involvement in this type of research, you should consider whether this conflicts with your personal values or beliefs.
We will conduct a full and comprehensive pre-employment check as an essential part of the recruitment process on all individuals that are offered a position with UKRI. This will include a security check and an extreme organisations affiliation check.  The role holder will be required to have the appropriate level of security screening/vetting required for the role.  UKRI reserves the right to run or re-run security clearance as required during the course of employment.

View or Apply
To help us track our recruitment effort, please indicate in your cover/motivation letter where (globalvacancies.org) you saw this job posting.

Job Location