Quick facts

    • Talence


  • Technology


  • Phd

Apply by: 2023-07-25

PhD Position F/ M Topology-aware load balancing for ocean simulation on heterogeneous platforms.

Published 2023-05-26

Contexte et atouts du poste

Various HPC improvements of the CROCO model itself are currently carried out with respect to a sustainable support of GPUs and different parallel programming models. Indeed, the current trend in high-performance computing architectures is going even more towards increasing heterogeneity. This is omnipresent on the intra-node computation with accelerator cards as well as on the inter-node level with different hardware and communication behaviors.

However, on the application and scheduling side, this trend is often ignored: scheduling of applications, in particular CROCO, still assumes homogeneity across the hardware stack. This leads to a mismatch between applications and the underlying HPC system, resulting in a poor performance in particular in the strong scaling case.

The AIRSEA team in Grenoble is one of the main developers of the CROCO model and the Tadaam team in Bordeaux has the expertise in load-balancing and topology-aware algorithms. Therefore, this PhD will be carried out mainly in Bordeaux but with strong collaboration with Grenoble : visits and exchanges will be organized regularly between the two locations. 

Mission confiée

The CROCO ocean model has a very complex workload model including non-homogeneous workload, adaptive mesh refinement with nested grids as well as existing support for hybrid CPUs and GPUs. Optimization attempts without application-driven information are therefore prone to fail. The goal of this PhD is to work on optimizing the execution of the CROCO model on supercomputers by developing and investigating new load-balancing algorithms.

Even if CROCO relies on structured meshes, load imbalance appears between the different computing units due to varying runtime of solvers. Moreover, as the topology of a heterogeneous machine can be extremely complex, the cost of communication can be very high depending on the location of the sender and the receiver. Hence, it is necessary to carefully optimize the mapping of the compute process and the load balance between them to optimize the computation and communication costs of the CRCOCO model.

Principales activités

The Phd Candidtae will work on the following workplan:

  • Understanding the CROCO model and the computation/communication graph of the application
  • Work on the state-of-the art of load-balancing and topology-aware algorithms.
  • Collaborate in the development of a microbenchmark that mimics the behavior of the CORCO model in terms of imbalance and communication on a fixed adaptive mesh.
  • Develop a performance model of the application/microbenchmark that will be used by the algorithmic engine
  • Propose a static load-balancing algorithm for the heterogeneous case (CPU)
  • Evaluate this algorithm on real testcases and real supercomputers. 
  • Enhance the solution toward heterogeneous resources (first GPUs, then hybrid) and at runtime.
  • Compétences

  • Mandatory:
  • High-performance computing
  • Parallel programming models (MPI, OpenMP)
  • Parallel programming models for heterogeneous computing (GPU/CPU)
  • Performance modeling
  • Strong programming skills
  • Graph:
  • Graph theory
  • Optimization and algorithms
  • Optional:
  • Numerics
  • Usage of large-scale super computers
  • Able to cope with operational forecasting codes / Fortran 90
  • Avantages

  • Subsidized meals
  • Partial reimbursement of public transport costs
  • Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
  • Possibility of partial teleworking and flexible organization of working hours
  • Professional equipment available (videoconferencing, loan of computer equipment, etc.)
  • Social, cultural and sports events and activities
  • Access to vocational training
  • Social security coverage
  • Similar jobs