+65 6591 8608
Home Deep Learning & HPC Education and software Manycore Software Porting

Friday30 September 2022

Manycore Software Porting

Manycore systems is a production proven paradigm to continue the move towards performance. EXALIT customers gained 200x (two hundred times) performance speedup over the CPU-only systems with the use of Manycore architecture such as NVIDIA Tesla.

EXALIT Manycore Software Porting services include deep understanding and analyzing of Your algorithm and performance goals following with the necessary algorithm redesign, software rewriting and fine tuning.

Methodology for application porting: 


Manycore Software Porting TCO analysis: is it worth it?
Case study.

CAPEX-OPEX analysis for an Hybrid System
Capital Expences (CAPEX) include:

• System acquisition cost
• Software migration cost
• Software acquisition cost
• Teaching cost
• Real estate cost

Operational Expenses (OPEX) include:

• Energy costs (system consumption + cooling)

• Maintenance costs (system maintenance, support)

For a given amount of compute work, the CAPEX-OPEX analysis indicates the Total Cost of Ownership: if I add GPUs will I save money? And how many should I add? Then should I use slower CPU or less CPU memory?

Application speed-up and CAPEX-OPEX

On one hand, adding GPUs to the system increases system cost and base energy consumption (one GPU = x10 watt idle). And the code has to be migrated.

On the other hand, exploiting GPUs decreases execution time, so potentially the energy consumption for a given amount of work, reduces the number of nodes and simplify the network architectures.

In the case we manage to obtain an acceleration of the application, this speed-up enables a shorten time-to-market and an increase of the amount of work performed during the lifetime of the system.

CAPEX: Hardware Parameters
Heterogeneous systems are widely available and the choice of a hardware configuration can be done among all those configurations: Fast CPU + Fast GPU (expensive node), Slow CPU + Fast GPU, Fast CPU + Slow GPU, Slow CPU + Slow GPU, Fast CPU, Slow CPU. Another parameter to take into account for the system is the performance of the nodes. Nodes performance impact on the number of nodes you will choose: more nodes means more network with non-negligible cost and energy consumption, less nodes may limit scalability issues if any.

Application workload analysis is the only way to decide: optimizing software can significantly increase performance and so reduce needed hardware. So, code migration to GPU is on the critical path.

CAPEX: Code migration cost

Migration cost includes learning cost, software environment cost and porting cost and is mostly hardware size independent: it is not an issue for dedicated large systems, but increases if the machine aims at serving a large community (more people to train, more applications to port…

Main migration benefit is to highlight manycore parallelism but this benefit is not specific to one kind of device. The implementation will be specific to each constructor specific implementation. So the amortize period for the migration is similar to the one of the hardware (3 years). 

Moreover, an intelligent way to deploy applications is to use portable solution for multiple hardware generations (amortized on 10 years) even if some level of tuning are still require. 

Actually, mastering the cost of migration has a significant impact on the total cost for small systems. Using methodology and processes to lower your risks and secure your investment.


All comparisons are made on an equivalent workload. For reminder:

CAPEX = System costs + Migration costs (4 nodes)
OPEX   = Energy costs (power + cooling) + Maintenance costs (10% of System cost)

Example 1: GREAT for porting to GPU
The first chosen application is a GPU friendly application with high speed-up.

The code to port is a ray-by-ray
heat transfer simulation which can be seen as a quadrature by a Monte-Carlo method. It is based on MPI.

The migration cost is equivalent to
one man month.


For this first case, GPU extra-cost is easily amortized (x4 in value) and the breakeven point is even reached on small machines.

Example 2: not worth porting

The second application is a 3D
hydrodynamical code used to simulate astrophysical fluid flows.

This MPI code is moderately efficient using GPGPU.

The migration cost is equivalent to two man month.

Price/performance improvement is not compensated by the speed-up obtained (1.5).


Answer is YES, it does worth it to migrate your application to GPGPU if you get 2x speed-ups and above!

Main factors in CAPEX-OPEX are:
• Application speedup: preliminary data shows that for speedup > 2 fast, GPUs are worthwhile;
• GPU costs;
• Migration/software costs.

Migration cost may be an issue for small systems until they are negligible for large systems, even if expenses are usually not planned. In every case, migration operations should always consider CPUs.  


Heterogeneity is enlarging the configuration space by adding more sensitivity on application characteristics and more pressure on developers.

Let us keep in mind that these first conclusions will be impact by the arrival of technological news as energy consumption control at software level, cloud technology or CPU and GPU fusion.


Ping us if you are interested in to port your code to Manycore architecture.

JA Minisite

2857.orig.q75.o0 - Copy IntelTPP amd  tesla preferred partner  quadro partner  qctlogo e   Mellanox APAC Partner   1