Opened 15 years ago
Closed 15 years ago
#131 closed enhancement (fixed)
Parallelize helicity loop via OpenMP
Reported by: | Christian Speckner | Owned by: | ohl |
---|---|---|---|
Priority: | P2 | Milestone: | v2.3.1 |
Component: | omega | Version: | 2.0.0beta |
Severity: | normal | Keywords: | |
Cc: |
Description
The helicity loop in the amplitude code can be parallelized via OpenMP on multicore / multicpu systems. Unfortunately, the code I wrote for 1.9x is a perl postprocessor for the amplitude and cannot be reused in W2 where it should be properly implemented in the O'Mega FORTRAN backend, but as an example of how this might be done, I've attached a parallelized amplitude for gg -> ggg in the SM which works and can be used as a drop-in replacement for the "normal" one (the compilation must be done by hand though as my perl code splits the amplitudes into chunks). Although this was originally done with ifort, I've checked that it also works with gfortran via the "-fopenmp" option. For this process, the speed gain on using two threads on two cores (via OMP_NUM_THREADS=2) is only a factor ~1.6, but for more complicated processes with more nonzero helicity combinations, it's pretty close to 2.
Attachments (3)
Change History (5)
Changed 15 years ago by
Changed 15 years ago by
Attachment: | test_decl.f90 added |
---|
Changed 15 years ago by
Attachment: | test_pamp_21_21_21_21_21.f90 added |
---|
comment:1 Changed 15 years ago by
Status: | new → assigned |
---|
comment:2 Changed 15 years ago by
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Works and scales linearly (tested up to 4 cores) in standalone O'Mega:
max. threads: 4 dynamic: elapsed 1.4087 seconds, elapsed * #threads: 5.6347 seconds, amp2 = 0.8660E+04 #threads = 1, elapsed 5.2956 seconds, elapsed * #threads: 5.2956 seconds, amp2 = 0.8660E+04 #threads = 2, elapsed 2.6757 seconds, elapsed * #threads: 5.3514 seconds, amp2 = 0.8660E+04 #threads = 3, elapsed 1.8633 seconds, elapsed * #threads: 5.5899 seconds, amp2 = 0.8660E+04 #threads = 4, elapsed 1.3568 seconds, elapsed * #threads: 5.4274 seconds, amp2 = 0.8660E+04 STOP 0 PASS: test_openmp
use --with-openmp in the O'Mega configure and make check.
As of r1917, I've started to add OpenMP directives.