whizard is hosted by Hepforge, IPPP Durham

Opened 11 years ago

Closed 11 years ago

#609 closed defect (fixed)

numerical noise in 32bit

Reported by: msekulla Owned by: Juergen Reuter
Priority: P0 Milestone: v2.2.0
Component: core Version: 2.2.0beta
Severity: normal Keywords: 32bit
Cc:

Description

Following tests will fail due to numerical noise on a 32bit system:

decays_5 eio_hepmc_1 integrations_4 integrations_5 integrations_6 integrations_7 mci_vamp_1 sf_epa_2 sf_lhapdf_2 simulations_6 simulations_8

For details look at the attachements.

Attachments (2)

32Bit-error.tar.gz (32.4 KB) - added by msekulla 11 years ago.
Diff.log (18.1 KB) - added by msekulla 11 years ago.

Download all attachments as: .zip

Change History (19)

Changed 11 years ago by msekulla

Attachment: 32Bit-error.tar.gz added

Changed 11 years ago by msekulla

Attachment: Diff.log added

comment:1 Changed 11 years ago by Juergen Reuter

Owner: changed from kilian to Juergen Reuter
Status: newassigned

comment:2 Changed 11 years ago by Juergen Reuter

I managed to install and run WHIZARD on a 32 bit SL5 machine. Unfortunately, there, even more tests fail: I will look into them in detail, when I have time.

comment:3 Changed 11 years ago by Juergen Reuter

Priority: P3P0

Will try to fix this next. _kz_

comment:4 Changed 11 years ago by Juergen Reuter

mci_vamp_1, this is clearly numerical noise, will try to introduce a pacify flag there:

$ diff mci_vamp_1.out ../ref-output/mci_vamp_1.ref 
30c30
<    Error                =  7.1445032289E-05
---
>    Error                =  7.1445032290E-05
51,52c51,52
<      1 936  1.0000520687E+00  7.1445032289E-05  3.3397791400E-01
<  MD5 sum (including results) = '2381B91D9D571FE06D38F5B616CD36AC'
---
>      1 936  1.0000520687E+00  7.1445032290E-05  3.3397791400E-01
>  MD5 sum (including results) = '76F60F053C015F408835D0757C56203F'
71c71
<    error    =  7.1445032289E-05
---
>    error    =  7.1445032290E-05

comment:5 Changed 11 years ago by Juergen Reuter

mci_vamp_1 was true numerical noise, fixed in 5374. Then, there was prc_omega_diags_1 failing on SL5 because a very antique and failing hyperref.sty. Switching off hyperref-ing for the testsuite in r5377. Down from 18 to 16 failing tests on 32bit SL5.

comment:6 Changed 11 years ago by Juergen Reuter

qedtest_4 is definitely not only last digits, it is a proper discrepancy.

$ diff qedtest_4.log ../../share/tests/ref-output-double/qedtest_4.ref 
44,47c44,47
<    2       1980  1.6055829E+01  5.53E-01    3.44    1.53*  10.57
<    3       1968  1.4476210E+01  4.28E-01    2.96    1.31*  15.41
<    4       1956  1.4943804E+01  4.74E-01    3.17    1.40   11.27
<    5       1944  1.5735154E+01  4.40E-01    2.80    1.23*  14.25
---
>    2       1992  1.6046227E+01  5.51E-01    3.43    1.53*  10.57
>    3       1992  1.4578766E+01  4.29E-01    2.94    1.31*  14.61
>    4       1992  1.5207985E+01  4.60E-01    3.02    1.35   12.00
>    5       1992  1.5764278E+01  4.21E-01    2.67    1.19*  14.87
49c49
<    5       9840  1.5236978E+01  2.31E-01    1.52    1.50   14.25    1.78   5
---
>    5       9960  1.5346069E+01  2.27E-01    1.48    1.47   14.87    1.49   5
51,55c51,55
<    6       1944  1.4898536E+01  4.33E-01    2.91    1.28   12.26
<    7       1944  1.4688845E+01  4.09E-01    2.78    1.23*  11.45
<    8       1944  1.4300816E+01  4.17E-01    2.92    1.29   10.34
<    9       1944  1.5142474E+01  4.23E-01    2.79    1.23*  10.52
<   10       1944  1.5110407E+01  4.43E-01    2.93    1.29    9.95
---
>    6       1992  1.4527227E+01  4.11E-01    2.83    1.26   12.02
>    7       1992  1.4824831E+01  4.17E-01    2.81    1.26*  11.21
>    8       1992  1.5055591E+01  4.21E-01    2.80    1.25*  10.36
>    9       1992  1.5039113E+01  4.04E-01    2.69    1.20*   9.61
>   10       1992  1.5138569E+01  4.55E-01    3.01    1.34    9.22
57c57
<   10       9720  1.4817661E+01  1.90E-01    1.28    1.26    9.95    0.67   5
---
>   10       9960  1.4908478E+01  1.88E-01    1.26    1.26    9.22    0.35   5

comment:7 Changed 11 years ago by Juergen Reuter

Quite funnily, with quadruple precision 32bit shows a lot less numerical noise, only 3 (sic!) tests fail (eio_hepmc, sf_lhapdf and circe1_1).

comment:8 Changed 11 years ago by Juergen Reuter

In r5381, more testflag options have been introduced, reducing I/O precision for simulation tests from ES19.12 to ES17.10. On 32bit, simulations test passes now, and also simulations_6/8 have been unified between double and quadruple precision. Down to 15 failing tests on 32bit.

comment:9 Changed 11 years ago by Juergen Reuter

The helicity test was pure numerical noise (last digit of the final integral). Pacified in r5386. Down to 14 failing test under 32bit.

comment:10 Changed 11 years ago by Juergen Reuter

True numerical noise in decays_5; pacify_phs needs a bit higher tolerance threshold for 32bit. Pacified in r5387. Down to 13 failures.

comment:11 Changed 11 years ago by Juergen Reuter

Also true numerical noise in eio_hepmc_1 test. Blanking out the last two digits of the numbers (and rounding properly). This also unifies the reference files for double and quadruple precision. Done in r5390. Still 12 failures left. :(

comment:12 Changed 11 years ago by Juergen Reuter

And (finally) pacifying the numerical noise in the integrations and integrations_history tests. They came from the fact that VAMP/phs_wood is not the best to integrate constant matrix elements. Done r5402. Down to 10 failing tests...

comment:13 Changed 11 years ago by Juergen Reuter

In r5403, sf_lhapdf_2 was pacified. It was indeed numerical noise in the matrix element / state matrix output. Down to 9 failing tests.

Testproc_2 and testproc_10 is the same as the integrations stuff above. We need an external flag here. Will do that after the gym.

comment:14 Changed 11 years ago by Juergen Reuter

For testproc_2/3/10 and smtest_9/10 it was simple numerical fluctuations. They could be damped by using an error_threshold setup. Done in r5405. Down to 4 failing tests. (!)

comment:15 Changed 11 years ago by Juergen Reuter

In r5408, shower_2 is pacified. It was numerical noise in the last digits of the 4-momenta of the event_transform as well as in the calculation of the mass squares. Now, there are 3 failures remaining:

  1. sf_isr
  2. sf_epa
  3. qedtest_4
  1. and 2. are numerical (true numerical) instabilities in the sampling of non-collinear splittings, 3. is a true numerical instabilities for the process e+ e- -> A A A. Any comment is highly appreciated.
Last edited 11 years ago by Juergen Reuter (previous) (diff)

comment:16 Changed 11 years ago by Juergen Reuter

Step in between: catching numerical noise for susyhit test, in case it is available for 32bit. Not yet counted.

qedtest_4 was numeroical noise due to the weight adaptation. Switching it off (i.e. "gw" -> "g") eliminated the noise. The meaning of it: To be discussed.

Leftover: the two sf_XXX non-collinear splittings.

comment:17 Changed 11 years ago by Juergen Reuter

Resolution: fixed
Status: assignedclosed

Finally done in r5418. For sf_epa_2, I artifically set the electron mass to 5 GeV, which suppresses numerical noise for non-collinear splitting (which is momentum-violating anyhow). For sf_isr_2, this was not doable, so piped it through a scratch file and Xed out the noise. Closing.

Note: See TracTickets for help on using tickets.