whizard is hosted by Hepforge, IPPP Durham

Opened 15 years ago

Closed 13 years ago

#235 closed defect (fixed)

make check fails if WHIZARD compiled with "nagfor -nan -C=all"

Reported by: sschmidt Owned by: Juergen Reuter
Priority: P5 Milestone: v2.0.6
Component: core Version: 2.0.0rc2
Severity: minor Keywords: nagfor nan
Cc:

Description

When compiling with nagfor and the -nan flag (at least those files fo which nagfor doesn't crash, omitting -nan for those) there are some uses of undefined variables found (I just played around with which files to be compiled with -nan):
commands.f90 line 1373
sf_lhapdf.f90 line 269
vamp_bundle.f90 line 2371
omega_couplings.f90 line 425
processes.f90 line 1944 (Note that this is src/whizard-core/processes.f90, NOT the automatically generated file in the directory you start Whizard from. Maybe the automated file should have a different name??? This confused me a bit.)
Tested with W2 r1924 and nagfor 5.2(717).

Change History (19)

comment:1 Changed 15 years ago by Juergen Reuter

Owner: changed from kilian to Juergen Reuter
Status: newassigned

I updated to nagfor 5.2(721), where they fixed fast and efficiently Sebastians bug. The problems seem to arise exclusively from uninitialized values of reals and compleces in type declarations.

comment:2 Changed 15 years ago by Juergen Reuter

I fixed the one in commands.f90, and the one in processes.f90. (r1930)

comment:3 Changed 15 years ago by Juergen Reuter

In r1933, I fixed the problem in sf_lhapdf (data%xmin/max -> xmin/max), and a couple of problems in the model parameter files. I still have to trace down the general problem for structure functions, which bites now both for LHAPDF and for EPA. There is now a strange thing in g_gg in omega_couplings, different from the line, Seb. reported.

comment:4 Changed 15 years ago by Juergen Reuter

OK, that was spurious, continuing after swimming.

comment:5 Changed 15 years ago by Juergen Reuter

I do not know exactly what this means, but the susyhit test fails with the following remark:

| Writing log to 'whizard.log'
|=============================================================================|
|                               WHIZARD 2.0.0_rc3
|=============================================================================|
| Initializing process library 'processes'
| Reading model file 'SM.mdl'
| Using model: SM
| Reading commands from file 'susyhit.sin'
| Reading model file 'MSSM.mdl'
| Switching to model 'MSSM': reassigning model parameters
| Writing SLHA output file 'suspect2_lha.in'
| command: susyhit
read start: end of file
apparent state: unit 25 named susyhit.in
last format: list io
lately reading direct formatted external IO
| Return code = 134
******************************************************************************
*** ERROR: System command returned with nonzero status code
******************************************************************************
| Reading SLHA input file 'susyhit_slha.out'
******************************************************************************
******************************************************************************
*** FATAL ERROR: SLHA input file 'susyhit_slha.out' not found
******************************************************************************
******************************************************************************
| There were  1 error(s) and no warnings.
WHIZARD run aborted.

comment:6 Changed 15 years ago by Juergen Reuter

This is the traceback from the structure function problem, if someone has any ideas, they are highly welcome:

| 1000 calls, 4 channels, 4 dimensions, 20 bins, stratified = T
|=============================================================================|
| It      Calls  Integral[fb]  Error[fb]   Err[%]    Acc  Eff[%]   Chi2 N[It] |
|=============================================================================|
Runtime Error: *** Arithmetic exception: Floating invalid operation - aborting
../../../src/vamp/vamp_bundle.f90, line 2371: Error occurred in DIVISIONS:PROBABILITY_S
../../../src/vamp/vamp_bundle.f90, line 2083: Called by DIVISIONS:PROBABILITY_V
../../../src/vamp/vamp_bundle.f90, line 4046: Called by VAMP_REST:VAMP_PROBABILITY
processes.f90, line 1987: Called by PROCESSES:PROCESS_COMPUTE_VAMP_PHS_FACTOR
processes.f90, line 3049: Called by PROCESSES:SAMPLE_FUNCTION
../../../src/vamp/vamp_bundle.f90, line 3961: Called by VAMP_REST:VAMP_SAMPLE_GRID0
../../../src/vamp/vamp_bundle.f90, line 5018: Called by VAMP_REST:VAMP_SAMPLE_GRIDS
processes.f90, line 2194: Called by PROCESSES:PROCESS_INTEGRATE
commands.f90, line 3800: Called by COMMANDS:CMD_INTEGRATE_EXECUTE
commands.f90, line 1726: Called by COMMANDS:COMMAND_EXECUTE
commands.f90, line 5743: Called by COMMANDS:COMMAND_LIST_EXECUTE
whizard.f90, line 200: Called by WHIZARD:WHIZARD_PROCESS_STREAM
whizard.f90, line 176: Called by WHIZARD:WHIZARD_PROCESS_FILE
main.f90, line 216: Called by MAIN
Aborted (core dumped)

comment:7 Changed 15 years ago by Juergen Reuter

No progress on that one, very difficult. Don't find it.

comment:8 Changed 15 years ago by Juergen Reuter

It seems that the x values of the process_t type are not properly defined or initialized. This triggers an error when trying to access them (made public) in process_set_kinematics.

comment:9 Changed 15 years ago by Juergen Reuter

I believe that it is the assignment for the three different variables process%x, process%x_hi and process%x_strfun. But no proof yet, and no definite hint.

comment:10 Changed 15 years ago by Juergen Reuter

Resolution: fixed
Status: assignedclosed

All NaNs? have been caught in r1951. The problem with susyhit is spurious, as it appeared only because not running it with the proper shell script. Closing.

comment:11 Changed 14 years ago by Juergen Reuter

Milestone: v2.0-rc3

Milestone v2.0-rc3 deleted

comment:12 Changed 13 years ago by sschmidt

Milestone: v2.0.6
Priority: P3P4
Resolution: fixed
Severity: normalminor
Status: closedreopened

With r3554 two checks within make check fail when compiling with "nagfor -nan" that don't fail when omitting the "-nan".

My nagfor version is 5.2(776), the failing checks are evaluators.run and processes.run. My command line was

../trunk/configure --prefix=... --enable-shower FC=nagfor FCFLAGS="-g90 -nan -gline" && m -j && m install && m check

comment:13 Changed 13 years ago by Juergen Reuter

Could you please follow back the revisions since when this has been damaged... If it is basically from the beginning it would be good to find out where the NaN is actually coming from... (sometimes the reporter might get punished)

comment:14 Changed 13 years ago by sschmidt

These checks have been broken since they were introduced, processes.run was introduced in r3413, evaluators.run a bit earlier. (Juergen increased the number of checks from 23 to 49 finishing with r3414)

comment:15 Changed 13 years ago by Juergen Reuter

Priority: P4P5

Ok, next interesting question: can you find out where the NaN is coming from and find a solution to that? Otherwise I will look into that in 2-3 weeks.

comment:16 Changed 13 years ago by sschmidt

OK, here's what I found:

The error in processes.run stems from the function phs_forest_evaluate_momenta. There the relevant code is

  subroutine phs_forest_evaluate_momenta &
       (forest, channel, active, sqrts, x, factor, volume, ok)
...
    real(default), dimension(:), intent(out) :: factor
...
    call phs_tree_compute_momenta_from_x (forest%grove(g)%tree(t), &
         forest%prt, factor(channel), volume, sqrts, x(:,channel), ok)
...
    end subroutine phs_forest_evaluate_momenta

with no other access to factor. So you have an array declared as intent(out) and only one element, factor(channel), gets assigned a value.

The NaN in evaluators.run stems from a call to external_link_get_ptr (int%source(i)) in interaction_receive_momenta (int) returning an uninitialized source. But I've not got further than this.

comment:17 Changed 13 years ago by sschmidt

Summary: W2 fails if compiled with "nagfor -nan"make check fails if WHIZARD compiled with "nagfor -nan -C=all"

just clarifying: make check (particles.run, polarizations.run, hepmc.run, processes.run) fails when compiled with "nagfor -nan -C=all", when omitting the "-C=all" only processes.run fails.

comment:18 Changed 13 years ago by Juergen Reuter

There is the remaining test with the processes test. It is strange: compiled without any -C=all or -nan flag, there are completely valid entries in all the files, while NAG complains when compiled with these two options. Could even be a NAG bug...

comment:19 Changed 13 years ago by Juergen Reuter

Resolution: fixed
Status: reopenedclosed

Oops, no, was a real bug, wrong intent in the phs_forest place ;) Corrected in r3571.

Note: See TracTickets for help on using tickets.