Opened 12 years ago
Closed 12 years ago
#483 closed defect (fixed)
Discuss support for gfortran 4.6
Reported by: | kilian | Owned by: | kilian |
---|---|---|---|
Priority: | P0 | Milestone: | v2.2.0 |
Component: | core | Version: | 2.1.1 |
Severity: | critical | Keywords: | |
Cc: |
Description
With the latest changes, gfortran 4.5 is probably out of the game (no/broken OO support, etc.). We should aim at supporting 4.6. Currently, 4.6.3 is ok (some bugs, but workarounds possible).
Check how severe the problems with 4.6.0 are.
Attachments (6)
Change History (36)
comment:1 Changed 12 years ago by
comment:2 Changed 12 years ago by
Maybe to veto against 4.5.0 we should include into the OO check in the configure something that is implemented in 4.6.0 but not yet in 4.5.0. Is there something like this? Just saying that such a test is more robust and compiler-independent than checking the version number. Of course, we must be sure that 4.5.1-4 do not have this feature, already. After all, I guess we should then switch off the 4.5.0 check on Jenkins, obviously.
comment:3 Changed 12 years ago by
One more info: with gfortran 4.6.2 everything compiles, but one of the tests fails, the one for the process libraries:
FAIL: process_libraries.run (exit: 139) ======================================= Running script ./process_libraries.run | ============================================================================ | Running self-test: process_libraries | ---------------------------------------------------------------------------- /afs/desy.de/group/theorie/software/packages/whizard/build/test/run_whizard.sh: line 24: 24653 Segmentation fault (core dumped) ../src/whizard --logfile $basename.log --library lib$libname --rebuild $* $script.sin
comment:4 Changed 12 years ago by
4.6.0 should be checked by Jenkins, right? as well as 4.7.0? I'm going to check 4.7.1/2 these days ...
comment:5 Changed 12 years ago by
Component: | configure → core |
---|---|
Owner: | changed from ALL to kilian |
Priority: | P1 → P0 |
Severity: | normal → blocker |
With gfortran 4.7.2 it compiles, but ALL tests fail! For process_libraries, test 6 fails (file attached) For omega_interfaces, file attached. For prclib_interfaces, it is tests 4,5,6.
comment:6 Changed 12 years ago by
Thanks!
If the tests fail but produce output, it's not too bad .. earlier gfortrans died with segfault or memory corruption.
Let's see ...
comment:7 Changed 12 years ago by
OK, this is trivial: JR, on your system the filename of a shared lib is
libname.0.so
instead of
libname.so.0
which I assumed. So the reference output is not portable. Otherwise everything ok. Which system was that?
comment:8 Changed 12 years ago by
Hahahaha, no, actually it was MAC OS X, and the name is something like libname.0.so indeed. Actually, I am completely confused because I thought on MAC OS X it is _always_ libname.dylib! Somehow WK seemed to have overruled the MAC OS X convention, which is for sure not autoconf/make -compliant which makes me kinda nervous, actually!
comment:9 Changed 12 years ago by
Moreover there was a spurious LaTeX error I got. But maybe only spurios.
comment:11 Changed 12 years ago by
Same also for 4.7.0, so it seems, we can tackle the problem for all 4.7.x.
comment:13 Changed 12 years ago by
For 4.6.2 it is almost the same, except that I get a bus error/seg fault for the process libraries test. On linux actually, the omega and prclib tests work, so probably it is indeed only the incompatibility between hardwired library assumptions and MAC OS X. Which is problem though!!
comment:14 Changed 12 years ago by
Finally, 4.6.1 shows the same error as 4.6.0 reported above. To be discussed!
comment:15 Changed 12 years ago by
This is the backtrace I get for the seg fault in the process_libraries test:
(gdb) bt #0 0x00007ffff6a76d20 in process_libraries::process_def_list_append (list=..., entry=0x60ca50) at process_libraries.f90:727 #1 0x00007ffff6a8a7c6 in process_libraries::process_libraries_2 (u=12) at process_libraries.f90:1327
comment:16 Changed 12 years ago by
Ok, doing the full backtrace with a debug mode-compiled program yields:
Running test: process_libraries_1 Program received signal SIGSEGV, Segmentation fault. 0x0000000000409e7e in iso_varying_string::len_ (string=<error reading variable: Cannot access memory at address 0x100000000>) at iso_varying_string.f90:1009 1009 if(ALLOCATED(string%chars)) then (gdb) bt #0 0x0000000000409e7e in iso_varying_string::len_ (string=<error reading variable: Cannot access memory at address 0x100000000>) at iso_varying_string.f90:1009 #1 0x00000000005dac9a in process_libraries::process_def_write (object=..., unit=12) at process_libraries.f90:527 #2 0x00000000005d53b8 in process_libraries::process_def_list_write (object=..., unit=12) at process_libraries.f90:690 #3 0x00000000005cd6a0 in process_libraries::process_libraries_1 (u=12) at process_libraries.f90:1248 #4 0x0000000000413435 in unit_tests::test (test_proc=0x5cd513 <process_libraries::process_libraries_1>, name=<error reading variable: Cannot access memory at address 0x614d78>, description=<error reading variable: Cannot access memory at address 0x614d66>, u_log=11, results=..., _name=19, _description=18) at unit_tests.f90:145 #5 0x00000000005cd7bc in process_libraries::process_libraries_test (u=11, results=...) at process_libraries.f90:1222 #6 0x00000000005f30d5 in whizard::whizard_check (check=..., lhapdf_present=.TRUE., results=...) at whizard.f90:122 #7 0x00000000005f7f4d in MAIN__ () #8 0x00000000005f8e44 in main () (gdb)
comment:17 Changed 12 years ago by
Priority: | P0 → P2 |
---|---|
Severity: | blocker → critical |
Summary of the present situation:
All gfortran versions from 4.6.3 on are ok.
We don't support versions prior to 4.6.0.
4.6.0 to 4.6.2 have problems. Since the current tests work with later versions, these are compiler bugs. We still have to check whether it is feasible to work around those bugs. I rank this down now, but the issue has to be resolved before the next release.
comment:18 Changed 12 years ago by
Priority: | P2 → P0 |
---|
For the distribution that is very important and remains my most urgent task.
comment:19 Changed 12 years ago by
With Janus' response to JR's enquiry, the 4.6.x issues may be solvable. Will check this asap, provided the Siegen network is up and running again.
comment:20 Changed 12 years ago by
Given the new failure with r4050, I'm inclined for trashing 4.6.x altogether ... arggh!
Not yet giving up ...
Changed 12 years ago by
comment:21 Changed 12 years ago by
Attachment fptr.f90 isolates the bug in 4.6.3.
The problem occurs when I extend a basic type, such that the extended type contains a procedure pointer. The target is a function.
Calling the function, after correctly assigning the procedure pointer, results in segfault. The problem doesn't occur with a non-polymorphic type.
JR: Maybe you could find a corresponding bugzilla entry (Janus?)? The bug appears to be fixed from in 4.7 and later, so it should be a known problem.
Apparently, a workaround is to change the function into a subroutine. Pointers to subroutines work (at least in a short test). This is not nice, but maybe I should do it this way ... it would be more natural to get the OMega amplitude as a function.
comment:22 Changed 12 years ago by
I sent an email to Janus. I'm completely confused about the status of the different compilers at the moment. For the reason of SL7 and other distributions (like Debian) I would definitely demand to keep 4.6.x (let's see how small x could really be).
comment:23 Changed 12 years ago by
So I work around the issue with 4.6.3 in r4060. The matrix element code is accessed only by subroutines, not by functions. (Doesn't affect the O'Mega-generated code, only the automatically generated driver code.)
The tests with 4.6.3 are successful, again.
comment:24 Changed 12 years ago by
r4066 revealed another bug in gfortran 4.6.3.
This time, unrelated to OO stuff. Instead, it is triggered by an allocatable scalar containing an allocatable array. (The latter being in reality the ISO varying string type.)
Here is a minimal example:
module objects type :: data_t real(4) :: number = 0 character, dimension(:), allocatable :: chars end type data_t type :: object_t type(data_t), allocatable :: data end type object_t contains subroutine sub type(object_t) :: object call do_something (object) end subroutine sub subroutine do_something (object) type(object_t), intent(in) :: object end subroutine do_something end module objects program main use objects call sub end program main
With gfortran 4.6.3, this compiles, but segfaults immediately when run. Note that there is no real code executed, it is just the memory layout.
This is REALLY annoying. gfortran 4.6 is much worse than 4.5 in that respect: too many bugs in supported features. Fortunately, 4.7 is in a much better shape, but what shall we do?
comment:26 Changed 12 years ago by
Yes, in the concrete case it's trivial. But I fear that issues like this surface every other day ... and in general allocatable scalars are mandatory for data abstraction.
comment:27 Changed 12 years ago by
For gfortran 4.6.2 the problem is the following: WK starts the test with an empty process_def_list, whose entires first and last are set to => null (). However, gfortran 4.6.2 incorrectly interpretes them to be associated. I don't see a simple way to program around this!? However, if that problem is really unavoidable and persists in 4.6.2, it does not make much sense to try to get gfortran 4.6.[1-2] to work, right? I mean I'm fine to veto from now on in that version against 4.6.[0-2] if WK gives his final statement / opinion about this. WK, do you know a way to work around this problem with 4.6.2?
comment:29 Changed 12 years ago by
Summary: | Discuss support for gfortran 4.5, 4.6 → Discuss support for gfortran 4.6 |
---|
Discussion about gfortran 4.5 closed, 4.6 still pending ...
comment:30 Changed 12 years ago by
Resolution: | → fixed |
---|---|
Status: | new → closed |
After the discussions today we decided that supporting gfortran 4.6.0/1/2 is not worth the effort as those early versions of the compiler do have severe bugs and deficiencies. At the moment, the strategy for the future is to keep gfortran 4.6.3+ on the boat as long as possible, as this will be the default compiler for the next Debian (correct?). I already removed the 4.6.0 gfortran tests from the Jenkins test such that they do not bother us any longer. The vetoing against gfortran 4.6.0/1/2 is done r4090.
Unfortunately, 4.6.0 turns out to be desastrous :((((