whizard is hosted by Hepforge, IPPP Durham

Opened 14 years ago

Closed 13 years ago

Last modified 7 years ago

#161 closed task (fixed)

Resurrect the Cruise Control

Reported by: Juergen Reuter Owned by: boschmann, ohl
Priority: P1 Milestone: v42-backlog
Component: interfaces Version: 2.0.4
Severity: critical Keywords:
Cc:

Description

Create a cruise control. Sometimes new check-ins are faster then it can be checked on the different systems. We have to have to specific scripts and control instances to guarantee sanity of the builds in full order.

Change History (28)

comment:1 Changed 14 years ago by ohl

Isn't that what feature freezes are for?

Ideally we would have a build farm that regulary checks out the latest revision, runs distcheck on all architectures and mails us the error messages. I could set this up for 64bit Debian.

comment:2 Changed 14 years ago by kilian

Priority: P1P2
Severity: blockermajor

The message is actually to use branches more freely, and check in to the trunk only if tests are done.

Applies to myself ...

We can't do this for rc1, since this is almost(TM) ready anyway. This is not P1.

comment:3 Changed 14 years ago by ohl

Milestone: v2.0finalv2.0-rc3

As of r1717, there is a new make target

make extra-distcheck

that can be used to run

make distcheck

for different configure options that disable feature which might be unavailable on target systems. Currently we have

EXTRA_DISTCHECK_CONFIGURE_FLAGS = \
    "--disable-noweb" \
    "--disable-noweb --disable-omega" \
    "--disable-noweb --disable-shared" \
    "--disable-noweb --disable-static"

--disable-noweb saves time for the other targets and noweb is tested by the normal make distcheck anyway.

This could be used by a automated system started by cron.

comment:4 Changed 14 years ago by Juergen Reuter

Milestone: v2.0-rc3v2.0final

comment:5 Changed 14 years ago by ohl

Milestone: v2.0.0finalgolden-classics

comment:6 Changed 14 years ago by Juergen Reuter

Owner: changed from ALL to ohl

Not yet finalized, especially not documented at all, therefore basically non-existent. TO has to provide, or give clearance for HUDSON.

comment:7 Changed 14 years ago by Juergen Reuter

Priority: P2P1
Severity: majorblocker

comment:8 Changed 14 years ago by Juergen Reuter

Owner: changed from ohl to boschmann, ohl

comment:9 Changed 14 years ago by Boschmann

I have started to set up a hudson server.

comment:10 Changed 14 years ago by Boschmann

HUDSON server is running.

Implemented so far:

Not yet implemented:

  • No Hudson clients: There is only on server yet on a single core amd x64 CPU and ubuntu OS.
  • No tests with missing packages like noweb, ocaml, metapost
  • no multithreaded tests, no make -j

comment:11 Changed 14 years ago by Boschmann

Client/Server? and examples are implemented

  • configure, make, make check, make distchek, whizard examples/*.sin on 32bit linux with gcc 4.5.0 and make -j 3
  • configure, make, make check, make distchek, whizard examples/*.sin on 64bit linux with gcc 4.5.latest
  • configure, make on 64bit linux with nagfor (no email notification)

Distribution tarball upload is deactivated but logfiles are available at:

http://event-generator.tp1.physik.uni-siegen.de/hudson/artefacts/logfiles/

comment:12 Changed 14 years ago by Juergen Reuter

Sounds good. So we are waiting for your MacMini? purchase to also include Mac OS X 10.4.0 Snow Leopard here. Can we do this also for NAG, license-wise? The reason for switching off the email notification for the NAG compiler is DW's code in MIISR, isn't it? I hope this will be fixed this week. Are there any major tasks missing right now? We can discuss the gory details at the next meeting and then eventually close this ticket in early September.

comment:13 Changed 14 years ago by Juergen Reuter

As a notice: for the MAC one can install the file system case-sensitive, however, for the MAC Mini wew should not do this, as the default is case insensitive, and we have to test for this case as well.

comment:14 Changed 14 years ago by Juergen Reuter

Priority: P1P3
Severity: blockernormal

As this is becoming more and more a "Wartungsarbeit" ranking it down.

comment:15 Changed 14 years ago by Boschmann

Resolution: fixed
Status: newclosed

MAC MINI is running as hudson8

The only problem so far is that make -j has always exit code 0 on Darwin, even when it fails. Therefore, the hudson job will never be tagged as "failed" as long as -j is set.

In total, there are no major problems any more so I close the ticket.

comment:16 Changed 13 years ago by Juergen Reuter

Priority: P3P1
Resolution: fixed
Severity: normalcritical
Status: closedreopened
Summary: Cruise ControlResurrect the Cruise Control
Version: 2.0beta2.0.4

comment:17 Changed 13 years ago by Juergen Reuter

Actually it would be brilliant to have the new HUDSON concept or even the server running until the next meeting in four weeks. E.g. at the moment 'make check' fails for the NAG compiler.

comment:18 Changed 13 years ago by Boschmann

I have set up a new CC server. It's name is not "hudson" any more, because the name is owend by oracle, so the community has shifted it to "jenkins". The server is not yet configured and it may take a couple of weeks till it's fully operational.

comment:19 Changed 13 years ago by Boschmann

Good news:

The Jenkins team has fixed a major matrix job bug two days ago. I've run some tests and believe that matrix jobs are applicable from now on. This reduces configuration efforts considerably, makes the system more flexible to changes in build parameters and gives well arranged result tables. I was hoping that this bug would get solved, so the actual configuration of the developing CC server is already based on matrix jobs. Now I am confident that I can present a working CC server next week.

comment:20 Changed 13 years ago by Juergen Reuter

What is the status? When is this up and running again??

comment:21 Changed 13 years ago by Boschmann

The i7's and jenkins are running. All jobs are configured in a way such that you can easily extend it to all architectures and compilers you like, but are running only on linux/amd64 so far. I have given up on the mac mini, because I fear that my grandchildren will graduate earlier then me when I continue to attend it. Some linux/i586 slaves are online as well, but these are much slower then the i7's and they would slow down the whole run, that's why they currently do not run any jobs.

Examples and "tables" do not work right now because of the "relocatable issue" (Ticket #224). I will watch this discussion before I try to work out a workaround.

comment:22 Changed 13 years ago by Juergen Reuter

So the cruise control (a.k.a. Jenkins) is open and working again. Some minor points (e.g. MAC OS X) and configuration is still missing. When do we set this up, and close this ticket finally?

comment:23 Changed 13 years ago by Boschmann

I would close the ticket when the examples and the "tables" run as batch jobs. More tests and more architectures can be added easily when needed/available.

comment:24 Changed 13 years ago by Boschmann

Thanks to the new i7 I could set the svn update interval from hourly to every 20 minutes.

comment:25 Changed 13 years ago by Juergen Reuter

So here is a list of the open issues: -- examples (working with gfortran, the one failing test is a REAL bug, what about

NAG? what was the issue there?)

-- trunk-dist-weekly (should add extended numerical tests here, but please could

fix the LaTeX installation on the slaves!)

-- what about the MAC OS X? would be way cool to include that one again....

but some people in some small towns in mid-Germany are content with being delivered broken hardware :P

-- hm.... is that it? guess so... So not much to do, and the closing of the ticket is near...

comment:26 Changed 13 years ago by Boschmann

Relocation still doesn't work with NAG. This is not really a problem for jenkins since we need relocation for running the examples and we don't need to test the examples with NAG.

trunk-dist-weekly is in deed a problem and I don't know the reason for the malfunction.

MAC OS X is not only failing because of hardware problems, it's software configuration is also more taxing then a debian configuration. You can do everything with apt get since debian 6. When I sent in the MAC because of a hard disc failure then I will get a clean installation in return, it will cost me a couple of days to bring it back into shape. Additionally, I doubt that the hardware is designed for nonstop operation. MAC hardware is supposed to be a status symbol and not to be a tool.

comment:27 Changed 13 years ago by Boschmann

Resolution: fixed
Status: reopenedclosed

All major problems are solved. Jenkins does now:

  • svn update every 20 minutes
  • distcheck with gcc-4.5.0, gcc-4.6.0 and nagfor-5.2 at new svn revisions
  • run examples with gcc-4.5.0 every night
  • extra-distcheck with gcc-4.5-LATEST, gcc-4.6-LATEST, gcc-4.7-LATEST and nagfor-5.2-LATEST every weekend

Right now, only 64 bit linux systems are configured. I have changed a lot to make use of the new reloction feature of whizard, so I better set up the other systems from scratch. Debian 6 includes almost everything I need, so this is much more simple then using debian 5.

I will not resurrect the mac mini, but I can turn it on so everyone who needs a jenkins job on mac os is invited to configure and run it.

A VirtualBox? server is configured and online and some virtual nodes are configured, which are called vslaveN with a number N. Unfortunately, the virtual cloud plugin of jenkins doesn't work, so jenkins cannot start and stop virtual machines on demand. So virtualization is almost useless in the moment.

There are two open problems left:

  • find an air conditioned room for jenkins
  • configure the reference value test

The first problem is not really a jenkins problem, but rather a general computing problem at our site. The second problem is no malfunction of jenkins but rather a job that has to be set up. Therefore I have decided to close this ticket, jenkins is working now.

comment:28 Changed 7 years ago by Juergen Reuter

Milestone: golden-classicsv42-backlog

Milestone renamed

Note: See TracTickets for help on using tickets.