Chapter 3 Working with WHIZARD
WHIZARD can run as a stand-alone program. You (the user) can steer WHIZARD either interactively or by a script file. We will first describe the latter method, since it will be the most common way to interact with the WHIZARD system.
3.1 Hello World
The legacy version series 1 of the program relied on a bunch of input files that the user had to provide in some obfuscated format. This approach is sufficient for straightforward applications. However, once you get experienced with a program, you start thinking about uses that the program’s authors did not foresee. In case of a Monte Carlo package, typical abuses are parameter scans, complex patterns of cuts and reweighting factors, or data analysis without recourse to external packages. This requires more flexibility.
Instead of transferring control over data input to some generic scripting language like PERL or Python (or even C++), which come with their own peculiarities and learning curves, we decided to unify data input and scripting in a dedicated steering language that is particularly adapted to the needs of Monte-Carlo integration, simulation, and simple analysis of the results. Thus we discovered what everybody knew anyway: that W(h)izards communicate in SINDARIN, Scripting INtegration, Data Analysis, Results display and INterfaces.
SINDARIN is a DSL – a domain-specific scripting language – that is designed for the single purpose of steering and talking to WHIZARD. Now since SINDARIN is a programming language, we honor the old tradition of starting with the famous Hello World program. In SINDARIN this reads simply
printf "Hello World!"
Open your favorite editor, type this text, and save it into a file
Now we assume that you – or your kind system administrator – has
installed WHIZARD in your executable path. Then you should open a
command shell and execute (we will come to the meaning of the
/home/user$ whizard -r hello.sin
and if everything works well, you get the output (the complete output including the WHIZARD banner is shown in Fig. 3.1)
| Writing log to 'whizard.log'
[... here a banner is displayed]
|=============================================================================| | WHIZARD 3.0.3 |=============================================================================| | Reading model file '/usr/local/share/whizard/models/SM.mdl' | Preloaded model: SM ! Process library 'default_lib': initialized ! Preloaded library: default_lib | Reading commands from file 'hello.sin' Hello World! | WHIZARD run finished. |=============================================================================|
If this has just worked for you, you can be confident that you have a working WHIZARD installation, and you have been able to successfully run the program.
3.2 A Simple Calculation
You may object that WHIZARD is not exactly designed for printing out plain text. So let us demonstrate a more useful example.
Looking at the Hello World output, we first observe that the program
writes a log file named (by default)
After the welcome banner, WHIZARD tells you that it reads a physics
model, and that it initializes and preloads a process library. The
process library is initially empty. It is ready for receiving
definitions of elementary high-energy physics processes (scattering or
decay) that you provide. The processes are set in the context of a
definite model of high-energy physics. By default this is the
Standard Model, dubbed
Here is the SINDARIN code for defining a SM physics process, computing its cross section, and generating a simulated event sample in Les Houches event format:
process ee = e1, E1 => e2, E2 sqrts = 360 GeV n_events = 10 sample_format = lhef simulate (ee)
As before, you save this text in a file (named, e.g.,
/home/user$ whizard -r ee.sin
(We will come to the meaning of the
| Writing log to 'whizard.log' [... banner ...] |=============================================================================| | WHIZARD 3.0.3 |=============================================================================| | Reading model file '/usr/local/share/whizard/models/SM.mdl' | Preloaded model: SM | Process library 'default_lib': initialized | Preloaded library: default_lib | Reading commands from file 'ee.sin' | Process library 'default_lib': recorded process 'ee' sqrts = 3.600000000000E+02 n_events = 10
| Starting simulation for process 'ee' | Simulate: process 'ee' needs integration | Integrate: current process library needs compilation | Process library 'default_lib': compiling ... | Process library 'default_lib': writing makefile | Process library 'default_lib': removing old files rm -f default_lib.la rm -f default_lib.lo default_lib_driver.mod opr_ee_i1.mod ee_i1.lo rm -f ee_i1.f90 | Process library 'default_lib': writing driver | Process library 'default_lib': creating source code rm -f ee_i1.f90 rm -f opr_ee_i1.mod rm -f ee_i1.lo /usr/local/bin/omega_SM.opt -o ee_i1.f90 -target:whizard -target:parameter_module parameters_SM -target:module opr_ee_i1 -target:md5sum '70DB728462039A6DC1564328E2F3C3A5' -fusion:progress -scatter 'e- e+ -> mu- mu+' [1/1] e- e+ -> mu- mu+ ... allowed. [time: 0.00 secs, total: 0.00 secs, remaining: 0.00 secs] all processes done. [total time: 0.00 secs] SUMMARY: 6 fusions, 2 propagators, 2 diagrams | Process library 'default_lib': compiling sources [.....]
| Process library 'default_lib': loading | Process library 'default_lib': ... success. | Integrate: compilation done | RNG: Initializing TAO random-number generator | RNG: Setting seed for random-number generator to 9616 | Initializing integration for process ee: | ------------------------------------------------------------------------ | Process [scattering]: 'ee' | Library name = 'default_lib' | Process index = 1 | Process components: | 1: 'ee_i1': e-, e+ => mu-, mu+ [omega] | ------------------------------------------------------------------------ | Beam structure: [any particles] | Beam data (collision): | e- (mass = 5.1099700E-04 GeV) | e+ (mass = 5.1099700E-04 GeV) | sqrts = 3.600000000000E+02 GeV | Phase space: generating configuration ... | Phase space: ... success. | Phase space: writing configuration file 'ee_i1.phs' | Phase space: 2 channels, 2 dimensions | Phase space: found 2 channels, collected in 2 groves. | Phase space: Using 2 equivalences between channels. | Phase space: wood Warning: No cuts have been defined.
| Starting integration for process 'ee' | Integrate: iterations not specified, using default | Integrate: iterations = 3:1000:"gw", 3:10000:"" | Integrator: 2 chains, 2 channels, 2 dimensions | Integrator: Using VAMP channel equivalences | Integrator: 1000 initial calls, 20 bins, stratified = T | Integrator: VAMP |=============================================================================| | It Calls Integral[fb] Error[fb] Err[%] Acc Eff[%] Chi2 N[It] | |=============================================================================| 1 784 8.3282892E+02 1.68E+00 0.20 0.06* 39.99 2 784 8.3118961E+02 1.23E+00 0.15 0.04* 76.34 3 784 8.3278951E+02 1.36E+00 0.16 0.05 54.45 |-----------------------------------------------------------------------------| 3 2352 8.3211789E+02 8.01E-01 0.10 0.05 54.45 0.50 3 |-----------------------------------------------------------------------------| 4 9936 8.3331732E+02 1.22E-01 0.01 0.01* 54.51 5 9936 8.3341072E+02 1.24E-01 0.01 0.01 54.52 6 9936 8.3331151E+02 1.23E-01 0.01 0.01* 54.51 |-----------------------------------------------------------------------------| 6 29808 8.3334611E+02 7.10E-02 0.01 0.01 54.51 0.20 3 |=============================================================================|
[.....] | Simulate: integration done | Simulate: using integration grids from file 'ee_m1.vg' | RNG: Initializing TAO random-number generator | RNG: Setting seed for random-number generator to 9617 | Simulation: requested number of events = 10 | corr. to luminosity [fb-1] = 1.2000E-02 | Events: writing to LHEF file 'ee.lhe' | Events: writing to raw file 'ee.evx' | Events: generating 10 unweighted, unpolarized events ... | Events: event normalization mode '1' | ... event sample complete. | Events: closing LHEF file 'ee.lhe' | Events: closing raw file 'ee.evx' | There were no errors and 1 warning(s). | WHIZARD run finished. |=============================================================================|
The final result is the desired event file, ee.lhe.
Let us discuss the output quickly to walk you through the procedures of a WHIZARD run: after the logfile message and the banner, the reading of the physics model and the initialization of a process library, the recorded process with tag ’ee’ is recorded. Next, user-defined parameters like the center-of-mass energy and the number of demanded (unweighted) events are displayed. As a next step, WHIZARD is starting the simulation of the process with tag ’ee’. It recognizes that there has not yet been an integration over phase space (done by an optional integrate command, cf. Sec. 5.7.1), and consequently starts the integration. It then acknowledges, that the process code for the process ’ee’ needs to be compiled first (done by an optional compile command, cf. Sec. 5.4.5). So, WHIZARD compiles the process library, writes the makefile for its steering, and as a safeguard against garbage removes possibly existing files. Then, the source code for the library and its processes are generated: for the process code, the default method – the matrix element generator O’Mega is called (cf. Sec. 9.3); and the sources are being compiled.
The next steps are the loading of the process library, and WHIZARD reports the completion of the integration. For the Monte-Carlo integration, a random number generator is initialized. Here, it is the default generator, TAO (for more details, cf. Sec. 6.2, while the random seed is set to a value initialized by the system clock, as no seed has been provided in the SINDARIN input file.
Now, the integration for the process ’ee’ is initialized, and information about the process (its name, the name of its process library, its index inside the library, and the process components out of which it consists, cf. Sec. 5.4.4) are displayed. Then, the beam structure is shown, which in that case are symmetric partonic electron and positron beams with the center-of-mass energy provided by the user (360 GeV). The next step is the generation of the phase space, for which the default phase space method wood (for more details cf. Sec. 8.3) is selected. The integration is performed, and the result with absolute and relative error, unweighting efficiency, accuracy, χ2 quality is shown.
The final step is the event generation (cf. Chap. 11). The integration grids are now being used, again the random number generator is initialized. Finally, event generation of ten unweighted events starts (WHIZARD let us know to which integrated luminosity that would correspond), and events are written both in an internal (binary) event format as well as in the demanded LHE format. This concludes the WHIZARD run.
After a more comprehensive introduction into the SINDARIN steering language in the next chapter, Chap. 4, we will discuss all the details of the different steps of this introductory example.
3.3 WHIZARD in a Computing Environment
3.3.1 Working on a Single Computer
After installation, WHIZARD is ready for use. There is a slight complication if WHIZARD has been installed in a location that is not in your standard search paths.
In that case, to successfully run WHIZARD, you may either
In either case, try to call whizard --help in order to check whether this is done correctly.
For a new WHIZARD project, you should set up a new (empty) directory. Depending on the complexity of your task, you may want to set up separate directories for each subproblem that you want to tackle, or even for each separate run. The location of the directories is arbitrary.
To run, WHIZARD needs only a single input file, a SINDARIN command script with extension .sin (by convention). Running WHIZARD is as simple as
your-workspace> whizard your-input.sin
No other configuration files are needed. The total number of auxiliary and output files generated in a single run may get quite large, however, and they may clutter your workspace. This is the reason behind keeping subdirectories on a per-run basis.
3.3.2 Working Parallel on Several Computers
For integration (only VAMP2), WHIZARD supports parallel execution via MPI by communicating between parallel tasks on a single machine or distributed over several machines.
During integration the calculation of channels is distributed along several workers where a master worker collects the results and adapts weights and grids.
In wortwhile cases (e.g. high number of calls in one channel), the calculation of a single grid is additionally distributed.
For that, we provide two different parallelization methods, which can be steered by
Both methods use a full non-blocking communication approach in order to collect the integration results of each channel after each iteration. After finishing the computation of a channel, the associated slave worker spawns a callback mechansim leading to the initialization of a sending process to the master. The master worker organizes, depending on the parallelization method, the correct closing of the sending process for a given channel by a matching receiving process. The callback approach allows us to concurrently communicate and produce integration results providing an increased parallelization portion, i.e. better HPC performance and utilization.
The load method comes with a drawback that it does not work with less than three workers. Hence, we recommend (e.g. for debugging purpose of the parallel setup) to use the simple method, and to use the load method only for direct production runs.
In order to use these advancements, WHIZARD requires an installed MPI-3.1 capable library (e.g. OpenMPI) and configuration and compilation with the appropriate flags, cf. Sec. 2.3.
MPI support is only active when the integration method is set to VAMP2. Additionally, to preserve the numerical properties of a single task run, it is recommended to use the RNGstream as random number generator.
WHIZARD has then to be called by mpirun
where the number of parallel tasks can be set by -np and a hostfile can be given by --hostfile. It is recommended to use --output-filename which lets mpirun redirect the standard (error) output to a file, for each worker separately.
Notes on Parallelization with MPI
The parallelization of WHIZARD requires that all instances of the parallel run be able to write and read all files produced by WHIZARD in a network file system as the current implementation does not handle parallel I/O. Usually, high-performance clusters have support for at least one network filesystem.
Furthermore, not all functions of WHIZARD are currently supported or
are only supported in a limited way in parallel mode. Currently the
Some features that have been missing in the very first implementation of the parallelized integration have now been made available, like the support of run IDs and the parallelization of the event generation.
A final remark on the stability of the numerical results in terms of the number of workers involved. Under certain circumstances, results between different numbers of workers but using otherwise an identical SINDARIN file can lead to slightly numerically different (but statistically compatible) results for integration or event generation This is related to the execution of the computational operations in MPI, which we use to reduce results from all workers. If the order of the numbers in the arithmetical operations changes, for example, by different setups of the workers, then the numerical results change slightly, which in turn is amplified under the influence of the adaptation. Nevertheless, the results are all statistically consistent.
3.3.3 Stopping and Resuming WHIZARD Jobs
On a Unix-like system, it is possible to prematurely stop running jobs by a kill(1) command, or by entering Ctrl-C on the terminal.
If the system supports this, WHIZARD traps these signals. It also traps some signals that a batch operating system might issue, e.g., for exceeding a predefined execution time limit. WHIZARD tries to complete the calculation of the current event and gracefully close open files. Then, the program terminates with a message and a nonzero return code. Usually, this should not take more than a fraction of a second.
If, for any reason, the program does not respond to an interrupt, it is always possible to kill it by kill -9. A convenient method, on a terminal, would be to suspend it first by Ctrl-Z and then to kill the suspended process.
The program is usually able to recover after being stopped. Simply run the job again from start, with the same input, all output files generated so far left untouched. The results obtained so far will be quickly recovered or gathered from files written in the previous run, and the actual time-consuming calculation is resumed near the point where it was interrupted.1 If the interruption happened during an integration step, it is resumed after the last complete iteration. If it was during event generation, the previous events are taken from file and event generation is continued.
The same mechanism allows for efficiently redoing a calculation with similar, somewhat modified input. For instance, you might want to add a further observable to event analysis, or write the events in a different format. The time for rerunning the program is determined just by the time it takes to read the existing integration or event files, and the additional calculation is done on the recovered information.
By managing various checksums on its input and output files, WHIZARD detects changes that affect further calculations, so it does a real recalculation only where it is actually needed. This applies to all steps that are potentially time-consuming: matrix-element code generation, compilation, phase-space setup, integration, and event generation. If desired, you can set command-line options or SINDARIN parameters that explicitly discard previously generated information.
3.3.4 Files and Directories: default and customization
WHIZARD jobs take a small set of files as input. In many cases, this is just a single SINDARIN script provided by the user. When running, WHIZARD can produce a set of auxiliary and output files:
A complex workflow with several processes, parameter sets, or runs, can easily lead to in file-name clashes or a messy working directory. Furthermore, running a batch job on a dedicated computing environment often requires transferring data from a user directory to the server and back.
Custom directory and file names can be used to organize things and facilitate dealing with the environment, along with the available batch-system tools for coordinating file transfer.
3.3.5 Batch jobs on a different machine
It is possible to separate the tasks of process-code compilation, integration, and simulation, and execute them on different machines. To make use of this feature, the local and remote machines including all installed libraries that are relevant for WHIZARD, must be binary-compatible.
To simplify transferring whole directories, WHIZARD supports the --pack and --unpack options. You may specify any number of these options for a WHIZARD run. (The feature relies on the GNU version of the tar utility.)
runs WHIZARD with the SINDARIN script script1.sin as input, where within the script you have defined
as the target directory for process-compilation files. After completion, the program will tar and gzip the target directory as my_ws.tgz. You should copy this file to the remote machine as one of the job’s input files.
On the remote machine, you can then run the program with
where script2.sin should include script1.sin, and add integration or simulation commands. The contents of ws.tgz will thus be unpacked and reused on the remote machine, instead of generating new process code.
3.3.6 Static Linkage
In its default running mode, WHIZARD compiles process-specific matrix element code on the fly and dynamically links the resulting library. On the computing server, this requires availability of the appropriate Fortran compiler, as well as the OCaml compiler suite, and the dynamical linking feature.
Since this may be unavailable or undesired, there is a possibility to distribute WHIZARD as a statically linked executable that contains a pre-compiled library of processes. This removes the need for the Fortran compiler, the OCaml system, and extra dynamic linking. Any external libraries that are accessed (the Fortran runtime environment, and possibly some dynamically linked external libraries and/or the C++ runtime library, must still be available on the target system, binary-compatible. Otherwise, there is no need for transferring the complete WHIZARD installation or process-code compilation data.
Generating, compiling and linking matrix element code is done in advance on a machine that can access the required tools and produces compatible libraries. This procedure is accomplished by SINDARIN commands, explained below in Sec. 5.4.7.
In this section, we list known issues or problems and give advice on what can be done in case something does not work as intended.
3.4.1 Possible (uncommon) build problems
OCaml versions and O’Mega builds
For the matrix element generator O’Mega of WHIZARD the functional programming language OCaml is used. Unfortunately, the versions of the OCaml compiler from 3.12.0 on broke backwards compatibility. Therefore, versions of O’Mega/WHIZARD up to v2.0.2 only compile with older versions (3.04 to 3.11 works). This has been fixed in all WHIZARD versions from 2.0.3 on.
Identical Build and Source directories
There is a problem that only occurred with version 2.0.0 and has been corected for all follow-up versions. It can only appear if you compile the WHIZARD sources in the source directory. Then an error like this may occur:
In this case, please unpack a fresh copy of WHIZARD and configure it in a separate directory (not necessarily a subdirectory). Then the compilation will go through:
The developers use this setup to be able to test different compilers. Therefore building in the same directory is not as thoroughly tested. This behavior has been patched from version 2.0.1 on. But note that in general it is always adviced to keep build and source directory apart from each other.
3.4.2 What happens if WHIZARD throws an error?
Particle name special characters in process declarations
Trying to use a process declaration like
will lead to a SINDARIN syntax error:
WHIZARD tries to interpret the minus and plus signs as operators (KEYWORD: ’-’), so you have to quote the particle names: process foo = "e-", "e+" => "mu-", "mu+".
Missing collider energy
This happens if you forgot to set the collider energy in the integration of a scattering process:
This will solve your problem:
Missing process declaration
If you try to integrate or simulate a process that has not declared before (and is also not available in a library that might be loaded), WHIZARD will complain:
Note that this could sometimes be a simple typo, e.g. in that case an integrate (f00) instead of integrate (foo)
Ambiguous initial state without beam declaration
When the user declares a process with a flavor sum in the initial state, e.g.
then a fatal error will be issued:
What now? Either a structure function providing a tensor structure in flavors has to be provided like
or, if the partonic process was intended, a specific flavor has to be singled out,
which would take only the up-quarks. Note that a sum over process components with varying initial states is not possible.
Invalid or unsupported beam structure
An error message like
This happens if you try to use a beam structure with is either not supported by WHIZARD (meaning that there is no phase-space parameterization for Monte-Carlo integration available in order to allow an efficient sampling), or you have chosen a combination of beam structure functions that do not make sense physically. Here is an example for the latter (lepton collider ISR applied to protons, then proton PDFs):
Mismatch in beams
Sometimes you get a rather long error output statement followed by a fatal error:
As WHIZARD indicates, this could have happened because the hard process setup did not match the specification of the beams as in:
In that case, the order of the beam particles simply was wrong, exchange proton and electron (together with the structure functions) into beams = e, p => none, pdf_builtin, and WHIZARD will be happy.
Unstable heavy beam particles
If you try to use unstable particles as beams that can potentially decay into the final state particles, you might encounter the following error message:
This happens basically only for processes in testing/validation (like t t → b b). In principle, it could also happen in a real physics setup, e.g. when simulating electron pairs at a muon collider:
However, WHIZARD at the moment does not allow a muon width, and so WHIZARD is not able to decay a muon in a scattering process. A possibile decay of the beam particle into (part of) the final state might lead to instabilities in the phase space setup. Hence, WHIZARD do not let you perform such an integration right away. When you nevertheless encounter such a rare occasion in your setup, there is a possibility to convert this fatal error into a simple warning by setting the flag:
Impossible beam polarization
If you specify a beam polarization that cannot correspond to any physically allowed spin density matrix, e.g.,
WHIZARD will throw a fatal error like this:
Beams with crossing angle
Specifying a crossing angle (e.g. at a linear lepton collider) without explicitly setting the beam momenta,
triggers a fatal:
In that case the single beam momenta have to be explicitly set:
Phase-space generation failed
Sometimes an error might be issued that WHIZARD could not generate a valid phase-space parameterization:
You see that WHIZARD tried to increase the number of off-shell lines that are taken into account for the phase-space setup. The second most important parameter for the phase-space setup, phs_t_channel, however, is not increased automatically. Its default value is 6, so e.g. for the process e+ e− → 8γ you will run into the problem above. Setting
where <n> is the number of final-state particles will solve the problem.
Non-converging process integration
There could be several reasons for this to happen. The most prominent one is that no cuts have been specified for the process (WHIZARD2 does not apply default cuts), and there are singular regions in the phase space over which the integration stumbles. If cuts have been specified, it could be that they are not sufficient. E.g. in pp → jj a distance cut between the two jets prevents singular collinear splitting in their generation, but if no pT cut have been set, there is still singular collinear splitting from the beams.
Why is there no event file?
If no event file has been generated, WHIZARD stumled over some error and should have told you, or, you simply forgot to set a simulate command for your process. In case there was a simulate command but the process under consideration is not possible (e.g. a typo, e1, E1 => e2, E3 instead of e1, E1 => e3, E3), then you get an error like that:
Why is the event file empty?
In order to get events, you need to set either a desired number of events:
or you have to specify a certain integrated luminosity (the default unit being inverse femtobarn:
In case you set both, WHIZARD will take the one that leads to the higher number of events.
Parton showering fails
For BSM models containing massive stable or long-lived particles parton showering with PYTHIA6 fails:
The solution to that problem is discussed in Sec. 10.7.3.
3.4.3 Debugging, testing, and validation
Catching/tracking arithmetic exceptions
Catching arithmetic exceptions is not automatically supported by Fortran compilers. In general, flags that cause the compiler to keep track of arithmetic exceptions are diminishing the maximally possible performance, and hence they should not be used in production runs. Hence, we refrained from making these flags a default. They can be added using the FCFLAGS = <flags> settings during configuration. For the NAG Fortran compiler we use the flags -C=all -nan -gline for debugging purposes. For the gfortran compilers, the flags -ffpe-trap=invalid,zero,overflow are the corresponding debugging flags. For tests, debugging or first sanity checks on your setup, you might want to make use of these flags in order to track possible numerical exceptions in the produced code. Some compilers started to include IEEE exception handling support (Fortran 2008 status), but we do not use these implementations in the WHIZARD code (yet).