WHIZARD
HOME
- Main Page
MANUAL, WIKI, NEWS
REPOSITORY, LAUNCHPAD, BUG TRACKER
DOWNLOADS
- Download Page
- Lepton collider beam spectra
- FeynRules and SARAH models
- Patches/Unofficial versions
SUBPACKAGES/INTERFACES
- O'Mega Matrix Element Generator
- VAMP Monte Carlo Integrator
- CIRCE1/2 Beam Spectra Generator
- WHIZARD/FeynRules interface (deprecated)
CONTACT
- Launchpad Support Page
- Contact us

Chapter ‍11 More on Event Generation

In order to perform a physics analysis with WHIZARD one has to generate events. This seems to be a trivial statement, but as there have been any questions like "My WHIZARD does not produce plots – what has gone wrong?" we believe that repeating that rule is worthwile. Of course, it is not mandatory to use WHIZARD’s own analysis set-up, the user can always choose to just generate events and use his/her own analysis package like ROOT, or TopDrawer, or you name it for the analysis.

Accordingly, we first start to describe how to generate events and what options there are – different event formats, renaming output files, using weighted or unweighted events with different normalizations. How to re-use and manipulate already generated event samples, how to limit the number of events per file, etc. etc.

11.1 Event generation

To explain how event generation works, we again take our favourite example, e⁺e⁻ → µ⁺ µ⁻,

  process eemm = e1, E1 => e2, E2

The command to trigger generation of events is simulate (<proc_name>) { <options> }, so in our case – neglecting any options for now – simply:

  simulate (eemm)

When you run this SINDARIN file you will experience a fatal error: FATAL ERROR: Colliding beams: sqrts is zero (please set sqrts). This is because WHIZARD needs to compile and integrate the process eemm first before event simulation, because it needs the information of the corresponding cross section, phase space parameterization and grids. It does both automatically, but you have to provide WHIZARD with the beam setup, or at least with the center-of-momentum energy. A corresponding integrate command like

  sqrts = 500 GeV
  integrate (eemm) { iterations = 3:10000 }

obviously has to appear before the corresponding simulate command (otherwise you would be punished by the same error message as before). Putting things in the correct order results in an output like:

| Reading model file '/usr/local/share/whizard/models/SM.mdl'
| Preloaded model: SM
| Process library 'default_lib': initialized
| Preloaded library: default_lib
| Reading commands from file 'bla.sin'
| Process library 'default_lib': recorded process 'eemm'
sqrts =  5.000000000000E+02
| Integrate: current process library needs compilation
| Process library 'default_lib': compiling ...
| Process library 'default_lib': keeping makefile
| Process library 'default_lib': keeping driver
| Process library 'default_lib': active
| Process library 'default_lib': ... success.
| Integrate: compilation done
| RNG: Initializing TAO random-number generator
| RNG: Setting seed for random-number generator to 29912
| Initializing integration for process eemm:
| ------------------------------------------------------------------------
| Process [scattering]: 'eemm'
|   Library name  = 'default_lib'
|   Process index = 1
|   Process components:
|     1: 'eemm_i1':   e-, e+ => mu-, mu+ [omega]
| ------------------------------------------------------------------------
| Beam structure: [any particles]
| Beam data (collision):
|   e-  (mass = 5.1099700E-04 GeV)
|   e+  (mass = 5.1099700E-04 GeV)
|   sqrts = 5.000000000000E+02 GeV
| Phase space: generating configuration ...
| Phase space: ... success.
| Phase space: writing configuration file 'eemm_i1.phs'
| Phase space: 2 channels, 2 dimensions
| Phase space: found 2 channels, collected in 2 groves.
| Phase space: Using 2 equivalences between channels.
| Phase space: wood
Warning: No cuts have been defined.
| OpenMP: Using 8 threads
| Starting integration for process 'eemm'
| Integrate: iterations = 3:10000
| Integrator: 2 chains, 2 channels, 2 dimensions
| Integrator: Using VAMP channel equivalences
| Integrator: 10000 initial calls, 20 bins, stratified = T
| Integrator: VAMP
|=============================================================================|
| It      Calls  Integral[fb]  Error[fb]   Err[%]    Acc  Eff[%]   Chi2 N[It] |
|=============================================================================|
   1       9216  4.2833237E+02  7.14E-02    0.02    0.02*  40.29
   2       9216  4.2829071E+02  7.08E-02    0.02    0.02*  40.29
   3       9216  4.2838304E+02  7.04E-02    0.02    0.02*  40.29
|-----------------------------------------------------------------------------|
   3      27648  4.2833558E+02  4.09E-02    0.01    0.02   40.29    0.43   3
|=============================================================================|
| Time estimate for generating 10000 events: 0d:00h:00m:04s
| Creating integration history display eemm-history.ps and eemm-history.pdf
| Starting simulation for process 'eemm'
| Simulate: using integration grids from file 'eemm_m1.vg'
| RNG: Initializing TAO random-number generator
| RNG: Setting seed for random-number generator to 29913
| OpenMP: Using 8 threads
| Simulation: requested number of events = 0
|             corr. to luminosity [fb-1] =   0.0000E+00
| Events: writing to raw file 'eemm.evx'
| Events: generating 0 unweighted, unpolarized events ...
| Events: event normalization mode '1'
|         ... event sample complete.
| Events: closing raw file 'eemm.evx'
| There were no errors and    1 warning(s).
| WHIZARD run finished.
|=============================================================================|

So, WHIZARD tells you that it has entered simulation mode, but besides this, it has not done anything. The next step is that you have to demand event generation – there are two ways to do this: you could either specify a certain number, say 42, of events you want to have generated by WHIZARD, or you could provide a number for an integrated luminosity of some experiment. (Note, that if you choose to take both options, WHIZARD will take the one which gives the larger event sample. This, of course, depends on the given process(es) – as well as cuts – and its corresponding cross section(s).) The first of these options is set with the command: n_events = <number>, the second with luminosity = <number> <opt. unit>.

Another important point already stated several times in the manual is that WHIZARD follows the commands in the steering SINDARIN file in a chronological order. Hence, a given number of events or luminosity after a simulate command will be ignored – or are relevant only for any simulate command potentially following further down in the SINDARIN file. So, in our case, try:

 n_events = 500
 luminosity = 10
 simulate (eemm)

Per default, numbers for integrated luminosity are understood as inverse femtobarn. So, for the cross section above this would correspond to 4283 events, clearly superseding the demand for 500 events. After reducing the luminosity number from ten to one inverse femtobarn, 500 is the larger number of events taken by WHIZARD for event generation. Now WHIZARD tells you:

| Simulation: requested number of events = 500
|             corr. to luminosity [fb-1] =   1.1673E+00
| Events: reading from raw file 'eemm.evx'
| Events: reading 500 unweighted, unpolarized events ...
| Events: event normalization mode '1'
| ... event file terminates after 0 events.
| Events: appending to raw file 'eemm.evx'
| Generating remaining  500 events ...
|         ... event sample complete.
| Events: closing raw file 'eemm.evx'

I.e., it evaluates the luminosity to which the sample of 500 events would correspond to, which is now, of course, bigger than the 1 fb⁻¹ explicitly given for the luminosity. Furthermore, you can read off that a file whizard.evx has been generated, containing the demanded 500 events. (It was there before containing zero events, because to n_events or luminosity value had been set. WHIZARD then tried to get the events first from file before generating new ones). Files with the suffix .evx are binary format event files, using a machine-dependent WHIZARD-specific event file format. Before we list the event formats supported by WHIZARD, the next two sections will tell you more about unweighted and weighted events as well as different possibilities to normalize events in WHIZARD.

As already explained for the libraries, as well as the phase space and grid files in Chap. ‍5, WHIZARD is trying to re-use as much information as possible. This is of course also true for the event files. There are special MD5 check sums testing the integrity and compatibility of the event files. If you demand for a process for which an event file already exists (as in the example above, though it was empty) equally many or less events than generated before, WHIZARD will not generate again but re-use the existing events (as already explained, the events are stored in a WHIZARD-own binary event format, i.e. in a so-called .evx file. If you suppress generation of that file, as will be described in subsection 11.5 then WHIZARD has to generate events all the time). From version v2.2.0 of WHIZARD on, the program is also able to read in event from different event formats. However, most event formats do not contain as many information as WHIZARD’s internal format, and a complete reconstruction of the events might not be possible. Re-using event files is very practical for doing several different analyses with the same data, especially if there are many and big data samples. Consider the case, there is an event file with 200 events, and you now ask WHIZARD to generate 300 events, then it will re-use the 200 events (if MD5 check sums are OK!), generate the remaining 100 events and append them to the existing file. If the user for some reason, however, wants to regenerate events (i.e. ignoring possibly existing events), there is the command option whizard --rebuild-events.

11.2 Unweighted and weighted events

WHIZARD is able to generate unweighted events, i.e. events that are distributed uniformly and each contribute with the same event weight to the whole sample. This is done by mapping out the phase space of the process under consideration according to its different phase space channels (which each get their own weights), and then unweighting the sample of weighted events. Only a sample of unweighted events could in principle be compared to a real data sample from some experiment. The seventh column in the WHIZARD iteration/adaptation procedure tells you about the efficiency of the grids, i.e. how well the phase space is mapped to a flat function. The better this is achieved, the higher the efficiency becomes, and the closer the weights of the different phase space channels are to uniformity. This means, for higher efficiency less weighted events ("calls") are needed to generate a single unweighted event. An efficiency of 10 % means that ten weighted events are needed to generate one single unweighted event. After the integration is done, WHIZARD uses the duration of calls during the adaptation to estimate a time interval needed to generate 10,000 unweighted events. The ability of the adaptive multi-channel Monte Carlo decreases with the number of integrations, i.e. with the number of final state particles. Adding more and more final state particles in general also increases the complexity of phase space, especially its singularity structure. For a 2 → 2 process the efficiency is roughly of the order of several tens of per cent. As a rule of thumb, one can say that with every additional pair of final state particle the average efficiency one can achieve decreases by a factor of five to ten.

The default of WHIZARD is to generate unweighted events. One can use the logical variable ?unweighted = false to disable unweighting and generate weighted events. (The command ?unweighted = true is a tautology, because true is the default for this variable.) Note that again this command has to appear before the corresponding simulate command, otherwise it will be ignored or effective only for any simulate command appearing later in the SINDARIN file.

In the unweighted procedure, WHIZARD is keeping track of the highest weight that has been appeared during the adaptation, and the efficiency for the unweighting has been estimated from the average value of the sampling function compared to the maximum value. In principle, during event generation no events should be generated whose sampling function value exceeds the maximum function value encountered during the grid adaptation. Sometimes, however, there are numerical fluctuations and such events are happening. They are called excess events. WHIZARD does keep track of these excess events during event generation and will report about them, e.g.:

Warning: Encountered events with excess weight: 9 events (  0.090 %)
| Maximum excess weight = 6.083E-01
| Average excess weight = 2.112E-04

Whenever in an event generation excess events appear, this shows that the adaptation of the sampling function has not been perfect. When the number of excess weights is a finite number of percent, you should inspect the phase-space setup and try to improve its settings to get a better adaptation.

Generating weighted events is, of course, much faster if the same number of events is requested. Each event carries a weight factor which is taken into account for any internal analysis (histograms), and written to file if an external file format has been selected. The file format must support event weights.

In a weighted event sample, there is typically a fraction of events which effectively have weight zero, namely those that have been created by the phase-space sampler but do not pass the requested cuts. In the default setup, those events are silently dropped, such that the events written to file or available for analysis all have nonzero weight. However, dropping such events affects the overall normalization. If this has happened, the program will issue a warning of the form

| Dropped events (weight zero) = 1142 (total 2142)
Warning: All event weights must be rescaled by f = 4.66853408E-01

This factor has to be applied by hand to any external event files (and to internally generated histograms). The program cannot include the factor in the event records, because it is known only after all events have been generated. To avoid this problem, there is the logical flag ?keep_failed_events which tells WHIZARD not to drop events with weight zero. The normalization will be correct, but the event sample will include invalid events which have to be vetoed by their zero weight, before any operations on the event record are performed.

11.3 Choice on event normalizations

There are basically four different choices to normalize event weights (⟨…⟩ denotes the average):

⟨w_i⟩ = 1, ⟨∑_i w_i⟩ = N
⟨w_i⟩ = σ, ⟨∑_i w_i⟩ = N × σ
⟨w_i⟩ = 1/N, ⟨∑_i w_i⟩ = 1
⟨w_i⟩ = σ/N, ⟨∑_i w_i⟩ = σ

So the four options are to have the average weight equal to unity, to the cross section of the corresponding process, to one over the number of events, or the cross section over the event calls. In these four cases, the event weights sum up to the event number, the event number times the cross section, to unity, and to the cross section, respectively. Note that neither of these really guarantees that all event weights individually lie in the interval 0 ≤ w_i ≤ 1.

The user can steer the normalization of events by using in SINDARIN input files the string variable $sample_normalization. The default is $sample_normalization = "auto", which uses option 1 for unweighted and 2 for weighted events, respectively. Note that this is also what the Les Houches Event Format (LHEF) demands for both types of events. This is WHIZARD’s preferred mode, also for the reason, that event normalizations are independent from the number of events. Hence, event samples can be cut or expanded without further need to adjust the normalization. The unit normalization (option 1) can be switched on also for weighted events by setting the event normalization variable equal to "1". Option 2 can be demanded by setting $sample_normalization = "sigma". Options 3 and 4 can be set by "1/n" and "sigma/n", respectively. WHIZARD accepts small and capital letters for these expressions.

There are several event formats (based upon the old common block definition HEPRUP) like some of the ASCII formats, LHA, LHE and HepMC that demand cross sections (and corresponding MCintegration errors) to be given in picobarn. So they are converted from the WHIZARD default of femtobarn to picobarn. The only exception is if a (pseudo-)event file for a decay is generated where the unit in those entries is downscaled by a factor of 1000, but remains in GeV as default unit.

In the following section we show some examples when discussing the different event formats available in WHIZARD.

11.4 Event selection

The selection expression (cf. Sec. ‍5.9.2) reduces the event sample during generation or rescanning, selecting only events for which the expression evaluates to true. Apart from internal analysis, the selection also applies to writing external files. For instance, the following code generates a e⁺e⁻→ W⁺W⁻ sample with longitudinally polarized W bosons only:

process ww = "e+", "e-" => "W-", "W+"
polarized "W+"
polarized "W-"
?polarized_events = true
sqrts = 500
selection = all Hel == 0 ["W+":"W-"]
simulate (ww) { n_events = 1000 }

The number of events that end up in the sample on file is equal to the number of events with longitudinally polarized Ws in the generated sample, so the file will contain less than 1000 events.

11.5 Supported event formats

Event formats can either be distinguished whether they are plain text (i.e. ASCII) formats or binary formats. Besides this, one can classify event formats according to whether they are natively supported by WHIZARD or need some external program or library to be linked. Table ‍11.1 gives a complete list of all event formats available in WHIZARD. The second column shows whether these are ASCII or binary formats, the third column contains brief remarks about the corresponding format, while the last column tells whether external programs or libraries are needed (which is the case only for the HepMC formats).

Format Type remark ext.

ascii ASCII WHIZARD verbose format no

Athena ASCII variant of HEPEVT no

debug ASCII most verbose WHIZARD format no

evx binary WHIZARD’s home-brew no

HepMC ASCII HepMC format yes

HEPEVT ASCII WHIZARD ‍1 style no

LCIO ASCII LCIO format yes

LHA ASCII WHIZARD ‍1/old Les Houches style no

LHEF ASCII Les Houches accord compliant no

long ASCII variant of HEPEVT no

mokka ASCII variant of HEPEVT no

short ASCII variant of HEPEVT no

StdHEP (HEPEVT) binary based on HEPEVT common block no

StdHEP (HEPRUP/EUP) binary based on HEPRUP/EUP common block no

Weight stream ASCII just weights no

Table 11.1: Event formats supported by WHIZARD, classified according to ASCII/binary formats and whether an external program or library is needed to generate a file of this format. For both the HEPEVT and the LHA format there is a more verbose variant.

The ".evx” is WHIZARD’s native binary event format. If you demand event generation and do not specify anything further, WHIZARD will write out its events exclusively in this binary format. So in the examples discussed in the previous chapters (where we omitted all details about event formats), in all cases this and only this internal binary format has been generated. The generation of this raw format can be suppressed (e.g. if you want to have only one specific event file type) by setting the variable ?write_raw = false. However, if the raw event file is not present, WHIZARD is not able to re-use existing events (e.g. from an ASCII file) and will regenerate events for a given process. Note that from version v2.2.0 of WHIZARD on, the program is able to (partially) reconstruct complete events also from other formats than its internal format (e.g. LHEF), but this is still under construction and not yet complete.

Other event formats can be written out by setting the variable sample_format = <format>, where <format> can be any of the following supported variables:

ascii: a quite verbose ASCII format which contains lots of information (an example is shown in the appendix).
Standard suffix: .evt
debug: an even more verbose ASCII format intended for debugging which prints out also information about the internal data structures
Standard suffix: .debug
hepevt: ASCII format that writes out a specific incarnation of the HEPEVT common block (WHIZARD ‍1 back-compatibility)
Standard suffix: .hepevt
hepevt_verb: more verbose version of hepevt (WHIZARD ‍1 back-compatibility)
Standard suffix: .hepevt.verb
short: abbreviated variant of the previous HEPEVT (WHIZARD 1 back-compatibility)
Standard suffix: .short.evt
long: HEPEVT variant that contains a little bit more information than the short format but less than HEPEVT (WHIZARD 1 back-compatibility)
Standard suffix: .long.evt
athena: HEPEVT variant suitable for read-out in the ATLAS ATHENA software environment (WHIZARD 1 back-compatibility)
Standard suffix: .athena.evt
mokka: HEPEVT variant suitable for read-out in the MOKKA ILC software environment
Standard suffix: .mokka.evt
lcio: LCIO binary format (only available if LCIO is installed and correctly linked)
Standard suffix: .slcio
lha: Implementation of the Les Houches Accord as it was in the old MadEvent and WHIZARD ‍1
Standard suffix: .lha
lha_verb: more verbose version of lha
Standard suffix: .lha.verb
lhef: Formatted Les Houches Accord implementation that contains the XML headers
Standard suffix: .lhe
hepmc: HepMC ASCII format (only available if HepMC is installed and correctly linked)
Standard suffix: .hepmc
stdhep: StdHEP binary format based on the HEPEVT common block
Standard suffix: .hep
stdhep_up: StdHEP binary format based on the HEPRUP/HEPEUP common blocks
Standard suffix: .up.hep
stdhep_ev4: StdHEP binary format based on the HEPEVT/HEPEV4 common blocks
Standard suffix: .ev4.hep
weight_stream: Format that prints out only the event weight (and maybe alternative ones)
Standard suffix: .weight.dat

Of course, the variable sample_format can contain more than one of the above identifiers, in which case more than one different event file format is generated. The list above also shows the standard suffixes for these event formats (remember, that the native binary format of WHIZARD does have the suffix .evx). (The suffix of the different event formats can even be changed by the user by setting the corresponding variable $extension_lhef = "foo" or $extension_ascii_short = "bread". The dot is automatically included.)

The name of the corresponding event sample is taken to be the string of the name of the first process in the simulate statement. Remember, that conventionally the events for all processes in one simulate statement will be written into one single event file. So simulate (proc1, proc2) will write events for the two processes proc1 and proc2 into one single event file with name proc1.evx. The name can be changed by the user with the command $sample = "<name>".

The commands $sample and sample_format are both accepted as optional arguments of a simulate command, so e.g. simulate (proc) { $sample = "foo" sample_format = hepmc } generates an event sample in the HepMC format for the process proc in the file foo.hepmc.

Examples for event formats, for specifications of the event formats correspond the different accords and publications ‍¹:

HEPEVT:

The HEPEVT is an ASCII event format that does not contain an event file header. There is a one-line header for each single event, containing four entries. The number of particles in the event (ISTHEP), which is four for a fictitious example process hh→ hh, but could be larger if e.g. beam remnants are demanded to be included in the event. The second entry and third entry are the number of outgoing particles and beam remnants, respectively. The event weight is the last entry. For each particle in the event there are three lines: the first one is the status according to the HEPEVT format, ISTHEP, the second one the PDG code, IDHEP, then there are the one or two possible mother particle, JMOHEP, the first and last possible daughter particle, JDAHEP, and the polarization. The second line contains the three momentum components, p_x, p_y, p_z, the particle energy E, and its mass, m. The last line contains the position of the vertex in the event reconstruction.

 4 2 0  3.0574068604E+08
 2 25 0 0 3 4 0
  0.0000000000E+00  0.0000000000E+00  4.8412291828E+02  5.0000000000E+02  1.2500000000E+02
  0.0000000000E+00  0.0000000000E+00  0.0000000000E+00  0.0000000000E+00  0.0000000000E+00
 2 25 0 0 3 4 0
  0.0000000000E+00  0.0000000000E+00 -4.8412291828E+02  5.0000000000E+02  1.2500000000E+02
  0.0000000000E+00  0.0000000000E+00  0.0000000000E+00  0.0000000000E+00  0.0000000000E+00
 1 25 1 2 0 0 0
 -1.4960220911E+02 -4.6042825611E+02  0.0000000000E+00  5.0000000000E+02  1.2500000000E+02
  0.0000000000E+00  0.0000000000E+00  0.0000000000E+00  0.0000000000E+00  0.0000000000E+00
 1 25 1 2 0 0 0
  1.4960220911E+02  4.6042825611E+02  0.0000000000E+00  5.0000000000E+02  1.2500000000E+02
  0.0000000000E+00  0.0000000000E+00  0.0000000000E+00  0.0000000000E+00  0.0000000000E+00

ASCII SHORT:

This is basically the same as the HEPEVT standard, but very much abbreviated. The header line for each event is identical, but the first line per particle does only contain the PDG and the polarization, while the vertex information line is omitted.

 4 2 0  3.0574068604E+08
 25 0
  0.0000000000E+00  0.0000000000E+00  4.8412291828E+02  5.0000000000E+02  1.2500000000E+02
 25 0
  0.0000000000E+00  0.0000000000E+00 -4.8412291828E+02  5.0000000000E+02  1.2500000000E+02
 25 0
 -1.4960220911E+02 -4.6042825611E+02  0.0000000000E+00  5.0000000000E+02  1.2500000000E+02
 25 0
  1.4960220911E+02  4.6042825611E+02  0.0000000000E+00  5.0000000000E+02  1.2500000000E+02

ASCII LONG:

Identical to the ASCII short format, but after each event there is a line containg two values: the value of the sample function to be integrated over phase space, so basically the squared matrix element including all normalization factors, flux factor, structure functions etc.

 4 2 0  3.0574068604E+08
 25 0
  0.0000000000E+00  0.0000000000E+00  4.8412291828E+02  5.0000000000E+02  1.2500000000E+02
 25 0
  0.0000000000E+00  0.0000000000E+00 -4.8412291828E+02  5.0000000000E+02  1.2500000000E+02
 25 0
 -1.4960220911E+02 -4.6042825611E+02  0.0000000000E+00  5.0000000000E+02  1.2500000000E+02
 25 0
  1.4960220911E+02  4.6042825611E+02  0.0000000000E+00  5.0000000000E+02  1.2500000000E+02
  1.0000000000E+00  1.0000000000E+00

ATHENA:

Quite similar to the HEPEVT ASCII format. The header line, however, does contain only two numbers: an event counter, and the number of particles in the event. The first line for each particle lacks the polarization information (irrelevant for the ATHENA environment), but has as leading entry an ordering number counting the particles in the event. The vertex information line has only the four relevant position entries.

 0 4
 1 2 25 0 0 3 4
  0.0000000000E+00  0.0000000000E+00  4.8412291828E+02  5.0000000000E+02  1.2500000000E+02
  0.0000000000E+00  0.0000000000E+00  0.0000000000E+00  0.0000000000E+00
 2 2 25 0 0 3 4
  0.0000000000E+00  0.0000000000E+00 -4.8412291828E+02  5.0000000000E+02  1.2500000000E+02
  0.0000000000E+00  0.0000000000E+00  0.0000000000E+00  0.0000000000E+00
 3 1 25 1 2 0 0
 -1.4960220911E+02 -4.6042825611E+02  0.0000000000E+00  5.0000000000E+02  1.2500000000E+02
  0.0000000000E+00  0.0000000000E+00  0.0000000000E+00  0.0000000000E+00
 4 1 25 1 2 0 0
  1.4960220911E+02  4.6042825611E+02  0.0000000000E+00  5.0000000000E+02  1.2500000000E+02
  0.0000000000E+00  0.0000000000E+00  0.0000000000E+00  0.0000000000E+00

MOKKA:

Quite similar to the ASCII short format, but the event entries are the particle status, the PDG code, the first and last daughter, the three spatial components of the momentum, as well as the mass.

 4 2 0  3.0574068604E+08
 2 25 3 4  0.0000000000E+00  0.0000000000E+00  4.8412291828E+02  1.2500000000E+02
 2 25 3 4  0.0000000000E+00  0.0000000000E+00 -4.8412291828E+02  1.2500000000E+02
 1 25 0 0 -1.4960220911E+02 -4.6042825611E+02  0.0000000000E+00  1.2500000000E+02
 1 25 0 0  1.4960220911E+02  4.6042825611E+02  0.0000000000E+00  1.2500000000E+02

LHA:

This is the implementation of the Les Houches Accord, as it was used in WHIZARD 1 and the old MadEvent. There is a first line containing six entries: 1. the number of particles in the event, NUP, 2. the subprocess identification index, IDPRUP, 3. the event weight, XWGTUP, 4. the scale of the process, SCALUP, 5. the value or status of α_QED, AQEDUP, 6. the value for α_s, AQCDUP. The next seven lines contain as many entries as there are particles in the event: the first one has the PDG codes, IDUP, the next two the first and second mother of the particles, MOTHUP, the fourth and fifth line the two color indices, ICOLUP, the next one the status of the particle, ISTUP, and the last line the polarization information, ISPINUP. At the end of the event there are as lines for each particles with the counter in the event and the four-vector of the particle. For more information on this event format confer ‍[51].

 25 25  5.0000000000E+02  5.0000000000E+02 -1 -1 -1 -1 3 1
  1.0000000000E-01  1.0000000000E-03  1.0000000000E+00 42
     4     1  3.0574068604E+08  1.000000E+03 -1.000000E+00 -1.000000E+00
    25    25    25    25
     0     0     1     1
     0     0     2     2
     0     0     0     0
     0     0     0     0
    -1    -1     1     1
     9     9     9     9
     1  5.0000000000E+02  0.0000000000E+00  0.0000000000E+00  4.8412291828E+02
     2  5.0000000000E+02  0.0000000000E+00  0.0000000000E+00 -4.8412291828E+02
     3  5.0000000000E+02 -1.4960220911E+02 -4.6042825611E+02  0.0000000000E+00
     4  5.0000000000E+02  1.4960220911E+02  4.6042825611E+02  0.0000000000E+00

LHEF:

This is the modern version of the Les Houches accord event format (LHEF), for the details confer the corresponding publication ‍[55].

<LesHouchesEvents version="1.0">
<header>
  <generator_name>WHIZARD</generator_name>
  <generator_version>3.1.6</generator_version>
</header>
<init>
 25 25  5.0000000000E+02  5.0000000000E+02 -1 -1 -1 -1 3 1
  1.0000000000E-01  1.0000000000E-03  1.0000000000E+00 42
</init>
<event>
 4 42  3.0574068604E+08  1.0000000000E+03 -1.0000000000E+00 -1.0000000000E+00
 25 -1 0 0 0 0  0.0000000000E+00  0.0000000000E+00  4.8412291828E+02  5.0000000000E+02  1.2500000000E+02  0.0000000000E+00  9.0000000000E+00
 25 -1 0 0 0 0  0.0000000000E+00  0.0000000000E+00 -4.8412291828E+02  5.0000000000E+02  1.2500000000E+02  0.0000000000E+00  9.0000000000E+00
 25 1 1 2 0 0 -1.4960220911E+02 -4.6042825611E+02  0.0000000000E+00  5.0000000000E+02  1.2500000000E+02  0.0000000000E+00  9.0000000000E+00
 25 1 1 2 0 0  1.4960220911E+02  4.6042825611E+02  0.0000000000E+00  5.0000000000E+02  1.2500000000E+02  0.0000000000E+00  9.0000000000E+00
</event>
</LesHouchesEvents>

Note that for the LHEF format, there are different versions according to the different stages of agreement. They can be addressed from within the SINDARIN file by setting the string variable $lhef_version to one of (at the moment) three values: "1.0", "2.0", or "3.0". The examples above corresponds (as is indicated in the header) to the version "1.0" of the LHEF format. Additional information in form of alternative squared matrix elements or event weights in the event are the most prominent features of the other two more advanced versions. For more details confer the literature.

Sample files for the default ASCII format as well as for the debug event format are shown in the appendix.

11.6 Interfaces to Parton Showers, Matching
and Hadronization

This section describes the interfaces to the internal parton shower as well as the parton shower and hadronization routines from PYTHIA. Moreover, our implementation of the MLM matching making use of the parton showers is described. Sample SINDARIN files are located in the share/examples directory. All input files come in two versions, one using the internal shower, ending in W.sin, and one using PYTHIA’s shower, ending in P.sin. Thus we state all file names as ending with X.sin, where X has to be replaced by either W or P. The input files include EENoMatchingX.sin and DrellYanNoMatchingX.sin for e⁺ e⁻ → hadrons and pp → Z without matching. The corresponding SINDARIN files with matching enabled are EEMatching2X.sin to EEMatching5X.sin for e⁺ e⁻ → hadrons with a different number of partons included in the matrix element and DrallYanMatchingX.sin for Drell-Yan with one matched emission.

11.6.1 Parton Showers and Hadronization

From version 2.1 onwards, WHIZARD contains an implementation of an analytic parton shower as presented in [74], providing the opportunity to perform the parton shower from whithin WHIZARD. Moreover, an interface to PYTHIA is included, which can be used to delegate the parton shower to PYTHIA. The same interface can be used to hadronize events using the generated events using PYTHIA’s hadronization routines. Note that by PYTHIA’s default, when performing initial-state radiation multiple interactions are included and when performing the hadronization hadronic decays are included. If required, these additional steps have to be switched off using the corresponding arguments for PYTHIA’s PYGIVE routine via the $ps_PYTHIA_PYGIVE string.

Note that from version 2.2.4 on the earlier flag --enable-shower flag has been abandoned, and there is only a flag to either compile or not compile the interally attached PYTHIA6 package (--enable-pythia6) last release of the Fortran PYTHIA, v6.427) as well as the interface. It can be invoked by the following SINDARIN keywords:

?ps_fsr_active = true	master switch for final-state parton showers
?ps_isr_active = true	master switch for initial-state parton showers
?ps_taudec_active = true	master switch for τ decays (at the moment only via TAUOLA
?hadronization_active = true	master switch to enable hadronization
$shower_method = "PYTHIA6"	switch to use PYTHIA6’s parton shower instead of
	WHIZARD’s own shower

If either ?ps_fsr_active or ?ps_isr_active is set to true, the event will be transferred to the internal shower routines or the PYTHIA data structures, and the chosen shower steps (initial- and final-state radiation) will be performed. If hadronization is enabled via the ?hadronization_active switch, WHIZARD will call PYTHIA’s hadronization routine. The hadronization can be applied to events showered using the internal shower or showered using PYTHIA’s shower routines, as well as unshowered events. Any necessary transfer of event data to PYTHIA is automatically taken care of within WHIZARD’s shower interface. The resulting (showered and/or hadronized) event will be transferred back to WHIZARD, the former final particles will be marked as intermediate. The analysis can be applied to a showered and/or hadronized event just like in the unshowered/unhadronized case. Any event file can be used and will contain the showered/hadronized event.

Settings for the internal analytic parton shower are set via the following SINDARIN variables:

ps_mass_cutoff: The cut-off in virtuality, below which, partons are assumed to radiate no more. Used for both ISR and FSR. Given in GeV. (Default = 1.0)
ps_fsr_lambda: The value for Λ used in calculating the value of the running coupling constant α_S for Final State Radiation. Given in GeV. (Default = 0.29)
ps_isr_lambda: The value for Λ used in calculating the value of the running coupling constant α_S for Initial State Radiation. Given in GeV. (Default = 0.29)
ps_max_n_flavors: Number of quark flavours taken into account during shower evolution. Meaningful choices are 3 to include u,d,s-quarks, 4 to include u,d,s,c-quarks and 5 to include u,d,s,c,b-quarks. (Default = 5)
?ps_isr_alphas_running: Switch to decide between a constant α_S, given by ps_fixed_alphas, and a running α_S, calculated using ps_isr_lambda for ISR. (Default = true)
?ps_fsr_alphas_running: Switch to decide between a constant α_S, given by ps_fixed_alphas, and a running α_S, calculated using ps_fsr_lambda for FSR. (Default = true)
ps_fixed_alphas: Fixed value of α_S for the parton shower. Used if either one of the variables ?ps_fsr_alphas_running or ?ps_isr_alphas_running are set to false. (Default = 0.0)
?ps_isr_angular_ordered: Switch for angular ordered ISR. (Default = true )²
ps_isr_primordial_kt_width: The width in GeV of the Gaussian assumed to describe the transverse momentum of partons inside the proton. Other shapes are not yet implemented. (Default = 0.0)
ps_isr_primordial_kt_cutoff: The maximal transverse momentum in GeV of a parton inside the proton. Used as a cut-off for the Gaussian. (Default = 5.0)
ps_isr_z_cutoff: Maximal z-value in initial state branchings. (Default = 0.999)
ps_isr_minenergy: Minimal energy in GeV of an emitted timelike or final parton. Note that the energy is not calculated in the labframe but in the center-of-mas frame of the two most initial partons resolved so far, so deviations may occur. (Default = 1.0)
ps_isr_tscalefactor: Factor for the starting scale in the initial state shower evolution. ( Default = 1.0 )
?ps_isr_only_onshell_emitted_partons: Switch to allow only for on-shell emitted partons, thereby rejecting all possible final state parton showers starting from partons emitted during the ISR. (Default = false)

Settings for the PYTHIA are transferred using the following SINDARIN variables:

?ps_PYTHIA_verbose	if set to false, output from PYTHIA will be suppressed
$ps_PYTHIA_PYGIVE	a string containing settings transferred to PYTHIA’s PYGIVE subroutine.
	The format is explained in the PYTHIA manual. The limitation to 100
	characters mentioned there does not apply here, the string is split
	appropriately before being transferred to PYTHIA.

Note that the included version of PYTHIA uses LHAPDF for initial state radiation whenever this is available, but the PDF set has to be set manually in that case using the keyword ps_PYTHIA_PYGIVE.

11.6.2 Parton shower – Matrix Element Matching

Along with the inclusion of the parton showers, WHIZARD includes an implementation of the MLM matching procedure. For a detailed description of the implemented steps see [74]. The inclusion of MLM matching still demands some manual settings in the SINDARIN file. For a given base process and a matching of N additional jets, all processes that can be obtained by attaching up to N QCD splittings, either a quark emitting a gluon or a gluon splitting into two quarks ar two gluons, have to be manually specified as additional processes. These additional processes need to be included in the simulate statement along with the original process. The SINDARIN variable mlm_nmaxMEjets has to be set to the maximum number of additional jets N. Moreover additional cuts have to be specified for the additional processes.

  alias quark = u:d:s:c
  alias antiq = U:D:S:C
  alias j = quark:antiq:g

  ?mlm_matching = true
  mlm_ptmin = 5 GeV
  mlm_etamax = 2.5
  mlm_Rmin = 1

  cuts = all Dist > mlm_Rmin [j, j]
         and all Pt > mlm_ptmin [j]
         and all abs(Eta) < mlm_etamax [j]

Note that the variables mlm_ptmin, mlm_etamax and mlm_Rmin are used by the matching routine. Thus, replacing the variables in the cut expression and omitting the assignment would destroy the matching procedure.

The complete list of variables introduced to steer the matching procedure is as follows:

?mlm_matching_active: Master switch to enable MLM matching. (Default = false)
mlm_ptmin: Minimal transverse momentum, also used in the definition of a jet
mlm_etamax: Maximal absolute value of pseudorapidity η, also used in defining a jet
mlm_Rmin: Minimal η−φ distance R_min
mlm_nmaxMEjets: Maximum number of jets N
mlm_ETclusfactor: Factor to vary the jet definition. Should be ≥ 1 for complete coverage of phase space. (Default = 1)
mlm_ETclusminE: Minimal energy in the variation of the jet definition
mlm_etaclusfactor: Factor in the variation of the jet definition. Should be ≤ 1 for complete coverage of phase space. (Default = 1)
mlm_Rclusfactor: Factor in the variation of the jet definition. Should be ≥ 1 for complete coverage of phase space. (Default = 1)

The variation of the jet definition is a tool to asses systematic uncertainties introduced by the matching procedure (See section 3.1 in [74]).

11.7 Rescanning and recalculating events

In the simplest mode of execution, WHIZARD handles its events at the point where they are generated. It can apply event transforms such as decays or shower (see above), it can analyze the events, calculate and plot observables, and it can output them to file. However, it is also possible to apply two different operations to those events in parallel, or to reconsider and rescan an event sample that has been previously generated.

We first discuss the possibilities that simulate offers. For each event, WHIZARD calculates the matrix element for the hard interaction, supplements this by Jacobian and phase-space factors in order to obtain the event weight, optionally applies a rejection step in order to gather uniformly weighted events, and applies the cuts and analysis setup. We may ask about the event matrix element or weight, or the analysis result, that we would have obtained for a different setting. To this end, there is an alt_setup option.

This option allows us to recalculate, event by event, the matrix element, weight, or analysis contribution with a different parameter set but identical kinematics. For instance, we may evaluate a distribution for both zero and non-zero anomalous coupling fw and enter some observable in separate histograms:

  simulate (some_proc) {
      fw = 0
      analysis = record hist1 (eval Pt [H])
    alt_setup = {
       fw = 0.01
       analysis = record hist2 (eval Pt [H])
    }
  }

In fact, the alt_setup object is not restricted to a single code block (enclosed in curly braces) but can take a list of those,

  alt_setup = { fw = 0.01 }, { fw = 0.02 }, ...

Each block provides the environment for a separate evaluation of the event data. The generation of these events, i.e., their kinematics, is still steered by the primary environment.

The alt_setup blocks may modify various settings that affect the evaluation of an event, including physical parameters, PDF choice, cuts and analysis, output format, etc. This must not (i.e., cannot) affect the kinematics of an event, so don’t modify particle masses. When applying cuts, they can only reduce the generated event sample, so they apply on top of the primary cuts for the simulation.

Alternatively, it is possible to rescan a sample that has been generated by a previous simulate command:

  simulate (some_proc) { $sample = "my_events"
    analysis = record hist1 (eval Pt [H])
  }
  ?update_sqme = true
  ?update_weight = true
  rescan "my_events" (some_proc) {
    fw = 0.01
    analysis = record hist2 (eval Pt [H])
  }
  rescan "my_events" (some_proc) {
    fw = 0.05
    analysis = record hist3 (eval Pt [H])
  }

In more complicated situation, rescanning is more transparent and offers greater flexibility than doing all operations at the very point of event generation.

Combining these features with the scan looping construct, we already cover a considerable range of applications. (There are limitations due to the fact that SINDARIN doesn’t provide array objects, yet.) Note that the rescan construct also allows for an alt_setup option.

You may generate a new sample by rescanning, for which you may choose any output format:

  rescan "my_events" (some_proc) {
    selection = all Pt > 100 GeV [H]
    $sample = "new_events"
    sample_format = lhef
  }

The event sample that you rescan need not be an internal raw WHIZARD file, as above. You may rescan a LHEF file,

  rescan "lhef_events" (proc) {
    $rescan_input_format = "lhef"
  }

This file may have any origin, not necessarily from WHIZARD. To understand such an external file, WHIZARD must be able to reconstruct the hard process and match it to a process with a known name (e.g., proc), that has been defined in the SINDARIN script previously.

Within its limits, WHIZARD can thus be used for translating an event sample from one format to another format.

There are three important switches that control the rescanning behavior. They can be set or unset independently.

?update_sqme (default: false). If true, WHIZARD will recalculate the hard matrix element for each event. When applying an analysis, the recalculated squared matrix element (averaged and summed over quantum numbers as usual) is available as the variable sqme_prc. This may be related to sqme_ref, the corresponding value in the event file, if available. (For the alt_env option, this switch is implied.)
?update_weight (default: false). If true, WHIZARD will recalculate the event weight according to the current environment and apply this to the event. In particular, the user may apply a reweight expression. In an analysis, the new weight value is available as weight_prc, to be related to weight_ref from the sample. The updated weight will be applied for histograms and averages. An unweighted event sample will thus be transformed into a weighted event sample. (This switch is also implied for the alt_env option.)
?update_event (default: false). If true, WHIZARD will generate a new decay chain etc., if applicable. That is, it reuses just the particles in the hard process. Otherwise, the complete event is kept as it is written to file.

For these options to make sense, WHIZARD must have access to a full process object, so the SINDARIN script must contain not just a definition but also a compile command for the matrix elements in question.

If an event file (other than raw format) contains several processes as a mixture, they must be identifiable by a numeric ID. WHIZARD will recognize the processes if their respective SINDARIN definitions contain appropriate process_num_id options, such as

  process foo = u, ubar => d, dbar { process_num_id = 42 }

Certain event-file formats, such as LHEF, support alternative matrix-element values or weights. WHIZARD can thus write both original and recalculated matrix-element and weight values. Other formats support only a single event weight, so the ?update_weight option is necessary for a visible effect.

External event files in formats such as LHEF, HepMC, or LCIO, also may carry information about the value of the strong coupling α_s and the energy scale of each event. This information will also be provided by WHIZARD when writing external event files. When such an event file is rescanned, the user has the choice to either user the α_s value that WHIZARD defines in the current context (or the method for obtaining an event-specific running α_s value), or override this for each event by using the value in the event file. The corresponding parameter is ?use_alphas_from_file, which is false by default. Analogously, the parameter ?use_scale_from_file may be set to override the scale definition in the current context. Obviously, these settings influence matrix-element recalculation and therefore require ?update_sqme to be set in order to become operational.

11.8 Negative weight events

For usage at NLO refer to Subsection ‍5.11.3. In case, you have some other mechanism to produce events with negative weights (e.g. with the weight = <expr> command), keep in mind that you should activate ?negative_weights = true and unweighted = false. The generation of unweighted events with varying sign (also known as events and counter events) is currently not supported.

1: Some event formats, based on the HEPEVT or HEPEUP common blocks, use fixed-form ASCII output with a two-digit exponent for real numbers. There are rare cases (mainly, ISR photons) where the event record can contain numbers with absolute value less than 10⁻⁹⁹. Since those numbers are not representable in that format, WHIZARD will set all non-zero numbers below that value to ± 10⁻⁹⁹, when filling either common block. Obviously, such values are physically irrelevant, but in the output they are representable and distinguishable from zero.
2: The FSR is always simulated with angular ordering enabled.