Chapter 4 Steering WHIZARD: SINDARIN Overview
4.1 The command language for WHIZARDA conventional physics application program gets its data from a set of input files. Alternatively, it is called as a library, so the user has to write his own code to interface it, or it combines these two approaches. WHIZARD 1 was built in this way: there were some input files which were written by the user, and it could be called both stand-alone or as an external library. WHIZARD 2 is also a stand-alone program. It comes with its own full-fledged script language, called SINDARIN. All interaction between the user and the program is done in SINDARIN expressions, commands, and scripts. Two main reasons led us to this choice:
The SINDARIN language is built specifically around event analysis, suitably extended to support steering, including data types, loops, conditionals, and I/O. It would have been possible to use an established general-purpose language for these tasks. For instance, OCaml which is a functional language would be a suitable candidate, and the matrix-element generator O’Mega is written in that language. Another candidate would be a popular scripting language such as PYTHON. We started to support interfaces for commonly used languages: prime examples for C, C++, and PYTHON are found in the share/interfaces subdirectory. However, introducing a special-purpose language has the three distinct advantages: First, it is compiled and executed by the very Fortran code that handles data and thus accesses it without interfaces. Second, it can be designed with a syntax especially suited to the task of event handling and Monte-Carlo steering, and third, the user is not forced to learn all those features of a generic language that are of no relevance to the application he/she is interested in. 4.2 SINDARIN scriptsA SINDARIN script tells the WHIZARD program what it has to do. Typically,
the script is contained in a file which you (the user) create. The file name
is arbitrary; by convention, it has the extension ‘ /home/user$ whizard script.sin Alternatively, you can call WHIZARD interactively and execute statements line by line; we describe this below in Sec.14.2. A SINDARIN script is a sequence of statements, similar to the statements in any imperative language such as Fortran or C. Examples of statements are commands like integrate, variable declarations like logical ?flag or assigments like mH = 130 GeV. The script is free-form, i.e., indentation, extra whitespace and newlines are syntactically insignificant. In contrast to most languages, there is no statement separator. Statements simply follow each other, just separated by whitespace.
Nevertheless, for clarity we recommend to write one statement per line where possible, and to use proper indentation for longer statements, nested and bracketed expressions. A command may consist of a keyword, a list of arguments in parantheses (…), and an option script which itself is a sequence of statements.
As a rule, parentheses () enclose arguments and expressions, as you would expect. Arguments enclosed in square brackets [] also exist. They have a special meaning, they denote subevents (collections of momenta) in event analysis. Braces {} enclose blocks of SINDARIN code. In particular, the option script associated with a command is a block of code that may contain local parameter settings, for instance. Braces always indicate a scoping unit, so parameters will be restored their previous values when the execution of that command is completed. The script can contain comments. Comments are initiated by either a
4.3 ErrorsBefore turning to proper SINDARIN syntax, let us consider error messages. SINDARIN distinguishes syntax errors and runtime errors. Syntax errors are recognized when the script is read and compiled, before any part is executed. Look at this example:
WHIZARD will fail with the error message sqrts = 1 TeV
integrade (foo)
^^
| Expected syntax: SEQUENCE <cmd_num> = <var_name> '=' <expr>
| Found token: KEYWORD: '('
******************************************************************************
******************************************************************************
*** FATAL ERROR: Syntax error (at or before the location indicated above)
******************************************************************************
******************************************************************************
WHIZARD run aborted.
which tells you that you have misspelled the command
Runtime errors are categorized by their severity. A warning is simply printed: Warning: No cuts have been defined.
This indicates a condition that is suspicious, but may actually be intended by the user. When an error is encountered, it is printed with more emphasis ******************************************************************************
*** ERROR: Variable 'md' set without declaration
******************************************************************************
and the program tries to continue. However, this usually indicates
that there is something wrong. (The d quark is defined
massless, so | There were 1 error(s) and no warnings.
just in case you missed the message. Other errors are considered fatal, and execution stops at this point. ******************************************************************************
******************************************************************************
*** FATAL ERROR: Colliding beams: sqrts is zero (please set sqrts)
******************************************************************************
******************************************************************************
Here, WHIZARD was unable to do anything sensible. But at least (in this case) it told the user what to do to resolve the problem. 4.4 StatementsSINDARIN statements are executed one by one. For an overview, we list the most common statements in the order in which they typically appear in a SINDARIN script, and quote the basic syntax and simple examples. This should give an impression on the WHIZARD’s capabilities and on the user interface. The list is not complete. Note that there are no mandatory commands (although an empty SINDARIN script is not really useful). The details and options are explained in later sections. 4.4.1 Process Configurationmodel
This assignment sets or resets the current physics model. The Standard Model is already preloaded, so the model assignment applies to non-default models. Obviously, the model must be known to WHIZARD. Example:
See Sec. 5.3. alias
Particles are specified by their names. For most particles, there
are various equivalent names. Names containing special characters
such as a
See Sec. 5.2.1. process
Define a process. You give the process a name ⟨tag⟩ by which it is identified later, and specify the incoming and outgoing particles, and possibly options. You can define an arbitrary number of processes as long as they are distinguished by their names. Example:
See Sec. 5.4. sqrts
Define the center-of-mass energy for collision processes. The default setup will assume head-on central collisions of two beams. Example:
See Sec. 5.5.1. beams
Declare beam particles and properties. The current value of sqrts is used, unless specified otherwise. Example:
With options, the assignment allows for defining beam structure in some detail. This includes beamstrahlung and ISR for lepton colliders, precise structure function definition for hadron colliders, asymmetric beams, beam polarization, and more. See Sec. 5.5. 4.4.2 ParametersParameter settings
Specify a value for a parameter. There are predefined parameters that affect the behavior of a command, model-specific parameters (masses, couplings), and user-defined parameters. The latter have to be declared with a type, which may be int (integer), real, complex, logical, string, or alias. Logical parameter names begin with a question mark, string parameter names with a dollar sign. Examples:
The value need not be a literal, it can be an arbitrary expression of the correct type. See Sec. 4.7. read_slha
This is useful only for supersymmetric models: read a parameter file in the SUSY Les Houches Accord format. The file defines parameter values and, optionally, decay widths, so this command removes the need for writing assignments for each of them.
See Sec. 10.2. show
Print the current value of some data object. This includes not just variables, but also models, libraries, cuts, etc. This is rather a debugging aid, so don’t expect the output to be concise in the latter cases. Example:
See Sec. 5.10. printf
Pretty-print the data objects according to the given format string. If there are no data objects, just print the format string. This command is borrowed from the C programming language; it is actually an interface to the system’s printf(3) function. The conversion specifiers are restricted to d,i,e,f,g,s, corresponding to the output of integer, real, and string variables. Example:
See Sec. 5.10. 4.4.3 Integrationcuts
The cut expression is a logical macro expression that is evaluated for each
phase space point during integration and event generation. You may construct
expressions out of various observables that are computed for the (partonic)
particle content of the current event. If the expression evaluates to
Example for the keyword cuts:
See Sec. 5.2.5. integrate
Compute the total cross section for a process. The command takes into account the definition of the process, the beam setup, cuts, and parameters as defined in the script. Parameters may also be specified as options to the command. Integration is necessary for each process for which you want to know total or differential cross sections, or event samples. Apart from computing a value, it sets up and adapts phase space and integration grids that are used in event generation. If you just need an event sample, you can omit an explicit integrate command; the simulate command will call it automatically. Example:
See Sec. 5.7.1. ?phs_only/n_calls_test
These are just optional settings for the integrate command
discussed just a second ago. The ?phs_only = true (note that
variables starting with a question mark are logicals) option tells
WHIZARD to prepare a process for integration, but instead of
performing the integration, just to generate a phase space
parameterization. n_calls_test = <num> evaluates the sampling
function for random integration channels and random momenta. VAMP
integration grids are neither generated nor used, so the channel
selection corresponds to the first integration pass, before any grids
or channel weights are adapted. The number of sampling points is
given by
(Note that there used to be a separate command matrix_element_test until version 2.1.1 of WHIZARD which has been discarded in order to simplify the SINDARIN syntax.) 4.4.4 Eventshistogram
Declare a histogram for event analysis. The histogram is filled by an analysis expression, which is evaluated once for each event during a subsequent simulation step. Example:
See Sec. 5.9.3. plot
Declare a plot for displaying data points. The plot may be filled by an analysis expression that is evaluated for each event; this would result in a scatter plot. More likely, you will use this feature for displaying data such as the energy dependence of a cross section. Example:
See Sec. 5.9.4. selection
The selection expression is a logical macro expression that is evaluated once for each event. It is applied to the event record, after all decays have been executed (if any). It is therefore intended e.g. for modelling detector acceptance cuts etc. For unfactorized processes the usage of cuts or selection leads to the same results. Events for which the selection expression evaluates to false are dropped; they are neither analyzed nor written to any user-defined output file. However, the dropped events are written to WHIZARD’s native event file. For unfactorized processes it is therefore preferable to implement all cuts using the cuts keyword for the integration, see cuts above. Example:
The syntax is generically the same as for the cuts expression, see Sec. 5.2.5. For more information see also Sec. 5.9. analysis
The analysis expression is a logical macro expression that is evaluated once for each event that passes the integration and selection cuts in a subsequent simulation step. The expression has type logical in analogy with the cut expression; however, its main use will be in side effects caused by embedded record expressions. The record expression books a value, calculated from observables evaluated for the current event, in one of the predefined histograms or plots. Example:
See Sec. 5.9. unstable
Specify that a particle can decay, if it occurs in the final state of a subsequent simulation step. (In the integration step, all final-state particles are considered stable.) The decay channels are processes which should have been declared before by a process command (alternatively, there are options that WHIZARD takes care of this automatically; cf. Sec. 5.8.2). They may be integrated explicitly, otherwise the unstable command will take care of the integration before particle decays are generated. Example:
Note that the decay is an on-shell approximation. Alternatively, WHIZARD is capable of generating the final state(s) directly, automatically including the particle as an internal resonance together with irreducible background. Depending on the physical problem and on the complexity of the matrix-element calculation, either option may be more appropriate. See Sec. 5.8.2. n_events
Specify the number of events that a subsequent simulation step should produce. By default, simulated events are unweighted. (Unweighting is done by a rejection operation on weighted events, so the usual caveats on event unweighting by a numerical Monte-Carlo generator do apply.) Example:
See Sec. 5.8.1. simulate
Generate an event sample. The command allows for analyzing the generated events by the analysis expression. Furthermore, events can be written to file in various formats. Optionally, the partonic events can be showered and hadronized, partly using included external (PYTHIA) or truly external programs called by WHIZARD. Example:
See Sec. 5.8.1 and Chapter 11. graph
Combine existing histograms and plots into a common graph. Also useful for pretty-printing single histograms or plots. Example:
See Sec. 12.4. write_analysis
Writes out data tables for the specified analysis objects (plots, graphs, histograms). If the argument is empty or absent, write all analysis objects currently available. The tables are available for feeding external programs. Example:
See Sec. 5.9. compile_analysis
Analogous to write_analysis, but the generated data tables are processed by LATEX and gamelan, which produces Postscript and PDF versions of the displayed data. Example:
See Sec. 5.9. 4.5 Control StructuresLike any complete programming language, SINDARIN provides means for branching and looping the program flow. 4.5.1 Conditionalsif
Execute statements conditionally, depending on the value of a logical expression. There may be none or multiple elsif branches, and the else branch is also optional. Example:
The current SINDARIN implementation puts some restriction on the statements that can appear in a conditional. For instance, process definitions must be done unconditionally. 4.5.2 Loopsscan
Execute the statements repeatedly, once for each value of the scan variable. The statements are executed in a local context, analogous to the option statement list for commands. The value list is a comma-separated list of expressions, where each item evaluates to the value that is assigned to ⟨variable⟩ for this iteration. The type of the variable is not restricted to numeric, scans can be done for various object types. For instance, here is a scan over strings:
The output: [user variable] $str = "%.3g"
80.4
[user variable] $str = "%.4g"
80.42
[user variable] $str = "%.5g"
80.419
For a numeric scan variable in particular, there are iterators that implement the usual functionality of for loops. If the scan variable is of type integer, an iterator may take one of the forms
The iterator can be put in place of an expression in the ⟨value-list⟩. Here is an example:
which results in the output [user variable] i = 1
[user variable] i = 3
[user variable] i = 4
[user variable] i = 5
[user variable] i = 10
[user variable] i = 14
[user variable] i = 18
[Note that the ⟨statements⟩ part of the scan construct may be empty or absent.] For real scan variables, there are even more possibilities for iterators:
The first variant is equivalent to /+ 1. The /+ and /- operators are intended to add or subtract the given step once for each iteration. Since in floating-point arithmetic this would be plagued by rounding ambiguities, the actual implementation first determines the (integer) number of iterations from the provided step value, then recomputes the step so that the iterations are evenly spaced with the first and last value included. The /* and // operators are analogous. Here, the initial value is intended to be multiplied by the step value once for each iteration. After determining the integer number of iterations, the actual scan values will be evenly spaced on a logarithmic scale. Finally, the /+/ and /*/ operators allow to specify the number of iterations (not counting the initial value) directly. The ⟨start-value⟩ and ⟨end-value⟩ are always included, and the intermediate values will be evenly spaced on a linear (/+/) or logarithmic (/*/) scale. Example:
4.5.3 Including Filesinclude
Include a SINDARIN script from the specified file. The contents must be complete commands; they are compiled and executed as if they were part of the current script. Example:
4.6 ExpressionsSINDARIN expressions are classified by their types. The type of an expression is verified when the script is compiled, before it is executed. This provides some safety against simple coding errors. Within expressions, grouping is done using ordinary brackets (). For subevent expressions, use square brackets []. 4.6.1 NumericThe language supports the classical numeric types
SINDARIN supports arithmetic expressions similar to conventional
languages. In arithmetic expressions, the three numeric types can be
mixed as appropriate. The computation essentially follows the rules
for mixed arithmetic in Fortran. The arithmetic operators are
Numeric values can be associated with units. Units evaluate to
numerical factors, and their use is optional, but they can be useful
in the physics context for which WHIZARD is designed. Note that the
default energy/mass unit is 4.6.2 Logical and StringThe language also has the following standard types:
There are comparisons, logical operations, string concatenation, and a mechanism for formatting objects as strings for output. 4.6.3 SpecialFurthermore, SINDARIN deals with a bunch of data types tailored specifically for Monte Carlo applications:
In the current implementation, SINDARIN has no container data types derived from basic types, such as lists, arrays, or hashes, and there are no user-defined data types. (The subevt type is a container for particles in the context of events, but there is no type for an individual particle: this is represented as a one-particle subevt). There are also containers for inclusive processes which are however simply handled as an expansion into several components of a master process tag. 4.7 VariablesSINDARIN supports global variables, variables local to a scoping unit (the option body of a command, the body of a scan loop), and variables local to an expression. Some variables are predefined by the system (intrinsic variables). They are further separated into independent variables that can be reset by the user, and derived or locked variables that are automatically computed by the program, but not directly user-modifiable. On top of that, the user is free to introduce his own variables (user variables). The names of numerical variables consist of alphanumeric characters and underscores. The first character must not be a digit. Logical variable names are furthermore prefixed by a ? (question mark) sign, while string variable names begin with a $ (dollar) sign. Character case does matter. In this manual we follow the convention that variable names consist of lower-case letters, digits, and underscores only, but you may also use upper-case letters if you wish. Physics models contain their own, specific set of numeric variables (masses, couplings). They are attached to the model where they are defined, so they appear and disappear with the model that is currently loaded. In particular, if two different models contain a variable with the same name, these two variables are nevertheless distinct: setting one doesn’t affect the other. This feature might be called, in computer-science jargon, a mixin. User variables – global or local – are declared by their type when they are introduced, and acquire an initial value upon declaration. Examples:
An existing user variable can be assigned a new value without a declaration:
and it may also be redeclared if the new declaration specifies the same type, this is equivalent to assigning a new value. Variables local to an expression are introduced by the let ... in contruct. Example:
The explicit int declaration is necessary only if the variable n has not been declared before. An intrinsic variable must not be declared: let mtop = 175.3 GeV in … let constructs can be concatenated if several local variables need to be assigned: let a = 3 in let b = 4 in expression. Variables of type subevt can only be defined in let constructs. Exclusively in the context of particle selections (event analysis), there are observables as special numeric objects. They are used like numeric variables, but they are never declared or assigned. They get their value assigned dynamically, computed from the particle momentum configuration. Hence, they may be understood as (intrinsic and predefined) macros. By convention, observable names begin with a capital letter. Further macros are
|