5.13.0 ITEM: 1. initialize() callbacks
Before a SLiM simulation can be run, the various classes underlying the simulation need to be set up with an initial configuration. In SLiM 1.8 and earlier, this was done by means of # directives in the simulation’s input file. In SLiM 2.0, simulation parameters are instead configured using Eidos.
Configuration in Eidos is done in initialize() callbacks that run prior to the beginning of simulation execution. In your input file, you can simply write something like this:
initialize() { ... }
The initialize() specifies that the script block is to be executed as an initialize() callback before the simulation starts. The script between the braces {} would set up various aspects of the simulation by calling initialization functions. These are SLiM functions that may be called only in an initialize() callback, and their names begin with initialize to mark them clearly as such. You may also use other Eidos functionality, of course; for example, you might automate generating a large number of subpopulations with complex migration patterns by using a for loop.
One thing worth mentioning is that in the context of an initialize() callback, none of SLiM’s globals are defined – not even the sim global for the simulation itself. This is because the state of the simulation is not yet constructed fully, and accessing partially constructed state would not be safe. New subpopulations, new genomic element types, etc., that you define in your callback by calling initialization functions will also not be available through globals during your callback; those globals will become visible once simulation execution begins.
Once all initialize() callbacks have executed, in the order in which they are specified in the SLiM input file, the simulation will begin. The generation number at which it starts is determined by the Eidos events you have defined; the first generation in which an Eidos event is scheduled to execute is the generation at which the simulation starts. Similarly, the simulation will terminate after the last generation for which a script block (either an event or a callback) is registered to execute, unless the stop() function is called to end the simulation earlier.
5.13.1 ITEM: 2. Eidos events
An Eidos event is a block of Eidos code that is executed every generation, within a generation range, to perform a desired task. The syntax of an Eidos event declaration looks like one of these:
[id] [gen1 [: gen2]] first() { ... }
[id] [gen1 [: gen2]] { ... }
[id] [gen1 [: gen2]] early() { ... }
[id] [gen1 [: gen2]] late() { ... }
The first declaration declares a first() event that executes first thing in the generation cycle. The second and third declarations are exactly equivalent, and declare an early() event that executes relatively early in the generation cycle; the early() designation is optional. The fourth declaration declares a late() event that executes near the end of the generation cycle. Exactly when these events run depends upon whether the model is a WF model (see chapter 22 for details on the generation cycle in WF models) or a nonWF model (see chapter 23 for the same about nonWF models).
The id is an optional identifier like s1 (or more generally, sX, where X is an integer greater than or equal to 0) that defines an identifier that can be used to refer to the script block. In most situations it can be omitted, in which case the id is implicitly defined as -1, a placeholder value that essentially represents the lack of an identifier value. Supplying an id is only useful if you wish to manipulate your script blocks programmatically.
Then comes a generation or a range of generations, and then a block of Eidos code enclosed in braces to form a compound statement. A trivial example might look like this:
1000:5000 {
catn(sim.generation);
}
This would print the generation number in every generation in the specified range, which is obviously not very exciting. The broader point is that the Eidos code in the braces {} is executed early in every generation within the specified range of generations. In this case, the generation range is 1000 to 5000, and so the Eidos event will be executed 4001 times (not 4000!). A range of generations can be given, as in the example above, or a single generation can be given with a single integer:
100 late() {
print("Finished generation 100!");
}
The generation range may also be incompletely specified, with a somewhat idiosyncratic syntax. A range of 1000: would specify that the event should run in generation 1000 and every subsequent generation until the model finishes; a range of :1000 would similarly specify that the event should run in the first generation executed, and every subsequent generation, up to and including generation 1000. (You might notice that the grammar shown above for the generation range is not quite correct; see section 27.1 for the correct, complete grammar.)
In fact, you can omit specifying a generation altogether, in which case the Eidos event runs every generation. Since it takes a little time to set up the Eidos interpreter and interpret a script, it is advisable to use the narrowest range of generations possible; however, that is more of a concern with the callbacks we will look at later in this chapter, since they might be called many time in every generation, whereas first(), early(), and late() events will just be called once per generation.
The generations specified for a Eidos event block can be any positive integer. All scripts that apply to a given time point will be run in the order in which they are given; scripts specified higher in the input file will run before those specified lower. Sometimes it is desirable to have a script block execute in a generation which is not fixed, but instead depends upon some parameter, defined constant, or calculation; this may be achieved by rescheduling the script block with the SLiMSim method rescheduleScriptBlock().
When Eidos events are executed, several global variables are defined by SLiM for use by the Eidos code. Here is a summary of those SLiM globals:
sim A SLiMSim object representing the current SLiM simulation
g1, ... GenomicElementType objects representing the genomic element types defined
i1, ... InteractionType objects representing the interaction types defined
m1, ... MutationType objects representing the mutation types defined
p1, ... Subpopulation objects representing the subpopulations that exist
s1, ... SLiMEidosBlock objects representing the named events and callbacks defined
self A SLiMEidosBlock object representing the script block currently executing
Note that the sim global is not available in initialize() callbacks, since the simulation has not yet been initialized. Similarly, the globals for subpopulations, mutation types, and genomic element types are only available after the point at which those objects have been defined by an initialize() callback.
5.13.2 ITEM: 3. fitness() callbacks
A fitness() callback is called by SLiM when it is determining the fitness effect of a mutation carried by an individual. Normally, the fitness effect of a mutation is determined by the selection coefficient s of the mutation and the dominance coefficient h of the mutation (the latter used only if the individual is heterozygous for the mutation). More specifically, the standard calculation for the fitness effect of a mutation takes one of two forms. If the individual is homozygous, then the fitness effect is (1+s), or:
w = w * (1.0 + selectionCoefficient),
where w is the relative fitness of the individual carrying the mutation. This equation is also used if the chromosome being simulated has no homologue – when the Y sex chromosome is being simulated. If the individual is heterozygous, then the dominance coefficient enters the picture, and the fitness effect is (1+hs) or:
w = w * (1.0 + dominanceCoeff * selectionCoeff).
The dominance coefficient usually comes from the dominanceCoeff property of the mutation’s MutationType; if the focal individual has only one non-null genome, however, such that the mutation is paired with a null genome (i.e., is actually hemizygous or haploid, not heterozygous), the haploidDominanceCoeff property of the MutationType is used instead. See section 22.6 for further discussion of this detail.
That is the standard behavior of SLiM, reviewed here to provide a conceptual baseline. Supplying a fitness() callback allows you to substitute any calculation you wish for the relative fitness effect of a mutation; the new relative fitness effect computation becomes:
w = w * fitness()
where fitness() is the value returned by your callback. This value is a multiplicative fitness effect, so 1.0 is neutral, unlike the selection coefficient scale where 0.0 is neutral; be careful with this distinction!
Like Eidos events, fitness() callbacks are defined as script blocks in the input file, but they use a variation of the syntax for defining an Eidos event:
[id] [gen1 [: gen2]] fitness(<mut-type-id> [, <subpop-id>]) { ... }
For example, if the callback were defined as:
1000:2000 fitness(m2, p3) { 1.0; }
then a relative fitness of 1.0 (i.e. neutral) would be used for all mutations of mutation type m2 in subpopulation p3 from generation 1000 to generation 2000. The very same mutations, if also present in individuals in other subpopulations, would preserve their normal selection coefficient and dominance coefficient in those other subpopulations; this callback would therefore establish spatial heterogeneity in selection, in which mutation type m2 was neutral in subpopulation p3 but under selection in other subpopulations, for the range of generations given.
In addition to standard SLiM globals, a fitness() callback is supplied with some additional information passed through “pseudo-parameters”, variables that are defined by SLiM within the context of the callback’s code to supply the callback with relevant information:
mut A Mutation object, the mutation whose relative fitness is being evaluated
homozygous A value of T (the mutation is homozygous), F (heterozygous), or NULL (it is
paired with a null chromosome, and is thus hemizygous or haploid)
relFitness The default relative fitness value calculated by SLiM
individual The individual carrying this mutation (an object of class Individual)
genome1 One genome of the individual carrying this mutation
genome2 The other genome of that individual
subpop The subpopulation in which that individual lives
These may be used in the fitness() callback to compute a fitness value. To implement the standard fitness functions used by SLiM for an autosomal simulation with no null genomes involved, for example, you could do something like this:
fitness(m1) {
if (homozygous)
return 1.0 + mut.selectionCoeff;
else
return 1.0 + mut.mutationType.dominanceCoeff * mut.selectionCoeff;
}
As mentioned above, a relative fitness of 1.0 is neutral (whereas a selection coefficient of 0.0 is neutral); the 1.0 + in these calculations converts between the selection coefficient scale and the relative fitness scale, and is therefore essential. However, the relFitness global variable mentioned above would already contain this value, precomputed by SLiM, so you could simply return relFitness to get that behavior when you want it:
fitness(m1) {
if (<conditions>)
<custom fitness calculations...>;
else
return relFitness;
}
This would return a modified fitness value in certain conditions, but would return the standard fitness value otherwise.
More than one fitness() callback may be defined to operate in the same generation. As with Eidos events, multiple callbacks will be called in the order in which they were defined in the input file. Furthermore, each callback will be given the relFitness value returned by the previous callback – so the value of relFitness is not necessarily the default value, in fact, but is the result of all previous fitness() callbacks for that individual in that generation. In this way, the effects of multiple callbacks can “stack”.
In SLiM version 2.3 and later, it is possible to define global fitness() callbacks, which are applied exactly once to every individual (within a given subpopulation, if the fitness() callback is declared to be limited to one subpopulation, as usual). Global fitness() callbacks do not reference a particular mutation type, and are not called in reference to any specific mutation in the individual; instead, they provide an opportunity for the model script to define fitness effects that are independent of specific mutations (although their fitness effects may still depend upon some aggregate genetic state). For example, they are useful for defining the fitness effect of an individual’s overall phenotype (perhaps determined by multiple loci, and perhaps by developmental noise, phenotypic plasticity, etc.), or for defining the fitness effects of behavioral interactions between individuals such as competition or altruism. A global fitness() callback is defined by giving NULL as the mutation type identifier in the callback’s declaration. These callbacks will generally be called once per individual in each generation, in an order that is formally undefined. When a global fitness() callback is running, the mut and homozygous variables are defined to be NULL (since there is no focal mutation), and relFitness is defined to be 1.0. The fitness effect for the callback is simply returned as a singleton float value, as usual.
Beginning in SLiM 3.0, it is also possible to set the fitnessScaling property on a subpopulation to scale the fitness values of every individual in the subpopulation by the same constant amount, or to set the fitnessScaling property on an individual to scale the fitness value of that specific individual. These scaling factors are multiplied together with all other fitness effects for an individual to produce the individual’s final fitness value. The fitnessScaling properties of Subpopulation and Individual can often provide similar functionality to fitness(NULL) callbacks with greater efficiency and simplicity. They are reset to 1.0 in every generation, immediately after fitness values are calculated, so they only need to be set when a value other than 1.0 is desired.
One caveat to be aware of in WF models is that fitness() callbacks are called at the end of each generation, just before the next generation begins. If you have a fitness() callback defined for generation 10, for example, it will actually be called at the very end of generation 10, after child generation has finished, after the new children have been promoted to be the next parental generation, and after late() events have been executed. The fitness values calculated will thus be used during generation 11; the fitness values used in generation 10 were calculated at the end of generation 9. (This is primarily so that SLiMgui, which refreshes its display in between generations, has computed fitness values at hand that it can use to display the new parental individuals in the proper colors.) This is not an issue in nonWF models, since fitness values are used in the same generation in which they are calculated.
Many other possibilities can be implemented with a fitness() callback, and/or with the fitnessScaling properties of Subpopulation and Individual. However, since fitness() callbacks involve Eidos code being executed for the evaluation of fitness of every mutation of every individual (within the generation range, mutation type, and subpopulation specified), they can slow down a simulation considerably, so use them as sparingly as possible.
5.13.3 ITEM: 4. mateChoice() callbacks
Normally, WF models in SLiM regulate mate choice according to fitness; individuals of higher fitness are more likely to be chosen as mates. However, one might wish to simulate more complex mate-choice dynamics such as assortative or disassortative mating, mate search algorithms, and so forth. Such dynamics can be handled in WF models with the mateChoice() callback mechanism. (In nonWF models mating is arranged by the script, so there is no need for a callback).
A mateChoice() callback is established in the input file with a syntax very similar to that of fitness() callbacks:
[id] [gen1 [: gen2]] mateChoice([<subpop-id>]) { ... }
The only difference between the two declarations is that the mateChoice() callback does not allow you to specify a mutation type to which the callback applies, since that makes no sense.
Note that if a subpopulation is given to which the mateChoice() callback is to apply, the callback is used for all matings that will generate a child in the stated subpopulation (as opposed to all matings of parents in the stated subpopulation); this distinction is important when migration causes children in one subpopulation to be generated by matings of parents in a different subpopulation.
When a mateChoice() callback is defined, the first parent in a mating is still chosen proportionally according to fitness (if you wish to influence that choice, you can use a fitness() callback). In a sexual (rather than hermaphroditic) simulation, this will be the female parent; SLiM does not currently support males as the choosy sex. The second parent – the male parent, in a sexual simulation – will then be chosen based upon the results of the mateChoice() callback.
More specifically, the callback must return a vector of weights, one for each individual in the subpopulation; SLiM will then choose a parent with probability proportional to weight. The mateChoice() callback could therefore modify or replace the standard fitness-based weights depending upon some other criterion such as assortativeness. A singleton object of type Individual may be returned instead of a weights vector to indicate that that specific individual has been chosen as the mate (beginning in SLiM 2.3); this could also be achieved by returned a vector of weights in which the chosen mate has a non-zero weight and all other weights are zero, but returning the chosen individual directly is much more efficient. A zero-length return vector – as generated by float(0), for example – indicates that a suitable mate was not found; in that event, a new first parent will be drawn from the subpopulation. Finally, if the callback returns NULL, that signifies that SLiM should use the standard fitness-based weights to choose a mate; the mateChoice() callback did not wish to alter the standard behavior for the current mating (this is equivalent to returning the unmodified vector of weights, but returning NULL is much faster since it allows SLiM to drop into an optimized case). Apart from the special cases described above – a singleton Individual, float(0), and NULL – the returned vector of weights must contain the same number of values as the size of the subpopulation, and all weights must be non-negative. Note that the vector of weights is not required to sum to 1, however; SLiM will convert relative weights on any scale to probabilities for you.
If the sum of the returned weights vector is zero, SLiM treats it as meaning the same thing as a return of float(0) – a suitable mate could not be found, and a new first parent will thus be drawn. (This is a change in policy beginning in SLiM 2.3; prior to that, returning a vector of sum zero was considered a runtime error.) There is a subtle difference in semantics between this and a return of float(0): returning float(0) immediately short-circuits mate choice for the current first parent, whereas returning a vector of zeros allows further applicable mateChoice() callbacks to be called, one of which might “rescue” the first parent by returning a non-zero weights vector or an individual. In most models this distinction is irrelevant, since chaining mateChoice() callbacks is uncommon. When the choice is otherwise unimportant, returning float(0) will be handled more quickly by SLiM.
In addition to the standard SLiM globals, a mateChoice() callback is supplied with some additional information passed through “pseudo-parameters”:
individual The parent already chosen (the female, in sexual simulations)
genome1 One genome of the parent already chosen
genome2 The other genome of the parent already chosen
subpop The subpopulation into which the offspring will be placed
sourceSubpop The subpopulation from which the parents are being chosen
weights The standard fitness-based weights for all individuals
If sex is enabled, the mateChoice() callback must ensure that the appropriate weights are zero and nonzero to guarantee that all eligible mates are male (since the first parent chosen is always female, as explained above). In other words, weights for females must be 0. The weights vector given to the callback is guaranteed to satisfy this constraint. If sex is not enabled – in a hermaphroditic simulation, in other words – this constraint does not apply.
For example, a simple mateChoice() callback might look like this:
1000:2000 mateChoice(p2) {
return weights ^ 2;
}
This defines a mateChoice() callback for generations 1000 to 2000 for subpopulation p2. The callback simply transforms the standard fitness-based probabilities by squaring them. Code like this could represent a situation in which fitness and mate choice proceed normally in one subpopulation (p1, here, presumably), but are altered by the effects of a social dominance hierarchy or male-male competition in another subpopulation (p2, here), such that the highest-fitness individuals tend to be chosen as mates more often than their (perhaps survival-based) fitness values would otherwise suggest. Note that by basing the returned weights on the weights vector supplied by SLiM, the requirement that females be given weights of 0 is finessed; in other situations, care would need to be taken to ensure that.
More than one mateChoice() callback may be defined to operate in the same generation. As with Eidos events, multiple callbacks will be called in the order in which they were defined. Furthermore, each callback will be given the weights vector returned by the previous callback – so the value of weights is not necessarily the default fitness-based weights, in fact, but is the result of all previous weights() callbacks for the current mate-choice event. In this way, the effects of multiple callbacks can “stack”. If any mateChoice() callback returns float(0), however – indicating that no eligible mates exist, as described above – then the remainder of the callback chain will be short-circuited and a new first parent will immediately be chosen.
Note that matings in SLiM do not proceed in random order. Offspring are generated for each subpopulation in turn, and within each subpopulation the order of offspring generation is also non-random with respect to both the source subpopulation and the sex of the offspring. It is important, therefore, that mateChoice() callbacks are not in any way biased by the offspring generation order; they should not treat matings early in the process any differently than matings late in the process. Any failure to guarantee such invariance could lead to large biases in the simulation outcome. In particular, it is usually dangerous to activate or deactivate mateChoice() callbacks while offspring generation is in progress.
A wide variety of mate choice algorithms can easily be implemented with mateChoice() callbacks. However, mateChoice() callbacks can be particularly slow since they are called for every proposed mating, and the vector of mating weights can be large and slow to process.
5.13.4 ITEM: 5. modifyChild() callbacks
Normally, a SLiM simulation defines child generation with its rules regarding selfing versus crossing, recombination, mutation, and so forth. However, one might wish to modify these rules in particular circumstances – by preventing particular children from being generated, by modifying the generated children in particular ways, or by generating children oneself. All of these dynamics can be handled in SLiM with the modifyChild() callback mechanism.
A modifyChild() callback is established in the input file with a syntax very similar to that of other callbacks:
[id] [gen1 [: gen2]] modifyChild([<subpop-id>]) { ... }
The modifyChild() callback may optionally be restricted to the children generated to occupy a specified subpopulation.
When a modifyChild() callback is called, a parent or parents have already been chosen, and a candidate child has already been generated. The parent or parents, and their genomes, are provided to the callback, as is the generated child and its genomes. The callback may accept the generated child, modify it, substitute completely different genomic information for it, or reject it (causing a new parent or parents to be selected and a new child to be generated, which will again be passed to the callback).
In addition to the standard SLiM globals, a modifyChild() callback is supplied with additional information passed through “pseudo-parameters”:
child The generated child (an object of class Individual)
childGenome1 One genome of the generated child
childGenome2 The other genome of the generated child
childIsFemale T if the child will be female, F if male (defined only if sex is enabled)
parent1 The first parent (an object of class Individual)
parent1Genome1 One genome of the first parent
parent1Genome2 The other genome of the first parent
isCloning T if the child is the result of cloning
isSelfing T if the child is the result of selfing (but see note below)
parent2 The second parent (an object of class Individual)
parent2Genome1 One genome of the second parent
parent2Genome2 The other genome of the second parent
subpop The subpopulation in which the child will live
sourceSubpop The subpopulation of the parents (==subpop if not a migration mating)
These may be used in the modifyChild() callback to decide upon a course of action. The childGenome1 and childGenome2 variables may be modified by the callback; whatever mutations they contain on exit will be used for the new child. Alternatively, they may be left unmodified (to accept the generated child as is). These variables may be thought of as the two gametes that will fuse to produce the fertilized egg that results in a new offspring; childGenome1 is the gamete contributed by the first parent (the female, if sex is turned on), and childGenome2 is the gamete contributed by the second parent (the male, if sex is turned on). The child object itself may also be modified – for example, to set the spatial position of the child.
Importantly, a logical singleton return value is required from modifyChild() callbacks. Normally this should be T, indicating that generation of the child may proceed (with whatever modifications might have been made to the child’s genomes). A return value of F indicates that generation of this child should not continue; this will cause new parent(s) to be drawn, a new child to be generated, and a new call to the modifyChild() callback. A modifyChild() callback that always returns F can cause SLiM to hang, so be careful that it is guaranteed that your callback has a nonzero probability of returning T for every state your simulation can reach.
Note that isSelfing is T only when a mating was explicitly set up to be a selfing event by SLiM; an individual may also mate with itself by chance (by drawing itself as a mate) even when SLiM did not explicitly set up a selfing event, which one might term de facto selfing. If you need to know whether a mating event was a de facto selfing event, you can compare the parents; self-fertilization will always entail parent1==parent2, even when isSelfing is F. Since selfing is enabled only in non-sexual simulations, isSelfing will always be F in sexual simulations (and de facto selfing is also impossible in sexual simulations).
Note that matings in SLiM do not proceed in random order. Offspring are generated for each subpopulation in turn, and within each subpopulation the order of offspring generation is also non-random with respect to the source subpopulation, the sex of the offspring, and the reproductive mode (selfing, cloning, or autogamy). It is important, therefore, that modifyChild() callbacks are not in any way biased by the offspring generation order; they should not treat offspring generated early in the process any differently than offspring generated late in the process. Similar to mateChoice() callbacks, any failure to guarantee such invariance could lead to large biases in the simulation outcome. In particular, it is usually dangerous to activate or deactivate modifyChild() callbacks while offspring generation is in progress. When SLiM sees that mateChoice() or modifyChild() callbacks are defined, it randomizes the order of child generation within each subpopulation, so this issue is mitigated somewhat. However, offspring are still generated for each subpopulation in turn. Furthermore, in generations without active callbacks offspring generation order will not be randomized (making the order of parents nonrandom in the next generation), with possible side effects. In short, order-dependency issues are still possible and must be handled very carefully.
As with the other callback types, multiple modifyChild() callbacks may be registered and active. In this case, all registered and active callbacks will be called for each child generated, in the order that the callbacks were registered. If a modifyChild() callback returns F, however, indicating that the child should not be generated, the remaining callbacks in the chain will not be called.
There are many different ways in which a modifyChild() callback could be used in a simulation; see the recipes in chapter 12 for illustrations of the power of this technique. In nonWF models, modifyChild() callbacks are often unnecessary since each generated child is available to the script in the models’ reproduction() callback anyway; but they may be used if desired.
5.13.5 ITEM: 6. recombination() callbacks
Typically, a simulation sets up a recombination map at the beginning of the run with initializeRecombinationRate(), and that map is used for the duration of the run. Less commonly, the recombination map is changed dynamically from generation to generation, with Chromosome’s method setRecombinationRate(); but still, a single recombination map applies for all individuals in a given generation. However, in unusual circumstances a simulation may need to modify the way that recombination works on an individual basis; for this, the recombination() callback mechanism is provided. This can be useful for models involving chromosomal inversions that prevent recombination within a region for some individuals, for example, or for models of the evolution of recombination.
A recombination() callback is defined with a syntax much like that of other callbacks:
[id] [gen1 [: gen2]] recombination([<subpop-id>]) { ... }
The recombination() callback will be called during the generation of every gamete during the generation(s) in which it is active. It may optionally be restricted to apply only to gametes generated by parents in a specified subpopulation, using the <subpop-id> specifier.
When a recombination() callback is called, a parent has already been chosen to generate a gamete, and candidate recombination breakpoints for use in recombining the parental genomes have been drawn. The genomes of the focal parent are provided to the callback, as is the focal parent itself (as an Individual object) and the subpopulation in which it resides. Furthermore, the proposed breakpoints are provided to the callback. The callback may modify these breakpoints in order to change the breakpoints used, in which case it must return T to indicate that changes were made, or it may leave the proposed breakpoints unmodified, in which case it must return F. (The behavior of SLiM is undefined if the callback returns the wrong logical value.)
In addition to the standard SLiM globals, then, a recombination() callback is supplied with additional information passed through “pseudo-parameters”:
individual The focal parent that is generating a gamete
genome1 One genome of the focal parent; this is the initial copy strand
genome2 The other genome of the focal parent
subpop The subpopulation to which the focal parent belongs
breakpoints An integer vector of crossover breakpoints
These may be used in the recombination() callback to determine the final recombination breakpoints used by SLiM. If values are set into breakpoints, the new values must be of type integer. If breakpoints is modified by the callback, T should be returned, otherwise F should be returned (this is a speed optimization, so that SLiM does not have to spend time checking for changes when no changes have been made).
The positions specified in breakpoints mean that a crossover will occur immediately before the specified base position (between the preceding base and the specified base, in other words). The genome specified by genome1 will be used as the initial copy strand when SLiM executes the recombination; this cannot presently be changed by the callback.
In this design, the recombination callback does not specify a custom recombination map. Instead, the callback can add or remove breakpoints at specific locations. To implement a chromosomal inversion, for example, if the parent is heterozygous for the inversion mutation then crossovers within the inversion region are removed by the callback. As another example, to implement a model of the evolution of the overall recombination rate, a model could (1) set the global recombination rate to the highest rate attainable in the simulation, (2) for each individual, within the recombination() callback, calculate the fraction of that maximum rate that the focal individual would experience based upon its genetics, and (3) probabilistically remove proposed crossover points based upon random uniform draws compared to that threshold fraction, thus achieving the individual effective recombination rate desired. Other similar treatments could actually vary the effective recombination map, not just the overall rate, by removing proposed crossovers with probabilities that depend upon their position, allowing for the evolution of localized recombination hot-spots and cold-spots. Crossovers may also be added, not just removed, by recombination() callbacks.
In SLiM 3.3 the recombination model in SLiM was redesigned. This required a corresponding redesign of recombination() callbacks. In particular, the gcStarts and gcEnds pseudo-parameters to recombination() callbacks were removed. In the present design, the callback receives “crossover breakpoints” information only, in the breakpoints pseudo-parameter; it receives no information about gene conversion. However, recombination() callbacks can still be used with the “DSB” recombination model; at the point when the callback is called, the pattern of gene conversion tracts will have been simplified down to a vector of crossover breakpoints. “Complex” gene conversion tracts, however, involving heteroduplex mismatch repair, are not compatible with recombination() callbacks, since there is presently no way for them to be specified to the callback.
Note that the positions in breakpoints are not, in the general case, guaranteed to be sorted or uniqued; in other words, positions may appear out of order, and the same position may appear more than once. After all recombination() callbacks have completed, the positions from breakpoints will be sorted, uniqued, and used as the crossover points in generating the prospective gamete genome. The essential point here is that if the same position occurs more than once, across breakpoints, the multiple occurrences of the position do not cancel; SLiM does not cross over and then “cross back over” given a pair of identical positions. Instead, the multiple occurrences of the position will simply be uniqued down to a single occurrence.
As with the other callback types, multiple recombination() callbacks may be registered and active. In this case, all registered and active callbacks will be called for each gamete generated, in the order that the callbacks were registered.
5.13.6 ITEM: 7. interaction() callbacks
The InteractionType class provides various built-in interaction functions that translate from distances to interaction strengths. However, it may sometimes be useful to define a custom function for that purpose; for that reason, SLiM allows interaction() callbacks to be defined that modify the standard interaction strength calculated by InteractionType. In particular, this mechanism allows the strength of interactions to depend upon not only the distance between individuals, but also the genetics and other state of the individuals, the spatial position of the individuals, and other environmental variables.
An interaction() callback is called by SLiM when it is determining the strength of the interaction between one individual (the receiver of the interaction) and another individual (the exerter of the interaction). This may occur when the evaluate() method of InteractionType is called, if immediate evaluation is requested; or it may occur at some point after evaluation of the InteractionType, when the interaction strength is needed, if immediate evaluation was not requested. This means that interaction() callbacks may be called at a variety of points in the generation cycle, unlike the other callback types in SLiM, which are each called at a specific point. If you write an interaction() callback, you need to take this into account; assuming that the generation cycle is at a particular stage, or even that the generation count is the same as it was when evaluate() was called, may be dangerous.
When an interaction strength is needed, the first thing SLiM does is calculate the default interaction strength using the interaction function that has been defined for the InteractionType. If the receiver is the same as the exerter, the interaction strength is always zero; and in spatial simulations if the distance between the receiver and the exerter is greater than the maximum distance set for the InteractionType, the interaction strength is also always zero. In these cases, interaction() callbacks will not be called, and there is no way to redefine these interaction strengths.
Otherwise, SLiM will then call interaction() callbacks that apply to the interaction type and subpopulation for the interaction being evaluated. An interaction() callback is defined with a variation of the syntax used for other callbacks:
[id] [gen1 [: gen2]] interaction(<int-type-id> [, <subpop-id>]) { ... }
For example, if the callback were defined as:
1000:2000 interaction(i2, p3) { 1.0; }
then an interaction strength of 1.0 would be used for all interactions of interaction type i2 in subpopulation p3 from generation 1000 to generation 2000.
In addition to the standard SLiM globals, an interaction() callback is supplied with some additional information passed through “pseudo-parameters”:
distance The distance from receiver to exerter, in spatial simulations; NAN otherwise
strength The default interaction strength calculated by the interaction function
receiver The individual receiving the interaction (an object of class Individual)
exerter The individual exerting the interaction (an object of class Individual)
subpop The subpopulation in which the receiver and exerter live
These may be used in the interaction() callback to compute an interaction strength. To simply use the default interaction strength that SLiM would use if a callback had not been defined for interaction type i1, for example, you could do this:
interaction(i1) {
return strength;
}
Usually an interaction() callback will modify that default strength based upon factors such as the genetics of the receiver and/or the exerter, the spatial positions of the two individuals, or some other simulation state. Any finite float value greater than or equal to 0.0 may be returned. The value returned will be cached by SLiM; if the interaction strength between the same two individuals is needed again later, the interaction() callback will not be called again until the interaction is next evaluated (something to keep in mind if the interaction strength includes a stochastic component).
More than one interaction() callback may be defined to operate in the same generation. As with other callbacks, multiple callbacks will be called in the order in which they were defined in the input file. Furthermore, each callback will be given the strength value returned by the previous callback – so the value of strength is not necessarily the default value, in fact, but is the result of all previous interaction() callbacks for the interaction in question. In this way, the effects of multiple callbacks can “stack”.
The interaction() callback mechanism is extremely powerful and flexible, allowing any sort of user-defined interactions whatsoever to be queried dynamically using the methods of InteractionType. However, in the general case a simulation may call for the evaluation of the interaction strength between each individual and every other individual, making the computation of the full interaction network an O(N2) problem. Since interaction() callbacks may be called for each of those N2 interaction evaluations, they can slow down a simulation considerably, so it is recommended that they be used sparingly. This is the reason that the various interaction functions of InteractionType were provided; when an interaction does not depend upon individual state, the intention is to avoid the necessity of an interaction() callback altogether. Furthermore, constraining the number of cases in which interaction strengths need to be calculated – using a short maximum interaction distance, querying the nearest neighbors of the focal individual rather than querying all possible interactions with that individual, and specifying the reciprocality and sex segregation of the InteractionType, for example – may greatly decrease the computational overhead of interaction evaluation.
5.13.7 ITEM: 8. reproduction() callbacks
In WF models (the default model type in SLiM), the SLiM core manages the reproduction of individuals in each generation. In nonWF models, however, reproduction is managed by the model script, in reproduction() callbacks. These callbacks may only be defined in nonWF models.
A reproduction() callback is defined with a syntax much like that of other callbacks:
[id] [gen1 [: gen2]] reproduction([<subpop-id> [, <sex>]]) { ... }
The reproduction() callback will be called once for each individual during the generation(s) in which it is active. It may optionally be restricted to apply only to individuals in a specified subpopulation, using the <subpop-id> specifier; this may be a subpopulation specifier such as p1, or NULL indicating no restriction. It may also optionally be restricted to apply only to individuals of a specified sex (in sexual models), using the <sex> specifier; this may be "M" or "F", or NULL indicating no restriction.
When a reproduction() callback is called, SLiM’s expectation is that the callback will trigger the reproduction of a focal individual by making method calls to add new offspring individuals. Typically the offspring added are the offspring of the focal individual, and typically they are added to the subpopulation to which the focal individual belongs, but neither of these is required; a reproduction() callback may add offspring generated by any parent(s), to any subpopulation. The focal individual is provided to the callback (as an Individual object), as are its genomes and the subpopulation in which it resides.
A common alternative pattern is for a reproduction() callback to ignore the focal individual and generate all of the offspring for the current generation, from all parents. The callback then sets self.active to 0, preventing itself from being called again in the current generation; this callback design therefore executes once per generation. This can be useful if individuals influence each other’s offspring generation (as in a monogamous-mating model, for example); it can also simply be more efficient when producing offspring in bulk.
In addition to the standard SLiM globals, then, a reproduction() callback is supplied with additional information passed through global variables:
individual The focal individual that is expected to reproduce
genome1 One genome of the focal individual
genome2 The other genome of the focal individual
subpop The subpopulation to which the focal individual belongs
At present, the return value from reproduction() callbacks is not used, and must be void (i.e., a value may not be returned). It is possible that other return values will be defined in future.
It is possible, of course, to do actions unrelated to reproduction inside reproduction() callbacks, but it is not recommended. The early() event phase of the current generation provides an opportunity for actions immediately before reproduction, and the early() event phase of the current generation provides an opportunity for actions immediately after reproduction, so only actions that are intertwined with reproduction itself should occur in reproduction() callbacks. Besides providing conceptual clarity, following this design principle will also decrease the probability of bugs, since actions that are unrelated to reproduction should not influence or be influenced by the dynamics of reproduction.
As with the other callback types, multiple reproduction() callbacks may be registered and active. In this case, all registered and active callbacks will be called for each individual, in the order that the callbacks were registered.
5.13.8 ITEM: 9. mutation() callbacks
SLiM auto-generates new mutations according to the current mutation rate (or rate map) and the genetic structure defined by genomic elements, their genomic element types, the mutation types those genomic element types draw from, and the distribution of fitness effects defined by those mutation types. In nucleotide-based models, the nucleotide sequence and the mutation matrix also play a role in determining both the rate of mutation and the nucleotide mutated to. In some models it can be desirable to modify these dynamics in some way – altering the selection coefficients of new mutations in some way, changing the mutation type used, dictating the nucleotide to be used, replacing the proposed mutation with a pre-existing mutation at the same position, or even suppressing the proposed mutation altogether. To achieve this, one may define a mutation() callback.
A mutation() callback is defined as:
[id] [gen1 [: gen2]] mutation([<mut-type-id> [, <subpop-id>]]) { ... }
The mutation() callback will be called once for each new auto-generated mutation during the generation(s) in which the callback is active. It may optionally be restricted to apply only to mutations of a particular mutation type, using the <mut-type-id> specifier; this may be a mutation type specifier such as m1, or NULL indicating no restriction. It may also optionally be restricted to individuals generated by a specified subpopulation (usually – see below for discussion), using the <subpop-id> specifier; this should be a subpopulation specifier such as p1.
When a mutation() callback is called, a focal mutation (provided to the callback as an object of type Mutation) has just been created by SLiM, referencing a particular position in a parental genome (also provided, as an object of type Genome). The mutation will not be added to that parental genome; rather, the parental genome is being copied, during reproduction, to make a gamete or an offspring genome, and the mutation is, conceptually, a copying error made during that process. It will be added to the offspring genome that is the end result of the copying process (which may also involve recombination with another genome). At the point that the mutation() callback is called, the offspring genome is not yet created, however, and so it cannot be accessed from within the mutation() callback; the mutation() callback can affect only the mutation itself, not the genome to which the mutation will be added.
In addition to the standard SLiM globals, then, a mutation() callback is supplied with additional information passed through global variables:
mut The focal mutation that is being modified or reviewed
genome The parental genome that is being copied
element The genomic element that controls the mutation site
originalNuc The nucleotide (0/1/2/3 for A/C/G/T) originally at the mutating position
parent The parent which is generating the offspring genome
subpop The subpopulation to which the parent belongs
The mutation() callback has three possible returns: T, F, or (beginning in SLiM 3.5) a singleton object of type Mutation. A return of T indicates that the proposed mutation should be used in generating the offspring genome (perhaps with modifications made by the callback). Conversely, a return of F indicates that the proposed mutation should be suppressed. If a proposed mutation is suppressed, SLiM will not try again; one fewer mutations will be generated during reproduction than would otherwise have been true. Returning F will therefore mean that the realized mutation rate in the model will be lower than the expected mutation rate. Finally, a return of an object of type Mutation replaces the proposed mutation (mut) with the mutation returned; the offspring genomes being generated will contain the returned mutation. The position of the returned mutation must match that of the proposed mutation. This provides a mechanism for a mutation() callback to make SLiM re-use existing mutations instead of generating new mutations, which can be useful.
The callback may perform a variety of actions related to the generated mutation. The selection coefficient of the mutation can be changed with setSelectionCoefficient(), and the mutation type of the mutation can be changed with setMutationType(); the drawSelectionCoefficient() method of MutationType may also be useful here. A tag property value may be set for the mutation, and named values may be attached to the mutation with setValue(). In nucleotide-based models, the nucleotide (or nucleotideValue) property of the mutation may also be changed; note that the original nucleotide at the focal position in the parental genome is provided through originalNuc (it could be retrieved with genome.nucleotides(), but SLiM already has it at hand anyway). All of these modifications to the new mutation may be based upon the state of the parent, including its genetic state, or upon any other model state.
It is possible, of course, to do actions unrelated to mutation inside mutation() callbacks, but it is not recommended; first(), early(), and late() events should be used for general-purpose scripting. Besides providing conceptual clarity, following this design principle will also decrease the probability of bugs, since actions that are unrelated to mutation should not influence or be influenced by the dynamics of mutation.
The proposed mutation will not appear in the sim.mutations vector of segregating mutations until it has been added to a genome; it will therefore not be visible in that vector within its own mutation() callback invocation, and indeed, may not be visible in subsequent callbacks during the reproduction generation cycle stage until such time as the offspring individual being generated has been completed. If that offspring is ultimately rejected, in particular by a modifyChild() callback, the proposed mutation may not be used by SLiM at all. It may therefore be unwise to assume, in a mutation() callback, that the focal mutation will ultimately be added to the simulation, depending upon the rest of the model’s script.
There is one subtlety to be mentioned here, having to do with subpopulations. The subpop pseudo-parameter discussed above is always the subpopulation of the parent which possesses the genome that is being copied and is mutating; there is no ambiguity about that whatsoever. The <subpop-id> specified in the mutation() callback declaration, however, is a bit more subtle; above it was said that it restricts the callback “to individuals generated by a specified subpopulation”, and that is usually true but requires some explanation. In WF models, recall that migrants are generated in a source subpopulation and placed in a target subpopulation, as a model of juvenile migration; in that context, the <subpop-id> specifies the source subpopulation to which the mutation() callback will be restricted. In nonWF models, offspring are generated by the add...() family of Subpopulation methods, which can cross individuals from two different subpopulations and place the result in a third target subpopulation; in that context, in general, the <subpop-id> specifies the source subpopulation that is generating the particular gamete that is sustaining a mutation during its production. The exception to this rule is addRecombinant(); since there are four different source subpopulations potentially in play there, it was deemed simpler in that case for the <subpop-id> to specify the target subpopulation to which the mutation() callback will be restricted. If restriction to the source subpopulation is needed with addRecombinant(), the subpop pseudo-parameter may be consulted rather than using <subpop-id>.
Note that mutation() callbacks are only called for mutations that are auto-generated by SLiM, as a consequence of the mutation rate and the genetic structure defined. Mutations that are created in script, using addNewMutation() or addNewDrawnMutation(), will not trigger mutation() callbacks; but of course the script may modify or tailor such added mutations in whatever way is desired, so there is no need for callbacks in that situation.
As with the other callback types, multiple mutation() callbacks may be registered and active. In this case, all registered and active callbacks will be called for each generated mutation to which they apply, in the order that the callbacks were registered.
5.13.9 ITEM: 10. survival() callbacks
In nonWF models, a selection phase in the generation cycle results in mortality; individuals survive or die based upon their fitness. In most cases this standard behavior is sufficient; but occasionally it can be useful to observe the survival decisions SLiM makes (to log out information about dying individuals, for example), to modify those decisions (influencing which individuals live and which die, perhaps based upon factors other than genetics), or even to short-circuit mortality completely (moving dead individuals into a “cold storage” subpopulation for later use, perhaps). To accomplish such goals, one can the survival() callback mechanism to override SLiM’s default behavior. Note that in WF models, since they always model non-overlapping generations, the entire parental generation dies in each generation regardless of fitness; survival() callbacks therefore apply only to nonWF models.
A survival() callback is defined with a syntax much like that of other callbacks:
[id] [gen1 [: gen2]] survival([<subpop-id>]) { ... }
The survival() callback will be called during the selection phase of the generation cycle of nonWF models, during the generation(s) in which it is active. By default it will be called once per individual in the entire population (whether slated for survival or not); it may optionally be restricted to apply only to individuals in a specified subpopulation, using the <subpop-id> specifier.
When a survival() callback is called, a focal individual has already been evaluated by SLiM regarding its survival; a final fitness value for the individual has been calculated, and a random uniform draw in [0,1] has been generated that determines whether the individual is to survive (a draw less than the individual’s fitness) or die (a draw greater than or equal to the individual’s fitness). The focal individual is provided to the callback, as is the subpopulation in which it resides. Furthermore, the preliminary decision (whether the focal individual will survive or not), the focal individual’s fitness, and the random draw made by SLiM to determine survival are also provided to the callback. The callback may return NULL to accept SLiM’s decision, or may return T to indicate that the individual should survive, or F to indicate that it should die, regardless of its fitness and the random deviate drawn. The callback may also return a singleton Subpopulation object to indicate the individual should remain alive but should be moved to that subpopulation (note that calling takeMigrants() during the survival phase is illegal, because SLiM is busy modifying the population’s internal state).
In addition to the standard SLiM globals, then, a survival() callback is supplied with additional information passed through “pseudo-parameters”:
individual The focal individual that will live or die
subpop The subpopulation to which the focal individual belongs
surviving A logical value indicating SLiM’s preliminary decision (T == survival)
fitness The focal individual’s fitness
draw SLiM’s random uniform deviate, which determined the preliminary decision
These may be used in the survival() callback to determine the final decision.
While survival() callbacks are still being called, no decisions are put into effect; no individuals actually die, and none are moved to a new Subpopulation if that was requested. In effect, SLiM pre-plans the fate of every individual completely without modifying the model state at all. After all survival() callbacks have completed for every individual, the planned fates for every individual will then be executed, without any opportunity for further intervention through callbacks. It is therefore legal to inspect subpopulations and individuals inside a survival() callback, but it should be understood that previously made decisions about the fates of other individuals will not yet have any visible effect. It is generally a good idea for the decisions rendered by survival() callbacks to be independent anyway, to avoid biases due to order-dependency, since the order in which individuals are evaluated is not guaranteed to be random.
It is worth noting that if survival() callbacks are used, “fitness” in the model is then no longer really fitness; the model is making its own decisions about which individuals live and die, and those decisions are the true determinant of fitness in the biological sense. A survival() callback that makes its own decisions regarding survival with no regard for SLiM’s calculated fitness values can completely alter the pattern of selection in a population, rendering all of SLiM’s fitness machinery – selection and dominance coefficients, fitnessScaling values, etc. – completely irrelevant. To avoid highly counterintuitive and confusing effects, it is thus generally a good idea to use of survival() callbacks only when it is strictly necessary to achieve a desired outcome.
As with the other callback types, multiple survival() callbacks may be registered and active. In this case, all registered and active callbacks will be called for each individual evaluated, in the order that the callbacks were registered.