Documentation for discsim

This is the documentation for the discsim module, an efficient coalescent simulator for the disc-based extinction/recolonisation continuum model. This module is intended as a companion to the ercs module, which provides support for simulating the history of a sample under more general models. The ercs module, however, is much less efficient for large neighbourhood sizes, and the discsim module documented here uses a different underlying simulation algorithm. The interface presented by the two modules is very similar (although not identical), so those unfamiliar with the interface are recommended to first read the documentation for the ercs module. In particular, the representation of the history of a sample uses the same format, and the ercs.MRCACalculator class can be used to find most recent common ancestors for both simulations.

Besides the specialisation of the simulation to the disc model, there are some other differences between the simulations. Firstly, discsim provides access to the state of the sample during simulations, allowing us to simulate the distribution of the location of ancestors. We also support simulating the history of a sample in one or two dimensions, which is achieved by simply providing a 1D or 2D sample of locations. Finally, discsim allows us to simulate the history of either the genetic ancestors and their history (as before) or the pedigree ancestors of the sample.

discsim – Module reference

discsim.Simulator

class discsim.Simulator(torus_diameter, simulate_pedigree=False)

Simulate the extinction/recolonisation continuum disc model in one or two dimensions, tracking either the genetic ancestors (and their genealogies at multiple loci) or the pedigree ancestors. If simulate_pedigree is True, simulate the locations of the pedigree ancestors of the sample; otherwise, simulate the locations and ancestry of the genetic ancestors of the sample over muliple loci.

sample

The location of individuals at the beginning of the simulation. This must be either a list of 2-tuples or numbers describing locations within the space defined by the torus. If a list of numbers is provided, the simulation is performed in a 1D environment; if a list of tuples is provided, the simulation occurs in 2D. Dimensions cannot be mixed. The zero’th element of the list must be None.

Default value: None.

event_classes

The event classes to simulate. This must be a list of ercs.DiscEventClass instances. There must be at least one event class specified. The underlying algorithm will take the first event class and only simulate events in which there is a high probability of a lineage jumping. Subsequent event classes are not treated in any special way, and are simulated directly without conditioning.

Default value: None.

torus_diameter

The diameter of the torus we are simulating on. This defines the size of the 1D or 2D space that lineages can move around in.

Default value: Specified at instantiation time.

num_parents

The number of parents in each event. For a single locus simulation there must be at least one parent and for multi-locus simulations at least two.

Default value: 1 if the simulation is single locus genetic simulation; otherwise 2.

recombination_probability

The probability of recombination between adjacent loci at an event.

Default value: 0.5 (free recombination).

num_loci

The number of loci we simulate the history of in a genetic simulation.

Default value: 1.

pixel_size

The length of one edge of a pixel. This is an important performance tuning factor. For a single locus simulation, this should be around 2.25 for best performance; for multilocus (or pedigree) simulations, this should be less than 2.25. For pixel_size less than 1, memory requirements increase sharply and may not justify any increase in speed of simulation.

For 1D simulations, this must be equal to 2.

There is a strong requirement that torus_diameter / pixel_size must be an integer exactly. Thus, there can be issues with floating point division not providing exact values. The best solution to this is to use pixel sizes that are exactly representable in binary (such as 2.125, 1.5, etc).

Default value: 2

random_seed

The random seed for the current simulation run.

Default value: A random value chosen by the standard Python random number generator.

max_population_size

The maximum number of extant individuals in the simulation. If the number of indivuduals we are tracking exceeds this limit, the simulation aborts and raises an _discsim.LibraryError.

Default value: 1000

max_occupancy

The maximum number of individuals that may occupy a simulation pixel. This is defined as the number of individuals that are within distance r of the pixel itself (not just those within the pixel). If this value is exceeded the simulation aborts and raises an _discsim.LibraryError.

Default value: N times the catchment area of a pixel, where N is the neighbourhood size. This should be sufficient for most purposes, but may need to be increased for very high sampling densities.

run(until=None)

Runs the simulation until coalescence or the specified time is exceeded. If until is not specified simulate until complete coalescence. Returns True if the sample coalesced, and False otherwise.

Parameters:until (float) – the time to simulate to.
Returns:True if the sample has completely coalesced; False otherwise.
Return type:Boolean
Raises :_discsim.LibraryError when the C library encounters an error
get_population()

Returns the current state of the population. For a pedigree simulation, this returns the current locations of all individuals; in a genetic simulation, this also returns the ancestral material mappings for each individual.

Returns:the current state of the population.
Return type:A list describing the state of each extant ancestor. For a pedigree simulation, this is a list of locations. For a genetic simulation, this is a list of tuples (x, a), where x is the location of the ancestor and a is its ancestry. The ancestry is a dictionary mapping a locus to the node it occupies in the genealogy for that locus.
get_history()

Returns the history of the current ancestral population. This is not defined for a pedigree simulation.

Returns:the simulated history of the sample, (pi, tau)
Return type:a tuple (pi, tau); pi is a list of lists of integers, and tau is a list of lists of doubles
Raises :NotImplementedError if called for a pedigree simulation.
reset()

Resets the simulation so that we can perform more replicates. This must be called if any attributes of the simulation are changed; otherwise, these changes will have no effect.

get_time()

Returns the current time of the simulator.

get_num_reproduction_events()

Returns the number of reproduction events since the beginning of the simulation.

Table Of Contents

This Page