Data and models from multi-model inference of non-random mating from an information theoretic approach

This is a co-submission with Multi-model inference of non-random mating from an information theoretic approach [1]. These data corresponds to the complete simulated data set jointly with the set of models defined for analysing the data. The simulated data set was obtained using the program MateSim [2]. The simulated cases correspond to one-sex competition and mate choice models. For each simulation run, the population frequencies (premating individuals) and the sample of 500 mating pairs were generated randomly for a hypothetical trait with two classes at each sex. Some datasets represent larger population size species (n = 10 000) and the mating process was represented as a sampling with replacement, and the population frequencies were constant over the mating season. The minimum phenotype frequency (MPF) allowed was 0.1. Five different model cases were simulated, namely random mating, female competition with mate choice (with independent or compound parameters) and male competition with mate choice (with independent or compound parameters). Each case was simulated 1000 times. Other datasets represent monogamous species (with large or small population size) and the mating process was without replacement (from the point of view of the available phenotypes). These data sets were used to test the performance of the multi-model inference methodology proposed in [1]. The data may be useful for testing any new/old statistics for measuring sexual selection or assortative mating patterns.


a b s t r a c t
This is a co-submission with Multi-model inference of non-random mating from an information theoretic approach [1]. These data corresponds to the complete simulated data set jointly with the set of models defined for analysing the data. The simulated data set was obtained using the program MateSim [2]. The simulated cases correspond to one-sex competition and mate choice models. For each simulation run, the population frequencies (premating individuals) and the sample of 500 mating pairs were generated randomly for a hypothetical trait with two classes at each sex. Some datasets represent larger population size species (n ¼ 10 000) and the mating process was represented as a sampling with replacement, and the population frequencies were constant over the mating season. The minimum phenotype frequency (MPF) allowed was 0.1. Five different model cases were simulated, namely random mating, female competition with mate choice (with independent or compound parameters) and male competition with mate choice (with independent or compound parameters). Each case was simulated 1000 times. Other datasets represent monogamous species (with large or small population size) and the mating process was without replacement (from the point of view of the available phenotypes). These data sets were used to test the performance of the multimodel inference methodology proposed in [1]. The data may be DOI of original article: https://doi.org/10.1016/j.tpb.2019.11.002. E-mail address: acraaj@uvigo.es.

Data description
There are two different types of data, first type corresponds to simulated data and second type corresponds to the description of various sets of mating models.

1) Simulated data
In the Simulated_Data zip file there are 17 data folders corresponding to different mating scenarios. Each folder includes 1000 files that correspond to different Monte Carlo runs of the same mating scenario.
The name of each folder gives the full information about the mating scenario. For example the name: i) Out_MateSim_0_a1_driftnonU_sample500_N10000_MPF0.1: Corresponds to mating with replacement (polygamous populations or monogamous with large population size) and a random mating model (model 0 with a ¼ 1). The population frequencies are different between runs (drift) and non-uniform (nonU). The sample size is 500 matings sampled from a population of size N ¼ 10 000 (sex ratio is 1:1 by default) and the minimum phenotype frequency (MPF) in the population was 0.1.
Specifications Table   Subject Ecology, Evolution, Behaviour and Systematics Specific subject area Mate competition and mate choice generate non-random mating that can be detected as sexual selection and/or assortative mating in the mating tables. Type of data

Value of the Data
The data represent the effect of mate competition and choice for discrete traits under different controlled conditions. Researchers interested in sexual selection and/or assortative mating. The data can be used to test different estimation methods of sexual selection and assortative mating or to study how different behavioural and demography conditions affect to the pattern in the data ii) Out_MateSim_6_ac2-3_driftnonU_sample500_N10000_MPF0.1: Stands for mating with replacement and MateSim program model 6 that implies female competition with parameter a ¼ 2 and mate choice with parameter c ¼ 3. The population frequencies are different between runs (drift) and non-uniform (nonU). The sample size is 500 matings from a population of size N ¼ 10 000 individuals (sex ratio is 1:1 by default), and the minimum phenotype frequency (MPF) in the population was 0.1.
The 17 data folders correspond to: Five models (M0, M6, M7, M8 and M9) with sample 500 and N ¼ 10 000: The item i) above corresponds to M0 while the item ii) correspond to M6; cases M7, M8 and M9 are similar to these items changing the notation for the model number, i.e. Out_MateSim_7_ instead of Out_MateSim_6_, and so on.
The model M7 is the same as M6 but for males (male competition plus mate choice). Similarly M9 is the same as M8 but for males. Three more scenarios with models M0, M6, M8 as before but with sample size 50 instead of 500. Now the mating is changed to be mass-encounter without replacement again with models M0, M6, M8. The cases for M6 and M8 with sample size 500 were noted in iii) and iv) items. There were three more models with sample size 50. Thus, at this point we have 14 different scenarios. The last three up to 17, correspond to mass-encounter scenario for the models M0, M6 and M8 with population size N ¼ 200 and sampling the whole population (sample ¼ 100 pairs).

2) Model set
The second kind of data correspond to the set of models assayed with the program InfoMating [1]. There are two different sets of models in these data:

2-1) Simulation model set
The file Simulations_Model_Set corresponds to the set of models assayed with the simulated data set.

2-2) Empirical model set
The file Empirical_Model_Set corresponds to the set of models assayed with the empirical data from Ref. [4].

Experimental design, materials, and methods
1) The Simulated data were obtained with the program MateSim [2] under the specifications given in the Data Description section. Specifically, the command line arguments to obtain the 1000 files explained in i) in the Data Description section (folder Out_MateSim_0_a1_driftnonU_sample500_N10000_MPF0.1) was: -continuous 0 -numfiles 1000 -femtypes 2 -maletypes 2 -N 10 000 -model 0 -matings 500 -uniform 0 -drift 1 -selfactor 2 -choicefactor 3 Note that the competition factor (selfactor) and choice factor values are irrelevant since we defined the model 0 that implies random mating.

2) Model sets
The model sets were automatically provided by the program InfoMating [1]. For example, to generate all the models available in the program for the first data file in the Out_MateSim_6_ac2-3_driftnonU_sample500_N10000_MPF0.1 folder, the command line arguments were: InfoMating -input MateSim_D_M6_ac2-3_driftnonU10000_1.txt -All 1.
Where 'MateSim_D_M6_ac2-3_driftnonU10000_1.txt' is the name of the first file in the folder and the tag -All 1 indicate that we desire to test all available models.