A Bayesian Approach to Selection and Ranking Procedures: The Unequal Variance Case
Book file PDF easily for everyone and every device.
You can download and read online A Bayesian Approach to Selection and Ranking Procedures: The Unequal Variance Case file PDF Book only if you are registered here.
And also you can download or read online all Book PDF file that related with A Bayesian Approach to Selection and Ranking Procedures: The Unequal Variance Case book.
Happy reading A Bayesian Approach to Selection and Ranking Procedures: The Unequal Variance Case Bookeveryone.
Download file Free Book PDF A Bayesian Approach to Selection and Ranking Procedures: The Unequal Variance Case at Complete PDF Library.
This Book have some digital formats such us :paperbook, ebook, kindle, epub, fb2 and another formats.
Here is The CompletePDF Book Library.
It's free to register here to get Book file PDF A Bayesian Approach to Selection and Ranking Procedures: The Unequal Variance Case Pocket Guide.
They allow the analysis of differences between means after the conclusion of the experiment to detect possible groups in a set of levels of unstructured factors. The MCP and the F test require that certain assumptions be satisfied - the samples should be randomly and independently selected; the residues must be normally distributed and the variances must be homogeneous RAFTER et al, Because one or more of these assumptions may be violated for a given set of data, it is important to be aware of how this would impact an inferential procedure.
The insensitivity of a procedure to one or more violations of its underlying assumptions is called its robustness. The first assumption is the least likely to be violated, because it is under the control of the researcher. If violated, neither the MCP nor the F -test are robust. Most of the procedures seem to be robust under moderate departures from normality in that the error rate per experiment will only be slightly higher than specified. Some MCP have been specifically developed to be used when the variances are not all equal.
Many of the proposed procedures control the general risk of type I errors, but have little statistical power. These procedures control the overall risk of a type I error experimentally at approximately the level of nominal significance and have the best statistical power between the alternative solutions. Tamhane proposed two approximated approaches for the multiple comparisons with a control and all-pairwise comparisons when the variances are unequal.
Demirhan et al , Ramsey et al and Ramsey et al studied the influence of violations of assumptions of normality and homogeneity of variances on the choice of a multiple comparison procedure. Booststrap resampling methods can be used in studies of multiple comparisons of the means of one-factor levels in situations of heterogeneity of variances of normal or non-normal probabilistic models KESELMAN et al, An alternative is the use of Bayesian procedures. The proposed alternatives were superior to the other procedures studied, in the simulated examples, because they controlled the type I error and presented a greater power.
In addition to having advantages over conventional tests, in the sense that there isn't need for homogeneity of variances and data balancing, that is very significant from a practical point of view. Despite the superiority of the Bayesian alternatives, they weren't implemented which made it difficult to use them.
In addition to being free, R has several packages for the most diverse areas and allows the user to create their own functions. In addition, it receives contributions from researchers from around the world in the form of packages, making a major development of the program and enabling solutions to real problems to be easily found or created by the researcher himself. This function allows to analyze the experimental data considering the cases of homogeneity and heterogeneity of variances in models with normal distribution, in situations of balancing or not.
For this, a sample of size n of the multivariate t distribution was generated, whose parameters are specified by: 1 where k is the number of population means and is the covariance matrix of the means. From the posteriori multivariate t distribution, we were generate k chains of means , using the Monte Carlo method and assuming constant means, vector that is, all the same components. Thus, without loss of generality, it was assumed for all k components , imposing the null hypothesis H 0 in the Bayesian method. Let be the indicator function to verify that the value zero belongs to the interval in the jth Monte Carlo sample unit of the a posteriori chain, 6 After performing the above procedures it was possible to implement a function named Bayes , which should receive the arguments presented in Table 1.
Table 1. The data are shown in Table 2. Table 2. Nitrogen content, in mg, of red clover plants inoculated with combinations of R. Results The Bayes function N, alpha, file receives three entries, N is the sample size to be simulated, alpha is the significance level and file the file with the experimental data. Initially the analysis of variance was performed and the assumptions of normality and homoscedasticity were verified.
By means of the qpostbayes function k k means chains were generated, using the Monte Carlo method, imposing the null hypothesis H 0 in the Bayesian method. The generation of the standardized amplitude of the posteriori was performed, under H 0 , from expressions 2 and 3. The inference about the hypothesis was made through two Bayesian tests d bayes e pbayes. The differences were then compared with the delta value. For any amplitude greater than delta, the difference is considered significantly different from zero, that is, there is difference between the treatments of that pair.
To test the hypothesis of equality of means by means of the pbayes test, we used the limits generated from equation 5 and calculated the posterior probability of the intervals containing the value zero, according to equation 6. The power the pbayes test was calculated. Figure 1. Format in which the data must be inserted in the R The interface shown to the user for the variance analysis and the normality and homocedasticity assumptions is presented in Figure 2. Figure 2. By means of the Shapiro-Wilk test it is verified that the errors follow a normal or approximately normal distribution p-value greater than alpha.
It is also observed, by the Bartlett test, that the variances are heterogeneous p-value less than alpha. The output obtained in the R for the d bayes test is shown in Figure 3. Figure 3. Figure 4. Output to the pbayes test with its power. The Tukey test was used as a procedure to perform multiple comparisons for the data set under analysis. The comparative results between the two proposed tests and the Tukey test are presented in Figure 5. Figure 5. Comparison between the tests proposed with Tukey In the example presented, it can be observed that the pbayes test shows a greater sensitivity regarding the detection of differences between treatments in relation to the d bayes and Tukey tests.
The d bayes test presented a lower identification in relation to Tukey test, but it is worth noting that since the data do not show homogeneity of variances, the Tukey test result is not reliable. Selecting an appropriate multiple comparison procedure requires extensive evaluation of the available information on the status of each test. Information on the importance of type I errors, power, computational simplicity, and so on, are extremely important to the selection process. In addition, selecting an appropriate multiple comparison procedure depends on data that conforms to validity assumptions.
Routinely selecting a procedure without careful consideration of available and alternative information can severely reduce the reliability and validity of results. Thus, the implementation of these two tests provides another possibility of choice for the user. The intention is to incorporate the functions developed in an R package and test their performances for other designs and analysis schemes. Properties of sufficiency and statistical tests. Controlling the false discovery rate.
Multiple comparisons, multiple tests, and data dredging: a Bayesian perspective with discussion. Oxford: Oxford University Press, p. Bayesian perspectives on multiple comparisons. Journal of Statistical Planning and Inference, 82, p. A Bayesian multiple comparison procedure for ranking the means of normally distributed data. Journal of Statistical Planning and Inference , p.
Multiple Comparisons Using R. Clustering means in anova by simultaneous testing. Biometrics, v. Multiple comparison procedures under heteroscedasticity. Tamkang Journal of Science and Engineering, Vol. Performance of some multiple comparison tests under heteroscedasticity and dependency. A Bayesian approach to multiple comparisons.
Technometrics, 7, — Pairwise multiple comparisons in the unequal variance case. Journal of the American Statistical Association, 75 , — Of Educational Statistics, 1, The effects of Bayesian models can be used as an inference to the test of hypothesis [ 19 ]. In both criteria AIC and DIC for the choice of statistical models, the obtained results revealed that the MTME models showed the best fit, explaining the genetic variability of the experiment and selection considering the genetic progeny and environmental interaction effects Table 2.
Variance components are the variances associated with the random effects of a model. Knowing them is of great importance in genetics and breeding, since the population and the breeding method to be used depend on information that can be obtained from these components. The solution of mixed-model equations depends on knowledge of the variance and covariance matrix, whose structure is known, but its components often are not. At present, the standard method for the estimation of variance components is REML, developed by Patterson and Thompson [ 32 ].
The BLUP method [ 33 ] maximizes the correlation between the predicted and the true genotypic value; i. Additionally, it is not biased, as we expect the predicted genotypic value to be equal to the true genotypic value [ 61 ]. Further, BLUP allows for the simultaneous use of several sources of information as well as information originating from experiments carried out in one or various locations and evaluated in one or various harvests [ 62 ]. Although the mixed-model methodology by the frequentist approach has several desirable characteristics [ 49 ], the adoption of Bayesian statistical inference for genetic evaluation in the breeding of crop species has shown to be advantageous.
Bayesian models have been used since [ 63 ] and further exploited in recent years [ 23 — 25 , 64 , 65 ] due to the great computational advancements and new methodological applications and elucidations. Bayesian analysis is based on the knowledge of the posterior distribution of the parameters to be estimated. This allows for the construction of exact credibility intervals for the estimates of random variables, variance components, and fixed effects [ 66 ].
The difference between mean, mode, and median of broad-sense heritability estimates Table 4 reflects some lack of symmetry in the posterior distribution estimates [ 38 ]. When the prior distribution is informative, the credibility interval tends to be narrower than the confidence intervals. When the mixed-model parameters are assigned non-informative distributions, Bayesian and frequentist inferences should be equivalent [ 67 ]. Mathew et al. Schenkel et al. Silva et al. The specific results obtained by the frequentist and Bayesian approaches were similar Table 3.
- Monsters come In Many Colors.
- Bourree II?
- Crazy Sh*t Presidents Said: The Most Surprising, Shocking, and Stupid Statements Ever Made by U.S. Presidents, from George Washi;
- Deterrence Now (Cambridge Studies in International Relations).
- New England Nation: The Country the Puritans Built?
- Statistics & Risk Modeling.
- Electronic Journal of Statistics.
This was expected, since non-informative prior distributions were used in Bayesian analysis. The modes of the marginal posterior distributions of the genetic parameters were similar to the corresponding REML estimates. From the Bayesian point-of-view, the estimates obtained via REML correspond to the modes of the combined posterior distributions of the variance components, obtained by Bayesian approach, given the use of uniform priors for the fixed effects and variance components [ 66 ]. The frequentist and Bayesian MTME models provided higher estimates and lower estimates, which resulted in higher for all evaluated traits.
The genotypic coefficient of variation CV g quantifies the magnitude of genetic variation available for selection, and thus high values are desirable [ 76 ]. In this way, the increase seen in this parameter with the use of the MTME models is important for breeding programs. The residual coefficient of variation CV e is a measurement of experimental precision of statistical and non-genetic nature.
Studies on genotypic, phenotypic, and environmental correlations in soybean involve traits that are evaluated from flowering to maturity; notably, yield and its components [ 77 — 80 ]. Our results corroborate those reported by Cober et al. The authors argued that the genes controlling maturity in soybean have pleiotropic effects with grain yield. Ablett et al. Liu et al. Li et al. According to Pollak et al. This bias was observed in the present study, where the genetic correlation value between the DM and SY traits exceeded the parameter space value higher than 1 Table 5.
Viana et al.
Statistical Weighting Methods
According to Thompson and Meyer [ 86 ], the increase in accuracy obtained with the use of multi-trait BLUP analysis compared with single-trait analysis is proportional to the difference between the genetic and environmental correlations of the analyzed traits. In the context of whole-genome prediction, Jia et al. This fact was observed in our study, in which the DM variable showed high heritability and high correlation with SY, consequently generating significant increases in selection accuracy for SY.
However, for both methodologies—frequentist and Bayesian—there was no significant increase in selection accuracy for the SW trait, as verified by its low correlation with the other evaluated traits. This finding was confirmed by Rank Spearman correlation Table 6.
These promises were used in the present study, which explains the obtained results. Despite the high agreement between the progeny selected for the DM and SW traits by both procedures, there was little agreement for the SY trait, which resulted in greater gains predicted from selection via MTME.
Piepho et al. Resende et al. Okeke et al. Greater accuracy and efficiency of multiple-trait models were also reported by Viana et al. This is explained by the use of the estimator of selection accuracy. In this regard, Resende et al. The opposite can be considered true for the SW variable, for which the frequentist models obtained better results due to a better adjustment of the normal distribution of the parameters attributed to the data.
These conclusions are also valid for the MTME models, which exhibited different obtained accuracies; however, for the SY variable, the FMTME model showed the best fit according to the mean accuracy of the progeny Table 3. However, the high positive correlation between the DM and SY traits can favor the selection of high-yielding and late-cycle progeny.
In this case, selection indices can help breeders select progeny that exhibit gains for both traits simultaneously [ 90 ]. This is a desirable factor that should be taken into account by breeders. According to Silva et al. Thus, progeny selection in F through a more precise method is relevant. Early in the generation of the base populations of the soybean breeding programs, many populations are commonly obtained at the expense of the number of progeny to be evaluated; i.
Thus, Bayes' theorem is recommended for those situations, as it gives precise solutions to the problem of finite-size samples, because for each data set—large or small—there is an exact posterior distribution to draw inferences.
mathematics and statistics online
The MTME models provided better results than the single-trait models using frequentist and Bayesian approach. Therefore, the former procedures can be efficiently applied in the genetic selection of segregating soybean progeny. However, it is necessary to use an adequate statistical tool that provides algorithms and routines to efficiently perform the analyses.
Though not necessarily easy, the use of Bayesian inference in quantitative genetics in the breeding of crop species [ 69 , 72 ] is a tendency in breeding programs [ 5 ]. The authors also developed an R-software package that offers specialized and optimized routines to efficiently perform the analyses under the proposed model.
- Navigation menu.
- Tanks in the Great War 1914-1918.
- Statistical Weighting Methods.
- Read A Bayesian Approach To Selection And Ranking Procedures: The Unequal Variance Case!
- Science and the Search for Meaning: Perspectives from International Scientists.
Despite the considerable difference in processing time of the analysis and output size of the results around 1. Furthermore, it provided additional results to those obtained by the frequentist approach, with noteworthy credibility intervals. However, the quality of the informative prior may have questionable origins and may not generate considerable advantages.
However, for potential future studies in plant breeding, the implementation of informative prior fitted to MTME models can be the next step to be assessed. Browse Subject Areas? Click through the PLOS taxonomy to find articles in your field. Abstract At present, single-trait best linear unbiased prediction BLUP is the standard method for genetic selection in soybean.
Introduction Soybean [ Glycine max L. Genetic and non-genetic components Broad-sense heritability per plot or heritability of total effects from progeny for the frequentist and Bayesian models were computed based on the approximated estimators, as discussed in Piepho et al. Genetic correlation To determine the genetic covariance by the frequentists and Bayesian single-trait models FSTME and BSTME, respectively , a pairwise analysis of the sum of phenotypic values of the traits was performed.
Download: PPT. Table 1. Table 2. Table 3. Table 4. Posterior inferences for mode, mean, median, and higher posterior density HPD interval of the broad-sense heritability per plot, considering the proposed Bayesian single-trait multi-environment BSTME and multi-trait multi-environment BMTME models for number of days to maturity DM , seed weight SW grams , and average seed yield per plot SY grams.
Table 5. Table 6. Table 7. Table 8. Variance components Variance components are the variances associated with the random effects of a model. Genetic correlations Studies on genotypic, phenotypic, and environmental correlations in soybean involve traits that are evaluated from flowering to maturity; notably, yield and its components [ 77 — 80 ]. Supporting information. S1 Table. Data set necessary to replicate the findings of our research. S2 Table. Scripts to run the multi-trait multi-environment models. S1 Fig. S2 Fig. S3 Fig. S4 Fig. Posterior density for the Bayesian single-trait multi-environment model BSTME of the estimate of variance components for average seed yield per plot SY.
References 1. Soybean breeding. Soybean Breeding.
A Bayesian approach to selection and ranking procedures: the unequal variance case
View Article Google Scholar 2. Genetic variation of world soybean maturity date and geographic distribution of maturity groups.
Breed Sci. A comparison of experimental designs for selection in breeding trials with nested treatment structure. Theor Appl Genet. Genotype -by- environment interaction. CRC Press; What should students in plant breeding know about the statistical aspects of genotype x Environment interactions? Crop Sci. View Article Google Scholar 6. Genetic mapping and confirmation of quantitative trait loci for seed protein and oil contents and seed weight in soybean.
View Article Google Scholar 7. Dissection of the genetic architecture for soybean seed weight across multiple environments.
A Bayesian approach to selection and ranking procedures: the unequal variance case
Crop Pasture Sci. View Article Google Scholar 8. Genetic parameters and selection of soybean lines based on selection indexes. Genet Mol Res. View Article Google Scholar 9. Selection indices for agronomic traits in segregating populations of soybean. Rev Cienc Agron. View Article Google Scholar Implications of the population effect in the selection of soybean progeny. Plant Breed. Selection of inbred soybean progeny Glycine max : an approach with population effect. Selection Bias and Multiple Trait Evaluation. J Dairy Sci. Multi-trait BLUP in half-sib selection of annual crops. Multiple trait evaluation using relatives records.
J Anim Sci. Schaeffer LR. Elsevier; ; — Mrode RA. The statistical analysis of multi-environment data: Modeling genotype-by-environment interaction and its genetic basis. Front Physiol. Joint prediction of multiple quantitative traits using a Bayesian multivariate antedependence model. Heredity Edinb. Bayesian multi-trait analysis reveals a useful tool to increase oil concentration and to decrease toxicity in Jatropha curcas L. PLoS One. Bayesian mapping of quantitative trait loci QTL controlling soybean cyst nematode resistant.
Multi-trait multi-environment Bayesian model reveals g x e interaction for nitrogen use efficiency components in tropical maize. Hayashi T, Iwata H. A Bayesian method and its variational approximation for prediction of genomic breeding values in multiple traits. BMC Bioinformatics. Inheritance of long juvenile period under short day conditions for the BR soybean Glycine max L. Merrill line. Perspectives for the use of quantitative genetics in breeding of autogamous plants. ISRN Genet. Lavras: UFLA; Stages of Soybean Development. Spec Rep. Patterson HD, Thompson R. Recovery of inter-block information when block sizes are unequal.
Henderson CR. Best linear unbiased estimation and prediction under a selection model published by: international biometric society stable. Akaike H. A new look at the statistical model identification. Statistical analysis of repeated measures data using SAS procedures. Wilks SS. Ann Math Stat. R Core Team. Vienna; Mora F, Serra N. Bayesian estimation of genetic parameters for growth, stem straightness, and survival in Eucalyptus globulus on an Andean Foothill site.