Model comparison via Bayes Factor analysis

Results of Bayesian analyses can be utilised for future research in the sense of Dennis Lindley’s motto: “Today’s posterior is tomorrow’s prior” (Lindley, 1972), or as Richard Feynman put it “Yesterday’s sensation is today’s calibration” to which Valentine Telegdi added“…and tomorrow’s background”.

More infos: https://christopher-germann.de/2019/02/24/jasp-a-fresh-way-to-do-statistics

At the conceptual meta level, the primary difference between the frequentists and the Bayesian account is that the former treats data as random and parameters as fixed and the latter regards data as fixed and unknown parameters as random.

A Bayes Factor can range from 0 to ∞ and a value of 1 denotes equivalent support for both competing hypotheses. Moreover, LogBF10 can be expressed as a logarithm ranging from -∞ to ∞. A BF of 0 denotes equal support for H0 and H1.

“The prior distribution is a key part of Bayesian inference and represents the information about an uncertain parameter that is combined with the probability distribution of new data to yield the posterior distribution, which in turn is used for future inferences and decisions.” (Gelman, 2006, p. 1634)

In a seminal paper entitled “Inference, method, and decision: towards a Bayesian philosophy of science” Rosenkrantz (Rosenkrantz, 1980, p. 485) discusses the Popperian concept of verisimilitude (truthlikeness) w.r.t. Bayesian decision making and develops a persuasive cogent argument in favour of diffuse priors (i.e., C-systems with a low λ). In a related publication he states: “If your prior is heavily concentrated about the true value (which amounts to a ‘lucky guess’ in the absence of pertinent data), you stand to be slightly closer to the truth after sampling than someone who adopts a diffuse prior, your advantage dissipating rapidly with sample size. If, however, your initial estimate is in error, you will be farther from the truth after sampling, and if the error is substantial, you will be much farther from the truth. I can express this by saying that a diffuse prior is a better choice at ‘almost all’ values of [q1] or, better, that it semi~dominates any highly peaked (or ‘opinionated’) prior. In practice, a diffuse prior never does much worse than a peaked one and ‘generally’ does much better…” (1946)

http://r-code.ml

Frequentist inference
Fixed-effects ANOVAGeneral linear models: mixing continuous and categorical covariatesOutput plot as PDFSimple linear regressionSkewness & Kurtosis
data(ToothGrowth)

## Example plot from ?ToothGrowth

coplot(len ~ dose | supp, data = ToothGrowth, panel = panel.smooth,
       xlab = "ToothGrowth data: length vs dose, given type of supplement")
## Treat dose as a factor
ToothGrowth$dose = factor(ToothGrowth$dose)
levels(ToothGrowth$dose) = c("Low", "Medium", "High")

summary(aov(len ~ supp*dose, data=ToothGrowth))


#install.packages("xtable")
library(xtable)
xtable(x, caption = NULL, label = NULL, align = NULL, digits = NULL,
       display = NULL, auto = FALSE, ...)

print(xtable(d), type="html")
print(xtable(d), type="latex") # anova table to latex
#https://cran.r-project.org/web/packages/xtable/index.html
#https://rmarkdown.rstudio.com/

data(ToothGrowth)

# model log2 of dose instead of dose directly
ToothGrowth$dose = log2(ToothGrowth$dose)

# Classical analysis for comparison
lmToothGrowth <- lm(len ~ supp + dose + supp:dose, data=ToothGrowth)
summary(lmToothGrowth)
x<- (1:5)
y<- (1:111)
pdf(file=file.choose())
hist(x)
plot(x, type='o')
dev.off()

Notes
file.choose() is a very handy command which saves the work associated with defining absolute and relative paths which can be quite cumbersome.
list.files for non-interactive selection.
choose.files for selecting multiple files interactively.
More
https://www.rdocumentation.org/packages/base/versions/3.5.2/topics/file.choose

# http://wiki.math.yorku.ca/index.php/VR4:_Chapter_1_summary
x <- seq( 1, 20, 0.5)   # sequence from 1 to 20 by 0.5
x                       # print it
w <- 1 + x/2            # a weight vector
y <- x + w*rnorm(x)     # generate y with standard deviations
                          #    equal to w
 
dum <- data.frame( x, y, w )     # make a data frame of 3 columns
dum                     # typing it's name shows the object
rm( x, y, w )           # remove the original variables


fm <- lm ( y ~ x, data = dum)   # fit a simple linear regression (between x and y)
summary( fm )                   # and look at the analysis



fm1 <- lm( y ~ x, data = dum, weight = 1/w^2 )  # a weighted regression  
summary(fm1)
 
lrf <- loess( y ~ x, dum)    # a smooth regression curve using a
                               #   modern local regression function
attach( dum )                # make the columns in the data frame visible
                               #   as variables (Note: before this command, typing x and
                               #   y would yield errors)
plot( x, y)                  # a standard scatterplot to which we will
                               #   fit regression curves 
lines( spline( x, fitted (lrf)), col = 2)  # add in the local regression using spline
                                             #  interpolation between calculated
                                             #  points
abline(0, 1, lty = 3, col = 3)    # add in the true regression line (lty=3: read line type=dotted;
                                    #   col=3: read colour=green; type "?par" for details)
abline(fm, col = 4)               # add in the unweighted regression line
abline(fm1, lty = 4, col = 5)     # add in the weighted regression line
plot( fitted(fm), resid(fm),
       xlab = "Fitted Values",
       ylab = "Residuals")          # a standard regression diagnostic plot
                                    # to check for heteroscedasticity.
                                    # The data were generated from a
                                    # heteroscedastic process. Can you see
                                    # it from this plot?
qqnorm( resid(fm) )               # Normal scores to check for skewness,
                                    #   kurtosis and outliers.  How would you
                                    #   expect heteroscedasticity to show up?
search()                          # Have a look at the search path
detach()                          # Remove the data frame from the search path
search()                          # Have another look. How did it change?
rm(fm, fm1, lrf, dum)             # Clean up (this is not that important in R 
                                    #    for reasons we will understand later)



library(rgl)
rgl.open()
rgl.bg(col='white')
x <- sort(rnorm(1000))
y <- rnorm(1000)
z1 <- atan2(x,y)
z <- rnorm(1000) + atan2(x, y)
plot3d(x, y, z1, col = rainbow(1000), size = 2)
bbox3d()
skew(x, na.rm = TRUE)
kurtosi(x, na.rm = TRUE)

#x A data.frame or matrix
#na.rm how to treat missing data

## The function is currently defined as
function (x, na.rm = TRUE) 
{
    if (length(dim(x)) == 0) {
        if (na.rm) {
            x <- x[!is.na(x)]
        			}
        
        mx <- mean(x)
         sdx <- sd(x,na.rm=na.rm)
        kurt <- sum((x - mx)^4)/(length(x) * sd(x)^4)  -3
        } else {
    
    kurt <- rep(NA,dim(x)[2])
    mx <- colMeans(x,na.rm=na.rm)
     sdx <- sd(x,na.rm=na.rm)
    for (i in 1:dim(x)[2]) {
    kurt[i] <- sum((x[,i] - mx[i])^4,  na.rm = na.rm)/((length(x[,i]) - sum(is.na(x[,i]))) * sdx[i]^4)  -3
            }
    }
    return(kurt)
  }
Bayes Factor analysis
ttestBF function (one sample) BFmcmc functionttestBF function (two sample)Bayesian meta analysisBayes factor robustness analysis, one-sidedAnimating Robustness-Check of Bayes Factor PriorsSkewness & KurtosisanovaBF function
#https://richarddmorey.github.io/BayesFactor/
bf = ttestBF(x = diffScores)
## Equivalently:
## bf = ttestBF(x = sleep$extra[1:10],y=sleep$extra[11:20], paired=TRUE)
bf

Comment

ttestBF function, which performs the “JZS” t test described by Rouder, Speckman, Sun, Morey, and Iverson (2009).

#https://richarddmorey.github.io/BayesFactor/
chains2 = recompute(chains, iterations = 10000)
plot(chains2[,1:2])

Comment
The posterior function returns a object of type BFmcmc, which inherits the methods of the mcmc class from the coda package.

#https://richarddmorey.github.io/BayesFactor/
## Compute Bayes factor
bf = ttestBF(formula = weight ~ feed, data = chickwts)
bf
###
chains = posterior(bf, iterations = 10000)
plot(chains[,2])
#https://richarddmorey.github.io/BayesFactor/
## Bem's t statistics from four selected experiments
t = c(-.15, 2.39, 2.42, 2.43)
N = c(100, 150, 97, 99)
###
## Do analysis again, without nullInterval restriction
bf = meta.ttestBF(t=t, n1=N, rscale=1)

## Obtain posterior samples
chains = posterior(bf, iterations = 10000)
plot(chains)

Comment
ttestBF function, which performs the “JZS” t test described by Rouder, Speckman, Sun, Morey, and Iverson (2009).

## This source code is licensed under the FreeBSD license
## (c) 2013 Felix Schönbrodt

#' @title Plots a comparison of a sequence of priors for t test Bayes factors
#'
#' @details
#'
#'
#' @param ts A vector of t values
#' @param ns A vector of corresponding sample sizes
#' @param rs The sequence of rs that should be tested. r should run up to 2 (higher values are implausible; E.-J. Wagenmakers, personal communication, Aug 22, 2013)
#' @param labels Names for the studies (displayed in the facet headings)
#' @param dots Values of r's which should be marked with a red dot
#' @param plot If TRUE, a ggplot is returned. If false, a data frame with the computed Bayes factors is returned
#' @param sides If set to "two" (default), a two-sided Bayes factor is computed. If set to "one", a one-sided Bayes factor is computed. In this case, it is assumed that positive t values correspond to results in the predicted direction and negative t values to results in the unpredicted direction. For details, see Wagenmakers, E. J., & Morey, R. D. (2013). Simple relation between one-sided and two-sided Bayesian point-null hypothesis tests.
#' @param nrow Number of rows of the faceted plot.
#' @param forH1 Defines the direction of the BF. If forH1 is TRUE, BF > 1 speak in favor of H1 (i.e., the quotient is defined as H1/H0). If forH1 is FALSE, it's the reverse direction.
#'
#' @references
#'
#' Rouder, J. N., Speckman, P. L., Sun, D., Morey, R. D., & Iverson, G. (2009). Bayesian t-tests for accepting and rejecting the null hypothesis. Psychonomic Bulletin and Review, 16, 225-237.
#' Wagenmakers, E.-J., & Morey, R. D. (2013). Simple relation between one-sided and two-sided Bayesian point-null hypothesis tests. Manuscript submitted for publication
#' Wagenmakers, E.-J., Wetzels, R., Borsboom, D., Kievit, R. & van der Maas, H. L. J. (2011). Yes, psychologists must change the way they analyze their data: Clarifications for Bem, Utts, & Johnson (2011)

BFrobustplot <- function(
    ts, ns, rs=seq(0, 2, length.out=200), dots=1, plot=TRUE,
    labels=c(), sides="two", nrow=2, xticks=3, forH1=TRUE)
{
    library(BayesFactor)
   
    # compute one-sided p-values from ts and ns
    ps <- pt(ts, df=ns-1, lower.tail = FALSE)   # one-sided test
   
    # add the dots location to the sequences of r's
    rs <- c(rs, dots)
       
    res <- data.frame()
    for (r in rs) {
       
        # first: calculate two-sided BF
        B_e0 <- c()
        for (i in 1:length(ts))
            B_e0 <- c(B_e0, exp(ttest.tstat(t = ts[i], n1 = ns[i], rscale=r)$bf))
   
        # second: calculate one-sided BF
        B_r0 <- c() for (i in 1:length(ts)) { if (ts[i] > 0) {
                # correct direction
                B_r0 <- c(B_r0, (2 - 2*ps[i])*B_e0[i])
            } else {
                # wrong direction
                B_r0 <- c(B_r0, (1 - ps[i])*2*B_e0[i])
            }
        }
   
        res0 <- data.frame(t=ts, n=ns, BF_two=B_e0, BF_one=B_r0, r=r) if (length(labels) > 0) {
            res0$labels <- labels
            res0$heading <- factor(1:length(labels), labels=paste0(labels, "\n(t = ", ts, ", df = ", ns-1, ")"), ordered=TRUE)
        } else {
            res0$heading <- factor(1:length(ts), labels=paste0("t = ", ts, ", df = ", ns-1), ordered=TRUE)
        }
        res <- rbind(res, res0)
    }
   
    # define the measure to be plotted: one- or two-sided?
    res$BF <- res[, paste0("BF_", sides)]

   
    # Flip BF if requested
    if (forH1 == FALSE) {
        res$BF <- 1/res$BF
    }
       
   
    if (plot==TRUE) {
        library(ggplot2)
        p1 <- ggplot(res, aes(x=r, y=log(BF))) + geom_line() + facet_wrap(~heading, nrow=nrow) + theme_bw() + ylab("log(BF)")
        p1 <- p1 + geom_hline(yintercept=c(c(-log(c(30, 10, 3)), log(c(3, 10, 30)))), linetype="dotted", color="darkgrey")
        p1 <- p1 + geom_hline(yintercept=log(1), linetype="dashed", color="darkgreen")
   
        # add the dots
        p1 <- p1 + geom_point(data=res[res$r %in% dots,], aes(x=r, y=log(BF)), color="red", size=2)
   
        # add annotation
        p1 <- p1 + annotate("text", x=max(rs)*1.8, y=-2.85, label=paste0("Strong~H[", ifelse(forH1==TRUE,0,1), "]"), hjust=1, vjust=.5, size=3, color="black", parse=TRUE)
        p1 <- p1 + annotate("text", x=max(rs)*1.8, y=-1.7 , label=paste0("Moderate~H[", ifelse(forH1==TRUE,0,1), "]"), hjust=1, vjust=.5, size=3, color="black", parse=TRUE)
        p1 <- p1 + annotate("text", x=max(rs)*1.8, y=-.55 , label=paste0("Anectodal~H[", ifelse(forH1==TRUE,0,1), "]"), hjust=1, vjust=.5, size=3, color="black", parse=TRUE)
        p1 <- p1 + annotate("text", x=max(rs)*1.8, y=2.86 , label=paste0("Strong~H[", ifelse(forH1==TRUE,1,0), "]"), hjust=1, vjust=.5, size=3, color="black", parse=TRUE)
        p1 <- p1 + annotate("text", x=max(rs)*1.8, y=1.7  , label=paste0("Moderate~H[", ifelse(forH1==TRUE,1,0), "]"), hjust=1, vjust=.5, size=3, color="black", parse=TRUE)
        p1 <- p1 + annotate("text", x=max(rs)*1.8, y=.55  , label=paste0("Anectodal~H[", ifelse(forH1==TRUE,1,0), "]"), hjust=1, vjust=.5, vjust=.5, size=3, color="black", parse=TRUE)
       
        # set scale ticks
        p1 <- p1 + scale_y_continuous(breaks=c(c(-log(c(30, 10, 3)), 0, log(c(3, 10, 30)))), labels=c("-log(30)", "-log(10)", "-log(3)", "log(1)", "log(3)", "log(10)", "log(30)"))
        p1 <- p1 + scale_x_continuous(breaks=seq(min(rs), max(rs), length.out=xticks))

        return(p1)
    } else {
        return(res)
    }
}

References
Rouder, J. N., Speckman, P. L., Sun, D., Morey, R. D., & Iverson, G. (2009). Bayesian t-tests for accepting and rejecting the null hypothesis. Psychonomic Bulletin and Review, 16, 225-237

#https://jimgrange.wordpress.com/2015/11/27/animating-robustness-check-of-bayes-factor-priors/
### plots of prior robustness check

# gif created from PNG plots using http://gifmaker.me/

# clear R's memory
rm(list = ls())

# load the Bayes factor package
library(BayesFactor)

# set working directory (will be used to save files here, so make sure
# this is where you want to save your plots!)
setwd <- "D:/Work//Blog_YouTube code//Blog/Prior Robust Visualisation/plots"

#------------------------------------------------------------------------------
### declare some variables for the analysis

# what is the t-value for the data?
tVal <- 3.098

# how many points in the prior should be explored?
nPoints <- 1000

# what Cauchy rates should be explored?
cauchyRates <- seq(from = 0.01, to = 1.5, length.out = nPoints)

# what effect sizes should be plotted?
effSize <- seq(from = -2, to = 2, length.out = nPoints)

# get the Bayes factor for each prior value
bayesFactors <- sapply(cauchyRates, function(x) 
  exp(ttest.tstat(t = tVal, n1 = 76, rscale = x)[['bf']]))
#------------------------------------------------------------------------------

#------------------------------------------------------------------------------
### do the plotting

# how many plots do we want to produce?
nPlots <- 50
plotWidth <- round(seq(from = 1, to = nPoints, length.out = nPlots), 0)

# loop over each plot
for(i in plotWidth){

  # set up the file
  currFile <- paste(getwd(), "/plot_", i, ".png", sep = "")
  
  # initiate the png file
  png(file = currFile, width = 1200, height = 1000, res = 200)
  
  # change the plotting window so plots appear side-by-side
  par(mfrow = c(1, 2))
    
  #----
  # do the prior density plot
  d <- dcauchy(effSize, scale = cauchyRates[i]) 
  plot(effSize, d, type = "l", ylim = c(0, 5), xlim = c(-2, 2), 
       ylab = "Density", xlab = "Effect Size (d)", lwd = 2, col = "gray48", 
       main = paste("Rate (r) = ", round(cauchyRates[i], 3), sep = ""))
  
  #-----
  # do the Bayes factor plot
  plot(cauchyRates, bayesFactors, type = "l", lwd = 2, col = "gray48", 
       ylim = c(0, max(bayesFactors)), xaxt = "n", 
                xlab = "Cauchy Prior Width (r)", ylab = "Bayes Factor (10)")
  abline(h = 0, lwd = 1)
  abline(h = 6, col = "black", lty = 2, lwd = 2)
  axis(1, at = seq(0, 1.5, 0.25))
    
  # add the BF at the default Cauchy point
  points(0.707, 9.97, col = "black", cex = 1.5, pch = 21, bg = "red")
    
  # add the BF for the Cauchy prior currently being plotted
  points(cauchyRates[i], bayesFactors[i], col = "black", pch = 21, cex = 1.3, 
         bg = "cyan")
    
  # add legend
  legend(x = 0.25, y = 3, legend = c("r = 0.707", paste("r = ", 
                                                      round(cauchyRates[i], 3), 
                                                            sep = "")),
         pch = c(21, 21), lty = c(NA, NA), lwd = c(NA, NA), pt.cex = c(1, 1),
         col = c("black", "black"), pt.bg = c("red", "cyan"), bty = "n")

  # save the current plot
  dev.off()
  
}
#------------------------------------------------------------------------------

skew(x, na.rm = TRUE)
kurtosi(x, na.rm = TRUE)

#x A data.frame or matrix
#na.rm how to treat missing data

## The function is currently defined as
function (x, na.rm = TRUE) 
{
    if (length(dim(x)) == 0) {
        if (na.rm) {
            x <- x[!is.na(x)]
        			}
        
        mx <- mean(x)
         sdx <- sd(x,na.rm=na.rm)
        kurt <- sum((x - mx)^4)/(length(x) * sd(x)^4)  -3
        } else {
    
    kurt <- rep(NA,dim(x)[2])
    mx <- colMeans(x,na.rm=na.rm)
     sdx <- sd(x,na.rm=na.rm)
    for (i in 1:dim(x)[2]) {
    kurt[i] <- sum((x[,i] - mx[i])^4,  na.rm = na.rm)/((length(x[,i]) - sum(is.na(x[,i]))) * sdx[i]^4)  -3
            }
    }
    return(kurt)
  }
#https://richarddmorey.github.io/BayesFactor/
data(ToothGrowth)

## Example plot from ?ToothGrowth

coplot(len ~ dose | supp, data = ToothGrowth, panel = panel.smooth,
       xlab = "ToothGrowth data: length vs dose, given type of supplement")
bf = anovaBF(len ~ supp*dose, data=ToothGrowth)
bf
plot(bf[3:4] / bf[2])

#reduce number of comparisons (alpha inflation)
bf = anovaBF(len ~ supp*dose, data=ToothGrowth, whichModels="top")
bf
bfMainEffects = lmBF(len ~ supp + dose, data = ToothGrowth)
bfInteraction = lmBF(len ~ supp + dose + supp:dose, data = ToothGrowth)
## Compare the two models
bf = bfInteraction / bfMainEffects
bf
##
newbf = recompute(bf, iterations = 500000)
newbf

# posterior function
## Sample from the posterior of the full model
chains = posterior(bfInteraction, iterations = 10000)
## 1:13 are the only "interesting" parameters
summary(chains[,1:13])
#lot posterior of selected effects
plot(chains[,4:6])

Comment
Reduce number of comparisons by specifyig the xact model of interest a priori.

Markov chain Monte Carlo methods
MCMC 1GBEST RBEST - Bayesian MCMC power analysis
# MODIFIED FROM BEST.R FOR ONE GROUP INSTEAD OF TWO.
# Version of Dec 02, 2015.
# John K. Kruschke  
# johnkruschke@gmail.com
# http://www.indiana.edu/~kruschke/BEST/
#
# This program is believed to be free of errors, but it comes with no guarantee! 
# The user bears all responsibility for interpreting the results. 
# Please check the webpage above for updates or corrections.
#
### ***************************************************************
### ******** SEE FILE BEST1Gexample.R FOR INSTRUCTIONS ************
### ***************************************************************

source("openGraphSaveGraph.R") # graphics functions for Windows, Mac OS, Linux

BEST1Gmcmc = function( y , numSavedSteps=100000, thinSteps=1, showMCMC=FALSE) { 
  # This function generates an MCMC sample from the posterior distribution.
  # Description of arguments:
  # showMCMC is a flag for displaying diagnostic graphs of the chains.
  #    If F (the default), no chain graphs are displayed. If T, they are.
  
  require(rjags)

  #------------------------------------------------------------------------------
  # THE MODEL.
  modelString = "
  model {
    for ( i in 1:Ntotal ) {
      y[i] ~ dt( mu , tau , nu )
    }
    mu ~ dnorm( muM , muP )
    tau <- 1/pow( sigma , 2 )
    sigma ~ dunif( sigmaLow , sigmaHigh )
    nu ~ dexp(1/30)
}
  " # close quote for modelString
  # Write out modelString to a text file
  writeLines( modelString , con="BESTmodel.txt" )
  
  #------------------------------------------------------------------------------
  # THE DATA.
  # Load the data:
  Ntotal = length(y)
  # Specify the data in a list, for later shipment to JAGS:
  dataList = list(
    y = y ,
    Ntotal = Ntotal ,
    muM = mean(y) ,
    muP = 0.000001 * 1/sd(y)^2 ,
    sigmaLow = sd(y) / 1000 ,
    sigmaHigh = sd(y) * 1000 
  )
  
  #------------------------------------------------------------------------------
  # INTIALIZE THE CHAINS.
  # Initial values of MCMC chains based on data:
  mu = mean(y) 
  sigma = sd(y) 
  # Regarding initial values in next line: (1) sigma will tend to be too big if 
  # the data have outliers, and (2) nu starts at 5 as a moderate value. These
  # initial values keep the burn-in period moderate.
  initsList = list( mu = mu , sigma = sigma , nu = 5 )
  
  #------------------------------------------------------------------------------
  # RUN THE CHAINS

  parameters = c( "mu" , "sigma" , "nu" )     # The parameters to be monitored
  adaptSteps = 500               # Number of steps to "tune" the samplers
  burnInSteps = 1000
  nChains = 3 
  nIter = ceiling( ( numSavedSteps * thinSteps ) / nChains )
  # Create, initialize, and adapt the model:
  jagsModel = jags.model( "BESTmodel.txt" , data=dataList , inits=initsList , 
                          n.chains=nChains , n.adapt=adaptSteps )
  # Burn-in:
  cat( "Burning in the MCMC chain...\n" )
  update( jagsModel , n.iter=burnInSteps )
  # The saved MCMC chain:
  cat( "Sampling final MCMC chain...\n" )
  codaSamples = coda.samples( jagsModel , variable.names=parameters , 
                              n.iter=nIter , thin=thinSteps )
  # resulting codaSamples object has these indices: 
  #   codaSamples[[ chainIdx ]][ stepIdx , paramIdx ]
  
  #------------------------------------------------------------------------------
  # EXAMINE THE RESULTS
  if ( showMCMC ) {
    openGraph(width=7,height=7)
    autocorr.plot( codaSamples[[1]] , ask=FALSE )
    show( gelman.diag( codaSamples ) )
    effectiveChainLength = effectiveSize( codaSamples ) 
    show( effectiveChainLength )
  }

  # Convert coda-object codaSamples to matrix object for easier handling.
  # But note that this concatenates the different chains into one long chain.
  # Result is mcmcChain[ stepIdx , paramIdx ]
  mcmcChain = as.matrix( codaSamples )
  return( mcmcChain )

} # end function BESTmcmc

#==============================================================================

BEST1Gplot = function( y , mcmcChain , compValm=0.0 , ROPEm=NULL , 
                       compValsd=NULL , ROPEsd=NULL , 
                       ROPEeff=NULL , showCurve=FALSE , pairsPlot=FALSE ) {
  # This function plots the posterior distribution (and data).
  # Description of arguments:
  # y is the data vector.
  # mcmcChain is a list of the type returned by function BEST1G.
  # compValm is hypothesized mean for comparing with estimated mean.
  # ROPEm is a two element vector, such as c(-1,1), specifying the limit
  #   of the ROPE on the mean.
  # ROPEsd is a two element vector, such as c(14,16), specifying the limit
  #   of the ROPE on the difference of standard deviations.
  # ROPEeff is a two element vector, such as c(-1,1), specifying the limit
  #   of the ROPE on the effect size.
  # showCurve is TRUE or FALSE and indicates whether the posterior should
  #   be displayed as a histogram (by default) or by an approximate curve.
  # pairsPlot is TRUE or FALSE and indicates whether scatterplots of pairs
  #   of parameters should be displayed.
  mu = mcmcChain[,"mu"]
  sigma = mcmcChain[,"sigma"]
  nu = mcmcChain[,"nu"]
  
  if ( pairsPlot ) {
    # Plot the parameters pairwise, to see correlations:
    openGraph(width=7,height=7)
    nPtToPlot = 1000
    plotIdx = floor(seq(1,length(mu),by=length(mu)/nPtToPlot))
    panel.cor = function(x, y, digits=2, prefix="", cex.cor, ...) {
      usr = par("usr"); on.exit(par(usr))
      par(usr = c(0, 1, 0, 1))
      r = (cor(x, y))
      txt = format(c(r, 0.123456789), digits=digits)[1]
      txt = paste(prefix, txt, sep="")
      if(missing(cex.cor)) cex.cor <- 0.8/strwidth(txt)
      text(0.5, 0.5, txt, cex=1.25 ) # was cex=cex.cor*r
    }
    pairs( cbind( mu , sigma , log10(nu) )[plotIdx,] ,
           labels=c( expression(mu) ,  
                     expression(sigma) , 
                     expression(log10(nu)) ) , 
           lower.panel=panel.cor , col="skyblue" )
  }

  # Define matrix for storing summary info:
  summaryInfo = NULL
  
  source("plotPost.R")
  # Set up window and layout:
  openGraph(width=6.0,height=5.0)
  layout( matrix( c(3,3,4,4,5,5, 1,1,1,1,2,2) , nrow=6, ncol=2 , byrow=FALSE ) )
  par( mar=c(3.5,3.5,2.5,0.5) , mgp=c(2.25,0.7,0) )
  
  # Select thinned steps in chain for plotting of posterior predictive curves:
  chainLength = NROW( mcmcChain )
  nCurvesToPlot = 30
  stepIdxVec = seq( 1 , chainLength , floor(chainLength/nCurvesToPlot) )
  xRange = range( y )
  xLim = c( xRange[1]-0.1*(xRange[2]-xRange[1]) , 
            xRange[2]+0.1*(xRange[2]-xRange[1]) )
  xVec = seq( xLim[1] , xLim[2] , length=200 )
  maxY = max( dt( 0 , df=max(nu[stepIdxVec]) ) / min(sigma[stepIdxVec]) )
  # Plot data y and smattering of posterior predictive curves:
  stepIdx = 1
  plot( xVec , dt( (xVec-mu[stepIdxVec[stepIdx]])/sigma[stepIdxVec[stepIdx]] , 
                   df=nu[stepIdxVec[stepIdx]] )/sigma[stepIdxVec[stepIdx]] , 
        ylim=c(0,maxY) , cex.lab=1.75 ,
        type="l" , col="skyblue" , lwd=1 , xlab="y" , ylab="p(y)" , 
        main="Data w. Post. Pred." )
  for ( stepIdx in 2:length(stepIdxVec) ) {
    lines(xVec, dt( (xVec-mu[stepIdxVec[stepIdx]])/sigma[stepIdxVec[stepIdx]] , 
                    df=nu[stepIdxVec[stepIdx]] )/sigma[stepIdxVec[stepIdx]] , 
          type="l" , col="skyblue" , lwd=1 )
  }
  histBinWd = median(sigma)/2
  histCenter = mean(mu)
  histBreaks = sort( c( seq( histCenter-histBinWd/2 , min(xVec)-histBinWd/2 ,
                             -histBinWd ),
                        seq( histCenter+histBinWd/2 , max(xVec)+histBinWd/2 ,
                             histBinWd ) , xLim ) )
  histInfo = hist( y , plot=FALSE , breaks=histBreaks )
  yPlotVec = histInfo$density 
  yPlotVec[ yPlotVec==0.0 ] = NA
  xPlotVec = histInfo$mids
  xPlotVec[ yPlotVec==0.0 ] = NA
  points( xPlotVec , yPlotVec , type="h" , lwd=3 , col="red" )
  text( max(xVec) , maxY , bquote(N==.(length(y))) , adj=c(1.1,1.1) )
  
  # Plot posterior distribution of parameter nu:
  paramInfo = plotPost( log10(nu) , col="skyblue" , # breaks=30 ,
                       showCurve=showCurve ,
                       xlab=bquote("log10("*nu*")") , cex.lab = 1.75 , showMode=TRUE ,
                       main="Normality" ) #  (<0.7 suggests kurtosis)
  summaryInfo = rbind( summaryInfo , paramInfo )
  rownames(summaryInfo)[NROW(summaryInfo)] = "log10(nu)"
  
  # Plot posterior distribution of parameters mu1, mu2, and their difference:
  xlim = range( c( mu , compValm ) )
  paramInfo = plotPost( mu ,  xlim=xlim , cex.lab = 1.75 , ROPE=ROPEm ,
                       showCurve=showCurve , compVal = compValm ,
                       xlab=bquote(mu) , main=paste("Mean") , 
                       col="skyblue" )
  summaryInfo = rbind( summaryInfo , paramInfo )
  rownames(summaryInfo)[NROW(summaryInfo)] = "mu"
  
  # Plot posterior distribution of sigma:
  xlim=range( sigma )
  paramInfo = plotPost( sigma ,  xlim=xlim , cex.lab = 1.75 , ROPE=ROPEsd ,
                       showCurve=showCurve , compVal=compValsd ,
                       xlab=bquote(sigma) , main=paste("Std. Dev.") , 
                       col="skyblue" , showMode=TRUE )
  summaryInfo = rbind( summaryInfo , paramInfo )
  rownames(summaryInfo)[NROW(summaryInfo)] = "sigma"
  
  # Plot of estimated effect size:
  effectSize = ( mu - compValm ) / sigma
  paramInfo = plotPost( effectSize , compVal=0 ,  ROPE=ROPEeff ,
                       showCurve=showCurve ,
                       xlab=bquote( (mu-.(compValm)) / sigma ),
                       showMode=TRUE , cex.lab=1.75 , main="Effect Size" , col="skyblue" )
  summaryInfo = rbind( summaryInfo , paramInfo )
  rownames(summaryInfo)[NROW(summaryInfo)] = "effSz"
  
  return( summaryInfo )

  } # end of function BEST1Gplot

#==============================================================================


# MODIFIED 2015 DEC 02.
# John K. Kruschke  
# johnkruschke@gmail.com
# http://www.indiana.edu/~kruschke/BEST/
#
# This program is believed to be free of errors, but it comes with no guarantee! 
# The user bears all responsibility for interpreting the results. 
# Please check the webpage above for updates or corrections.
#
### ***************************************************************
### ******** SEE FILE BESTexample.R FOR INSTRUCTIONS **************
### ***************************************************************

# Load various essential functions:
source("DBDA2E-utilities.R") 

BESTmcmc = function( y1 , y2 , 
                     priorOnly=FALSE , showMCMC=FALSE ,
                     numSavedSteps=20000 , thinSteps=1 ,
                     mu1PriorMean = mean(c(y1,y2)) ,
                     mu1PriorSD = sd(c(y1,y2))*5 ,
                     mu2PriorMean = mean(c(y1,y2)) ,
                     mu2PriorSD = sd(c(y1,y2))*5 ,
                     sigma1PriorMode = sd(c(y1,y2)) ,
                     sigma1PriorSD = sd(c(y1,y2))*5 ,
                     sigma2PriorMode = sd(c(y1,y2)) ,
                     sigma2PriorSD = sd(c(y1,y2))*5 ,
                     nuPriorMean = 30 ,
                     nuPriorSD = 30 ,
                     runjagsMethod=runjagsMethodDefault , 
                     nChains=nChainsDefault ) { 
  # This function generates an MCMC sample from the posterior distribution.
  # Description of arguments:
  # showMCMC is a flag for displaying diagnostic graphs of the chains.
  #    If F (the default), no chain graphs are displayed. If T, they are.
  
  #------------------------------------------------------------------------------
  # THE DATA.
  # Load the data:
  y = c( y1 , y2 ) # combine data into one vector
  x = c( rep(1,length(y1)) , rep(2,length(y2)) ) # create group membership code
  Ntotal = length(y)
  # Specify the data and prior constants in a list, for later shipment to JAGS:
  if ( priorOnly ) {
    dataList = list(
      #  y = y ,
      x = x ,
      Ntotal = Ntotal ,
      mu1PriorMean = mu1PriorMean ,
      mu1PriorSD = mu1PriorSD ,
      mu2PriorMean = mu2PriorMean ,
      mu2PriorSD = mu2PriorSD ,
      Sh1 = gammaShRaFromModeSD( mode=sigma1PriorMode , 
                                 sd=sigma1PriorSD )$shape ,
      Ra1 = gammaShRaFromModeSD( mode=sigma1PriorMode , 
                                 sd=sigma1PriorSD )$rate ,
      Sh2 = gammaShRaFromModeSD( mode=sigma2PriorMode , 
                                 sd=sigma2PriorSD )$shape ,
      Ra2 = gammaShRaFromModeSD( mode=sigma2PriorMode , 
                                 sd=sigma2PriorSD )$rate ,
      ShNu = gammaShRaFromMeanSD( mean=nuPriorMean , sd=nuPriorSD )$shape ,
      RaNu = gammaShRaFromMeanSD( mean=nuPriorMean , sd=nuPriorSD )$rate
    )
  } else {
    dataList = list(
      y = y ,
      x = x ,
      Ntotal = Ntotal ,
      mu1PriorMean = mu1PriorMean ,
      mu1PriorSD = mu1PriorSD ,
      mu2PriorMean = mu2PriorMean ,
      mu2PriorSD = mu2PriorSD ,
      Sh1 = gammaShRaFromModeSD( mode=sigma1PriorMode , 
                                 sd=sigma1PriorSD )$shape ,
      Ra1 = gammaShRaFromModeSD( mode=sigma1PriorMode , 
                                 sd=sigma1PriorSD )$rate ,
      Sh2 = gammaShRaFromModeSD( mode=sigma2PriorMode , 
                                 sd=sigma2PriorSD )$shape ,
      Ra2 = gammaShRaFromModeSD( mode=sigma2PriorMode , 
                                 sd=sigma2PriorSD )$rate ,
      ShNu = gammaShRaFromMeanSD( mean=nuPriorMean , sd=nuPriorSD )$shape ,
      RaNu = gammaShRaFromMeanSD( mean=nuPriorMean , sd=nuPriorSD )$rate
    )
  }
  
  #----------------------------------------------------------------------------
  # THE MODEL.
  modelString = "
  model {
    for ( i in 1:Ntotal ) {
      y[i] ~ dt( mu[x[i]] , 1/sigma[x[i]]^2 , nu )
    }
    mu[1] ~ dnorm( mu1PriorMean , 1/mu1PriorSD^2 )  # prior for mu[1]
    sigma[1] ~ dgamma( Sh1 , Ra1 )     # prior for sigma[1]
    mu[2] ~ dnorm( mu2PriorMean , 1/mu2PriorSD^2 )  # prior for mu[2]
    sigma[2] ~ dgamma( Sh2 , Ra2 )     # prior for sigma[2]
    nu ~ dgamma( ShNu , RaNu ) # prior for nu
  }
  " # close quote for modelString
  # Write out modelString to a text file
  writeLines( modelString , con="BESTmodel.txt" )
  
  #------------------------------------------------------------------------------
  # INTIALIZE THE CHAINS.
  # Initial values of MCMC chains based on data:
  mu = c( mean(y1) , mean(y2) )
  sigma = c( sd(y1) , sd(y2) )
  # Regarding initial values in next line: (1) sigma will tend to be too big if 
  # the data have outliers, and (2) nu starts at 5 as a moderate value. These
  # initial values keep the burn-in period moderate.
  initsList = list( mu = mu , sigma = sigma , nu = 5 )
  
  #------------------------------------------------------------------------------
  # RUN THE CHAINS
  
  parameters = c( "mu" , "sigma" , "nu" )     # The parameters to be monitored
  adaptSteps = 500               # Number of steps to "tune" the samplers
  burnInSteps = 1000
  
  runJagsOut <- run.jags( method=runjagsMethod , model="BESTmodel.txt" , monitor=parameters , data=dataList , inits=initsList , n.chains=nChains , adapt=adaptSteps , burnin=burnInSteps , sample=ceiling(numSavedSteps/nChains) , thin=thinSteps , summarise=FALSE , plots=FALSE ) codaSamples = as.mcmc.list( runJagsOut ) # resulting codaSamples object has these indices: # codaSamples[[ chainIdx ]][ stepIdx , paramIdx ] ## Original version, using rjags: # nChains = 3 # nIter = ceiling( ( numSavedSteps * thinSteps ) / nChains ) # # Create, initialize, and adapt the model: # jagsModel = jags.model( "BESTmodel.txt" , data=dataList , inits=initsList , # n.chains=nChains , n.adapt=adaptSteps ) # # Burn-in: # cat( "Burning in the MCMC chain...\n" ) # update( jagsModel , n.iter=burnInSteps ) # # The saved MCMC chain: # cat( "Sampling final MCMC chain...\n" ) # codaSamples = coda.samples( jagsModel , variable.names=parameters , # n.iter=nIter , thin=thinSteps ) #------------------------------------------------------------------------------ # EXAMINE THE RESULTS if ( showMCMC ) { for ( paramName in varnames(codaSamples) ) { diagMCMC( codaSamples , parName=paramName ) } } # Convert coda-object codaSamples to matrix object for easier handling. # But note that this concatenates the different chains into one long chain. # Result is mcmcChain[ stepIdx , paramIdx ] mcmcChain = as.matrix( codaSamples ) return( mcmcChain ) } # end function BESTmcmc #============================================================================== BESTsummary = function( y1 , y2 , mcmcChain ) { #source("HDIofMCMC.R") # in DBDA2E-utilities.R mcmcSummary = function( paramSampleVec , compVal=NULL ) { meanParam = mean( paramSampleVec ) medianParam = median( paramSampleVec ) dres = density( paramSampleVec ) modeParam = dres$x[which.max(dres$y)] hdiLim = HDIofMCMC( paramSampleVec ) if ( !is.null(compVal) ) { pcgtCompVal = ( 100 * sum( paramSampleVec > compVal ) 
                      / length( paramSampleVec ) )
    } else {
      pcgtCompVal=NA
    }
    return( c( meanParam , medianParam , modeParam , hdiLim , pcgtCompVal ) )
  }
  # Define matrix for storing summary info:
  summaryInfo = matrix( 0 , nrow=9 , ncol=6 , dimnames=list(
    PARAMETER=c( "mu1" , "mu2" , "muDiff" , "sigma1" , "sigma2" , "sigmaDiff" ,
                 "nu" , "nuLog10" , "effSz" ),
    SUMMARY.INFO=c( "mean" , "median" , "mode" , "HDIlow" , "HDIhigh" ,
                    "pcgtZero" ) 
  ) )
  summaryInfo[ "mu1" , ] = mcmcSummary( mcmcChain[,"mu[1]"] )
  summaryInfo[ "mu2" , ] = mcmcSummary( mcmcChain[,"mu[2]"] )
  summaryInfo[ "muDiff" , ] = mcmcSummary( mcmcChain[,"mu[1]"]
                                           - mcmcChain[,"mu[2]"] , 
                                           compVal=0 )
  summaryInfo[ "sigma1" , ] = mcmcSummary( mcmcChain[,"sigma[1]"] )
  summaryInfo[ "sigma2" , ] = mcmcSummary( mcmcChain[,"sigma[2]"] )
  summaryInfo[ "sigmaDiff" , ] = mcmcSummary( mcmcChain[,"sigma[1]"]
                                              - mcmcChain[,"sigma[2]"] , 
                                              compVal=0 )
  summaryInfo[ "nu" , ] = mcmcSummary( mcmcChain[,"nu"] )
  summaryInfo[ "nuLog10" , ] = mcmcSummary( log10(mcmcChain[,"nu"]) )
  
  N1 = length(y1)
  N2 = length(y2)
  effSzChain = ( ( mcmcChain[,"mu[1]"] - mcmcChain[,"mu[2]"] ) 
                 / sqrt( ( mcmcChain[,"sigma[1]"]^2 
                           + mcmcChain[,"sigma[2]"]^2 ) / 2 ) ) 
  summaryInfo[ "effSz" , ] = mcmcSummary( effSzChain , compVal=0 )
  # Or, use sample-size weighted version:
  # effSz = ( mu1 - mu2 ) / sqrt( ( sigma1^2 *(N1-1) + sigma2^2 *(N2-1) ) 
  #                               / (N1+N2-2) )
  # Be sure also to change plot label in BESTplot function, below.
  return( summaryInfo )
}

#==============================================================================

BESTplot = function( y1 , y2 , mcmcChain , ROPEm=NULL , ROPEsd=NULL , 
                     ROPEeff=NULL , showCurve=FALSE , pairsPlot=FALSE ) {
  # This function plots the posterior distribution (and data).
  # Description of arguments:
  # y1 and y2 are the data vectors.
  # mcmcChain is a list of the type returned by function BTT.
  # ROPEm is a two element vector, such as c(-1,1), specifying the limit
  #   of the ROPE on the difference of means.
  # ROPEsd is a two element vector, such as c(-1,1), specifying the limit
  #   of the ROPE on the difference of standard deviations.
  # ROPEeff is a two element vector, such as c(-1,1), specifying the limit
  #   of the ROPE on the effect size.
  # showCurve is TRUE or FALSE and indicates whether the posterior should
  #   be displayed as a histogram (by default) or by an approximate curve.
  # pairsPlot is TRUE or FALSE and indicates whether scatterplots of pairs
  #   of parameters should be displayed.
  mu1 = mcmcChain[,"mu[1]"]
  mu2 = mcmcChain[,"mu[2]"]
  sigma1 = mcmcChain[,"sigma[1]"]
  sigma2 = mcmcChain[,"sigma[2]"]
  nu = mcmcChain[,"nu"]
  if ( pairsPlot ) {
    # Plot the parameters pairwise, to see correlations:
    openGraph(width=7,height=7)
    nPtToPlot = 1000
    plotIdx = floor(seq(1,length(mu1),by=length(mu1)/nPtToPlot))
    panel.cor = function(x, y, digits=2, prefix="", cex.cor, ...) {
      usr = par("usr"); on.exit(par(usr))
      par(usr = c(0, 1, 0, 1))
      r = (cor(x, y))
      txt = format(c(r, 0.123456789), digits=digits)[1]
      txt = paste(prefix, txt, sep="")
      if(missing(cex.cor)) cex.cor <- 0.8/strwidth(txt)
      text(0.5, 0.5, txt, cex=1.25 ) # was cex=cex.cor*r
    }
    pairs( cbind( mu1 , mu2 , sigma1 , sigma2 , log10(nu) )[plotIdx,] ,
           labels=c( expression(mu[1]) , expression(mu[2]) , 
                     expression(sigma[1]) , expression(sigma[2]) , 
                     expression(log10(nu)) ) , 
           lower.panel=panel.cor , col="skyblue" )
  }
  # source("plotPost.R") # in DBDA2E-utilities.R
  # Set up window and layout:
  openGraph(width=6.0,height=8.0)
  layout( matrix( c(4,5,7,8,3,1,2,6,9,10) , nrow=5, byrow=FALSE ) )
  par( mar=c(3.5,3.5,2.5,0.5) , mgp=c(2.25,0.7,0) )
  
  # Select thinned steps in chain for plotting of posterior predictive curves:
  chainLength = NROW( mcmcChain )
  nCurvesToPlot = 30
  stepIdxVec = seq( 1 , chainLength , floor(chainLength/nCurvesToPlot) )
  xRange = range( c(y1,y2) )
  if ( isTRUE( all.equal( xRange[2] , xRange[1] ) ) ) {
    meanSigma = mean( c(sigma1,sigma2) )
    xRange = xRange + c( -meanSigma , meanSigma )
  }
  xLim = c( xRange[1]-0.1*(xRange[2]-xRange[1]) , 
            xRange[2]+0.1*(xRange[2]-xRange[1]) )
  xVec = seq( xLim[1] , xLim[2] , length=200 )
  maxY = max( dt( 0 , df=max(nu[stepIdxVec]) ) /
                min(c(sigma1[stepIdxVec],sigma2[stepIdxVec])) )
  # Plot data y1 and smattering of posterior predictive curves:
  stepIdx = 1
  plot( xVec , dt( (xVec-mu1[stepIdxVec[stepIdx]])/sigma1[stepIdxVec[stepIdx]] , 
                   df=nu[stepIdxVec[stepIdx]] )/sigma1[stepIdxVec[stepIdx]] , 
        ylim=c(0,maxY) , cex.lab=1.75 ,
        type="l" , col="skyblue" , lwd=1 , xlab="y" , ylab="p(y)" , 
        main="Data Group 1 w. Post. Pred." )
  for ( stepIdx in 2:length(stepIdxVec) ) {
    lines(xVec, dt( (xVec-mu1[stepIdxVec[stepIdx]])/sigma1[stepIdxVec[stepIdx]] , 
                    df=nu[stepIdxVec[stepIdx]] )/sigma1[stepIdxVec[stepIdx]] , 
          type="l" , col="skyblue" , lwd=1 )
  }
  histBinWd = median(sigma1)/2
  histCenter = mean(mu1)
  histBreaks = sort( c( seq( histCenter-histBinWd/2 , min(xVec)-histBinWd/2 ,
                             -histBinWd ),
                        seq( histCenter+histBinWd/2 , max(xVec)+histBinWd/2 ,
                             histBinWd ) , xLim ) )
  histInfo = hist( y1 , plot=FALSE , breaks=histBreaks )
  yPlotVec = histInfo$density 
  yPlotVec[ yPlotVec==0.0 ] = NA
  xPlotVec = histInfo$mids
  xPlotVec[ yPlotVec==0.0 ] = NA
  points( xPlotVec , yPlotVec , type="h" , lwd=3 , col="red" )
  text( max(xVec) , maxY , bquote(N[1]==.(length(y1))) , adj=c(1.1,1.1) )
  # Plot data y2 and smattering of posterior predictive curves:
  stepIdx = 1
  plot( xVec , dt( (xVec-mu2[stepIdxVec[stepIdx]])/sigma2[stepIdxVec[stepIdx]] , 
                   df=nu[stepIdxVec[stepIdx]] )/sigma2[stepIdxVec[stepIdx]] , 
        ylim=c(0,maxY) , cex.lab=1.75 , 
        type="l" , col="skyblue" , lwd=1 , xlab="y" , ylab="p(y)" , 
        main="Data Group 2 w. Post. Pred." )
  for ( stepIdx in 2:length(stepIdxVec) ) {
    lines(xVec, dt( (xVec-mu2[stepIdxVec[stepIdx]])/sigma2[stepIdxVec[stepIdx]] , 
                    df=nu[stepIdxVec[stepIdx]] )/sigma2[stepIdxVec[stepIdx]] , 
          type="l" , col="skyblue" , lwd=1 )
  }
  histBinWd = median(sigma2)/2
  histCenter = mean(mu2)
  histBreaks = sort( c( seq( histCenter-histBinWd/2 , min(xVec)-histBinWd/2 ,
                             -histBinWd ),
                        seq( histCenter+histBinWd/2 , max(xVec)+histBinWd/2 ,
                             histBinWd ) , xLim ) )
  histInfo = hist( y2 , plot=FALSE , breaks=histBreaks )
  yPlotVec = histInfo$density 
  yPlotVec[ yPlotVec==0.0 ] = NA
  xPlotVec = histInfo$mids
  xPlotVec[ yPlotVec==0.0 ] = NA
  points( xPlotVec , yPlotVec , type="h" , lwd=3 , col="red" )
  text( max(xVec) , maxY , bquote(N[2]==.(length(y2))) , adj=c(1.1,1.1) )
  
  # Plot posterior distribution of parameter nu:
  histInfo = plotPost( log10(nu) , col="skyblue" , # breaks=30 ,
                       showCurve=showCurve ,
                       xlab=bquote("log10("*nu*")") , cex.lab = 1.75 , 
                       cenTend=c("mode","median","mean")[1] ,
                       main="Normality" ) #  (<0.7 suggests kurtosis)
  
  # Plot posterior distribution of parameters mu1, mu2, and their difference:
  xlim = range( c( mu1 , mu2 ) )
  histInfo = plotPost( mu1 ,  xlim=xlim , cex.lab = 1.75 ,
                       showCurve=showCurve ,
                       xlab=bquote(mu[1]) , main=paste("Group",1,"Mean") , 
                       col="skyblue" )
  histInfo = plotPost( mu2 ,  xlim=xlim , cex.lab = 1.75 ,
                       showCurve=showCurve ,
                       xlab=bquote(mu[2]) , main=paste("Group",2,"Mean") , 
                       col="skyblue" )
  histInfo = plotPost( mu1-mu2 , compVal=0 ,  showCurve=showCurve ,
                       xlab=bquote(mu[1] - mu[2]) , cex.lab = 1.75 , ROPE=ROPEm ,
                       main="Difference of Means" , col="skyblue" )
  
  # Plot posterior distribution of param's sigma1, sigma2, and their difference:
  xlim=range( c( sigma1 , sigma2 ) )
  histInfo = plotPost( sigma1 ,  xlim=xlim , cex.lab = 1.75 ,
                       showCurve=showCurve ,
                       xlab=bquote(sigma[1]) , main=paste("Group",1,"Std. Dev.") , 
                       col="skyblue" , cenTend=c("mode","median","mean")[1] )
  histInfo = plotPost( sigma2 ,  xlim=xlim , cex.lab = 1.75 ,
                       showCurve=showCurve ,
                       xlab=bquote(sigma[2]) , main=paste("Group",2,"Std. Dev.") , 
                       col="skyblue" , cenTend=c("mode","median","mean")[1] )
  histInfo = plotPost( sigma1-sigma2 , 
                       compVal=0 ,  showCurve=showCurve ,
                       xlab=bquote(sigma[1] - sigma[2]) , cex.lab = 1.75 , 
                       ROPE=ROPEsd ,
                       main="Difference of Std. Dev.s" , col="skyblue" , 
                       cenTend=c("mode","median","mean")[1] )
  
  # Plot of estimated effect size. Effect size is d-sub-a from 
  # Macmillan & Creelman, 1991; Simpson & Fitter, 1973; Swets, 1986a, 1986b.
  effectSize = ( mu1 - mu2 ) / sqrt( ( sigma1^2 + sigma2^2 ) / 2 )
  histInfo = plotPost( effectSize , compVal=0 ,  ROPE=ROPEeff ,
                       showCurve=showCurve ,
                       xlab=bquote( (mu[1]-mu[2])
                                    /sqrt((sigma[1]^2 +sigma[2]^2 )/2 ) ),
                       cenTend=c("mode","median","mean")[1] , cex.lab=1.0 , main="Effect Size" , 
                       col="skyblue" )
  # Or use sample-size weighted version:
  # Hedges 1981; Wetzels, Raaijmakers, Jakab & Wagenmakers 2009.
  # N1 = length(y1)
  # N2 = length(y2)
  # effectSize = ( mu1 - mu2 ) / sqrt( ( sigma1^2 *(N1-1) + sigma2^2 *(N2-1) )
  #                                    / (N1+N2-2) )
  # Be sure also to change BESTsummary function, above.
  # histInfo = plotPost( effectSize , compVal=0 ,  ROPE=ROPEeff ,
  #          showCurve=showCurve ,
  #          xlab=bquote( (mu[1]-mu[2])
  #          /sqrt((sigma[1]^2 *(N[1]-1)+sigma[2]^2 *(N[2]-1))/(N[1]+N[2]-2)) ),
  #          cenTend=c("mode","median","mean")[1] , cex.lab=1.0 , main="Effect Size" , col="skyblue" )
  return( BESTsummary( y1 , y2 , mcmcChain ) )
} # end of function BESTplot

#==============================================================================

BESTpower = function( mcmcChain , N1 , N2 , ROPEm , ROPEsd , ROPEeff ,
                      maxHDIWm , maxHDIWsd , maxHDIWeff , nRep=200 ,
                      mcmcLength=10000 , saveName="BESTpower.Rdata" ,
                      showFirstNrep=0 ) {
  # This function estimates power.
  # Description of arguments:
  # mcmcChain is a matrix of the type returned by function BESTmcmc.
  # N1 and N2 are sample sizes for the two groups.
  # ROPEm is a two element vector, such as c(-1,1), specifying the limit
  #   of the ROPE on the difference of means.
  # ROPEsd is a two element vector, such as c(-1,1), specifying the limit
  #   of the ROPE on the difference of standard deviations.
  # ROPEeff is a two element vector, such as c(-1,1), specifying the limit
  #   of the ROPE on the effect size.
  # maxHDIWm is the maximum desired width of the 95% HDI on the difference 
  #   of means.
  # maxHDIWsd is the maximum desired width of the 95% HDI on the difference 
  #   of standard deviations.
  # maxHDIWeff is the maximum desired width of the 95% HDI on the effect size.
  # nRep is the number of simulated experiments used to estimate the power.
  #source("HDIofMCMC.R") # in DBDA2E-utilities.R
  #source("HDIofICDF.R") # in DBDA2E-utilities.R
  chainLength = NROW( mcmcChain )
  # Select thinned steps in chain for posterior predictions:
  stepIdxVec = seq( 1 , chainLength , floor(chainLength/nRep) )
  goalTally = list( HDIm.gt.ROPE = c(0) ,
                    HDIm.lt.ROPE = c(0) ,
                    HDIm.in.ROPE = c(0) ,
                    HDIm.wd.max = c(0) ,
                    HDIsd.gt.ROPE = c(0) ,
                    HDIsd.lt.ROPE = c(0) ,
                    HDIsd.in.ROPE = c(0) ,
                    HDIsd.wd.max = c(0) ,
                    HDIeff.gt.ROPE = c(0) ,
                    HDIeff.lt.ROPE = c(0) ,
                    HDIeff.in.ROPE = c(0) ,
                    HDIeff.wd.max = c(0) )
  power     = list( HDIm.gt.ROPE = c(0,0,0) ,
                    HDIm.lt.ROPE = c(0,0,0) ,
                    HDIm.in.ROPE = c(0,0,0) ,
                    HDIm.wd.max = c(0,0,0) ,
                    HDIsd.gt.ROPE = c(0,0,0) ,
                    HDIsd.lt.ROPE = c(0,0,0) ,
                    HDIsd.in.ROPE = c(0,0,0) ,
                    HDIsd.wd.max = c(0,0,0) ,
                    HDIeff.gt.ROPE = c(0,0,0) ,
                    HDIeff.lt.ROPE = c(0,0,0) ,
                    HDIeff.in.ROPE = c(0,0,0) ,
                    HDIeff.wd.max = c(0,0,0) )
  nSim = 0
  for ( stepIdx in stepIdxVec ) {
    nSim = nSim + 1
    cat( "\n:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::\n" )
    cat( paste( "Power computation: Simulated Experiment" , nSim , "of" , 
                length(stepIdxVec) , ":\n\n" ) )
    # Get parameter values for this simulation:
    mu1Val = mcmcChain[stepIdx,"mu[1]"]
    mu2Val = mcmcChain[stepIdx,"mu[2]"]
    sigma1Val = mcmcChain[stepIdx,"sigma[1]"]
    sigma2Val = mcmcChain[stepIdx,"sigma[2]"]
    nuVal = mcmcChain[stepIdx,"nu"]
    # Generate simulated data:
    y1 = rt( N1 , df=nuVal ) * sigma1Val + mu1Val    
    y2 = rt( N2 , df=nuVal ) * sigma2Val + mu2Val    
    # Get posterior for simulated data:
    simChain = BESTmcmc( y1, y2, numSavedSteps=mcmcLength, thinSteps=1, 
                         showMCMC=FALSE )
    if (nSim <= showFirstNrep ) { BESTplot( y1, y2, simChain, ROPEm=ROPEm, ROPEsd=ROPEsd, ROPEeff=ROPEeff ) } # Assess which goals were achieved and tally them: HDIm = HDIofMCMC( simChain[,"mu[1]"] - simChain[,"mu[2]"] ) if ( HDIm[1] > ROPEm[2] ) { 
      goalTally$HDIm.gt.ROPE = goalTally$HDIm.gt.ROPE + 1 }
    if ( HDIm[2] < ROPEm[1] ) { goalTally$HDIm.lt.ROPE = goalTally$HDIm.lt.ROPE + 1 } if ( HDIm[1] > ROPEm[1] &  HDIm[2] < ROPEm[2] ) { 
      goalTally$HDIm.in.ROPE = goalTally$HDIm.in.ROPE + 1 } 
    if ( HDIm[2] - HDIm[1] < maxHDIWm ) { goalTally$HDIm.wd.max = goalTally$HDIm.wd.max + 1 } HDIsd = HDIofMCMC( simChain[,"sigma[1]"] - simChain[,"sigma[2]"] ) if ( HDIsd[1] > ROPEsd[2] ) { 
      goalTally$HDIsd.gt.ROPE = goalTally$HDIsd.gt.ROPE + 1 }
    if ( HDIsd[2] < ROPEsd[1] ) { goalTally$HDIsd.lt.ROPE = goalTally$HDIsd.lt.ROPE + 1 } if ( HDIsd[1] > ROPEsd[1] &  HDIsd[2] < ROPEsd[2] ) { 
      goalTally$HDIsd.in.ROPE = goalTally$HDIsd.in.ROPE + 1 } 
    if ( HDIsd[2] - HDIsd[1] < maxHDIWsd ) { goalTally$HDIsd.wd.max = goalTally$HDIsd.wd.max + 1 } HDIeff = HDIofMCMC( ( simChain[,"mu[1]"] - simChain[,"mu[2]"] ) / sqrt( ( simChain[,"sigma[1]"]^2 + simChain[,"sigma[2]"]^2 ) / 2 ) ) if ( HDIeff[1] > ROPEeff[2] ) { 
      goalTally$HDIeff.gt.ROPE = goalTally$HDIeff.gt.ROPE + 1 }
    if ( HDIeff[2] < ROPEeff[1] ) { goalTally$HDIeff.lt.ROPE = goalTally$HDIeff.lt.ROPE + 1 } if ( HDIeff[1] > ROPEeff[1] &  HDIeff[2] < ROPEeff[2] ) { 
      goalTally$HDIeff.in.ROPE = goalTally$HDIeff.in.ROPE + 1 } 
    if ( HDIeff[2] - HDIeff[1] < maxHDIWeff ) { goalTally$HDIeff.wd.max = goalTally$HDIeff.wd.max + 1 } for ( i in 1:length(goalTally) ) { s1 = 1 + goalTally[[i]] s2 = 1 + ( nSim - goalTally[[i]] ) power[[i]][1] = s1/(s1+s2) power[[i]][2:3] = HDIofICDF( qbeta , shape1=s1 , shape2=s2 ) } cat( "\nAfter ", nSim , " Simulated Experiments, Power Is: \n( mean, HDI_low, HDI_high )\n" ) for ( i in 1:length(power) ) { cat( names(power[i]) , round(power[[i]],3) , "\n" ) } save( nSim , power , file=saveName ) } return( power ) } # end of function BESTpower #============================================================================== makeData = function( mu1 , sd1 , mu2 , sd2 , nPerGrp , pcntOut=0 , sdOutMult=2.0 , rnd.seed=NULL ) { # Auxilliary function for generating random values from a # mixture of normal distibutions. if(!is.null(rnd.seed)){set.seed(rnd.seed)} # Set seed for random values. nOut = ceiling(nPerGrp*pcntOut/100) # Number of outliers. nIn = nPerGrp - nOut # Number from main distribution. if ( pcntOut > 100 | pcntOut < 0 ) { stop("pcntOut must be between 0 and 100.") } if ( pcntOut > 0 & pcntOut < 1 ) { warning("pcntOut is specified as percentage 0-100, not proportion 0-1.") } if ( pcntOut > 50 ) {
    warning("pcntOut indicates more than 50% outliers; did you intend this?")
  }
  if ( nOut < 2 & pcntOut > 0 ) {
    stop("Combination of nPerGrp and pcntOut yields too few outliers.")
  }
  if ( nIn < 2 ) { stop("Too few non-outliers.") } sdN = function( x ) { sqrt( mean( (x-mean(x))^2 ) ) } # genDat = function( mu , sigma , Nin , Nout , sigmaOutMult=sdOutMult ) { y = rnorm(n=Nin) # Random values in main distribution. y = ((y-mean(y))/sdN(y))*sigma + mu # Translate to exactly realize desired. yOut = NULL if ( Nout > 0 ) {
      yOut = rnorm(n=Nout)                 # Random values for outliers
      yOut=((yOut-mean(yOut))/sdN(yOut))*(sigma*sdOutMult)+mu # Realize exactly.
    }
    y = c(y,yOut)                      # Concatenate main with outliers.
    return(y)
  }
  y1 = genDat( mu=mu1 , sigma=sd1 , Nin=nIn , Nout=nOut )
  y2 = genDat( mu=mu2 , sigma=sd2 , Nin=nIn , Nout=nOut )
  #
  # Set up window and layout:
  openGraph(width=7,height=7) # Plot the data.
  layout(matrix(1:2,nrow=2))
  histInfo = hist( y1 , main="Simulated Data" , col="pink2" , border="white" , 
                   xlim=range(c(y1,y2)) , breaks=30 , prob=TRUE )
  text( max(c(y1,y2)) , max(histInfo$density) , 
        bquote(N==.(nPerGrp)) , adj=c(1,1) )
  xlow=min(histInfo$breaks)
  xhi=max(histInfo$breaks)
  xcomb=seq(xlow,xhi,length=1001)
  lines( xcomb , dnorm(xcomb,mean=mu1,sd=sd1)*nIn/(nIn+nOut) +
           dnorm(xcomb,mean=mu1,sd=sd1*sdOutMult)*nOut/(nIn+nOut) , lwd=3 )
  lines( xcomb , dnorm(xcomb,mean=mu1,sd=sd1)*nIn/(nIn+nOut) , 
         lty="dashed" , col="green", lwd=3)
  lines( xcomb , dnorm(xcomb,mean=mu1,sd=sd1*sdOutMult)*nOut/(nIn+nOut) , 
         lty="dashed" , col="red", lwd=3)
  histInfo = hist( y2 , main="" , col="pink2" , border="white" , 
                   xlim=range(c(y1,y2)) , breaks=30 , prob=TRUE )
  text( max(c(y1,y2)) , max(histInfo$density) , 
        bquote(N==.(nPerGrp)) , adj=c(1,1) )
  xlow=min(histInfo$breaks)
  xhi=max(histInfo$breaks)
  xcomb=seq(xlow,xhi,length=1001)
  lines( xcomb , dnorm(xcomb,mean=mu2,sd=sd2)*nIn/(nIn+nOut) +
           dnorm(xcomb,mean=mu2,sd=sd2*sdOutMult)*nOut/(nIn+nOut) , lwd=3)
  lines( xcomb , dnorm(xcomb,mean=mu2,sd=sd2)*nIn/(nIn+nOut) , 
         lty="dashed" , col="green", lwd=3)
  lines( xcomb , dnorm(xcomb,mean=mu2,sd=sd2*sdOutMult)*nOut/(nIn+nOut) , 
         lty="dashed" , col="red", lwd=3)
  #
  return( list( y1=y1 , y2=y2 ) )
}

#==============================================================================

# Version of May 26, 2012. Re-checked on 2015 May 08.
# John K. Kruschke  
# johnkruschke@gmail.com
# http://www.indiana.edu/~kruschke/BEST/
#
# This program is believed to be free of errors, but it comes with no guarantee! 
# The user bears all responsibility for interpreting the results. 
# Please check the webpage above for updates or corrections.
#
### ***************************************************************
### ******** SEE FILE BESTexample.R FOR INSTRUCTIONS **************
### ***************************************************************

# OPTIONAL: Clear R's memory and graphics:
rm(list=ls())  # Careful! This clears all of R's memory!
graphics.off() # This closes all of R's graphics windows.

# Get the functions loaded into R's working memory:
source("BEST.R")

#-------------------------------------------------------------------------------
# RETROSPECTIVE POWER ANALYSIS.
# !! This section assumes you have already run BESTexample.R !!
# Re-load the saved data and MCMC chain from the previously conducted 
# Bayesian analysis. This re-loads the variables y1, y2, mcmcChain, etc.
load( "BESTexampleMCMC.Rdata" )
power = BESTpower( mcmcChain , N1=length(y1) , N2=length(y2) , 
                  ROPEm=c(-0.1,0.1) , ROPEsd=c(-0.1,0.1) , ROPEeff=c(-0.1,0.1) , 
                  maxHDIWm=2.0 , maxHDIWsd=2.0 , maxHDIWeff=0.2 , nRep=1000 ,
                  mcmcLength=10000 , saveName = "BESTexampleRetroPower.Rdata" ) 

#-------------------------------------------------------------------------------
# PROSPECTIVE POWER ANALYSIS, using fictitious strong data.
# Generate large fictitious data set that expresses hypothesis:
prospectData = makeData( mu1=108, sd1=17, mu2=100, sd2=15, nPerGrp=1000, 
                         pcntOut=10, sdOutMult=2.0, rnd.seed=NULL )
y1pro = prospectData$y1 # Merely renames simulated data for convenience below.
y2pro = prospectData$y2 # Merely renames simulated data for convenience below.
# Generate Bayesian posterior distribution from fictitious data:
# (uses fewer than usual MCMC steps because it only needs nRep credible 
# parameter combinations, not a high-resolution representation)
mcmcChainPro = BESTmcmc( y1pro , y2pro , numSavedSteps=2000 )  
postInfoPro = BESTplot( y1pro , y2pro , mcmcChainPro , pairsPlot=TRUE )  
save( y1pro, y2pro, mcmcChainPro, postInfoPro, 
      file="BESTexampleProPowerMCMC.Rdata" )
# Now compute the prospective power for planned sample sizes:
N1plan = N2plan = 50 # specify planned sample size
powerPro = BESTpower( mcmcChainPro , N1=N1plan , N2=N2plan , showFirstNrep=5 ,
                   ROPEm=c(-1.5,1.5) , ROPEsd=c(-0.0,0.0) , ROPEeff=c(-0.0,0.0) , 
                   maxHDIWm=15.0 , maxHDIWsd=10.0 , maxHDIWeff=1.0 , nRep=1000 ,
                   mcmcLength=10000 , saveName = "BESTexampleProPower.Rdata" ) 

#-------------------------------------------------------------------------------


Various plots
BeanplotsDensity and Rug plotDirac function - Interdisciplinarity versus OverspecialisationGoogle TrendsJASP prior & posterior plot
#https://cran.r-project.org/web/packages/beanplot/beanplot.pdf
par( mfrow = c( 1, 4 ) )
with(dataexp2, beanplot(v00, ylim = c(0,10), col="lightgray", main = "v00", kernel = "gaussian", cut = 3, cutmin = -Inf, cutmax = Inf, overallline = "mean", horizontal = FALSE, side = "no", jitter = NULL, beanlinewd = 2))
with(dataexp2, beanplot(v01, ylim = c(0,10), col="lightgray", main = "v01", kernel = "gaussian", cut = 3, cutmin = -Inf, cutmax = Inf, overallline = "mean", horizontal = FALSE, side = "no", jitter = NULL, beanlinewd = 2))
with(dataexp2, beanplot(v10, ylim = c(0,10), col="darkgray", main = "v10", kernel = "gaussian", cut = 3, cutmin = -Inf, cutmax = Inf, overallline = "mean", horizontal = FALSE, side = "no", jitter = NULL, beanlinewd = 2))
with(dataexp2, beanplot(v11, ylim = c(0,10), col="darkgray", main = "v11", kernel = "gaussian", cut = 3, cutmin = -Inf, cutmax = Inf, overallline = "mean", horizontal = FALSE, side = "no", jitter = NULL, beanlinewd = 2))


#with ylim and titles
with(dataexp2, beanplot(v00, ylab = "VAS Brightness rating", xlab = "beanplot showing distribution and mean", 
ylim = c(0,10), col="darkgray", main = "v00", kernel = "gaussian", cut = 3, cutmin = -Inf, cutmax = Inf, 
overallline = "mean", horizontal = FALSE, side = "no", jitter = NULL, beanlinewd = 2))




install.packages ("beanplot")
library("beanplot")
par( mfrow = c( 1, 2 ) )
with(dataexp2, beanplot(v00, col="darkgray", main = "v00", kernel = "gaussian", cut = 3, cutmin = -Inf, cutmax = Inf, overallline = "mean", horizontal = FALSE, side = "no", jitter = NULL, beanlinewd = 2))
with(dataexp2, beanplot(v01, col="darkgray", main = "v01", kernel = "gaussian", cut = 3, cutmin = -Inf, cutmax = Inf, overallline = "mean", horizontal = FALSE, side = "no", jitter = NULL, beanlinewd = 2))
par( mfrow = c( 1, 2 ) )
with(dataexp2, beanplot(v10, col="darkgray", main = "v10", kernel = "gaussian", cut = 3, cutmin = -Inf, cutmax = Inf, overallline = "mean", horizontal = FALSE, side = "no", jitter = NULL, beanlinewd = 2))
with(dataexp2, beanplot(v11, col="darkgray", main = "v11", kernel = "gaussian", cut = 3, cutmin = -Inf, cutmax = Inf, overallline = "mean", horizontal = FALSE, side = "no", jitter = NULL, beanlinewd = 2))

stripchart(variable ~ factor, vertical=TRUE, method="stack", 
  ylab="variable", data=StackedData)
  
#two sided
with(dataexp1, beanplot(v00, v10, col="darkgray", main = "v00 vs. v10", kernel = "gaussian", cut = 3, cutmin = -Inf, cutmax = Inf, overallline = "mean", horizontal = FALSE, side = "both", jitter = NULL, beanlinewd = 2))

#verticalpar( mfrow = c( 2, 1 ) )
install.packages ("beanplot")
library("beanplot")
par( mfrow = c( 2, 1 ) ) # layout matrix

with(dataexp1, beanplot(v00, v10, main = "v00 vs. v10", kernel = "gaussian", cut = 3, col = list("black", c("grey", "white")), xlab="VAS",
cutmin = -Inf, cutmax = Inf, overallline = "mean", horizontal = T, side = "both", jitter = NULL, beanlinewd = 2))

with(dataexp1, beanplot(v01, v11, main = "v01 vs. v11", kernel = "gaussian", cut = 3, col = list("black", c("grey", "white")), xlab="VAS",
cutmin = -Inf, cutmax = Inf, overallline = "mean", horizontal = T, side = "both", jitter = NULL, beanlinewd = 2))


#exp1
dataexp1 <- 
  read.table("http://www.irrational-decisions.com/phd-thesis/dataexp1.csv",
   header=TRUE, sep=",", na.strings="NA", dec=".", strip.white=TRUE)
   
par( mfrow = c( 1, 2) )
with(dataexp1, beanplot(v00, ylim = c(0,7), col="lightgray", main = "v00", kernel = "gaussian", cut = 3, cutmin = -Inf, cutmax = Inf, overallline = "mean", horizontal = FALSE, side = "no", jitter = NULL, beanlinewd = 2))
with(dataexp1, beanplot(v10, ylim = c(0,7), col="lightgray", main = "v01", kernel = "gaussian", cut = 3, cutmin = -Inf, cutmax = Inf, overallline = "mean", horizontal = FALSE, side = "no", jitter = NULL, beanlinewd = 2))

par( mfrow = c( 1, 2) )
with(dataexp1, beanplot(v01, ylim = c(3,10), col="darkgray", main = "v10", kernel = "gaussian", cut = 3, cutmin = -Inf, cutmax = Inf, overallline = "mean", horizontal = FALSE, side = "no", jitter = NULL, beanlinewd = 2))
with(dataexp1, beanplot(v11, ylim = c(3,10), col="darkgray", main = "v11", kernel = "gaussian", cut = 3, cutmin = -Inf, cutmax = Inf, overallline = "mean", horizontal = FALSE, side = "no", jitter = NULL, beanlinewd = 2))


#histogram and rug plot

plot(density(dataexp2$v00))
rug(dataexp2$v00, col=2, lwd=3.5)

##############

library(MASS)
plot(density(dataexp2$v00, bw=5)) 
rug(dataexp2$v00+ rnorm(length(dataexp2$v00), sd=5), col=2, lwd=3.5)

x = c(5)
y   = c(9)
plot(x,y,type="h",xlim=c(0,10),ylim=c(0,10),lwd=2,col="blue",ylab="knowledge", xlab="diversity")

install.packages("gtrendsR")
library(gtrendsR)
gtrends(c("bayesian inference", "p-value"), time = "all") # since 2004

res <- gtrends(c("Bayesian inference", "Markov Chain Monte Carlo", "ANOVA"), geo = c("US", "GB", "DE", "CA",))

res <- gtrends("Markov Chain Monte Carlo", geo = c("US", "GB", "DE"))

plot(res)

#get vars

length(res)
str(res)
res$interest_over_time
ff<-res$interest_over_time

scatterplot(ff$hits~ff$date| geo, reg.line=FALSE, smooth=FALSE, spread=FALSE, boxplots=FALSE, span=0.5, ellipse=FALSE, levels=c(.5, .9), by.groups=TRUE, data=ff)

res2 <- gtrends(c("p-value", "bayes factor"))

library(plotrix)

.likelihoodShiftedT <- function(par, data) {
    
    -sum(log(dt((data - par[1])/par[2], par[3])/par[2]))
    
}

.dposteriorShiftedT <- function(x, parameters, oneSided) { # function that returns the posterior density if (oneSided == FALSE) { dt((x - parameters[1])/parameters[2], parameters[3])/parameters[2] } else if (oneSided == "right") { ifelse(x >= 0, (dt((x - parameters[1])/parameters[2], parameters[3])/parameters[2])/pt((0 - 
            parameters[1])/parameters[2], parameters[3], lower.tail = FALSE), 
            0)
        
    } else if (oneSided == "left") {
        
        ifelse(x <= 0, (dt((x - parameters[1])/parameters[2], parameters[3])/parameters[2])/pt((0 - 
            parameters[1])/parameters[2], parameters[3], lower.tail = TRUE), 
            0)
        
    }
    
}

.dprior <- function(x, r, oneSided = oneSided) {
    
    # function that returns the prior density
    
    if (oneSided == "right") {
        
        y <- ifelse(x < 0, 0, 2/(pi * r * (1 + (x/r)^2)))
        return(y)
    }
    
    if (oneSided == "left") {
        
        y <- ifelse(x > 0, 0, 2/(pi * r * (1 + (x/r)^2)))
        return(y)
    } else {
        
        return(1/(pi * r * (1 + (x/r)^2)))
    }
}

.plotPosterior.ttest <- function(x = NULL, y = NULL, paired = FALSE, oneSided = FALSE, 
    BF, BFH1H0 = TRUE, iterations = 10000, rscale = "medium", lwd = 2, 
    cexPoints = 1.5, cexAxis = 1.2, cexYlab = 1.5, cexXlab = 1.5, cexTextBF = 1.4, 
    cexCI = 1.1, cexLegend = 1.2, lwdAxis = 1.2, addInformation = TRUE, 
    dontPlotData = FALSE) {
    
    if (addInformation) {
        
        par(mar = c(5.6, 5, 7, 4) + 0.1, las = 1)
        
    } else {
        
        par(mar = c(5.6, 5, 4, 4) + 0.1, las = 1)
    }
    
    if (dontPlotData) {
        
        plot(1, type = "n", xlim = 0:1, ylim = 0:1, bty = "n", axes = FALSE, 
            xlab = "", ylab = "")
        
        axis(1, at = 0:1, labels = FALSE, cex.axis = cexAxis, lwd = lwdAxis, 
            xlab = "")
        axis(2, at = 0:1, labels = FALSE, cex.axis = cexAxis, lwd = lwdAxis, 
            ylab = "")
        
        mtext(text = "Density", side = 2, las = 0, cex = cexYlab, line = 3.25)
        mtext(expression(paste("Effect size", ~delta)), side = 1, cex = cexXlab, 
            line = 2.5)
        
        return()
    }
    
    if (rscale == "medium") {
        r <- sqrt(2)/2
    }
    if (rscale == "wide") {
        r <- 1
    }
    if (rscale == "ultrawide") {
        r <- sqrt(2)
    }
    if (mode(rscale) == "numeric") {
        r <- rscale
    }
    
    if (oneSided == FALSE) {
        nullInterval <- NULL
    }
    if (oneSided == "right") {
        nullInterval <- c(0, Inf)
    }
    if (oneSided == "left") {
        nullInterval <- c(-Inf, 0)
    }
    
    # sample from delta posterior
    samples <- BayesFactor::ttestBF(x = x, y = y, paired = paired, posterior = TRUE, 
        iterations = iterations, rscale = r)
    
    delta <- samples[, "delta"]
    
    # fit shifted t distribution
    if (is.null(y)) {
        
        deltaHat <- mean(x)/sd(x)
        N <- length(x)
        df <- N - 1
        sigmaStart <- 1/N
        
    } else if (paired) {
        
        deltaHat <- mean(x - y)/sd(x - y)
        N <- length(x)
        df <- N - 1
        sigmaStart <- 1/N
        
    } else if (!is.null(y) && !paired) {
        
        N1 <- length(x)
        N2 <- length(y)
        sdPooled <- sqrt(((N1 - 1) * var(x) + (N2 - 1) * var(y))/(N1 + 
            N2))
        deltaHat <- (mean(x) - mean(y))/sdPooled
        df <- N1 + N2 - 2
        sigmaStart <- sqrt(N1 * N2/(N1 + N2))
    }
    
    if (sigmaStart < 0.01) 
        sigmaStart <- 0.01
    
    parameters <- try(silent = TRUE, expr = optim(par = c(deltaHat, sigmaStart, 
        df), fn = .likelihoodShiftedT, data = delta, method = "BFGS")$par)
    
    if (class(parameters) == "try-error") {
        
        parameters <- try(silent = TRUE, expr = optim(par = c(deltaHat, 
            sigmaStart, df), fn = .likelihoodShiftedT, data = delta, method = "Nelder-Mead")$par)
    }
    
    if (BFH1H0) {
        
        BF10 <- BF
        BF01 <- 1/BF10
        
    } else {
        
        BF01 <- BF
        BF10 <- 1/BF01
    }
    
    
    # set limits plot
    xlim <- vector("numeric", 2)
    
    if (oneSided == FALSE) {
        
        xlim[1] <- min(-2, quantile(delta, probs = 0.01)[[1]])
        xlim[2] <- max(2, quantile(delta, probs = 0.99)[[1]])
        
        if (length(x) < 10) {
            
            if (addInformation) {
                
                stretch <- 1.52
            } else {
                
                stretch <- 1.4
            }
            
        } else {
            
            stretch <- 1.2 } } if (oneSided == "right") { # if (length(delta[delta >= 0]) < 10) return('Plotting is not
        # possible: To few posterior samples in tested interval')
        
        xlim[1] <- min(-2, quantile(delta[delta >= 0], probs = 0.01)[[1]])
        xlim[2] <- max(2, quantile(delta[delta >= 0], probs = 0.99)[[1]])
        
        if (any(is.na(xlim))) {
            
            xlim[1] <- min(-2, .qShiftedT(0.01, parameters, oneSided = "right"))
            xlim[2] <- max(2, .qShiftedT(0.99, parameters, oneSided = "right"))
            
        }
        
        stretch <- 1.32
    }
    
    if (oneSided == "left") {
        
        # if (length(delta[delta <= 0]) < 10) return('Plotting is not
        # possible: To few posterior samples in tested interval')
        
        xlim[1] <- min(-2, quantile(delta[delta <= 0], probs = 0.01)[[1]])
        xlim[2] <- max(2, quantile(delta[delta <= 0], probs = 0.99)[[1]])
        
        if (any(is.na(xlim))) {
            
            xlim[1] <- min(-2, .qShiftedT(0.01, parameters, oneSided = "left"))
            xlim[2] <- max(2, .qShiftedT(0.99, parameters, oneSided = "left"))
            
        }
        
        stretch <- 1.32
    }
    
    xticks <- pretty(xlim)
    
    ylim <- vector("numeric", 2)
    
    ylim[1] <- 0
    dmax <- optimize(function(x) .dposteriorShiftedT(x, parameters = parameters, 
        oneSided = oneSided), interval = range(xticks), maximum = TRUE)$objective
    ylim[2] <- max(stretch * .dprior(0, r, oneSided = oneSided), stretch * 
        dmax)  # get maximum density
    
    # calculate position of 'nice' tick marks and create labels
    yticks <- pretty(ylim)
    xlabels <- formatC(xticks, 1, format = "f")
    ylabels <- formatC(yticks, 1, format = "f")
    
    # compute 95% credible interval & median:
    if (oneSided == FALSE) {
        
        CIlow <- quantile(delta, probs = 0.025)[[1]]
        CIhigh <- quantile(delta, probs = 0.975)[[1]]
        medianPosterior <- median(delta)
        
        if (any(is.na(c(CIlow, CIhigh, medianPosterior)))) {
            
            CIlow <- .qShiftedT(0.025, parameters, oneSided = FALSE)
            CIhigh <- .qShiftedT(0.975, parameters, oneSided = FALSE)
            medianPosterior <- .qShiftedT(0.5, parameters, oneSided = FALSE)
        }
    }
    
    if (oneSided == "right") {
        
        CIlow <- quantile(delta[delta >= 0], probs = 0.025)[[1]]
        CIhigh <- quantile(delta[delta >= 0], probs = 0.975)[[1]]
        medianPosterior <- median(delta[delta >= 0])
        
        if (any(is.na(c(CIlow, CIhigh, medianPosterior)))) {
            
            CIlow <- .qShiftedT(0.025, parameters, oneSided = "right")
            CIhigh <- .qShiftedT(0.975, parameters, oneSided = "right")
            medianPosterior <- .qShiftedT(0.5, parameters, oneSided = "right")
        }
    }
    
    if (oneSided == "left") {
        
        CIlow <- quantile(delta[delta <= 0], probs = 0.025)[[1]]
        CIhigh <- quantile(delta[delta <= 0], probs = 0.975)[[1]]
        medianPosterior <- median(delta[delta <= 0])
        
        if (any(is.na(c(CIlow, CIhigh, medianPosterior)))) {
            
            CIlow <- .qShiftedT(0.025, parameters, oneSided = "left")
            CIhigh <- .qShiftedT(0.975, parameters, oneSided = "left")
            medianPosterior <- .qShiftedT(0.5, parameters, oneSided = "left")
        }
        
    }
    
    posteriorLine <- .dposteriorShiftedT(x = seq(min(xticks), max(xticks), 
        length.out = 1000), parameters = parameters, oneSided = oneSided)
    
    xlim <- c(min(CIlow, range(xticks)[1]), max(range(xticks)[2], CIhigh)) plot(1, 1, xlim = xlim, ylim = range(yticks), ylab = "", xlab = "", type = "n", axes = FALSE) lines(seq(min(xticks), max(xticks), length.out = 1000), posteriorLine, lwd = lwd) lines(seq(min(xticks), max(xticks), length.out = 1000), .dprior(seq(min(xticks), max(xticks), length.out = 1000), r = r, oneSided = oneSided), lwd = lwd, lty = 3) axis(1, at = xticks, labels = xlabels, cex.axis = cexAxis, lwd = lwdAxis) axis(2, at = yticks, labels = ylabels, , cex.axis = cexAxis, lwd = lwdAxis) if (nchar(ylabels[length(ylabels)]) > 4) {
        
        mtext(text = "Density", side = 2, las = 0, cex = cexYlab, line = 4)
        
    } else if (nchar(ylabels[length(ylabels)]) == 4) {
        
        mtext(text = "Density", side = 2, las = 0, cex = cexYlab, line = 3.25)
        
    } else if (nchar(ylabels[length(ylabels)]) < 4) {
        
        mtext(text = "Density", side = 2, las = 0, cex = cexYlab, line = 2.85)
        
    }
    
    mtext(expression(paste("Effect size", ~delta)), side = 1, cex = cexXlab, 
        line = 2.5)
    
    points(0, .dprior(0, r, oneSided = oneSided), col = "black", pch = 21, 
        bg = "grey", cex = cexPoints)
    
    if (oneSided == FALSE) {
        
        heightPosteriorAtZero <- .dposteriorShiftedT(0, parameters = parameters, 
            oneSided = oneSided)
        
    } else if (oneSided == "right") {
        
        posteriorLineLargerZero <- posteriorLine[posteriorLine > 0]
        heightPosteriorAtZero <- posteriorLineLargerZero[1]
        
    } else if (oneSided == "left") {
        
        posteriorLineLargerZero <- posteriorLine[posteriorLine > 0]
        heightPosteriorAtZero <- posteriorLineLargerZero[length(posteriorLineLargerZero)]
    }
    
    points(0, heightPosteriorAtZero, col = "black", pch = 21, bg = "grey", 
        cex = cexPoints)
    
    ### 95% credible interval
    
    # enable plotting in margin
    par(xpd = TRUE)
    
    yCI <- grconvertY(dmax, "user", "ndc") + 0.04
    yCI <- grconvertY(yCI, "ndc", "user")
    
    arrows(CIlow, yCI, CIhigh, yCI, angle = 90, code = 3, length = 0.1, 
        lwd = lwd)
    
    medianText <- formatC(medianPosterior, digits = 3, format = "f")
    
    if (addInformation) {
        
        # display BF10 value
        offsetTopPart <- 0.06
        
        yy <- grconvertY(0.75 + offsetTopPart, "ndc", "user")
        yy2 <- grconvertY(0.806 + offsetTopPart, "ndc", "user")
        
        xx <- min(xticks) if (BF10 >= 1000000 | BF01 >= 1000000) {
            
            BF10t <- formatC(BF10, 3, format = "e")
            BF01t <- formatC(BF01, 3, format = "e")
        }
        
        if (BF10 < 1000000 & BF01 < 1000000) {
            
            BF10t <- formatC(BF10, 3, format = "f")
            BF01t <- formatC(BF01, 3, format = "f")
        }
        
        if (oneSided == FALSE) {
            
            text(xx, yy2, bquote(BF[10] == .(BF10t)), cex = cexTextBF, 
                pos = 4)
            text(xx, yy, bquote(BF[0][1] == .(BF01t)), cex = cexTextBF, 
                pos = 4)
        }
        
        if (oneSided == "right") {
            
            text(xx, yy2, bquote(BF["+"][0] == .(BF10t)), cex = cexTextBF, 
                pos = 4)
            text(xx, yy, bquote(BF[0]["+"] == .(BF01t)), cex = cexTextBF, 
                pos = 4)
        }
        
        if (oneSided == "left") {
            
            text(xx, yy2, bquote(BF["-"][0] == .(BF10t)), cex = cexTextBF, 
                pos = 4)
            text(xx, yy, bquote(BF[0]["-"] == .(BF01t)), cex = cexTextBF, 
                pos = 4)
        }
        
        yy <- grconvertY(0.756 + offsetTopPart, "ndc", "user")
        yy2 <- grconvertY(0.812 + offsetTopPart, "ndc", "user")
        
        CIText <- paste("95% CI: [", bquote(.(formatC(CIlow, 3, format = "f"))), 
            ", ", bquote(.(formatC(CIhigh, 3, format = "f"))), "]", sep = "")
        medianLegendText <- paste("median =", medianText)
        
        text(max(xticks), yy2, medianLegendText, cex = 1.1, pos = 2)
        text(max(xticks), yy, CIText, cex = 1.1, pos = 2)
        
        # probability wheel
        if (max(nchar(BF10t), nchar(BF01t)) <= 4) {
            xx <- grconvertX(0.44, "ndc", "user")
        }
        
        if (max(nchar(BF10t), nchar(BF01t)) == 5) {
            xx <- grconvertX(0.44 + 0.001 * 5, "ndc", "user")
        }
        
        if (max(nchar(BF10t), nchar(BF01t)) == 6) {
            xx <- grconvertX(0.44 + 0.001 * 6, "ndc", "user")
        }
        
        if (max(nchar(BF10t), nchar(BF01t)) == 7) {
            xx <- grconvertX(0.44 + 0.002 * max(nchar(BF10t), nchar(BF01t)), 
                "ndc", "user")
        }
        
        if (max(nchar(BF10t), nchar(BF01t)) == 8) {
            xx <- grconvertX(0.44 + 0.003 * max(nchar(BF10t), nchar(BF01t)), "ndc", "user") } if (max(nchar(BF10t), nchar(BF01t)) > 8) {
            xx <- grconvertX(0.44 + 0.005 * max(nchar(BF10t), nchar(BF01t)), 
                "ndc", "user")
        }
        
        yy <- grconvertY(0.788 + offsetTopPart, "ndc", "user")
        
        # make sure that colored area is centered
        radius <- 0.06 * diff(range(xticks))
        A <- radius^2 * pi
        alpha <- 2/(BF01 + 1) * A/radius^2
        startpos <- pi/2 - alpha/2
        
        # draw probability wheel
        plotrix::floating.pie(xx, yy, c(BF10, 1), radius = radius, col = c("darkred", 
            "white"), lwd = 2, startpos = startpos)
        
        yy <- grconvertY(0.865 + offsetTopPart, "ndc", "user")
        yy2 <- grconvertY(0.708 + offsetTopPart, "ndc", "user")
        
        if (oneSided == FALSE) {
            
            text(xx, yy, "data|H1", cex = cexCI)
            text(xx, yy2, "data|H0", cex = cexCI)
        }
        
        if (oneSided == "right") {
            
            text(xx, yy, "data|H+", cex = cexCI)
            text(xx, yy2, "data|H0", cex = cexCI)
        }
        
        if (oneSided == "left") {
            
            text(xx, yy, "data|H-", cex = cexCI)
            text(xx, yy2, "data|H0", cex = cexCI)
        }
        
        # add legend
        CIText <- paste("95% CI: [", bquote(.(formatC(CIlow, 3, format = "f"))), 
            " ; ", bquote(.(formatC(CIhigh, 3, format = "f"))), "]", sep = "")
        
        medianLegendText <- paste("median =", medianText)
    }
    
    mostPosterior <- mean(delta > mean(range(xticks)))
    
    if (mostPosterior >= 0.5) {
        
        legendPosition <- min(xticks)
        legend(legendPosition, max(yticks), legend = c("Posterior", "Prior"), 
            lty = c(1, 3), bty = "n", lwd = c(lwd, lwd), cex = cexLegend, 
            xjust = 0, yjust = 1, x.intersp = 0.6, seg.len = 1.2)
    } else {
        
        legendPosition <- max(xticks)
        legend(legendPosition, max(yticks), legend = c("Posterior", "Prior"), 
            lty = c(1, 3), bty = "n", lwd = c(lwd, lwd), cex = cexLegend, 
            xjust = 1, yjust = 1, x.intersp = 0.6, seg.len = 1.2)
    }
}

### generate data ###

set.seed(1)
x <- rnorm(30, 0.15)

### calculate Bayes factor ###

library(BayesFactor)
BF <- extractBF(ttestBF(x, rscale = "medium"), onlybf = TRUE)

### plot ###

.plotPosterior.ttest(x = x, rscale = "medium", BF = BF)

Simulations
#monte carlo t test simulation


#install.packages("MonteCarlo")
library(MonteCarlo)



# Define function that generates data and applies the method of interest

ttest<-function(n,loc,scale){
  
  # generate sample:
    sample<-rnorm(n, loc, scale)
  
  # calculate test statistic:
    stat<-sqrt(n)*mean(sample)/sd(sample)
  
  # get test decision:
    decision<-abs(stat)>1.96
  
  # return result:
    return(list("decision"=decision))
}

# define parameter grid:

  n_grid<-c(50,100,250,500)
  loc_grid<-seq(0,1,0.2)
  scale_grid<-c(1,2)

# collect parameter grids in list:
  param_list=list("n"=n_grid, "loc"=loc_grid, "scale"=scale_grid)

# run simulation:

  MC_result<-MonteCarlo(func=ttest, nrep=1000, param_list=param_list)

  summary(MC_result)

R to WordPress
#install.packages("RWordPress", repos="http://www.omegahat.org/R", build=TRUE)
library(RWordPress)
options(WordPressLogin=c(user="password"),
        WordPressURL="http://your_wp_installation.org/xmlrpc.php")
#[code lang='r']
#...
#[/code]
knit_hooks$set(output=function(x, options) paste("\\[code\\]\n", x, "\\[/code\\]\n", sep=""))
knit_hooks$set(source=function(x, options) paste("\\[code lang='r'\\]\n", x, "\\[/code\\]\n", sep=""))

knit2wp <- function(file) {
    require(XML)
    content <- readLines(file)
    content <- htmlTreeParse(content, trim=FALSE)

    ## WP will add the h1 header later based on the title, so delete here
    content$children$html$children$body$children$h1 <- NULL
    content <- paste(capture.output(print(content$children$html$children$body,
                                          indent=FALSE, tagSeparator="")),
                     collapse="\n")
    content <- gsub("<?.body>", "", content)         # remove body tag
    
    ## enclose code snippets in SyntaxHighlighter format
    content <- gsub("<?pre><code class=\"r\">", "\\[code lang='r'\\]\\\n",
                    content)
    content <- gsub("<?pre><code class=\"no-highlight\">", "\\[code\\]\\\n",
                    content)
    content <- gsub("<?pre><code>", "\\[code\\]\\\n", content)
    content <- gsub("<?/code></pre>", "\\[/code\\]\\\n", content)
    return(content)
}

newPost(content=list(description=knit2wp('rerWorkflow.html'),
                     title='Workflow: Post R markdown to WordPress',
                     categories=c('R')),
        publish=FALSE)

postID <- 99                    # post id returned by newPost()
editPost(postID,
         content=list(description=knit2wp('rerWorkflow.html'),
                      title='Workflow: Post R markdown to WordPress',
                      categories=c('R')),
         publish=FALSE)


 

Shiny Apps
Exploring the diagnosticity of the p-valueP-value distribution and power curves for an independent two-tailed t-testBIC approximation for ANOVA designsWhen does a significant p-value indicate a true effect?




RGL

Scatterplot3d

Rcmdr

Video lectures
Packages/Libraries



Ask/answer questions on StackOverflow
Trending on GitHub

Leave a Reply