An R algorithm for generating synchronized permutations

 

Phillip I. Good

Information Research, Huntington Beach CA USA

 

and Cliff Lunneborg

Statistics, University of Washington, Seattle WA USA

 

 

Summary: An algorithm is provided for generating synchronized permutations for use in obtaining exact and independent tests of main effects and interaction in a two-way design with two levels of one factor and an unrestricted number of levels of the other.

 

Keywords: permutation tests, exact tests, synchronized permutations, algorithm, 2-way designs.

 

 

 

Introduction

Synchronized permutations are required to obtain exact tests of hypotheses when multiple factors are involved, Pesarin (2001) and Salmaso (2001).  The analysis of variance cannot be relied on for data from distributions that are heavier in the tails than the normal, Good[2005, chapter 7).   Only synchronized permutations in which, for example, exchanges between rows in one column are duplicated in all other columns, provide for the clear separation of main effects and interactions, Good(2002,2005).

     While synchronized permutations for testing main effects are readily generated, the rearrangements required for testing interactions are not quite so straightforward to compute.  The unthinking application of  a synchronized row permutation followed by a synchronized column permutation can lead to permutations in which main and interactive effects are still confounded.  We illustrate this point in Figure 1a,b.c,d reproduced from Good(2005; chapter 7).

 

 

 

 

 

 


   

 

 Figure 1a.   Part of a Two-Factor Experimental Design.  Shapes correspond to row effects, patterns to column effects, and shape-pattern to interaction.  The usual linear zero-sum rules for additive linear models apply.

 

 

 

 

 

 

 


Figure 1b. The same design after a synchronized exchange of elements between the first and second rows.  This rearrangement can be used for testing for row main effects.  Note that column effects and interactions continue to sum to zero.

 

 

 

 

 


    Figure 1c.  In this rearrangement, two synchronized exchanges of elements have taken place.  The first between the first and second rows and the second between the first and second columns.  One is able to use this rearrangement to test for interactions as the row and column effects sum to zero.

 

 

 

 

 


    Figure 1d.  In this rearrangement, two synchronized exchanges of elements have taken place.  The first between the first and second rows and the second between the first and second columns.  We are not able to use this rearrangement to test for interactions as the row and column effects do not cancel.

 

Algorithm

Generating synchronized permutations that will test for a main effect independently of the main effects of other factors and of interactions is straightforward. 

     Recall that the steps in deriving a p-value for a permutation test via a Monte Carlo are four in number:

1.     Compute the test statistic for the original observations

2.     Generate a random rearrangement

3.     Compute the test statistic for the rearrangement

4.     Compare the original value of the test statistic with the distribution of values obtained by repeating steps 2 and 3 a large number of times.

     To generate a synchronized random rearrangement of columns, we first rearrange the indicies and then rearrange the elements of each row.

dim(D)=c(2,C,K)

# where D denotes the array holding study results

#  2 is the number of levels of the first factor

#  C denotes the number of levels of the second factor

#  K denotes the number of observations in each cell

dim(D)=c(2,C*K)

index=1:C*K

Rindex=sample(index)

RD=D

dim(RD)=c(2,C*K)

for (r in 1:2)

for (j in 1:C*K)

RD [r,j]=T[r,Rindex(j)

dim(RD)=c(2,C,K)

     In the code that follows, we first generate a random arrangement for testing for a main effect of the rows in a two-factor design.  The method employed is not the most efficient for this purpose but offers the advantage that it facilitates the subsequent generation of a rearrangement for use in testing for a two-way interaction. 

 

#To obtain a rearrangement for use in computing the main effect of the second     factor, begin by permuting the elements within each cell

PD=D

dim(PD)=c(2,C,K)

for (r in 1:2)

for (c in 1:C)

PD [r,c,] = sample(D[r,c,])

#Decide how many exchanges will be made

p = cp = c(1,1:K)

for (j in 2:K){

p[j]=choose(K,j-1)**(R*C)

cp[j]=p[j]+cp[j-1]

}

p[K+1]=1

cp[K+1]=cp[K]+1

j=0

x=runif(0, cp[K+1])

while(x>cp[j+1])j=j+1

#We now synchronize the rearrangements by swapping the first j elements of each column between rows.

for (c in 1:C){

   temp= PD[1,c,1:j]

    PD[1,c,1:j]= PD[2,c,1:j]

    PD[2,c,1:j]= temp

}

#To generate a random rearrangement for testing interactions, we first rearrange indicies and then swap the actual values.

PPD=D

dim(PPD)=c(2,C,K)

indexs=c(1:C*j)

indexu=c(1:C*(K-j))

pindexs=sample(indexs)

pindexu=sample(indexu)

for (r in 1:2)

for (c in 1:C){

for (k in 1:j) {

h= pindexs[(c-1)*j+k]/j

PPD [r,c,k]= PD [2,ceiling(h),j*(h-trunc(h))]

}

for (k in 1:K-j) {

h= pindexu[(c-1)*(K-j)+k]/(K-j)

PPD [r,c,k+j]= PD [2,ceiling(h),(K-j)*(h-trunc(h))+j]

}

                    }

#PD contains a rearrangement for use in testing for a row effect, and PPD contains a rearrangement for use in testing for an interaction.

 

Application of the Algorithm

The most common distribution in practice where traditional ANOV methods fail is that of the contaminated normal.  A series of small random samples were generated using the code rnorm(24,rbinom(24,2,0.3),1), for all the observations in a 2x4 matrix with 3 observations per cell.  If the data were normal, we would expect to reject in error the hypothesis of no column effect 5% of the time at the 5% level by either traditional ANOV or synchronized permutation methods.  Moreover, we would expect to simultaneously reject in error both the hypothesis of no column effect at the 10% level and of no row-column interaction at the 10% level 1% of the time.  Table 1 reveals the results of 10,000 simulations.

 

Address for Correspondence:  Phillip I. Good, Information Research, 205 W. Utica Ave., Huntington Beach CA 92648 USA. E-Mail: frere_untel@hotmail.com

 

 

References

 

Good, P. (2002) Extensions of the concept of exchangeability and their applications, J. Modern Appl. Statist. Methods 1: 243-247.

http://tbf.coe.wayne.edu/jmasm/vol1_no2.pdf.

Good, P. (2005) Permutation, Parametric, and Bootstrap Tests of Hypotheses, Springer-Verlag, NY, 3rd edition.

Pesarin F. (2001) Multivariate Permutation Tests. New York: Wiley.

Salmaso L. (2001) Synchronized permutation tests in 2k factorial designs. Int. J. Non Linear Model. Sci. Eng. 3.