Stats and archiving summary

In last week’s meeting we discussed implementation of archiving best practices for the large amount of material MesoSpace deals with. From out individual researchers there is raw data (audio and video recordings), transcriptions, and spreadsheets of coded data (frames of reference used). From the stats team we have consolidated data files (mostly excel spreadsheets that are fed into R).

Problems have occurred in the past with changes in coding and versions not being noted and applied universally. A solution for this is to keep an archive with a detailed log, which includes versions of files and the changes that have resulted in different versions. Importantly, a sealed (or read-only) archive should be kept and backed up on external harddrives, and only working versions should be kept on the RA laptop as an active workspace. The archive should include the data produced by team members as well as all versions of publications, which serve as a log of our ongoing analyses.

For the stats team, the problem of simultaneous work was brought up, and we decided to implement a more careful system of work delegation (i.e. researchers have pre-set times to work on shared files and changes are noted and logged, so that overlapping versions are not created). We discussed potentially using GitHub for file storage and sharing, but we decided that it won’t more adequately address the simultaneous work issue any better than dropbox, so we’ll keep using dropbox.

We discussed additional analyses that are in the works:

  1. Run the generalized linear mixed-effects model (GLMM) on the matcher’s demographic data. This is in response to a suggestion made by Dr. Fertig at our colloquium presentation Feb. 22. To date we’ve run GLMM on the director’s demographic data, but there may be an effect of the matcher’s demographic factors (i.e. the director accommodating the matcher).
  2. Run the analysis for Sets 1 & 3. (GLMM, similarity matrix).  To date we’ve run analyses on data from B&C sets 2 & 4. This new analysis would address concerns about the extent to which there are strategy effects.
  3. Analysis for effect of order of pictures described. (What do directors use when they’re not understood? (interesting follow up to (1).  Based on what he’s observed in his own Yucate data set, Juergen predicts that when the director uses relative, the matcher first asks about ‘left’ and ‘right’, then they use the rel. And when the director uses cardinal directions, they then switch to landmark.
  4.  Is there a set effect? Ball & Chair was designed largely following the protocol of the designers of Men & Tree, where each set of pictures targets different types of reference frames. For example, Set 1 is resolvable intrinsically. We should see evidence of such an effect in the order of description study (3).
  5.  Run analyses with New Animals (recall) data. This step is crucial for the upcoming CogSci submission.
  6. Run analyses that include orientation descriptions (in addition to locative descriptions).

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s