Skip to content

Data Collection

Dae Houlihan edited this page Nov 11, 2024 · 2 revisions

Data Collection

Warning

This section is under construction. Some of this information is out of date with the latest release of the template. Use with caution.

This guide covers integrating with Prolific and retrieving experimental data.

Prolific Integration

Setup

  1. Create study on Prolific
  2. Configure URL parameters in Prolific:
    • PROLIFIC_PID
    • STUDY_ID
    • SESSION_ID

Prolific URL Parameters

Configuration

  1. Add completion code in creds.ts:
const prolificCompletionCode = "YOUR-CODE";
  1. Verify URL parameter handling in your code:
const urlParams = new URLSearchParams(window.location.search);
const prolificPID = urlParams.get('PROLIFIC_PID');
const studyID = urlParams.get('STUDY_ID');
const sessionID = urlParams.get('SESSION_ID');

Data Storage

Data Structure

The template uses three Firestore collections:

  • exptData: Main experimental data
    {
      uid: string,
      trials: Array<{
        currentTrial: number,
        response: string|number,
        // ... other trial data
      }>,
      // ... metadata
    }
  • userData: User-specific data
  • sharedData: Shared experiment configuration

Security Rules

Default Firestore rules:

match /expData/{uid} {
    allow read: if true;
    allow write: if request.auth.uid == uid;
}

Data Retrieval

Using the Retrieval Script

  1. Generate Firebase Admin credentials:

    • Go to Firebase Console
    • Project Settings > Service Accounts
    • Generate New Private Key
    • Save JSON file securely
  2. Run retrieval script:

python retrieve_data.py \
    --cred "path/to/firebase-adminsdk.json" \
    --out "path/to/output" \
    --collection 'exptData' 'sharedData'

Data Format

Retrieved data structure:

{
  "participant_id": {
    "trials": [
      {
        "currentTrial": 0,
        "response": "value",
        "timestamp": "2024-01-01T12:00:00Z"
      }
      // ... more trials
    ],
    "metadata": {
      "prolificPID": "...",
      "studyID": "...",
      "sessionID": "...",
      "version": "1.0.0",
      "commitHash": "abc123"
    }
  }
  // ... more participants
}

Data Security

Best Practices

  1. Secure credential storage:

    • Never commit credentials to git
    • Use encrypted storage for admin SDK key
    • Limit access to production data
  2. Data backup:

    • Regular exports
    • Version control for analysis scripts
    • Secure backup storage
  3. Data cleanup:

    • Remove debug data regularly
    • Archive completed studies
    • Maintain audit trail

Analysis Pipeline

Example Analysis Script

import pandas as pd
import firebase_admin
from firebase_admin import credentials, firestore

def load_experiment_data(cred_path):
    cred = credentials.Certificate(cred_path)
    firebase_admin.initialize_app(cred)
    db = firestore.client()
    
    # Get all documents from exptData collection
    docs = db.collection('exptData').stream()
    
    # Convert to pandas DataFrame
    data = []
    for doc in docs:
        participant_data = doc.to_dict()
        # Flatten trial data
        for trial in participant_data['trials']:
            trial_data = {
                'participant_id': doc.id,
                **participant_data['metadata'],
                **trial
            }
            data.append(trial_data)
    
    return pd.DataFrame(data)

Next Steps