The sound of silence: an EEG study of how musicians time pauses in individual and joint music performance

by Anna Zamm1, Stefan Debener2, Ivana Konvalinka3, Natalie Sebanz1, Günther Knoblich1
1Department of Cognitive Science, Central European University, Vienna, Austria
2Department of Psychology, University of Oldenburg, Oldenburg, Germany
3Section for Cognitive Systems, DTU Compute, Technical University of Denmark, Lyngby, Denmark

Abstract

In our recent publication in Social Cognitive and Affective Neuroscience (SCAN) we investigated how musical partners resolve unmeasured expressive silences in musical interaction. Partners resolved shorter silences more synchronously than longer silences; partners also displayed enhanced neural markers of motor preparation for shorter relative to longer silences. Thus, shorter silences in interaction may facilitate interpersonal coordination. Our SCAN article is summarized below.

Introduction

Silences often arise in social interactions such as conversation and joint music-making: conversation partners often pause between utterances; musicians insert expressive silences between musical phrases. Such silences are typically unmeasured, and thus pose a challenge to coordination: partners must determine when to end silences without disrupting the flow of interaction. How partners resolve unmeasured silence in social interaction is an open and important question for social cognitive neuroscience. Here we investigated this question in the context of musical interaction.

Specifically, we measured trained musicians’ performances of a simple piano melody featuring unmeasured expressive silences, termed fermatas (Fermata | Grove Music). Musicians performed this melody alone and simultaneously with a partner with whom they had to coordinate the duration of silences. Behaviorally, we tested the hypothesis that partners produce shorter and less variable silences relative to individuals as a means of facilitating coordination. This prediction follows from previous work indicating that partners try to reduce unpredictability by making their individual actions more predictable, namely by increasing action speed and reducing variability (Vesper et al., 2011, 2016). Neurally, we tested the hypothesis that dynamics of cortical beta oscillations (13 – 30 Hz) – typically associated with motor preparation and planning (Engel & Fries, 2010; Tzagarakis et al., 2010; Zaepffel et al., 2013) – reflect preparation and planning processes linked to resolution of unmeasured musical silences.

Methods

Participants

40 healthy right-handed adults with at least six years of formal piano training (M years of training = 12.13, s.d. = 4.27 years, range = 6–22 years) completed the study in pairs (N = 20). Ethical approval was obtained from the local United Ethical Review Committee for Research in Psychology (EPKEB) at Central European University (Budapest).

Experimental Procedure

Participants performed a simple melody featuring unmeasured expressive silences (12 total silences notated in the score) from memory in two tasks: Solo (5 trials) and Duet (5 trials) performance (1 trial = one performance of the melody). In Solo performance, pianists performed the melody alone at the rate indicated by a metronome pacing cue, with the instructions to “use [their] intuition to determine the length of each pause” and that “each pause should be unique and expressive”. Duet instructions were the same as for the Solo task, with the added goal to “synchronize keystrokes while maintaining the tempo of the metronome (cue)“. Partners could not hear one another’s Solo performances but could hear one another during Duets. Partners could not see one another’s hand or torso movements to minimize visual communication, and EEG was recorded during all tasks.

Data acquisition

MIDI: Pianists performed on identical Akai Professional MAX25 USB-powered keyboards. Musical Instrument Digital Interface (MIDI) timing information was sent on a unique channel from each keyboard, and data from the two keyboards were merged via a MIDI merger (MIDI Solutions Inc, Canada), which sent all MIDI data to a MIDI-USB Interface to a Linux computer (Fedora 28) running an adapted version of FTAP MIDI recording software (Finney, 2001). Sounds associated with pianists’ keystrokes were generated by a tone generator (Roland SD50 Mobile Studio Canvas, Roland Corporation, Japan), which received MIDI information about each keystroke from FTAP and produced the corresponding pitch in a piano timbre. Sounds were delivered to pianists via EEG-compatible earbuds (ER3C Tubal Insert Earphones, Etymotic Research Inc., USA); sounds were amplified via a battery-powered headphone distributor/amplifier (M-Audio Bass Traveller, M-Audio Inc., USA).

EEG: Duet partners’ EEG data were synchronously recorded using a 32×2 BrainAmp DC (Brain Products GmbH, Germany) system hyperscanning setup. Specifically, for each partner, 32 active electrodes (Brain Products GmbH, Germany) were placed on a 32-channel Standard Cap for actiCAP from Easycap (EASYCAP GmbH, Germany; reference site = FCz, ground = AFz). Each partner’s electrodes were connected to a separate battery-powered actiCAP ControlBox (plugged into the port labelled “Ch 1-32, Splitter Box”). Reference and ground electrodes for each partner were connected to their respective control boxes, ensuring galvanic isolation. Each ControlBox was connected via ribbon cable (plugged into “Ch 1-32, Amplifier” on the ControlBox) to a separate BrainAmp DC amplifier (high-pass filter = 10s time-constant/0.0159 Hz, low-pass filter = 250 Hz, sampling rate = 5000 Hz, 0.1 μV resolution, +/− 3.28 mV range). Each amplifier was powered by its own PowerPack. Data from each amplifier were sent to the Brain Products USB2 Adapter Box (to Fiberoptic inputs 1 & 2 respectively), which provided a shared clock for incoming data. The USB2 Adapter Box sent all EEG data to BrainVision Recorder (BVR) Software (v1.20.0801) running on a Windows OS (Win 7 Professional SP1), which recorded both partners’ data to a single custom 64-channel BVR file. With a single exception, impedances were kept <25 kOhm (recommended threshold from Brain Products) at the start of each experimental task.

Synchronization of MIDI & EEG: MIDI and EEG recordings were synchronized via Transistor-transistor logic (TTL) triggers sent from FTAP software to the USB2 Adapter Box. TTL triggers were sent from the FTAP recording computer over a parallel port connected to the USB2 Adapter Box via a DB26 trigger cable (Brain Products TRIG26 Trigger Cable LPT/BNC). TTL triggers were sent on every recorded MIDI keystroke onset and were 0.5 milliseconds in duration to ensure millisecond-precision (thus the acquisition of EEG data at 5000 Hz).

Data processing & analysis

MIDI data processing was performed in MATLAB using custom scripts. EEG data processing was performed in EEGLAB software (Delorme & Makeig, 2004) and using custom scripts. Statistics were computed using R Statistics software. For details of MIDI and EEG data cleaning, please refer to the original SCAN manuscript. Analysis procedures computed on cleaned data are described briefly below. It should be noted that all behavioral DVs were averaged within members of each pair.

MIDI keystrokes: Three behavioral dependent variables (DV) were assessed: 1) mean duet asynchronies, which measured the absolute latency between partners’ keystroke onsets for corresponding melody tones (where high asynchronies reflect poor synchronization, and low asynchronies reflect good synchronization), 2) Pause duration, and 3) Pause duration variability (SD normalized by duration).

EEG: EEG data were assessed for event-related desynchronization (ERD) of cortical beta oscillations (Pfurtscheller & Da Silva, 1999), a hallmark of motor preparation and planning (Engels & Fries, 2010). In preparation for beta ERD analyses, artefact-corrected data were re-referenced to the linked mastoids (TP9/10), filtered in the beta frequency range (13-30 Hz), and beta amplitude was computed as the magnitude of the analytic signal. Beta amplitude time-courses were epoched relative to pause locations (0-6s relative to the first keystroke release before each pause), and noisy epochs (activity > 3 SD from the mean of the joint probability distribution) were excluded. Pauses (0s through the first keystroke onset after each pause) were then divided into deciles, and beta amplitude was averaged over samples within each decile. Beta ERD was computed for each decile (defined as the percent amplitude difference between each decile and a pre-pause baseline period, -.5 to 0s). For each subject and pair, beta ERD was averaged across electrodes at a parietal region of interest (ROI: P3, P4, Pz, P7, P8) and a central ROI (C3, C4, Cz). Linear changes in beta ERD were assessed across deciles of musical pauses in Solo and Duet performance using a linear mixed model (see below).

Results

Duet synchronization reduced following pauses

Figure 1 displays the grand average asynchrony profiles (1a) and each pairs’ mean profile (1b). A one-way repeated measures ANOVA on Duet asynchronies with Pause Location (Pause or Non-pause location) as factor indicated that partners displayed significantly higher asynchronies for tones immediately following pauses, relative to tones at non-pause locations, F(1, 19) = 265.03, p < 0.0001, η2G = 0.84.

The sound of silence: an EEG hyperscanning study

Figure 1. A: Grand average asynchrony profile for pause (red) and non-pause (blue) melody tones. B: Each pair’s mean asynchrony profile represented as a z-score. C: Grand average pause durations for Duet and Solo performance. D: Correlation between pause duration and asynchrony for tones immediately following pauses. Reproduced from Soc Cogn Affect Neurosci, Volume 16, Issue 1-2, January-February 2021, Pages 31–42, https://doi.org/10.1093/scan/nsaa096.

Partners modify pause timing in Duets to facilitate synchrony

Mean pause durations in Duet and Solo performance are shown in Figure 2c. A two-way repeated measures ANOVA was computed on pause duration, with factors of Task (Solo/Duet) and Pause number (1-12, i.e. number of pause in melody). Pauses were significantly shorter in Duet relative to Solo performance, F(1, 19) = 5.99, p = 0.02, η2G = 0.08; there was also an effect of pause number on pause duration, F(5.10, 96.92) = 16.64, p < 0.0001, η2G = 0.09. No other effects or interactions were observed. Figure 2d shows that Duet asynchronies increase significantly with pause duration, rho(18) = .824, p < .0001.

A separate ANOVA on pause variability revealed no effect of Task on pause duration variability (p=.54).

Beta ERD reflects motor preparation during pauses

Figure 2 displays changes in beta ERD across deciles of musical pauses. A linear mixed model was computed on beta ERD using the following equation: ERD ~ Time Window * Task * ROI * standardized pause duration + (1 + Time window * Task * ROI | subject). See the original manuscript for full model details. Fixed linear mixed model effects significance levels were estimated using Satterthwaite’s approximation for degrees of freedom. Findings revealed main effects of Time Window (Deciles 1-10, treated as a continuous variable), ROI (Central/Parietal), Pause duration, and a Time window x ROI interaction. The full model failed to converge, so the non-significant factor of task was removed to create a restricted model; the restricted model did converge, and confirmed the main effects of time window F(1, 40) = 30.12, p < 0.0001, Pause duration (standardized), F(1, 7465.9) = 9.931, p = 0.002, and ROI F(1, 95.7) = 15.095, p = 0.0002, observed in the full model, as well as the interaction between Time window and ROI, F (1, 198.5) = 5.55, p = 0.02. No other main effects or interactions were significant.

The sound of silence: an EEG hyperscanning study

Figure 2. A: Topographies displaying mean ERD% across pause deciles for Solo (top) and Duet (bottom). Outliers were not removed for this visualization (but were for data submitted to analyses, see 3b-c). B: ERD% data for Observed (dashed) and Predicted (solid) Solo performance results. C: ERD% data for Observed (dashed) and Predicted Duet performance results. D: Relationship predicted from Linear Mixed Model between standardized pause duration and beta ERD%. All predicted data displayed are computed from the full Linear Mixed Effects Model (see Results). Reproduced from Soc Cogn Affect Neurosci, Volume 16, Issue 1-2, January-February 2021, Pages 31–42, https://doi.org/10.1093/scan/nsaa096.

Discussion

We investigated the behavioral and neural processes underlying how musicians resolve the duration of unmeasured expressive silences in music. Our findings revealed that unmeasured silences pose a challenge to interpersonal coordination, as evidenced by reduced duet synchronization for tones following pauses relative to other melody tones. Partners navigated this challenge by reducing the duration of pauses during duet relative to solo performance; tones following shorter pauses were associated with enhanced synchronization relative to tones following longer pauses. Cortical beta oscillations displayed a linear reduction in amplitude at centro-parietal sites over the course of pauses, characteristic of beta ERD, a hallmark of motor planning and preparation (Engel & Fries, 2010). Although ERD did not differ between Solo and Duet performance, ERD was enhanced during shorter relative to longer pauses, suggesting that shorter pauses in Duets may have facilitated action readiness, thereby possibly facilitating synchronous resolution of musical silence. Together, the current findings point towards silence as a fertile ground for investigating social coordination.

Acknowledgement

We thank Tamas Stolmar of Qualitis LTD (Hungarian Brain Products distributor) for consultation on our hyperscanning set-up.

References

Fermata | Grove Music. (2020).
https://www.oxfordmusiconline.com/grovemusic/view/10.1093/gmo/9781561592630.001.0001/omo-9781561592630-e-0000009487?rskey=h9Bg2M&result=1.

Delorme, A. & Makeig, S. (2004)
EEGLAB: an open-source toolbox for analysis of single-trial EEG dynamics
Journal of Neuroscience Methods, 134, 9-21. PDF.

Engel, A.K., Fries, P. (2010).
Beta-band oscillations-signalling the status quo?
Current Opinion in Neurobiology, 20(2), 156–65. doi: 10.1016/j.conb.2010.02.015

Finney, S.A. (2001).
FTAP: a Linux-based program for tapping and music experiments.
Behavior Research Methods, Instruments, and Computers, 33(1), 65–72. doi: 10.3758/BF03195348

Pfurtscheller, G., Da Silva, F.H.L. (1999).
Event-related EEG/MEG synchronization and desynchronization: basic principles.
Clinical Neurophysiology, 110(11). doi: 10.1016/S1388-2457(99)00141-8

Tzagarakis, C., Ince, N.F., Leuthold, A.C., Pellizzer, G. (2010).
Beta band activity during motor planning reflects response uncertainty.
Journal of Neuroscience, 30(34), 11270-11277. doi: 10.1523/JNEUROSCI.6026-09.2010

Vesper, C., Van Der Wel, R.P.R.D., Knoblich, G., Sebanz, N. (2011).
Making oneself predictable: reduced temporal variability facilitates joint action coordination.
Experimental Brain Research, 211(3–4), 517–30. doi: 10.1007/s00221-011-2706-z

Vesper, C., Schmitz, L., Safra, L., Sebanz, N., Knoblich, G. (2016).
The role of shared visual information for joint action coordination.
Cognition, 153, 118–23. doi: 10.1016/j.cognition.2016.05.002

Zaepffel, M., Trachel, R., Kilavik, B.E., Brochier, T. (2013).
Modulations of EEG Beta power during planning and execution of grasping movements.
PLoS One, 8(3). doi: 10.1371/journal.pone.0060060