A Penny for Your Thoughts: Decoding Speech from Inexpensive Brain Signals
Quentin Auster, Kateryna Shapovalenko, Chuang Ma, Demaio Sun
我们探索神经网络是否可以通过将EEG录音映射到音频表示中来解码大脑活动。 使用记录为受试者听自然语音的EEG数据,我们训练了一个具有对比性CLIP损失的模型,将EEG衍生的嵌入与预训练的基于变压器的语音模型的嵌入对齐。 基于Meta最先进的EEG解码器,我们引入了三种架构修改:(i)主题特定注意力层(+0.15% WER改进),(ii)个性化空间注意力(+0.45%)和(iii)具有注意力的双路径RNN(-1.87%)。 三个修改中的两个改进提高了性能,突出了用于脑对语音解码和脑机接口应用中的个性化架构的承诺。
We explore whether neural networks can decode brain activity into speech by mapping EEG recordings to audio representations. Using EEG data recorded as subjects listened to natural speech, we train a model with a contrastive CLIP loss to align EEG-derived embeddings with embeddings from a pre-trained transformer-based speech model. Building on the state-of-the-art EEG decoder from Meta, we introduce three architectural modifications: (i) subject-specific attention layers (+0.15% WER improvement)...