Reconstructing speech envelopes from electroencephalography(EEG) signals is a challenging but valuable task for brain-computer interfaces (BCIs), with applications in assistive communication for individuals with speech impairments. While deep learning has improved reconstruction accuracy, most existing approaches are restricted to single-layer architectures such as convolutional neural networks (CNNs). This limits their ability to capture the full complexity of spatio-temporal and structural EEG patterns. In this work, we systematically extend the VLAAI framework by evaluating 26 architectures that integrate CNNs, long short-term memory networks (LSTMs), and graph convolutional networks (GCNs) in both single-layer and hybrid configurations. Experiments on the 64-channel SparrKULee dataset demonstrate that CNNs remain the strongest standalone models, but hybrid designs; particularly CNN-LSTM and CNN-GCN-LSTM achieve competitive or superior performance. These results highlight the importance of combining spatial, temporal, and graph-based processing, and provide practical guidelines for hybrid architecture design. Our study offers the first large-scale comparative analysis of hybrid models for EEG-based speech envelope reconstruction, advancing robust BCI systems for non-invasive speech decoding.
Gottipalli, U. S., Jha, A., Miyapuram, K. P.
Advertisement
Stats
- Recommendations n/a n/a positive of 0 vote(s)
- Views 9
- Comments 0
