Estimation of neuronal tuning for word meaning from passively recorded naturalistic speech

Preprint Created on 28 Jun 2026 bioRxiv

The ability to derive neural-level language coding models holds great scientific and clinical potential. Current approaches are limited by the scale and ethological validity of input data; applications requiring large, rare, or naturalistic samples in particular would benefit from the ability to infer neural coding from incidental everyday speech. Here we present a novel pipeline designed to leverage spontaneous and incidental naturalistic speech. This pipeline performs transcription, segmentation, and video-assisted diarization, as well as alignment and spike detection of neural data. We apply this pipeline to a dataset derived from 21 patients (6+ days each, over 800 hours and 5 million words total). We benchmark both encoding and decoding models against extensive and rare ground-truth control datasets consisting of human-curated word-level temporal alignment and manually sorted spikes. We further validate our approach by quantifying representational drift, effect of dataset size, and differences between six brain areas. Together, these findings demonstrate that incidental natural speech is sufficiently processed in the brain to enable the estimation neural-level embeddings.

Attention!

To access all content shared on our platform and the source link, please sign up for an account. If you already have an account, sign in, or connect with LinkedIn, Google.

Stats

Recommendations n/a n/a positive of 0 vote(s)
Views 5
Comments 0

Comments

There are no comments yet.

Attention!

Stats

Recommended by

Post a comment

Comments