AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
Our experimental results clearly demonstrated that training our convnets on the triphone recognition tasks increased their representational similarity to the collected auditory functional magnetic resonance imaging activity

Training neural networks to recognize speech increased their correspondence to the human auditory pathway but did not yield a shared hierarchy of acoustic features

Cited by: 2|Views243
Full Text
Bibtex
Weibo

Abstract

The correspondence between the activity of artificial neurons in convolutional neural networks (CNNs) trained to recognize objects in images and neural activity collected throughout the primate visual system has been well documented. Shallower layers of CNNs are typically more similar to early visual areas and deeper layers tend to be mor...More

Code:

Data:

0
Introduction
  • The use of deep neural networks (DNNs) as models of biological neural networks has been discussed as an opportunity for synergy between neuroscience and artificial intelligence (Barrett et al, 2019, Marblestone et al., 2016, Richards et al, 2019).
  • The paradigm of comparing DNN activity to neural activity has been most thoroughly explored in research on the primate visual system.
  • Similar language has been used to describe how DNNs accomplish recognition tasks (Bengio et al., 2013).
  • Several studies have reported that state-of-the-art (SOTA) machine learning systems, trained only to maximize their performance on a specific task, without any explicit goal to mimic neural activity, appear to learn representations that are similar to those found in the brains of animals engaged in a similar task (Kriegeskorte, 2015).
  • Similar comparisons have been made between modern convnets and the human visual system as recorded with functional magnetic resonance imaging (Khaligh-Razavi and Kriegeskorte, 2014, Agrawal et al, 2014, Eickenberg et al, 2017, Güçlü and van Gerven, 2016)
Highlights
  • The use of deep neural networks (DNNs) as models of biological neural networks has been discussed as an opportunity for synergy between neuroscience and artificial intelligence (Barrett et al, 2019, Marblestone et al., 2016, Richards et al, 2019)
  • Recognition) to 7-Tesla functional magnetic resonance imaging (fMRI) activity collected throughout the human auditory pathway, including subcortical and cortical regions, while participants listened to speech
  • The results of these analyses can be summarized in similarity matrices whose rows correspond to layers of a network and whose columns correspond to the auditory Regions of interest (ROIs)
  • We find no evidence of a shared hierarchy, which would manifest itself as a diagonal pattern of high neural similarity scores where shallow layers are more similar to early ROIs and deeper layers are more similar to later ROIs
  • Our experimental results clearly demonstrated that training our convnets on the triphone recognition tasks increased their representational similarity to the collected auditory fMRI activity
  • Layer fc2 yields greater neural similarity for the networks that were trained on two languages, which performed better on the triphone recognition task
Methods
  • Six healthy participants with normal hearing and no known neurological disorders were recruited to partici.
  • All participants provided written informed consent prior to the first.
  • All participants consented to their data being made publicly available..
  • The native languages of the participants were English, German and Dutch.
  • The RV- Coefficient, Applied Statistics 25 (1976).
  • Bedo, Supervised feature selection via dependence estimation, ACM International Conference
Results
  • The results of these analyses can be summarized in similarity matrices whose rows correspond to layers of a network and whose columns correspond to the auditory ROIs. Figure 1 shows the grand mean similarity matrix, the mean similarity matrix for the untrained network, and the mean neural similarity score matrix.
  • Training increased network similarity to the auditory ROIs, as evidenced by the fact the the neural similarity scores for the trained layers are all positive (Figure 1c).
  • The authors find no evidence of a shared hierarchy, which would manifest itself as a diagonal pattern of high neural similarity scores where shallow layers are more similar to early ROIs and deeper layers are more similar to later ROIs. the authors find no evidence of a shared hierarchy, which would manifest itself as a diagonal pattern of high neural similarity scores where shallow layers are more similar to early ROIs and deeper layers are more similar to later ROIs
  • This hypothesized diagonal pattern does not occur in the raw CKA similarity scores, neither for the trained nor untrained networks (Figure 1a–b)
Conclusion
  • The authors' experimental results clearly demonstrated that training the convnets on the triphone recognition tasks increased their representational similarity to the collected auditory fMRI activity.
  • This demonstrates that the experimental design and analysis was sufficiently sensitive to reveal training-related effects on representational similarity.
  • The first fully-connected layer, fc, achieved the highest similarity score across all “Language 1 to Language 2” indicate that the network was first trained on Language 1 and freeze trained on Language 2.
  • Layer fc shows the highest neural similarity score and there is little evidence for shared hierarchy.
  • Layer fc yields greater neural similarity for the networks that were trained on two languages, which performed better on the triphone recognition task
Funding
  • This work was supported by NWO Vici-Grant 453-12-002 and the Dutch Province of Limburg, an operating grant from the Canadian Institutes of Health Research (MOP 201309), the Erasmus Mundus Student Exchange Network in Auditory Cognitive Neuroscience, a Mitacs-Accelerate internship, and doctoral scholarships from the Fonds de Recherche du Québec – Nature et technologies and Natural Sciences and Engineering Research Coun28 cil (CREATE)
Study subjects and analysis
healthy participants: 6
. Six healthy participants (aged 28–31, three women, three men) with normal hearing and no known neurological disorders were recruited to partici7. MRI session

healthy participants: 6
2.1. Participants

Six healthy participants (aged 28–31, three women, three men) with normal hearing and no known neurological disorders were recruited to partici7

pate
. All participants provided written informed consent prior to the first

MRI session

healthy participants: 6
Participants. Six healthy participants (aged 28–31, three women, three men) with normal hearing and no known neurological disorders were recruited to partici7. It is made available under aCC-BY 4.0 International license

datasets: 3
The audio corpora from which the stimuli were constructed were the same datasets that were used in (Thompson et al, 2019a) and (Thompson et al, 2019b), which are owned by Nuance Communications. Each of the three datasets, one for English, Dutch and German, contained 64–83 hours of spoken text read by several native speakers in a quiet room. The datasets also included phonetic transcriptions established in a forced alignment with text transcriptions

speakers: 60
The larger the input corpus relative to the desired quilt length, the more effectively the seams of the quilt will be hidden. Therefore, we selected the 60 speakers (30 women and 30 men) with the longest set of utterances in each language. Given all the utterances from a single speaker as input, the quilting procedure generated a one-minute quilt

Reference
  • Opinion in Neurobiology 55 (2019) 55–64. doi:10.1016/j.conb.2019.01.
    Findings
  • 007. A. H. Marblestone, G. Wayne, K. P. Kording, Towards an integration of deep learning and neuroscience, Frontiers in Computational Neuroscience
    Google ScholarFindings
  • (2016) 94. doi:10.3389/fncom.2016.00094.
    Findings
  • Miller, R. Naud, C. C. Pack, P. Poirazi, P. Roelfsema, J. Sacramento, A. Saxe, B. Scellier, A. C. Schapiro, W. Senn, G. Wayne, D. Yamins, F. Zenke, J. Zylberberg, D. Therien, K. P. Kording, A deep learning framework for neuroscience, Nature Neuroscience 22 (2019) 1761–1770. doi:10.1038/s41593-019-0520-2.
    Locate open access versionFindings
  • Cognitive Sciences 11 (2007) 333–341. doi:10.1016/j.tics.2007.06.010.
    Locate open access versionFindings
  • Intelligence 35 (2013) 1798–1828.
    Google ScholarFindings
  • Science 1 (2015) 417–446.
    Google ScholarFindings
  • Representation of Primate IT Cortex for Core Visual Object Recognition, PLoS Computational Biology 10 (2014) e1003963. doi:10.1371/journal.
    Locate open access versionFindings
  • D. L. K. Yamins, H. Hong, C. F. Cadieu, E. A. Solomon, D. Seibert, J. J. DiCarlo, Performance-optimized hierarchical models predict neural responses in higher visual cortex., Academy of Sciences of the United States of America 111 (2014) 8619–
    Google ScholarLocate open access versionFindings
  • 24. doi:10.1073/pnas.1403112111. Proceedings of the National
    Locate open access versionFindings
  • S.-M. Khaligh-Razavi, N. Kriegeskorte, Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation, PLoS Computational Biology 10 (2014) e1003915. doi:10.1371/journal.pcbi.1003915.
    Locate open access versionFindings
  • Visual Representation in the Human Brain, arXiv (2014) 1407.5104 [q–
    Google ScholarFindings
  • lutional network layers map the function of the human visual system, NeuroImage 152 (2017) 184–194. doi:10.1016/j.neuroimage.2016.10.001.
    Findings
  • U. Güçlü, M. A. J. van Gerven, Increasingly complex representations of natural movies across the dorsal stream are shared between subjects, NeuroImage (2016) 6–13. doi:10.1016/j.neuroimage.2015.12.036.
    Findings
  • P. Bashivan, K. Kar, J. J. DiCarlo, Neural population control via deep image synthesis, Science 364 (2019). doi:10.1126/science.aav9436.
    Locate open access versionFindings
  • (2016). doi:10.1038/srep27755.
    Findings
  • U. Güçlü, M. A. J. van Gerven, Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream, The Journal of Neuroscience 35 (2015) 10005–10014. doi:10.1523/JNEUROSCI.
    Locate open access versionFindings
  • Walke, J. Reimer, M. Bethge, A. S. Tolias, A. S. Ecker, How well do deep neural networks trained on object recognition characterize the mouse visual system?, in: Real Neurons & Hidden Units NeurIPS Workshop, 2019.
    Google ScholarFindings
  • Behavior, Predicts Brain Responses, and Reveals a Cortical Processing Hierarchy, Neuron 98 (2018) 630–644. doi:10.1016/j.neuron.2018.03.044.
    Locate open access versionFindings
  • U. Güçlü, J. Thielen, M. Hanke, M. A. J. van Gerven, Brains on Beats, in: Advances in Neural Information Processing Systems, 2016, p. 1606.02627.
    Findings
  • Canonical Correlation Analysis for Deep Understanding and Improvement, NeurIPS (2017).
    Google ScholarLocate open access versionFindings
  • A. S. Morcos, M. Raghu, S. Bengio, Insights on representational similarity in neural networks with canonical correlation, NeurIPS (2018).
    Google ScholarFindings
  • Representations Revisited, ICLR workshop on Debugging Machine Learning Models (2019).
    Google ScholarFindings
  • Neuroscience 2 (2008).
    Google ScholarFindings
  • J. A. F. Thompson, M. Schönwiesner, Y. Bengio, D. Willett, How transferable are features in convolutional neural network acoustic models across languages?, Proceedings of the IEEE International Conference on Audio, Speech and Signal Processing (ICASSP) (2019a).
    Google ScholarLocate open access versionFindings
  • J. A. F. Thompson, Yoshua Bengio, M. Schönwiesner, The effect of task and training on intermediate representations in convolutional neural networks revealed with modified RV similarity analysis, in: Cognitive Computational Neuroscience, 2019b.
    Google ScholarFindings
  • T. Overath, J. H. McDermott, J. M. Zarate, D. Poeppel, The cortical analysis of speech-specific temporal structure revealed by responses to sound quilts, Nature Neuroscience 18 (2015) 903–911. doi:10.1038/nn.4021.
    Locate open access versionFindings
  • M. A. Griswold, P. M. Jakob, R. M. Heidemann, M. Nittka, V. Jellus, J. Wang, B. Kiefer, A. Haase, Generalized Autocalibrating Partially Parallel Acquisitions (GRAPPA), Magnetic Resonance in Medicine 47 (2002)
    Google ScholarLocate open access versionFindings
  • 1202–1210. doi:10.1002/mrm.10171.
    Findings
  • Steen Moeller, E. Yacoub, C. A. Olman, E. Auerbach, J. Strupp, N. Harel, K. Uğurbil, Multiband Multislice GE-EPI at 7 Tesla, With 16-Fold Acceleration Using Partial Parallel Imaging With Application to High Spatial and Temporal Whole-Brain FMRI, Magnetic Resonance in Medicine 63 (2010). doi:10.1161/CIRCULATIONAHA.110.956839.
    Findings
  • Resonance in Medicine 67 (2012) 1210–1224. doi:10.1002/mrm.23097.
    Findings
  • RIPrep: a robust preprocessing pipeline for functional MRI, Nature Methods (2018a). doi:10.1038/s41592-018-0235-4.
    Locate open access versionFindings
  • D. E. P. Gomez, D. J. Lurie, Z. Ye, R. A. Poldrack, K. J. Gorgolewski, fMRIPrep, Software (2018b). doi:10.5281/zenodo.852659.
    Locate open access versionFindings
  • Waskom, S. Ghosh, Nipype: a flexible, lightweight and extensible neuroimaging data processing framework in Python, Frontiers in Neuroinformatics 5 (2011) 13. doi:10.3389/fninf.2011.00013.
    Locate open access versionFindings
  • A. Marina, A. Mattfeld, M. Noel, L. Snoek, K. Matsubara, B. Cheung, S. Rothmei, S. Urchs, J. Durnez, F. Mertz, D. Geisler, A. Floren, S. Gerhard, P. Sharp, M. Molina-Romero, A. Weinstein, W. Broderick, V. Saase, S. K. Andberg, R. Harms, K. Schlamp, J. Arias, D. Papadopoulos Orfanos, C. Tarbert, A. Tambini, A. De La Vega, T. Nickson, M. Brett, M. Falkiewicz, K. Podranski, J. Linkersdörfer, G. Flandin, E. Ort, D. Shachnev, D. McNamee, A. Davison, J. Varada, I. Schwabacher, J. Pellman, M. Perez-Guevara, R. Khanuja, N. Pannetier, C. McDermottroe, S. Ghosh, Nipype, Software (2018). doi:10.5281/zenodo.596855.
    Findings
  • N. J. Tustison, B. B. Avants, P. A. Cook, Y. Zheng, A. Egan, P. A. Yushkevich, J. C. Gee, N4ITK: Improved N3 Bias Correction, IEEE Transactions on Medical Imaging 29 (2010) 1310–1320. doi:10.1109/TMI.2010.
    Locate open access versionFindings
  • (2008) 26–41. doi:10.1016/j.media.2007.06.004.
    Findings
  • Y. Zhang, M. Brady, S. Smith, Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm, IEEE Trans Med Imag 20 (2001) 45–57.
    Google ScholarLocate open access versionFindings
  • M. Reuter, H. D. Rosas, B. Fischl, Highly accurate inverse consistent registration: A robust approach, NeuroImage 53 (2010) 1181–1196. doi:10.1016/j.neuroimage.2010.07.020.
    Locate open access versionFindings
  • Segmentation and Surface Reconstruction, NeuroImage 9 (1999) 179–194. doi:10.1006/nimg.1998.0395.
    Findings
  • A. Klein, S. S. Ghosh, F. S. Bao, J. Giard, Y. Häme, E. Stavsky, N. Lee, B. Rossa, M. Reuter, E. C. Neto, A. Keshavan, Mindboggling morphometry of human brains, PLOS Computational Biology 13 (2017) e1005350. doi:10.1371/journal.pcbi.1005350.
    Locate open access versionFindings
  • NeuroImage 47, Supple (2009) S102. doi:10.1016/
    Google ScholarFindings
  • D. N. Greve, B. Fischl, Accurate and robust brain image alignment using boundary-based registration, NeuroImage 48 (2009) 63–72. doi:10.1016/
    Google ScholarLocate open access versionFindings
  • Brain Images, NeuroImage 17 (2002) 825–841. doi:10.1006/nimg.2002.
    Findings
  • 1132. R. W. Cox, J. S. Hyde, Software tools for analysis and visualization of fMRI data, NMR in Biomedicine 10 (1997) 171–178. doi:10.1002/(SICI)
    Findings
  • in resting state fMRI, NeuroImage 84 (2014) 320–341. doi:10.1016/j.
    Findings
  • (2007) 90–101. doi:10.1016/j.neuroimage.2007.04.042.
    Findings
  • T. D. Satterthwaite, M. A. Elliott, R. T. Gerraty, K. Ruparel, J. Loughead, M. E. Calkins, S. B. Eickhoff, H. Hakonarson, R. C. Gur, R. E. Gur, D. H. Wolf, An improved framework for confound regression and filtering for control of motion artifact in the preprocessing of resting-state functional connectivity data, NeuroImage 64 (2013) 240–256. doi:10.1016/j.
    Findings
  • C. Lanczos, Evaluation of Noisy Data, Journal of the Society for Industrial and Applied Mathematics Series B Numerical Analysis 1 (1964) 76–85. doi:10.1137/0701007.
    Locate open access versionFindings
  • K. R. Sitek, O. Faruk Gulban, E. Calabrese, G. A. Johnson, S. S. Ghosh, F. De Martino, Mapping the human subcortical auditory system using histology, post mortem MRI and in vivo MRI at 7T, eLife (2019). doi:10.
    Google ScholarFindings
  • A. Abraham, F. Pedregosa, M. Eickenberg, P. Gervais, A. Mueller, J. Kossaifi, A. Gramfort, B. Thirion, G. Varoquaux, neuroimaging with scikit-learn, Frontiers in Neuroinformatics 8 (2014). doi:10.3389/fninf.2014.00014.
    Locate open access versionFindings
  • C. Cortes, M. Mohri, A. Rostamizadeh, Algorithms for learning kernels based on centered alignment, Journal of Machine Learning Research 13 (2012)
    Google ScholarLocate open access versionFindings
  • Theory (2005) 63–77. doi:10.1007/11564089{\_}7.
    Findings
  • Methods: The RV- Coefficient, Applied Statistics 25 (1976).
    Google ScholarLocate open access versionFindings
  • Proceeding Series 227 (2007) 823–830. doi:10.1145/1273496.1273600.
    Locate open access versionFindings
  • W. McKinney, Data structures for statistical computing in python, in: Proceedings of the 9th Python in Science Conference, volume 445, Austin, TX, 2010, pp. 51–56.
    Google ScholarLocate open access versionFindings
  • W. McKinney, pandas: a foundational Python library for data analysis and statistics, Python for High Performance and Scientific Computing 14 (2011).
    Google ScholarLocate open access versionFindings
  • Behavior 11 (2017) 253–263. doi:10.1007/s11682-016-9515-8.
    Locate open access versionFindings
  • model of the representation space in human ventral temporal cortex, Neuron 2 (2011).
    Google ScholarFindings
  • S. Recanatesi, M. Farrell, M. Advani, T. Moore, G. Lajoie, E. Shea-Brown, Dimensionality compression and expansion in Deep Neural Networks (2019).
    Google ScholarLocate open access versionFindings
  • A. Ansuini, A. Laio, J. H. Macke, D. Zoccolan, Intrinsic dimension of data representations in deep neural networks, in: Advances in Neural Information Processing Systems, 2019.
    Google ScholarLocate open access versionFindings
  • L. Wyse, Audio Spectrogram Representations for Processing with Convolutional Neural Networks, in: Proceedings of the First International Workshop on Deep Learning and Music joint with IJCNN, 2017, pp. 37–41.
    Google ScholarLocate open access versionFindings
  • Model for Raw Audio, in: The 9th ISCA Speech Synthesis Workshop, 2016.
    Google ScholarFindings
Author
JAF Thompson
JAF Thompson
E Formisano
E Formisano
M Schönwiesner
M Schönwiesner
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科