Convolutional neural network models of V1 responses to complex patterns

Zhang, Yimeng; Lee, Tai Sing; Li, Ming; Liu, Fang; Tang, Shiming

doi:10.1007/s10827-018-0687-7

Convolutional neural network models of V1 responses to complex patterns

Published: 05 June 2018

Volume 46, pages 33–54, (2019)
Cite this article

Journal of Computational Neuroscience Aims and scope Submit manuscript

2009 Accesses
24 Citations
1 Altmetric
Explore all metrics

Abstract

In this study, we evaluated the convolutional neural network (CNN) method for modeling V1 neurons of awake macaque monkeys in response to a large set of complex pattern stimuli. CNN models outperformed all the other baseline models, such as Gabor-based standard models for V1 cells and various variants of generalized linear models. We then systematically dissected different components of the CNN and found two key factors that made CNNs outperform other models: thresholding nonlinearity and convolution. In addition, we fitted our data using a pre-trained deep CNN via transfer learning. The deep CNN’s higher layers, which encode more complex patterns, outperformed lower ones, and this result was consistent with our earlier work on the complexity of V1 neural code. Our study systematically evaluates the relative merits of different CNN components in the context of V1 neuron modeling.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A deep convolutional visual encoding model of neuronal responses in the LGN

Article Open access 15 June 2021

Eslam Mounier, Bassem Abdullah, … Seif Eldawlatly

Characterisation of nonlinear receptive fields of visual neurons by convolutional neural network

Article Open access 07 March 2019

Jumpei Ukita, Takashi Yoshida & Kenichi Ohki

Activations of deep convolutional neural networks are aligned with gamma band activity of human visual cortex

Article Open access 08 August 2018

Ilya Kuzovkin, Raul Vicente, … Jaan Aru

Notes

The images were rescaled to 2/3 of their original sizes; we used this scale because in another study (Zhang et al. 2016) we found that this scale gave the highest representational similarity (Kriegeskorte et al. 2008) between the CNN and neural data among all scales explored; we also tried using raw images without rescaling in the current study and got worse results.
In theory we should exclude these neurons for model evaluation, we did not do it as doing it or not has negligible effects with hundreds of neurons in our data set.
In practice, we performed PCA only on the pure quadratic terms to reduce their dimensionalities to 432 and concatenated the PCAed 432-dimensional pure quadratic terms with the 400-dimensional linear terms to generate the final 882-dimensional input vectors; such method would guarantee that the information from linear terms, which are heavily used in most V1 models, is preserved. We also tried performing PCA on both linear and pure quadratic terms together and two methods made little difference in our experiments.

References

Adelson, E.H., & Bergen, J.R. (1985). Spatiotemporal energy models for the perception of motion. Journal of the Optical Society of America. A, 2(2), 284–299. https://doi.org/10.1364/JOSAA.2.000284. http://josaa.osa.org/abstract.cfm?URI=josaa-2-2-284.
Article CAS Google Scholar
Andrews, B.W., & Pollen, D.A. (1979). Relationship between spatial frequency selectivity and receptive field profile of simple cells. The Journal of Physiology, 287(1), 163–176. https://doi.org/10.1113/jphysiol.1979.sp012652.
Article CAS PubMed PubMed Central Google Scholar
Bishop, C.M. (2006). Machine learning and pattern recognition. Information science and statistics. Springer.
Cadena, S.A., Denfield, G.H., Walker, E.Y., Gatys, L.A., Tolias, A.S., Bethge, M., Ecker, A.S. (2017). Deep convolutional models improve predictions of macaque v1 responses to natural images. bioRxiv.
Carandini, M., Demb, J.B., Mante, V., Tolhurst, D.J., Dan, Y., Olshausen, B.A., Gallant, J.L., Rust, N.C. (2005). Do we know what the early visual system does? Journal of Neuroscience, 25(46), 10577–10597. https://doi.org/10.1523/JNEUROSCI.3726-05.2005. http://www.jneurosci.org/cgi/doi/10.1523/JNEUROSCI.3726-05.2005.
Article CAS PubMed Google Scholar
Coen-Cagli, R., Kohn, A., Schwartz, O. (2015). Flexible gating of contextual influences in natural vision. Nature Neuroscience, 18, 1648–1655. https://doi.org/10.1038/nn.4128.
Article CAS PubMed PubMed Central Google Scholar
Daugman, J.G. (1985). Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. Journal of the Optical Society of America. A, 2(7), 1160–1169.
Article CAS Google Scholar
David, S.V., & Gallant, J.L. (2005). Predicting neuronal responses during natural vision. Network: Computation in Neural Systems, 16(2-3), 239–260. https://doi.org/10.1080/09548980500464030. http://www.tandfonline.com/doi/full/10.1080/09548980500464030.
Article Google Scholar
Dayan, P., & Abbott, L.F. (2001). Theoretical neuroscience. Computational and Mathematical Modeling of Neural Systems.
Finn, I.M., & Ferster, D. (2007). Computational diversity in complex cells of cat primary visual cortex. Journal of Neuroscience, 27(36), 9638–9648. https://doi.org/10.1523/JNEUROSCI.2119-07.2007. http://www.jneurosci.org/content/27/36/9638.
Article CAS PubMed Google Scholar
Friedman, J., Hastie, T., Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software Articles, 33(1), 1–22. https://doi.org/10.18637/jss.v033.i01.
Article Google Scholar
Fukushima, K. (1980). Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 36(4), 193–202. https://doi.org/10.1007/BF00344251.
Article CAS PubMed Google Scholar
Gollisch, T., & Meister, M. (2010). Eye smarter than scientists believed: neural computations in circuits of the retina. Neuron, 65(2), 150–164. https://doi.org/10.1016/j.neuron.2009.12.009.
Article CAS PubMed PubMed Central Google Scholar
Goodfellow, I.J., Bengio, Y., Courville, A. (2016). Deep learning. MIT Press.
Heeger, D.J. (1992). Half-squaring in responses of cat striate cells. Visual Neuroscience, 9(5), 427–443. https://doi.org/10.1017/S095252380001124X.
Article CAS PubMed Google Scholar
Hegdé, J., & Van Essen, D.C. (2007). A comparative study of shape representation in macaque visual areas V2 and V4. Cerebral Cortex, 17(5), 1100–1116.
Article PubMed Google Scholar
Hsu, A., Borst, A., Theunissen, F. (2004). Quantifying variability in neural responses and its application for the validation of model predictions. Network: Computation in Neural Systems, 15(2), 91–109.
Article Google Scholar
Hubel, D.H., & Wiesel, T.N. (1959). Receptive fields of single neurones in the cat’s striate cortex. The Journal of Physiology, 148(3), 574–591. https://doi.org/10.1113/jphysiol.1959.sp006308.
Article CAS PubMed PubMed Central Google Scholar
Hubel, D.H., & Wiesel, T.N. (1962). Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. The Journal of Physiology, 160(1), 106–154. https://doi.org/10.1113/jphysiol.1962.sp006837.
Article CAS PubMed PubMed Central Google Scholar
Hubel, D.H., & Wiesel, T.N. (1968). Receptive fields and functional architecture of monkey striate cortex. The Journal of Physiology, 195(1), 215–243. https://doi.org/10.1113/jphysiol.1968.sp008455.
Article CAS PubMed PubMed Central Google Scholar
Jones, J.P., & Palmer, L.A. (1987a). An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. Journal of Neurophysiology, 58(6), 1233– 1258.
Article CAS PubMed Google Scholar
Jones, J.P., & Palmer, L.A. (1987b). The two-dimensional spatial structure of simple receptive fields in cat striate cortex. Journal of Neurophysiology, 58(6), 1187–1211. http://jn.physiology.org/content/58/6/1187.
Article CAS PubMed Google Scholar
Kelly, R.C., Smith, M.A., Kass, R.E., Lee, T.S. (2010). Accounting for network effects in neuronal responses using L1 regularized point process models. In Lafferty, J.D., Williams, C.K.I., Shawe-Taylor, J., Zemel, R.S., Culotta, A. (Eds.) Advances in neural information processing systems 23: 24th annual conference on neural information processing systems 2010. Proceedings of a meeting held 6-9 December 2010, Vancouver, British Columbia, Canada (pp. 1099–1107): Curran Associates, Inc. http://papers.nips.cc/paper/4050-accounting-for-network-effects-in-neuronal-responses-using-l1-regularized-point-process-models.
Kindel, W.F., Christensen, E.D., Zylberberg, J. (2017). Using deep learning to reveal the neural code for images in primary visual cortex. ArXiv e-prints, q-bio.NC.
Kingma, D.P., & Ba, J. (2014). Adam: a method for stochastic optimization. CoRR, arXiv:1412.6980.
Klindt, D., Ecker, A.S., Euler, T., Bethge, M. (2017). Neural system identification for large populations separating “what” and “where”. In Guyon, I.I., von Luxburg, U., Bengio, S., Wallach, H.M., Fergus, R., Vishwanathan, S.V.N., Garnett, R. (Eds.) Advances in neural information processing systems 30: annual conference on neural information processing systems 2017, 4-9 December 2017 (pp. 3509–3519). Long Beach.
Köster, U., & Olshausen, B. (2013). Testing our conceptual understanding of V1 function. q-bio.NC. arXiv:1311.0778.
Kotikalapudi, R. (2017). keras-vis: Keras visualization toolkit. https://github.com/raghakot/keras-vis.
Kriegeskorte, N. (2015). Deep neural networks: a new framework for modeling biological vision and brain information processing. Annual Review of Vision Science, 1(1), 417–446.
Article PubMed Google Scholar
Kriegeskorte, N., Mur, M., Bandettini, P. (2008). Representational similarity analysis - connecting the branches of systems neuroscience. Frontiers in Systems Neuroscience, 2, 4. https://doi.org/10.3389/neuro.06.004.2008. https://www.frontiersin.org/article/10.3389/neuro.06.004.2008.
Article PubMed PubMed Central Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E. (2012). Imagenet classification with deep convolutional neural networks. In Bartlett, P.L., Pereira, F.C.N., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (Eds.) Advances in neural information processing systems 25: 26th annual conference on neural information processing systems 2012. Proceedings of a meeting held December 3-6, 2012 (pp. 1106–1114). Lake Tahoe.
Li, M., Liu, F., Jiang, H., Lee, T.S., Tang, S. (2017). Long-term two-photon imaging in awake macaque monkey. Neuron, 93(5), 1049–1057.e3.
Article CAS PubMed Google Scholar
McCullagh, P., & Nelder, J. (1989). Generalized linear models, 2nd edn. Chapman & Hall/CRC Monographs on Statistics & Applied Probability. Taylor & Francis.
McFarland, J.M., Cui, Y., Butts, D.A. (2013). Inferring nonlinear neuronal computation based on physiologically plausible inputs. PLoS Computational Biology, 9(7), e1003143.
Article CAS PubMed PubMed Central Google Scholar
McIntosh, L.T., Maheswaranathan, N., Nayebi, A., Ganguli, S., Baccus, S.A. (2017). Deep learning models of the retinal response to natural scenes. ArXiv e-prints, q-bio.NC.
Olah, C., Mordvintsev, A., Schubert, L. (2017). Feature visualization. Distill. https://distill.pub/2017/feature-visualization.
Paninski, L. (2004). Maximum likelihood estimation of cascade point-process neural encoding models. Network: Computation in Neural Systems, 15(4), 243–262.
Article Google Scholar
Park, I.M., & Pillow, J.W. (2011). Bayesian spike-triggered covariance analysis. In Taylor, J.S., Zemel, R.S., Bartlett, P.L., Pereira, F.C.N., Weinberger, K.Q. (Eds.) Advances in neural information processing systems 24: 25th annual conference on neural information processing systems 2011. Proceedings of a meeting held 12-14 December 2011 (pp. 1692–1700). Granada.
Park, I.M., Archer, E., Priebe, N., Pillow, J.W. (2013). Spectral methods for neural characterization using generalized quadratic models. In Burges, C.J.C., Bottou, L., Ghahramani, Z., Weinberger, K.Q. (Eds.) Advances in neural information processing systems 26: 27th annual conference on neural information processing systems 2013. Proceedings of a meeting held December 5-8, 2013 (pp. 2454–2462). Lake Tahoe.
Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A. (2017). Automatic differentiation. In pytorch.
Pillow, J.W., Shlens, J., Paninski, L., Sher, A., Litke, A.M., Chichilnisky, E.J., Simoncelli, E.P. (2008). Spatio-temporal correlations and visual signalling in a complete neuronal population. Nature, 454, 995–999. https://doi.org/10.1038/nature07140.
Article CAS PubMed PubMed Central Google Scholar
Prenger, R., Wu, M.C.K., David, S.V., Gallant, J.L. (2004). Nonlinear V1 responses to natural scenes revealed by neural network analysis. Neural Networks, 17(5–6), 663–679.
Article PubMed Google Scholar
Riesenhuber, M., & Poggio, T. (1999). Hierarchical models of object recognition in cortex. Nature Neuroscience, 2, 1019–1025. https://doi.org/10.1038/14819.
Article CAS PubMed Google Scholar
Rowekamp, R.J., & Sharpee, T.O. (2017). Cross-orientation suppression in visual area V2. Nature Communications, 8, 15739.
Article CAS PubMed PubMed Central Google Scholar
Rust, N.C., Schwartz, O., Movshon, J.A., Simoncelli, E.P. (2005). Spatiotemporal elements of macaque V1 receptive fields. Neuron, 46(6), 945–956. https://doi.org/10.1016/j.neuron.2005.05.021. http://linkinghub.elsevier.com/retrieve/pii/S089662730500468X.
Article CAS PubMed Google Scholar
Schoppe, O., Harper, N.S., Willmore, B.D.B., King, A.J., Schnupp, J.W.H. (2016). Measuring the performance of neural models. Frontiers in Computational Neuroscience, 10, 1929.
Article Google Scholar
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. ArXiv e-prints, cs.CV.
Tang, S., Lee, T.S., Li, M., Zhang, Y., Xu, Y., Liu, F., Teo, B., Jiang, H. (2018). Complex pattern selectivity in macaque primary visual cortex revealed by large-scale two-photon imaging. Current Biology, 28 (1), 38–48.e3.
Article CAS PubMed Google Scholar
Theunissen, F.E., David, S.V., Singh, N.C., Hsu, A., Vinje, W.E., Gallant, J.L. (2001). Estimating spatio-temporal receptive fields of auditory and visual neurons from their responses to natural stimuli. Network: Computation in Neural Systems, 12(3), 289–316.
Article CAS Google Scholar
Touryan, J., Felsen, G., Dan, Y. (2005). Spatial structure of complex cell receptive fields measured with natural images. Neuron, 45 (5), 781–791. https://doi.org/10.1016/j.neuron.2005.01.029. http://linkinghub.elsevier.com/retrieve/pii/S0896627305000619.
Article CAS PubMed Google Scholar
Victor, J.D., Mechler, F., Repucci, M.A., Purpura, K.P., Sharpee, T. (2006). Responses of v1 neurons to two-dimensional hermite functions. Journal of Neurophysiology, 95(1), 379–400. https://doi.org/10.1152/jn.00498.2005. http://jn.physiology.org/content/95/1/379.
Article PubMed Google Scholar
Vintch, B., Movshon, J.A., Simoncelli, E.P. (2015). A convolutional subunit model for neuronal responses in macaque v1. Journal of Neuroscience, 35(44), 14829–14841.
Article CAS PubMed Google Scholar
Wu, M.C.K., David, S.V., Gallant, J.L. (2006). Complete functional characterization of sensory neurons by system identification. Annual Review of Neuroscience, 29(1), 477–505. https://doi.org/10.1523/JNEUROSCI.2815-13.2015.
Article CAS PubMed Google Scholar
Yamins, D., Hong, H., Cadieu, C.F., DiCarlo, J.J. (2013). Hierarchical modular optimization of convolutional networks achieves representations similar to macaque IT and human ventral stream. In Burges, C.J.C., Bottou, L., Ghahramani, Z., Weinberger, K.Q. (Eds.) Advances in neural information processing systems 26: 27th annual conference on neural information processing systems 2013. Proceedings of a meeting held December 5-8, 2013 (pp. 3093–3101). Lake Tahoe.
Yamins, D.L.K., & DiCarlo, J.J. (2016). Using goal-driven deep learning models to understand sensory cortex. Nature Neuroscience, 19(3), 356–365.
Article CAS PubMed Google Scholar
Zhang, Y., Massot, C., Zhi, T., Papandreou, G., Yuille, A., Lee, T.S. (2016). Understanding neural representations in early visual areas using convolutional neural networks. In Neuroscience (SfN).

Download references

Acknowledgments

We thank Rob Kass, Dave Touretzky, and the reviewers for careful comments on the manuscript. We thank Wenbiao Gan for the early provision of AAV-GCaMP5; and to Peking University Laboratory Animal Center for excellent animal care. We acknowledge the Janelia Farm program for providing the GCaMP5-G construct, specifically Loren L. Looger, Jasper Akerboom, Douglas S. Kim, and the Genetically Encoded Calcium Indicator (GECI) project at Janelia Farm Research Campus Howard Hughes Medical Institute. This work was supported by the National Natural Science Foundation of China No. 31730109, National Natural Science Foundation of China Outstanding Young Researcher Award 30525016, a project 985 grant of Peking University, Beijing Municipal Commission of Science and Technology under contract No. Z151100000915070, NIH 1R01EY022247 and NSF CISE 1320651 and IARPA D16PC00007 of the U.S.A.

Author information

Authors and Affiliations

Center for the Neural Basis of Cognition and Computer Science Department, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
Yimeng Zhang & Tai Sing Lee
Peking University School of Life Sciences and Peking-Tsinghua Center for Life Sciences, Beijing, 100871, China
Ming Li, Fang Liu & Shiming Tang
IDG/McGovern Institute for Brain Research at Peking University, Beijing, 100871, China
Ming Li, Fang Liu & Shiming Tang

Authors

Yimeng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Tai Sing Lee
View author publications
You can also search for this author in PubMed Google Scholar
Ming Li
View author publications
You can also search for this author in PubMed Google Scholar
Fang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Shiming Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Yimeng Zhang or Shiming Tang.

Ethics declarations

Conflict of interests

The authors declare that they have no conflict of interest.

Additional information

Action Editor: Zhe Chen

Electronic supplementary material

Below is the link to the electronic supplementary material.

(PDF 1.95 MB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, Y., Lee, T.S., Li, M. et al. Convolutional neural network models of V1 responses to complex patterns. J Comput Neurosci 46, 33–54 (2019). https://doi.org/10.1007/s10827-018-0687-7

Download citation

Received: 22 December 2017
Revised: 26 April 2018
Accepted: 30 April 2018
Published: 05 June 2018
Issue Date: 15 February 2019
DOI: https://doi.org/10.1007/s10827-018-0687-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Convolutional neural network models of V1 responses to complex patterns

Abstract

Access this article

Similar content being viewed by others

A deep convolutional visual encoding model of neuronal responses in the LGN

Characterisation of nonlinear receptive fields of visual neurons by convolutional neural network

Activations of deep convolutional neural networks are aligned with gamma band activity of human visual cortex

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interests

Additional information

Electronic supplementary material

(PDF 1.95 MB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Convolutional neural network models of V1 responses to complex patterns

Abstract

Access this article

Similar content being viewed by others

A deep convolutional visual encoding model of neuronal responses in the LGN

Characterisation of nonlinear receptive fields of visual neurons by convolutional neural network

Activations of deep convolutional neural networks are aligned with gamma band activity of human visual cortex

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interests

Additional information

Electronic supplementary material

(PDF 1.95 MB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation