Skip to main content
Log in

Convolutional neural network models of V1 responses to complex patterns

  • Published:
Journal of Computational Neuroscience Aims and scope Submit manuscript

Abstract

In this study, we evaluated the convolutional neural network (CNN) method for modeling V1 neurons of awake macaque monkeys in response to a large set of complex pattern stimuli. CNN models outperformed all the other baseline models, such as Gabor-based standard models for V1 cells and various variants of generalized linear models. We then systematically dissected different components of the CNN and found two key factors that made CNNs outperform other models: thresholding nonlinearity and convolution. In addition, we fitted our data using a pre-trained deep CNN via transfer learning. The deep CNN’s higher layers, which encode more complex patterns, outperformed lower ones, and this result was consistent with our earlier work on the complexity of V1 neural code. Our study systematically evaluates the relative merits of different CNN components in the context of V1 neuron modeling.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. The images were rescaled to 2/3 of their original sizes; we used this scale because in another study (Zhang et al. 2016) we found that this scale gave the highest representational similarity (Kriegeskorte et al. 2008) between the CNN and neural data among all scales explored; we also tried using raw images without rescaling in the current study and got worse results.

  2. In theory we should exclude these neurons for model evaluation, we did not do it as doing it or not has negligible effects with hundreds of neurons in our data set.

  3. In practice, we performed PCA only on the pure quadratic terms to reduce their dimensionalities to 432 and concatenated the PCAed 432-dimensional pure quadratic terms with the 400-dimensional linear terms to generate the final 882-dimensional input vectors; such method would guarantee that the information from linear terms, which are heavily used in most V1 models, is preserved. We also tried performing PCA on both linear and pure quadratic terms together and two methods made little difference in our experiments.

References

  • Adelson, E.H., & Bergen, J.R. (1985). Spatiotemporal energy models for the perception of motion. Journal of the Optical Society of America. A, 2(2), 284–299. https://doi.org/10.1364/JOSAA.2.000284. http://josaa.osa.org/abstract.cfm?URI=josaa-2-2-284.

    Article  CAS  Google Scholar 

  • Andrews, B.W., & Pollen, D.A. (1979). Relationship between spatial frequency selectivity and receptive field profile of simple cells. The Journal of Physiology, 287(1), 163–176. https://doi.org/10.1113/jphysiol.1979.sp012652.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Bishop, C.M. (2006). Machine learning and pattern recognition. Information science and statistics. Springer.

  • Cadena, S.A., Denfield, G.H., Walker, E.Y., Gatys, L.A., Tolias, A.S., Bethge, M., Ecker, A.S. (2017). Deep convolutional models improve predictions of macaque v1 responses to natural images. bioRxiv.

  • Carandini, M., Demb, J.B., Mante, V., Tolhurst, D.J., Dan, Y., Olshausen, B.A., Gallant, J.L., Rust, N.C. (2005). Do we know what the early visual system does? Journal of Neuroscience, 25(46), 10577–10597. https://doi.org/10.1523/JNEUROSCI.3726-05.2005. http://www.jneurosci.org/cgi/doi/10.1523/JNEUROSCI.3726-05.2005.

    Article  CAS  PubMed  Google Scholar 

  • Coen-Cagli, R., Kohn, A., Schwartz, O. (2015). Flexible gating of contextual influences in natural vision. Nature Neuroscience, 18, 1648–1655. https://doi.org/10.1038/nn.4128.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Daugman, J.G. (1985). Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. Journal of the Optical Society of America. A, 2(7), 1160–1169.

    Article  CAS  Google Scholar 

  • David, S.V., & Gallant, J.L. (2005). Predicting neuronal responses during natural vision. Network: Computation in Neural Systems, 16(2-3), 239–260. https://doi.org/10.1080/09548980500464030. http://www.tandfonline.com/doi/full/10.1080/09548980500464030.

    Article  Google Scholar 

  • Dayan, P., & Abbott, L.F. (2001). Theoretical neuroscience. Computational and Mathematical Modeling of Neural Systems.

  • Finn, I.M., & Ferster, D. (2007). Computational diversity in complex cells of cat primary visual cortex. Journal of Neuroscience, 27(36), 9638–9648. https://doi.org/10.1523/JNEUROSCI.2119-07.2007. http://www.jneurosci.org/content/27/36/9638.

    Article  CAS  PubMed  Google Scholar 

  • Friedman, J., Hastie, T., Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software Articles, 33(1), 1–22. https://doi.org/10.18637/jss.v033.i01.

    Article  Google Scholar 

  • Fukushima, K. (1980). Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 36(4), 193–202. https://doi.org/10.1007/BF00344251.

    Article  CAS  PubMed  Google Scholar 

  • Gollisch, T., & Meister, M. (2010). Eye smarter than scientists believed: neural computations in circuits of the retina. Neuron, 65(2), 150–164. https://doi.org/10.1016/j.neuron.2009.12.009.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Goodfellow, I.J., Bengio, Y., Courville, A. (2016). Deep learning. MIT Press.

  • Heeger, D.J. (1992). Half-squaring in responses of cat striate cells. Visual Neuroscience, 9(5), 427–443. https://doi.org/10.1017/S095252380001124X.

    Article  CAS  PubMed  Google Scholar 

  • Hegdé, J., & Van Essen, D.C. (2007). A comparative study of shape representation in macaque visual areas V2 and V4. Cerebral Cortex, 17(5), 1100–1116.

    Article  PubMed  Google Scholar 

  • Hsu, A., Borst, A., Theunissen, F. (2004). Quantifying variability in neural responses and its application for the validation of model predictions. Network: Computation in Neural Systems, 15(2), 91–109.

    Article  Google Scholar 

  • Hubel, D.H., & Wiesel, T.N. (1959). Receptive fields of single neurones in the cat’s striate cortex. The Journal of Physiology, 148(3), 574–591. https://doi.org/10.1113/jphysiol.1959.sp006308.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Hubel, D.H., & Wiesel, T.N. (1962). Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. The Journal of Physiology, 160(1), 106–154. https://doi.org/10.1113/jphysiol.1962.sp006837.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Hubel, D.H., & Wiesel, T.N. (1968). Receptive fields and functional architecture of monkey striate cortex. The Journal of Physiology, 195(1), 215–243. https://doi.org/10.1113/jphysiol.1968.sp008455.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Jones, J.P., & Palmer, L.A. (1987a). An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. Journal of Neurophysiology, 58(6), 1233– 1258.

    Article  CAS  PubMed  Google Scholar 

  • Jones, J.P., & Palmer, L.A. (1987b). The two-dimensional spatial structure of simple receptive fields in cat striate cortex. Journal of Neurophysiology, 58(6), 1187–1211. http://jn.physiology.org/content/58/6/1187.

    Article  CAS  PubMed  Google Scholar 

  • Kelly, R.C., Smith, M.A., Kass, R.E., Lee, T.S. (2010). Accounting for network effects in neuronal responses using L1 regularized point process models. In Lafferty, J.D., Williams, C.K.I., Shawe-Taylor, J., Zemel, R.S., Culotta, A. (Eds.) Advances in neural information processing systems 23: 24th annual conference on neural information processing systems 2010. Proceedings of a meeting held 6-9 December 2010, Vancouver, British Columbia, Canada (pp. 1099–1107): Curran Associates, Inc. http://papers.nips.cc/paper/4050-accounting-for-network-effects-in-neuronal-responses-using-l1-regularized-point-process-models.

  • Kindel, W.F., Christensen, E.D., Zylberberg, J. (2017). Using deep learning to reveal the neural code for images in primary visual cortex. ArXiv e-prints, q-bio.NC.

  • Kingma, D.P., & Ba, J. (2014). Adam: a method for stochastic optimization. CoRR, arXiv:1412.6980.

  • Klindt, D., Ecker, A.S., Euler, T., Bethge, M. (2017). Neural system identification for large populations separating “what” and “where”. In Guyon, I.I., von Luxburg, U., Bengio, S., Wallach, H.M., Fergus, R., Vishwanathan, S.V.N., Garnett, R. (Eds.) Advances in neural information processing systems 30: annual conference on neural information processing systems 2017, 4-9 December 2017 (pp. 3509–3519). Long Beach.

  • Köster, U., & Olshausen, B. (2013). Testing our conceptual understanding of V1 function. q-bio.NC. arXiv:1311.0778.

  • Kotikalapudi, R. (2017). keras-vis: Keras visualization toolkit. https://github.com/raghakot/keras-vis.

  • Kriegeskorte, N. (2015). Deep neural networks: a new framework for modeling biological vision and brain information processing. Annual Review of Vision Science, 1(1), 417–446.

    Article  PubMed  Google Scholar 

  • Kriegeskorte, N., Mur, M., Bandettini, P. (2008). Representational similarity analysis - connecting the branches of systems neuroscience. Frontiers in Systems Neuroscience, 2, 4. https://doi.org/10.3389/neuro.06.004.2008. https://www.frontiersin.org/article/10.3389/neuro.06.004.2008.

    Article  PubMed  PubMed Central  Google Scholar 

  • Krizhevsky, A., Sutskever, I., Hinton, G.E. (2012). Imagenet classification with deep convolutional neural networks. In Bartlett, P.L., Pereira, F.C.N., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (Eds.) Advances in neural information processing systems 25: 26th annual conference on neural information processing systems 2012. Proceedings of a meeting held December 3-6, 2012 (pp. 1106–1114). Lake Tahoe.

  • Li, M., Liu, F., Jiang, H., Lee, T.S., Tang, S. (2017). Long-term two-photon imaging in awake macaque monkey. Neuron, 93(5), 1049–1057.e3.

    Article  CAS  PubMed  Google Scholar 

  • McCullagh, P., & Nelder, J. (1989). Generalized linear models, 2nd edn. Chapman & Hall/CRC Monographs on Statistics & Applied Probability. Taylor & Francis.

  • McFarland, J.M., Cui, Y., Butts, D.A. (2013). Inferring nonlinear neuronal computation based on physiologically plausible inputs. PLoS Computational Biology, 9(7), e1003143.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • McIntosh, L.T., Maheswaranathan, N., Nayebi, A., Ganguli, S., Baccus, S.A. (2017). Deep learning models of the retinal response to natural scenes. ArXiv e-prints, q-bio.NC.

  • Olah, C., Mordvintsev, A., Schubert, L. (2017). Feature visualization. Distill. https://distill.pub/2017/feature-visualization.

  • Paninski, L. (2004). Maximum likelihood estimation of cascade point-process neural encoding models. Network: Computation in Neural Systems, 15(4), 243–262.

    Article  Google Scholar 

  • Park, I.M., & Pillow, J.W. (2011). Bayesian spike-triggered covariance analysis. In Taylor, J.S., Zemel, R.S., Bartlett, P.L., Pereira, F.C.N., Weinberger, K.Q. (Eds.) Advances in neural information processing systems 24: 25th annual conference on neural information processing systems 2011. Proceedings of a meeting held 12-14 December 2011 (pp. 1692–1700). Granada.

  • Park, I.M., Archer, E., Priebe, N., Pillow, J.W. (2013). Spectral methods for neural characterization using generalized quadratic models. In Burges, C.J.C., Bottou, L., Ghahramani, Z., Weinberger, K.Q. (Eds.) Advances in neural information processing systems 26: 27th annual conference on neural information processing systems 2013. Proceedings of a meeting held December 5-8, 2013 (pp. 2454–2462). Lake Tahoe.

  • Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A. (2017). Automatic differentiation. In pytorch.

  • Pillow, J.W., Shlens, J., Paninski, L., Sher, A., Litke, A.M., Chichilnisky, E.J., Simoncelli, E.P. (2008). Spatio-temporal correlations and visual signalling in a complete neuronal population. Nature, 454, 995–999. https://doi.org/10.1038/nature07140.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Prenger, R., Wu, M.C.K., David, S.V., Gallant, J.L. (2004). Nonlinear V1 responses to natural scenes revealed by neural network analysis. Neural Networks, 17(5–6), 663–679.

    Article  PubMed  Google Scholar 

  • Riesenhuber, M., & Poggio, T. (1999). Hierarchical models of object recognition in cortex. Nature Neuroscience, 2, 1019–1025. https://doi.org/10.1038/14819.

    Article  CAS  PubMed  Google Scholar 

  • Rowekamp, R.J., & Sharpee, T.O. (2017). Cross-orientation suppression in visual area V2. Nature Communications, 8, 15739.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Rust, N.C., Schwartz, O., Movshon, J.A., Simoncelli, E.P. (2005). Spatiotemporal elements of macaque V1 receptive fields. Neuron, 46(6), 945–956. https://doi.org/10.1016/j.neuron.2005.05.021. http://linkinghub.elsevier.com/retrieve/pii/S089662730500468X.

    Article  CAS  PubMed  Google Scholar 

  • Schoppe, O., Harper, N.S., Willmore, B.D.B., King, A.J., Schnupp, J.W.H. (2016). Measuring the performance of neural models. Frontiers in Computational Neuroscience, 10, 1929.

    Article  Google Scholar 

  • Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. ArXiv e-prints, cs.CV.

  • Tang, S., Lee, T.S., Li, M., Zhang, Y., Xu, Y., Liu, F., Teo, B., Jiang, H. (2018). Complex pattern selectivity in macaque primary visual cortex revealed by large-scale two-photon imaging. Current Biology, 28 (1), 38–48.e3.

    Article  CAS  PubMed  Google Scholar 

  • Theunissen, F.E., David, S.V., Singh, N.C., Hsu, A., Vinje, W.E., Gallant, J.L. (2001). Estimating spatio-temporal receptive fields of auditory and visual neurons from their responses to natural stimuli. Network: Computation in Neural Systems, 12(3), 289–316.

    Article  CAS  Google Scholar 

  • Touryan, J., Felsen, G., Dan, Y. (2005). Spatial structure of complex cell receptive fields measured with natural images. Neuron, 45 (5), 781–791. https://doi.org/10.1016/j.neuron.2005.01.029. http://linkinghub.elsevier.com/retrieve/pii/S0896627305000619.

    Article  CAS  PubMed  Google Scholar 

  • Victor, J.D., Mechler, F., Repucci, M.A., Purpura, K.P., Sharpee, T. (2006). Responses of v1 neurons to two-dimensional hermite functions. Journal of Neurophysiology, 95(1), 379–400. https://doi.org/10.1152/jn.00498.2005. http://jn.physiology.org/content/95/1/379.

    Article  PubMed  Google Scholar 

  • Vintch, B., Movshon, J.A., Simoncelli, E.P. (2015). A convolutional subunit model for neuronal responses in macaque v1. Journal of Neuroscience, 35(44), 14829–14841.

    Article  CAS  PubMed  Google Scholar 

  • Wu, M.C.K., David, S.V., Gallant, J.L. (2006). Complete functional characterization of sensory neurons by system identification. Annual Review of Neuroscience, 29(1), 477–505. https://doi.org/10.1523/JNEUROSCI.2815-13.2015.

    Article  CAS  PubMed  Google Scholar 

  • Yamins, D., Hong, H., Cadieu, C.F., DiCarlo, J.J. (2013). Hierarchical modular optimization of convolutional networks achieves representations similar to macaque IT and human ventral stream. In Burges, C.J.C., Bottou, L., Ghahramani, Z., Weinberger, K.Q. (Eds.) Advances in neural information processing systems 26: 27th annual conference on neural information processing systems 2013. Proceedings of a meeting held December 5-8, 2013 (pp. 3093–3101). Lake Tahoe.

  • Yamins, D.L.K., & DiCarlo, J.J. (2016). Using goal-driven deep learning models to understand sensory cortex. Nature Neuroscience, 19(3), 356–365.

    Article  CAS  PubMed  Google Scholar 

  • Zhang, Y., Massot, C., Zhi, T., Papandreou, G., Yuille, A., Lee, T.S. (2016). Understanding neural representations in early visual areas using convolutional neural networks. In Neuroscience (SfN).

Download references

Acknowledgments

We thank Rob Kass, Dave Touretzky, and the reviewers for careful comments on the manuscript. We thank Wenbiao Gan for the early provision of AAV-GCaMP5; and to Peking University Laboratory Animal Center for excellent animal care. We acknowledge the Janelia Farm program for providing the GCaMP5-G construct, specifically Loren L. Looger, Jasper Akerboom, Douglas S. Kim, and the Genetically Encoded Calcium Indicator (GECI) project at Janelia Farm Research Campus Howard Hughes Medical Institute. This work was supported by the National Natural Science Foundation of China No. 31730109, National Natural Science Foundation of China Outstanding Young Researcher Award 30525016, a project 985 grant of Peking University, Beijing Municipal Commission of Science and Technology under contract No. Z151100000915070, NIH 1R01EY022247 and NSF CISE 1320651 and IARPA D16PC00007 of the U.S.A.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yimeng Zhang or Shiming Tang.

Ethics declarations

Conflict of interests

The authors declare that they have no conflict of interest.

Additional information

Action Editor: Zhe Chen

Electronic supplementary material

Below is the link to the electronic supplementary material.

(PDF 1.95 MB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Y., Lee, T.S., Li, M. et al. Convolutional neural network models of V1 responses to complex patterns. J Comput Neurosci 46, 33–54 (2019). https://doi.org/10.1007/s10827-018-0687-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10827-018-0687-7

Keywords

Navigation