Skip to main content

Music Genre Classification via Sequential Wavelet Scattering Feature Learning

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11776))

Abstract

Various content-based high-level descriptors are used for musical similarity, classification and recommendation tasks. Our study uses wavelet scattering coefficients as features providing both translation-invariant representation and transient characterizations of audio signal to predict musical genre. Extracted features are fed to sequential architectures to model temporal dependencies of musical piece more efficiently. Competitive classification results are obtained against hand-engineered feature based frameworks with proposed technique.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    We used various combinations of wavelet scattering parameters of N: transform support length of input signal length, \(2^J\): maximum log-scale of the scattering transform and Q: The number of first-order wavelets per octave for; \(N \in [2^{15},2^{16},2^{17}]\), \(J \in [2^{10}, 2^{11}, 2^{12}]\) and \(Q \in [8,10,12]\).

References

  1. Andén, J., Mallat, S.: Deep scattering spectrum. IEEE Trans. Signal Process. 62(16), 4114–4128 (2014)

    Article  MathSciNet  Google Scholar 

  2. Benetos, E., Kotropoulos, C.: A tensor-based approach for automatic music genre classification. In: 2008 16th European Signal Processing Conference, pp. 1–4. IEEE (2008)

    Google Scholar 

  3. Bengio, Y., Simard, P., Frasconi, P., et al.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)

    Article  Google Scholar 

  4. Bergstra, J., Casagrande, N., Erhan, D., Eck, D., Kégl, B.: Aggregate features and ada boost for music classification. Mach. Learn. 65(2–3), 473–484 (2006)

    Article  Google Scholar 

  5. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  6. Holzapfel, A., Stylianou, Y.: Musical genre classification using nonnegative matrix factorization-based features. IEEE Trans. Audio Speech Lang. Process. 16(2), 424–434 (2008)

    Article  Google Scholar 

  7. Irvin, J., Chartock, E., Hollander, N.: Recurrent neural networks with attention for genre classification (2016)

    Google Scholar 

  8. LeCun, Y., Kavukcuoglu, K., Farabet, C.: Convolutional networks and applications in vision. In: Proceedings of IEEE International Symposium on Circuits and Systems, pp. 253–256. IEEE (2010)

    Google Scholar 

  9. Li, T., Ogihara, M., Li, Q.: A comparative study on content-based music genre classification. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 282–289. ACM (2003)

    Google Scholar 

  10. Lidy, T., Rauber, A., Pertusa, A., Quereda, J.M.I.: Improving genre classification by combination of audio and symbolic descriptors using a transcription systems. In: International Society for Music Information Retrieval Conference, ISMIR 2007, pp. 61–66 (2007)

    Google Scholar 

  11. Mallat, S.: Group invariant scattering. Commun. Pure Appl. Math. 65(10), 1331–1398 (2012)

    Article  MathSciNet  Google Scholar 

  12. Mallat, S.G.: A theory for multiresolution signal decomposition: the wavelet representation. IEEE Trans. Pattern Anal. Mach. Intell. 11(7), 674–693 (1989)

    Article  Google Scholar 

  13. Paszke, A., et al.: Automatic differentiation in PyTorch. In: Neural Information Processing Systems Workshop (2017)

    Google Scholar 

  14. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  15. Tzanetakis, G., Cook, P.: Gtzan genre collection (2002). http://marsyas.info/downloads/datasets.html

  16. Tzanetakis, G., Cook, P.: Musical genre classification of audio signals. IEEE Trans. Speech Audio Process. 10(5), 293–302 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gokhan Bilgin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kanalici, E., Bilgin, G. (2019). Music Genre Classification via Sequential Wavelet Scattering Feature Learning. In: Douligeris, C., Karagiannis, D., Apostolou, D. (eds) Knowledge Science, Engineering and Management. KSEM 2019. Lecture Notes in Computer Science(), vol 11776. Springer, Cham. https://doi.org/10.1007/978-3-030-29563-9_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-29563-9_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-29562-2

  • Online ISBN: 978-3-030-29563-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics