ABSTRACT
Head related transfer functions (HRTF) describe how sound signals bounce, scatter, and diffract when they arrive at the head, and travel towards the ears. HRTFs produce distinct sound patterns that ultimately help the brain infer the spatial properties of the sound, such as its direction of arrival, 𝜃. If an earphone can learn the HRTF, it could apply the HRTF to any sound and make that sound appear directional to the user. For instance, a directional voice guide could help a tourist navigate a new city. While past works have estimated human HRTFs, an important gap lies in personalization. Today's HRTFs are global templates that are used in all products; since human HRTFs are unique, a global HRTF only offers a coarse-grained experience. This paper shows that by moving a smartphone around the head, combined with mobile acoustic communications between the phone and the earbuds, it is possible to estimate a user's personal HRTF. Our personalization system, UNIQ, combines techniques from channel estimation, motion tracking, and signal processing, with a focus on modeling signal diffraction on the curvature of the face. The results are promising and could open new doors into the rapidly growing space of immersive AR/VR, earables, smart hearing aids, etc.
Supplemental Material
Available for Download
Personalizing Head Related Transfer Functions for Earables: Public Review
- 2015. The Sound Professionals. Retrieved Jan 26, 2021 from https://www. soundprofessionals.com/cgi-bin/gold/item/SP-TFB-2Google Scholar
- 2015. Wave Interactions and Interference. Retrieved Jan 24, 2021 from https://www.ck12.org/section/wave-interactions-and-interference-%3a%3aof% 3a%3a-waves-%3a%3aof%3a%3a-ck-12-physical-science-for-middle-school/Google Scholar
- 2017. Beyond Surround Sound: Audio Advances in VR. Retrieved Jan 24, 2021 from https://www.oculus.com/blog/beyond-surround-sound-audio-advances in-vr/Google Scholar
- 2017. Near-field 3D Audio Explained. Retrieved Jun 11, 2021 from https: //developer.oculus.com/blog/near-field-3d-audio-explained/Google Scholar
- 2018. Simulating Dynamic Soundscapes at Facebook Reality Labs. Retrieved Jan 26, 2021 from https://www.oculus.com/blog/simulating-dynamic-soundscapes at-facebook-reality-labs/Google Scholar
- 2019. Audio in mixed reality. Retrieved Jan 24, 2021 from https://docs.microsoft. com/en-us/windows/mixed-reality/design/spatial-soundGoogle Scholar
- 2019. Mach1 will provide spatial audio for Bose's AR platform. Retrieved Jan 24, 2021 from https://venturebeat.com/2019/12/18/mach1-will-provide-spatial audio-for-boses-ar-platform/Google Scholar
- 2020. Apple brings surround sound and Dolby Atmos to AirPods Pro. Re trieved Jan 24, 2021 from https://thenextweb.com/plugged/2020/06/22/apple brings-surround-sound-and-dolby-atmos-to-airpods-pro/Google Scholar
- 2020. Diffraction. Retrieved Jan 24, 2021 from https://en.wikipedia.org/wiki/ DiffractionGoogle Scholar
- 2020. Inside Facebook Reality Labs Research: The Future of Audio. Retrieved Jan 24, 2021 from https://about.fb.com/news/2020/09/facebook-reality-labs-research future-of-audio/Google Scholar
- 2020. Xiaomi United States. Retrieved Jan 26, 2021 from https://www.mi.com/us/Google Scholar
- 2021. DIY HRTF measurement using an iPhone. Retrieved Jun 11, 2021 from https://www.earfish.eu/sites/default/files/2018-01/DIY_earfish_iPhone_0.pdfGoogle Scholar
- 2021. Equal-loudness contour. Retrieved Jan 24, 2021 from https://en.wikipedia. org/wiki/Equal-loudness_contourGoogle Scholar
- Ishwarya Ananthabhotla, Vamsi Krishna Ithapu, and W Owen Brimijoin. 2021. A framework for designing head-related transfer function distance metrics that capture localization perception. JASA Express Letters 1, 4 (2021), 044401.Google ScholarCross Ref
- Jeffrey R Blum, Mathieu Bouchard, and Jeremy R Cooperstock. 2011. What's around me? Spatialized audio augmented reality for blind users with a smart phone. In International Conference on Mobile and Ubiquitous Systems: Computing, Networking, and Services. Springer, 49--62.Google Scholar
- C Phillip Brown and Richard O Duda. 1997. An efficient HRTF model for 3-D sound. In Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics. IEEE, 4--pp.Google ScholarCross Ref
- Thibaut Carpentier, Hélène Bahu, Markus Noisternig, and Olivier Warusfel. 2014. Measurement of a head-related transfer function database with high spatial resolution. In 7th Forum Acusticum (EAA).Google Scholar
- Jorge Dávila-Chacón, Jindong Liu, and Stefan Wermter. 2018. Enhanced robot speech recognition using biomimetic binaural sound source localization. IEEE transactions on neural networks and learning systems 30, 1 (2018), 138--150.Google Scholar
- Hossein Falaki, Ratul Mahajan, Srikanth Kandula, Dimitrios Lymberopoulos, Ramesh Govindan, and Deborah Estrin. 2010. Diversity in smartphone usage. In Proceedings of the 8th international conference on Mobile systems, applications, and services. 179--194.Google ScholarDigital Library
- Yang Gao, Wei Wang, Vir V Phoha, Wei Sun, and Zhanpeng Jin. 2019. EarEcho: Using Ear Canal Echo for Wearable Authentication. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 3, 3 (2019), 1--24.Google ScholarDigital Library
- William G Gardner. 2005. Spatial audio reproduction: Towards individualized binaural sound. In Frontiers of Engineering:: Reports on Leading-Edge Engineering from the 2004 NAE Symposium on Frontiers of Engineering, Vol. 34. 113.Google Scholar
- William G Gardner and Keith D Martin. 1995. HRTF measurements of a KEMAR. The Journal of the Acoustical Society of America 97, 6 (1995), 3907--3908.Google ScholarCross Ref
- Reza Ghaffarivardavagh, Sayed Saad Afzal, Osvy Rodriguez, and Fadel Adib. 2020. Ultra-wideband underwater backscatter via piezoelectric metamaterials. In Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication. 722--734.Google ScholarDigital Library
- Yasaman Ghasempour, Chia-Yi Yeh, Rabi Shrestha, Yasith Amarasinghe, Daniel Mittleman, and Edward W Knightly. 2020. LeakyTrack: non-coherent single antenna nodal and environmental mobility tracking with a leaky-wave antenna. In Proceedings of the 18th Conference on Embedded Networked Sensor Systems. 56--68.Google ScholarDigital Library
- Michael M Goodwin and Jean-Marc Jot. 2007. Binaural 3-D audio rendering based on spatial audio scene coding. In Audio Engineering Society Convention 123. Audio Engineering Society.Google Scholar
- Michael M Goodwin, Jean-Marc Jot, and Mark Dolson. 2013. Spatial audio analysis and synthesis for binaural reproduction and format conversion. US Patent 8,374,365.Google Scholar
- Corentin Guezenoc and Renaud Seguier. 2020. HRTF individualization: A survey. arXiv preprint arXiv:2003.06183 (2020).Google Scholar
- Nail A Gumerov, Ramani Duraiswami, and Dmitry N Zotkin. 2007. Fast multipole accelerated boundary elements for numerical computation of the head related transfer function. In 2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP'07, Vol. 1. IEEE, I--165.Google ScholarCross Ref
- Hongmei Hu, Lin Zhou, Hao Ma, and Zhenyang Wu. 2008. HRTF personalization based on artificial neural network in individual virtual auditory space. Applied Acoustics 69, 2 (2008), 163--172.Google ScholarCross Ref
- Sungmok Hwang, Youngjin Park, and Younsik Park. 2007. Sound direction estima tion using artificial ear. In 2007 International Conference on Control, Automation and Systems. IEEE, 1906--1910.Google ScholarCross Ref
- C Jackman, M Zampino, D Cadge, R Dravida, V Katiyar, and J Lewis. 2009. Esti mating acoustic performance of a cell phone speaker using Abaqus. In SIMULIA Customer Conference. 14--21.Google Scholar
- Cheol-Taek Kim, Tae-Yong Choi, ByongSuk Choi, and Ju-Jang Lee. 2008. Robust estimation of sound direction for robot interface. In 2008 IEEE International Conference on Robotics and Automation. IEEE, 3475--3480.Google Scholar
- Lin Li and Qinghua Huang. 2013. HRTF personalization modeling based on RBF neural network. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 3707--3710.Google ScholarCross Ref
- Zhihong Luo, Qiping Zhang, Yunfei Ma, Manish Singh, and Fadel Adib. 2019. 3D backscatter localization for fine-grained robotics. In 16th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 19). 765--782.Google Scholar
- Wenguang Mao, Wei Sun, Mei Wang, and Lili Qiu. 2020. DeepRange: Acous tic Ranging via Deep Learning. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 4, 4 (2020), 1--23.Google ScholarDigital Library
- Alok Meshram, Ravish Mehra, Hongsheng Yang, Enrique Dunn, Jan-Michael Franm, and Dinesh Manocha. 2014. P-HRTF: Efficient personalized HRTF com putation for high-fidelity spatial sound. In 2014 IEEE International Symposium on Mixed and Augmented Reality (ISMAR). IEEE, 53--61.Google ScholarCross Ref
- Yan Michalevsky, Aaron Schulman, Gunaa Arumugam Veerapandian, Dan Boneh, and Gabi Nakibly. 2015. Powerspy: Location tracking using mobile device power analysis. In 24th {USENIX} Security Symposium ({USENIX} Security 15). 785--800.Google Scholar
- Philip M Morse and Pearl J Rubenstein. 1938. The diffraction of waves by ribbons and by slits. Physical Review 54, 11 (1938), 895.Google ScholarCross Ref
- Rajalakshmi Nandakumar, Krishna Kant Chintalapudi, Venkat Padmanabhan, and Ramarathnam Venkatesan. 2013. Dhwani: secure peer-to-peer acoustic NFC. ACM SIGCOMM Computer Communication Review 43, 4 (2013), 63--74.Google ScholarDigital Library
- Takanori Nishino, Sumie Mase, Shoji Kajita, Kazuya Takeda, and Fumitada Itakura. 1996. Interpolating HRTF for auditory virtual reality. Ph.D. Dissertation. Acoustical Society of America.Google Scholar
- Chunyi Peng, Guobin Shen, Yongguang Zhang, Yanlin Li, and Kun Tan. 2007. Beepbeep: a high accuracy acoustic ranging system using cots mobile devices. In Proceedings of the 5th international conference on Embedded networked sensor systems. 1--14.Google ScholarDigital Library
- Ming-Zher Poh, Kyunghee Kim, Andrew D Goessling, Nicholas C Swenson, and Rosalind W Picard. 2009. Heartphones: Sensor earphones and mobile applica tion for non-obtrusive health monitoring. In 2009 International Symposium on Wearable Computers. IEEE, 153--154.Google ScholarDigital Library
- Swadhin Pradhan, Ghufran Baig, Wenguang Mao, Lili Qiu, Guohai Chen, and Bo Yang. 2018. Smartphone-based acoustic indoor space mapping. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 2 (2018), 1--26.Google ScholarDigital Library
- Jay Prakash, Zhijian Yang, Yu-Lin Wei, and Romit Roy Choudhury. 2019. STEAR: Robust Step Counting from Earables. In Proceedings of the 1st International Work shop on Earable Computing. 36--41.Google ScholarDigital Library
- Niklas Röber, Sven Andres, and Maic Masuch. 2006. HRTF simulations through acoustic raytracing. Universitäts-und Landesbibliothek Sachsen-Anhalt.Google Scholar
- Sheng Shen, Daguan Chen, Yu-Lin Wei, Zhijian Yang, and Romit Roy Choudhury. 2020. Voice localization using nearby wall reflections. In Proceedings of the 26th Annual International Conference on Mobile Computing and Networking. 1--14.Google ScholarDigital Library
- Tzu-Chun Tai, Kate Ching-Ju Lin, and Yu-Chee Tseng. 2019. Toward reliable local ization by unequal AoA tracking. In Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services. 444--456.Google ScholarDigital Library
- Jelmer Tiete, Federico Domínguez, Bruno Da Silva, Laurent Segers, Kris Steenhaut, and Abdellah Touhafi. 2014. SoundCompass: a distributed MEMS microphone array-based sensor for sound source localization. Sensors 14, 2 (2014), 1918--1949.Google ScholarCross Ref
- Edgar A Torres-Gallegos, Felipe Orduna-Bustamante, and Fernando Arámbula Cosío. 2015. Personalization of head-related transfer functions (hrtf) based on automatic photo-anthropometry and inference from a database. Applied Acoustics 97 (2015), 84--95.Google ScholarCross Ref
- J-M Valin, François Michaud, Jean Rouat, and Dominic Létourneau. 2003. Robust sound source localization using a microphone array on a mobile robot. In Pro ceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003)(Cat. No. 03CH37453), Vol. 2. IEEE, 1228--1233.Google Scholar
- Lars Falck Villemoes and Dirk Jeroen Breebaart. 2012. Method and apparatus for generating a binaural audio signal. US Patent 8,265,284.Google Scholar
- Jeff Wilson, Bruce N Walker, Jeffrey Lindsay, Craig Cambias, and Frank Dellaert. 2007. Swan: System for wearable audio navigation. In 2007 11th IEEE international symposium on wearable computers. IEEE, 91--98.Google ScholarDigital Library
- Jens Windau and Laurent Itti. 2016. Walking compass with head-mounted IMU sensor. In 2016 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 5542--5547.Google ScholarDigital Library
- Zhijian Yang, Yu-Lin Wei, Sheng Shen, and Romit Roy Choudhury. 2020. Ear-AR: indoor acoustic augmented reality on earphones. In Proceedings of the 26th Annual International Conference on Mobile Computing and Networking. 1--14.Google ScholarDigital Library
- Guangzheng Yu, Ruixing Wu, Yu Liu, and Bosun Xie. 2018. Near-field head related transfer-function measurement and database of human subjects. The Journal of the Acoustical Society of America 143, 3 (2018), EL194--EL198.Google ScholarCross Ref
- Yanzi Zhu, Yibo Zhu, Ben Y Zhao, and Haitao Zheng. 2015. Reusing 60ghz radios for mobile radar imaging. In Proceedings of the 21st Annual International Conference on Mobile Computing and Networking. 103--116.Google ScholarDigital Library
- Harald Ziegelwanger, Wolfgang Kreuzer, and Piotr Majdak. 2016. A priori mesh grading for the numerical calculation of the head-related transfer functions. Applied Acoustics 114 (2016), 99--110.Google ScholarCross Ref
- DYN Zotkin, Jane Hwang, R Duraiswaini, and Larry S Davis. 2003. HRTF per sonalization using anthropometric measurements. In 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No. 03TH8684). Ieee, 157--160.Google ScholarCross Ref
- Dmitry N Zotkin, Ramani Duraiswami, and Larry S Davis. 2004. Rendering localized spatial audio in a virtual auditory space. IEEE Transactions on multimedia 6, 4 (2004), 553--564.Google ScholarDigital Library
Index Terms
- Personalizing head related transfer functions for earables
Recommendations
Natural listening over headphones in augmented reality using adaptive filtering techniques
Augmented reality (AR), which composes of virtual and real world environments, is becoming one of the major topics of research interest due to the advent of wearable devices. Today, AR is commonly used as assistive display to enhance the perception of ...
A manifold learning approach for personalizing HRTFs from anthropometric features
We present a new anthropometry-based method to personalize head-related transfer functions (HRTFs) using manifold learning in both azimuth and elevation angles with a single nonlinear regression model. The core element of our approach is a domain-...
Efficient approximation of head-related transfer functions in subbands for accurate sound localization
Head-related transfer functions (HRTFs) describe the acoustic filtering of incoming sounds by the human morphology and are essential for listeners to localize sound sources in virtual auditory displays. Since rendering complex virtual scenes is ...
Comments