ABSTRACT
Short Message Service (SMS) headers are alpha-numeric codes which identify the message sender without divulging the contents of the message. The code by itself may be uninformative of the intent of the message or the interest of the recipient. We show the application of embedding techniques in learning representations of the SMS headers for commercial communication. We use these embeddings to 1) discover insightful header cohorts and, 2) create customer embeddings which are then applied as features in supervised modeling tasks such as lookalike modelling and gender prediction. The experimental results show the customer embeddings help in improving performance of these models and also emerge as top features. This derived intelligence improves customer experience, product offerings and advertisement yield. To the best of our knowledge, this is the first application of representation learning for SMS headers.
- Tiago A Almeida, José María G Hidalgo, and Akebo Yamakami. 2011. Contributions to the study of SMS spam filtering: new collection and results. Proceedings of the 11th ACM symposium on Document engineering (2011), 259–262. https://dl.acm.org/doi/10.1145/2034691.2034742Google ScholarDigital Library
- Serkan Ballı and Onur Karasoy. 2019. Development of content-based SMS classification application by using Word2Vec-based feature extraction. IET Software 13, 4 (2019), 295–304. https://doi.org/10.1049/iet-sen. 2018.5046 arXiv:https://ietresearch.onlinelibrary.wiley.com/doi/pdf/10.1049/iet- sen.2018.5046Google ScholarDigital Library
- Alessandro Crivellari and Euro Beinat. 2019. From motion activity to geo-embeddings: Generating and exploring vector representations of locations, traces and visitors through large-scale mobility data. ISPRS International Journal of Geo-Information 8, 3(2019), 134. https://doi.org/10.3390/ijgi8030134Google ScholarCross Ref
- Cedric De Boom, Steven Van Canneyt, Thomas Demeester, and Bart Dhoedt. 2016. Representation learning for very short texts using weighted word embedding aggregation. Pattern Recognition Letters 80, 3 (2016), 150–156. https://www.sciencedirect.com/science/article/abs/pii/S0167865516301362?via%3DihubGoogle ScholarDigital Library
- Sarah Jane Delany, Mark Buckley, and Derek Greene. 2012. SMS spam filtering: Methods and data. Expert Systems with Applications 39, 10 (2012), 9899–9908. https://doi.org/10.1016/j.eswa.2012.02.053Google ScholarDigital Library
- Tom Kenter, Alexey Borisov, and Maarten De Rijke. 2016. Siamese cbow: Optimizing word embeddings for sentence representations, In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). arXiv preprint arXiv:1606.04640. https://doi.org/10.18653/v1/P16- 1089Google ScholarCross Ref
- Hyun-Young Lee and Seung-Shik Kang. 2019. Word Embedding Method of SMS Messages for Spam Message Filtering, In 2019 IEEE International Conference on Big Data and Smart Computing (BigComp). arXiv preprint arXiv:1606.04640, 1–4. 10.1109/BIGCOMP.2019.8679476Google ScholarCross Ref
- Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013a. Efficient Estimation of Word Representations in Vector Space. BMC genomics (2013). https://doi.org/10.48550/arxiv.1301.3781 [cs.CL]Google Scholar
- Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013b. Efficient Estimation of Word Representations in Vector Space. BMC genomics (2013). https://doi.org/10.48550/arxiv.1301.3781Google Scholar
- Telecom Regulatory Authority of India.2019. Yearly Performance Indicators, Indian Telecom Sector (Fourth Edition).Technical Report. New Delhi, India.(2019).Google Scholar
- Telecom Regulatory Authority of India.2021. List of commercial communication headers Principal Entities (PE).Technical Report. New Delhi, India.(2021).Google Scholar
- Matteo Pagliardini, Prakhar Gupta, and Martin Jaggi. 2018. Unsupervised Learning of Sentence Embeddings Using Compositional n-Gram Features. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, 528–540. https://aclanthology.org/N18-1049Google ScholarCross Ref
- Radim Řehůřek and Petr Sojka. 2010. Software Framework for Topic Modelling with Large Corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. ELRA, Valletta, Malta, 45–50. http://is.muni.cz/publication/884893/en.Google Scholar
- Gunikhan Sonowal and K S Kuppusamy. 2018. SmiDCA: An Anti-Smishing Model with Machine Learning Approach. Comput. J. 61, 8 (04 2018), 1143–1157. https://doi.org/10.1093/comjnl/bxy039Google Scholar
- Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing Data using t-SNE. Journal of Machine Learning Research 9, 86 (2008), 2579–2605. http://jmlr.org/papers/v9/vandermaaten08a.htmlGoogle Scholar
- Qian Xu, Evan Wei Xiang, Qiang Yang, Jiachun Du, and Jieping Zhong. 2012. SMS Spam Detection Using Noncontent Features. IEEE Intelligent Systems 27, 6 (2012), 44–51. https://ieeexplore.ieee.org/document/6133257Google ScholarDigital Library
- Radim Řehůřek and Petr Sojka. 2010. Software Framework for Topic Modelling with Large Corpora. In Proceedings of LREC 2010 workshop New Challenges for NLP Frameworks. University of Malta, Valletta, Malta, 46–50. https://is.muni.cz/publication/884893/enGoogle Scholar
Index Terms
- Customer Informatics by Embedding SMS Headers
Recommendations
Dual-image reversible data hiding method using maximum embedding ability of each pixel
AbstractCompared to conventional reversible data hiding (RDH) methods, dual-image RDH methods have greater embedding capacity, better stego image quality and higher security. Among existing dual-image RDH methods, three methods of Lu et al., ...
Digital audio steganography using DWT with reduced embedding error and better extraction compared to DCT
ICWET '11: Proceedings of the International Conference & Workshop on Emerging Trends in TechnologyThe proposed system showed high hiding rates with reasonable imperceptibility compared to other steganographic systems, DCT and better audio quality. The results shown gives detail comparison between DWT and DCT. In this paper a novel method for digital ...
Sending pictures by SMS
ICACT'09: Proceedings of the 11th international conference on Advanced Communication Technology - Volume 1SMS (Short Message Service) is a popular service for transferring and exchange of short text messages between mobile phones. MMS (Multimedia Messaging Service) is another technology in mobile phones for creating, sending, receiving and storing messages ...
Comments