Abstract
This paper presents the automation of a Web advertising recognition algorithm, using regular expressions. Currently, the use of regular expressions, optical character recognition, Databases, and automation tests have been critical for multiple Software implementations. The tests were carried out in three Web browsers. As a result, the detection of advertisements in Spanish, that distract attention and that above all extract information from users was achieved. The main feature of the algorithm is that automatic and versatile execution does not require access to the code of the page in question and that in the future it can be an application with background operation. Being supported by optical character recognition gives us acceptable efficiency in detecting advertising. Thanks to this identification, it may be possible to generate different applications, both in favor of the user and the brands, always with the aim of improving current online marketing models.
Similar content being viewed by others
REFERENCES
Marketing Digital, ¿Qué es el marketing digital?, 2020. http://www.mdmarketingdigital.com/que-es-el-marketingdigital.
Redes Semánticas, http://tesis.uson.mx/digital/tesis/docs/9049/Capitulo1.pdf.
Marketing Online: Potencial y Estrategias, 2019. http://www.cecarm.com/Guia_Marketing_Online_Potencial_y_Estrategias_-_CECARM.pdf-6120.
Pomol, R., González, C., and González, S., Una herramienta didáctica para el aprendizaje interactivo de expresiones regulares, 2013. http://repositorio.uigv.edu.pe/handle/20.500.11818/804.
Beltrán, R., El uso de expresiones regulares en la detección de errores escritos: implicaciones para el diseño de un corrector gramatical, 2008. https://dialnet.unirioja.es/servlet/articulo?codigo=4007478.
Gallego, A., La jerarquía de Chomsky y la facultad del lenguaje: consecuencias para la variación y la evolución, Teorema, 2008, vol. 27, no. 2, pp. 47–60.
García, I., Herramienta para la corrección automática de autómatas finitos, 2017. https://riull.ull.es/xmlui/handle/915/5846.
Sánchez, J., López, L., and Martíez, J., Solución para garantizar la privacidad en el Internet de las Cosas, El profesional de la informaciyn, 2015, vol. 24, pp. 62–70.
Ortiz, M., Aguilar, L., and Marín, L., Los desafíos del marketing en la era del big data, e-Ciencias de la Informaciyn, 2016, vol. 6, pp. 1–30.
Riaño, D., Molero-Castillo, G., Velázquez-Mena, A., and Bárcenas, E., Expresiones regulares para el tratamiento de privacidad de navegadores Web, Abstr. Appl., 2019, vol. 25, pp. 121–130.
Cerezo, P., Ad blocking: el modelo publicitario digital, a revisión, Cuadernos de periodistas: revista de la Asociaciyn de la Prensa de Madrid, 2016, pp. 81–89.
Londaitz, A., Publicidad en los celulares: publicidad invasiva vs. derecho a la privacidad, Thesis, Universidad del Salvador, 2011. https://racimo.usal.edu.ar/4312.
Bienvenido a Google, la mejor empresa para trabajar, 2013. http://www.expansion.com/2013/08/23/directivos/1377273795.html.
Jarvis, J., Y Google, ¿cómo lo haría?, 2000. https://narrativabreve.com/2013/10/libro-google-jeffharvis.html.
Leotta, M., Clerissi, D., Ricca, F., and Spadaro, C., Comparing the maintainability of selenium webdriver test suites employing different locators: a case study, Proc. 1st Int. Workshop on Joining AcadeMiA and Industry Contributions to Testing Automation, Lugano, 2013. https://dl.acm.org/doi/10.1145/2489280.2489284.
Gojare, S., Joshi, R., and Gaigaware, D., Analysis and design of selenium WebDriver automation testing framework, Procedia Comput. Sci., 2015, vol. 50, pp. 341–346.
Selenium Webdriver, 2017. http://www.tutorialspoint.com/selenium/pdf/selenium_webdriver.pdf.
Yih, W., Goodman, J., and Carvalho, V., Finding advertising keywords on web pages, Proc. 15th Int. Conf. on World Wide Web, Edinburgh, 2006. https://dl.acm.org/doi/pdf/10.1145/1135777.1135813.
Mei, T., Li, L., Tian, X., Tao, D., and Ngo, C., PageSense: toward stylewise contextual advertising via visual analysis of web pages, IEEE Trans. Circuits Syst. Video Technol., 2018. http://dl.acm.org/doi/abs/10.1109/TCSVT.2016.2598702
Sánchez, D. and Viejo, A., Privacy-preserving and advertising-friendly web surfing, Comput. Commun., 2018, vol. 130, pp. 113–123.
Krammer, V., An effective defense against intrusive web advertising, Proc. 6th Annu. Conf. on Privacy, Security and Trust, Fredericton, NB, 2008. https://ieeexplore.ieee.org/document/4641268.
Sajjad, K., Automatic license plate recognition using Python and Opencv, College of Engineering, 2010. https://pdfs.semanticscholar.org/bddf/1200eb17f239e4dce2a9cec938eb8cf305f5.pdf.
Patel, C., Patel, A., and Patel, D., Optical character recognition by open source OCR tool tesseract: a case study, Int. J. Comput. Appl., 2012, vol. 55, no. 10. https://research.ijcaonline.org/volume55/number10/pxc3882784.pdf.
Vallez, M., Keyword research: métodos y herramientas para identificar palabras clave, BiD: textos universitaris de biblioteconomia i documentació, 2011, vol. 27, pp. 1–14.
Slamet, C., Andrian, R., Maylawati, D., Darmalaksana, W., and Ramdhani, M., Web scraping and naïve Bayes classification for job search engine, Proc. 2nd Annu. Applied Science and Engineering Conf., Bandung, 2018. https://iopscience.iop.org/article/10.1088/1757-899X/288/1/012038/pdf.
ACKNOWLEDGMENTS
This work was supported by UNAM-PAPIIT IA105320.
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Riaño, D., Piñon, R., Molero-Castillo, G. et al. Regular Expressions for Web Advertising Detection Based on an Automatic Sliding Algorithm. Program Comput Soft 46, 652–660 (2020). https://doi.org/10.1134/S0361768820080162
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S0361768820080162