Abstract
A statistical study has been conducted on Bhagavad Gita. Four measures have been derived for the original text in Sanskrit and its translations in Hindi, English and French. First, word frequency distributions for the documents were modelled. Power law was observed with the longest tail in the case of Sanskrit. For other versions, the distributions well replicated the Zipf–Mandelbrot pattern. Second, the Kullback–Leibler (KL) divergence between the documents has been computed with the highest value recorded in all three translations from the Sanskrit text. Next, a Shannon entropy-based measure: vocabulary quotient has been calculated, which estimates the vocabulary richness the texts offer; the highest being in the case of Bhagavad Gita in Sanskrit. Finally, word-length distributions were obtained with the longest word length in Sanskrit. The results attribute to the inflectional nature of Sanskrit.
Similar content being viewed by others
References
C D Manning and H Schütze, Foundations of statistical natural language processing (MIT Press, UK, 1999)
R Harald Baayen, Word frequency distributions (Springer Science & Business Media, 2001), Vol. 18
G K Zipf, The psycho-biology of language (George Routledge & Sons, Ltd., 1936), reprinted in 2002
W Li, IEEE Trans. Inf. Theory 38(6), 1842 (1992)
B Mandelbrot, Information theory and psycholinguistics (BB Wolman and E, USA, 1965)
H Baayen, Comput. Human. 26(5–6), 347 (1992)
J B Carroll, Proceedings of the Conference on Language and Language Behavior edited by E M Zale (Appleton-Century-Crofts, New York, 1968) pp. 213–235
J Narisong Jiang and H Liu, J. Quant. Linguist. 21(2), 123 (2014)
S Shtrikman, J. Inf. Sci. 20(2), 142 (1994)
S Miyazima, Y Lee, T Nagamine and H Miyajima, Phys. A: Stat. Mech. Appl. 278(1–2), 282 (2000)
B D Jayaram and M N Vidya, J. Quant. Linguist. 15(4), 293 (2008)
C E Shannon, ACM SIGMOBILE Mob. Comput. Commun. Rev. 5(1), 3 (2001)
W Ebeling and G Nicolis, Chaos Solitons Fractals 2(6), 635 (1992)
A Stolcke, Entropy-based pruning of backoff language models, arXiv:cs/0006025 (2000)
D Genzel and E Charniak, Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (Philadelphia, 2002) pp. 199–206
W Ebeling and T Pöschel, Europhys. Lett. 26(4), 241 (1994)
M A Montemurro and D H Zanette, Adv. Complex Syst. 5(01), 7 (2002)
C C Hoi Hee, Singapore Manag. Rev. 29(1), 73 (2007)
D V Jeste and I V Vahia, Psychiatry Interpers. Biol. Process. 71(3), 197 (2008)
W J Johnson, The Bhagavad Gita (Oxford University Press, New York, 1994)
S Radakrishnan, Int. J. Ethics 21(4), 465 (1911)
www.archive.org/stream/LaBhagavadGita-FrenchTranHrBslationHrB
A Mehri and M Jamaati, Phys. Lett. A 381(31), 2470 (2017)
M Wiegand, S Nadarajah and Y Si, Phys. Lett. A 382, 621 (2018)
M E J Newman, Contemp. Phys. 46(5), 323 (2005)
M A Montemurro, Phys. A: Stat. Mech. Appl. 300(3–4), 567 (2001)
A K Singh et al, IEEE Commun. Lett. 18(8), 1335 (2014)
T M Cover and J A Thomas, Elements of information theory (John Wiley & Sons, USA, 2012)
N K Rajput, B Ahuja and M K Riyal, Digit. Scholarship Human. 33, 894 (2018)
G Wimmer, R Köhler, R Grotjahn and G Altmann, J. Quant. Linguist. 1(1), 98 (1994)
C B Williams, Biometrika 62(1), 207 (1975)
B Sigurd, M Eeg-Olofsson and J Van Weijer, Studia Linguist. 58(1), 37 (2004)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rajput, N.K., Ahuja, B. & Riyal, M.K. A statistical probe into the word frequency and length distributions prevalent in the translations of Bhagavad Gita. Pramana - J Phys 92, 60 (2019). https://doi.org/10.1007/s12043-018-1709-8
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s12043-018-1709-8