Abstract
There is no shortcut to knowledge; and there are no worthwhile data without preprocessing. In the first three sections of this chapter, we discuss situations that necessitate data preprocessing and how to handle them. In the final section we discuss how to manipulate data in general; specifically, how to manipulate data in R using the reshape2 and plyr packages and in Python using the pandas module.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We use the asterisk in a*ply and elsewhere to indicate a collection of functions obtained by substituting the asterisk with other characters.
References
H. Wickham. Reshaping data with the reshape package. Journal of Statistical Software, 21 (12), 2007.
H. Wickham. The split-apply-combine strategy for data analysis. Journal of Statistical Software, 40 (1), 2011.
R. J. A. Little and D. B. Rubin. Statistical Analysis with Missing Data. Wiley, second edition, 2002.
P. J. Huber. Robust Statistics. Wiley, 1981.
R. Maronna, D. R. Martin, and V. J. Yohai. Robust Statistics: Theory and Methods. Wiley, 2006.
M. Kutner, C. Nachtsheim, J. Neter, and W. Li. Applied Linear Statistical Models. McGraw-Hill, fifth edition, 2004.
P. Spector. Data Manipulation with R. Springer, 2008.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Lebanon, G., El-Geish, M. (2018). Processing Data in R and Python. In: Computing with Data. Springer, Cham. https://doi.org/10.1007/978-3-319-98149-9_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-98149-9_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-98148-2
Online ISBN: 978-3-319-98149-9
eBook Packages: Computer ScienceComputer Science (R0)