Skip to main content

Processing Data in R and Python

  • Chapter
  • First Online:
Computing with Data
  • 3487 Accesses

Abstract

There is no shortcut to knowledge; and there are no worthwhile data without preprocessing. In the first three sections of this chapter, we discuss situations that necessitate data preprocessing and how to handle them. In the final section we discuss how to manipulate data in general; specifically, how to manipulate data in R using the reshape2 and plyr packages and in Python using the pandas module.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We use the asterisk in a*ply and elsewhere to indicate a collection of functions obtained by substituting the asterisk with other characters.

References

  • H. Wickham. Reshaping data with the reshape package. Journal of Statistical Software, 21 (12), 2007.

    Google Scholar 

  • H. Wickham. The split-apply-combine strategy for data analysis. Journal of Statistical Software, 40 (1), 2011.

    Google Scholar 

  • R. J. A. Little and D. B. Rubin. Statistical Analysis with Missing Data. Wiley, second edition, 2002.

    Google Scholar 

  • P. J. Huber. Robust Statistics. Wiley, 1981.

    Google Scholar 

  • R. Maronna, D. R. Martin, and V. J. Yohai. Robust Statistics: Theory and Methods. Wiley, 2006.

    Google Scholar 

  • M. Kutner, C. Nachtsheim, J. Neter, and W. Li. Applied Linear Statistical Models. McGraw-Hill, fifth edition, 2004.

    Google Scholar 

  • P. Spector. Data Manipulation with R. Springer, 2008.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Lebanon, G., El-Geish, M. (2018). Processing Data in R and Python. In: Computing with Data. Springer, Cham. https://doi.org/10.1007/978-3-319-98149-9_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-98149-9_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-98148-2

  • Online ISBN: 978-3-319-98149-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics