Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Mastering Python Scientific Computing
Mastering Python Scientific Computing
Mastering Python Scientific Computing
Ebook606 pages4 hours

Mastering Python Scientific Computing

Rating: 4 out of 5 stars

4/5

()

Read preview

About this ebook

If you are a Python programmer and want to get your hands on scientific computing, this book is for you. The book expects you to have had exposure to various concepts of Python programming.
LanguageEnglish
Release dateSep 23, 2015
ISBN9781783288830
Mastering Python Scientific Computing

Related to Mastering Python Scientific Computing

Related ebooks

Enterprise Applications For You

View More

Related articles

Reviews for Mastering Python Scientific Computing

Rating: 4 out of 5 stars
4/5

1 rating0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Mastering Python Scientific Computing - Mehta Hemant Kumar

    Table of Contents

    Mastering Python Scientific Computing

    Credits

    About the Author

    About the Reviewers

    www.PacktPub.com

    Support files, eBooks, discount offers, and more

    Why subscribe?

    Free access for Packt account holders

    Preface

    What this book covers

    What you need for this book

    Who this book is for

    Conventions

    Reader feedback

    Customer support

    Downloading the example code

    Downloading the color images of this book

    Errata

    Piracy

    Questions

    1. The Landscape of Scientific Computing – and Why Python?

    Definition of scientific computing

    A simple flow of the scientific computation process

    Examples from scientific/engineering domains

    A strategy for solving complex problems

    Approximation, errors, and associated concepts and terms

    Error analysis

    Conditioning, stability, and accuracy

    Backward and forward error analysis

    Is it okay to ignore these errors?

    Computer arithmetic and floating-point numbers

    The background of the Python programming language

    The guiding principles of the Python language

    Why Python for scientific computing?

    Compact and readable code

    Holistic language design

    Free and open source

    Language interoperability

    Portable and extensible

    Hierarchical module system

    Graphical user interface packages

    Data structures

    Python's testing framework

    Available libraries

    The downsides of Python

    Summary

    2. A Deeper Dive into Scientific Workflows and the Ingredients of Scientific Computing Recipes

    Mathematical components of scientific computations

    A system of linear equations

    A system of nonlinear equations

    Optimization

    Interpolation

    Extrapolation

    Numerical integration

    Numerical differentiation

    Differential equations

    The initial value problem

    The boundary value problem

    Random number generator

    Python scientific computing

    Introduction to NumPy

    The SciPy library

    The SciPy Subpackage

    Data analysis using pandas

    A brief idea of interactive programming using IPython

    IPython parallel computing

    IPython Notebook

    Symbolic computing using SymPy

    The features of SymPy

    Why SymPy?

    The plotting library

    Summary

    3. Efficiently Fabricating and Managing Scientific Data

    The basic concepts of data

    Data storage software and toolkits

    Files

    Structured files

    Unstructured files

    Database

    Possible operations on data

    Scientific data format

    Ready-to-use standard datasets

    Data generation

    Synthetic data generation (fabrication)

    Using Python's built-in functions for random number generation

    Bookkeeping functions

    Functions for integer random number generation

    Functions for sequences

    Statistical-distribution-based functions

    Nondeterministic random number generator

    Designing and implementing random number generators based on statistical distributions

    A program with simple logic to generate five-digit random numbers

    A brief note about large-scale datasets

    Summary

    4. Scientific Computing APIs for Python

    Numerical scientific computing in Python

    The NumPy package

    The ndarrays data structure

    File handling

    Some sample NumPy programs

    The SciPy package

    The optimization package

    The interpolation package

    Integration and differential equations in SciPy

    The stats module

    Clustering package and spatial algorithms in SciPy

    Image processing in SciPy

    Sample SciPy programs

    Statistics using SciPy

    Optimization in SciPy

    Image processing using SciPy

    Symbolic computations using SymPy

    Computer Algebra System

    Features of a general-purpose CAS

    A brief idea of SymPy

    Core capability

    Polynomials

    Calculus

    Solving equations

    Discrete math

    Matrices

    Geometry

    Plotting

    Physics

    Statistics

    Printing

    SymPy modules

    Simple exemplary programs

    Basic symbol manipulation

    Expression expansion in SymPy

    Simplification of an expression or formula

    Simple integrations

    APIs and toolkits for data analysis and visualization

    Data analysis and manipulation using pandas

    Important data structures of pandas

    Special features of pandas

    Data visualization using matplotlib

    Interactive computing in Python using IPython

    Sample data analysis and visualization programs

    Summary

    5. Performing Numerical Computing

    The NumPy fundamental objects

    The ndarray object

    The attributes of an array

    Basic operations on arrays

    Special operations on arrays (shape change and conversion)

    Classes associated with arrays

    The matrix sub class

    Masked array

    The structured/recor array

    The universal function object

    Attributes

    Methods

    Various available ufunc

    The NumPy mathematical modules

    Introduction to SciPy

    Mathematical functions in SciPy

    Advanced modules/packages

    Integration

    Signal processing (scipy.signal)

    Fourier transforms (scipy.fftpack)

    Spatial data structures and algorithms (scipy.spatial)

    Optimization (scipy.optimize)

    Interpolation (scipy.interpolate)

    Linear algebra (scipy.linalg)

    Sparse eigenvalue problems with ARPACK

    Statistics (scipy.stats)

    Multidimensional image processing (scipy.ndimage)

    Clustering

    Curve fitting

    File I/O (scipy.io)

    Summary

    6. Applying Python for Symbolic Computing

    Symbols, expressions, and basic arithmetic

    Equation solving

    Functions for rational numbers, exponentials, and logarithms

    Polynomials

    Trigonometry and complex numbers

    Linear algebra

    Calculus

    Vectors

    The physics module

    Hydrogen wave functions

    Matrices and Pauli algebra

    The quantum harmonic oscillator in 1-D and 3-D

    Second quantization

    High-energy Physics

    Mechanics

    Pretty printing

    LaTeX Printing

    The cryptography module

    Parsing input

    The logic module

    The geometry module

    Symbolic integrals

    Polynomial manipulation

    Sets

    The simplify and collect operations

    Summary

    7. Data Analysis and Visualization

    Matplotlib

    The architecture of matplotlib

    The scripting layer (pyplot)

    The artist layer

    The backend layer

    Graphics with matplotlib

    Output generation

    The pandas library

    Series

    DataFrame

    Panel

    The common functionality among the data structures

    Time series and date functions

    Handling missing data

    I/O operations

    Working on CSV files

    Ready-to-eat datasets

    The pandas plotting

    IPython

    The IPython console and system shell

    The operating system interface

    Nonblocking plotting

    Debugging

    IPython Notebook

    Summary

    8. Parallel and Large-scale Scientific Computing

    Parallel computing using IPython

    The architecture of IPython parallel computing

    The components of parallel computing

    The IPython engine

    The IPython controller

    IPython view and interfaces

    The IPython client

    Example of performing parallel computing

    A parallel decorator

    IPython's magic functions

    Activating specific views

    Engines and QtConsole

    Advanced features of IPython

    Fault-tolerant execution

    Dynamic load balancing

    Pushing and pulling objects between clients and engines

    Database support for storing the requests and results

    Using MPI in IPython

    Managing dependencies among tasks

    Functional dependency

    Decorators for functional dependency

    Graph dependency

    Impossible dependencies

    The DAG dependency and the NetworkX library

    Using IPython on an Amazon EC2 cluster with StarCluster

    A note on security of IPython

    Well-known parallel programming styles

    Issues in parallel programming

    Parallel programming

    Concurrent programming

    Distributed programming

    Multiprocessing in Python

    Multithreading in Python

    Hadoop-based MapReduce in Python

    Spark in Python

    Summary

    9. Revisiting Real-life Case Studies

    Scientific computing applications developed in Python

    The one Laptop per Child project used Python for their user interface

    ExpEYES – eyes for science

    A weather prediction application in Python

    An aircraft conceptual designing tool and API in Python

    OpenQuake Engine

    SMS Siemag AG application for energy efficiency

    Automated code generator for analysis of High-energy Physics data

    Python for computational chemistry applications

    Python for developing a Blind Audio Tactile Mapping System

    TAPTools for air traffic control

    Energy-efficient lights with an embedded system

    Scientific computing libraries developed in Python

    A maritime designing API by Tribon

    Molecular Modeling Toolkit

    Standard Python packages

    Summary

    10. Best Practices for Scientific Computing

    The best practices for designing

    The implementation of best practices

    The best practices for data management and application deployment

    The best practices to achieving high performance

    The best practices for data privacy and security

    Testing and maintenance best practices

    General Python best practices

    Summary

    Index

    Mastering Python Scientific Computing


    Mastering Python Scientific Computing

    Copyright © 2015 Packt Publishing

    All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

    Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

    Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

    First published: September 2015

    Production reference: 1180915

    Published by Packt Publishing Ltd.

    Livery Place

    35 Livery Street

    Birmingham B3 2PB, UK.

    ISBN 978-1-78328-882-3

    www.packtpub.com

    Credits

    Author

    Hemant Kumar Mehta

    Reviewers

    Austen Groener

    Sachin R. Joglekar

    Commissioning Editor

    Kartikey Pandey

    Acquisition Editor

    Kevin Colaco

    Content Development Editor

    Arshiya Umer

    Technical Editor

    Mohita Vyas

    Copy Editor

    Vikrant Phadke

    Project Coordinator

    Sanjeet Rao

    Proofreader

    Safis Editing

    Indexer

    Tejal Soni

    Graphics

    Jason Monteiro

    Production Coordinator

    Aparna Bhagat

    Cover Work

    Aparna Bhagat

    About the Author

    Hemant Kumar Mehta is a distributed and scientific computing enthusiast. He has more than 13 years of experience of teaching, research, and software development. He received his BSc (in computer science) Hons., master of computer applications degree, and PhD in computer science from Devi Ahilya University, Indore, India in 1998, 2001, and 2011, respectively. He has experience of working in diverse international environments as a software developer in MNCs. He is a post-doctorate fellow at an international university of high reputation.

    Hemant has published more than 20 highly cited research papers in reputed national and international conferences and journals sponsored by ACM, IEEE, and Springer. He is the author of Getting Started with Oracle Public Cloud, Packt Publishing. He is also the coauthor of a book named Internet and Web Technology, published by Kaushal Prakashan Mandir, Indore.

    He earned his PhD in the field of cloud computing and big data. Hemant is a member of ACM (Special Interest Group on High-performance Computing Education: SIGHPC-Edu), senior member of IEEE (the computer society, STC on cloud computing, and the big data technical committee), and a senior member of IACSIT, IAENG, and MIR Labs.

    I am extremely thankful to my PhD supervisors, namely Professor Priyesh Kanungo and the late Professor Manohar Chandwani from Devi Ahilya University. Their words work as continuous guiding lights in my career and life.

    I express heartfelt thanks to my dear student and friend, Pawan Pawar, for helping me develop some programs for this book.

    I am also thankful to the entire Packt Publishing team and the reviewers for their tremendous support in maintaining the highest quality of work in this book.

    Most of all, I thank my family. I am infinitely grateful to my parents. I thank my wife, Priya, and darling sons, Luv and Darsh, for whom this acknowledgement cannot be covered in words.

    About the Reviewers

    Austen Groener was raised in Southfield, Massachusetts, USA. He completed his BA in physics from Hartwick College and went on to pursue his MS and PhD in physics from Drexel University in Philadelphia, Pennsylvania, USA. He is a reputed astrophysicist, with research interests surrounding the detailed distribution of dark matter within the largest objects in the universe—galaxy clusters. When he is not studying the cosmos, he enjoys spending his free time developing software tools for other astronomers to use. Austen has a newfound interest in web development.

    I would like to thank my family and friends for their unwavering support. To my wife, Brittany: you are the love of my life, my best friend, and my inspiration.

    Sachin R. Joglekar is a computer science graduate from BITS-Pilani (Goa campus) in India. His areas of interest primarily include machine learning and intelligent systems. He graduated in December 2014. Since then, he has been working as the cofounder of a start-up based in Mumbai. His work involves the design and development of server infrastructure and backend analytics for sensor networks. Sachin has also worked as an open source developer for SymPy, a symbolic computing library written in pure Python. His work at Google Summer of Code 2014 involved developing the vector module for SymPy.

    www.PacktPub.com

    Support files, eBooks, discount offers, and more

    For support files and downloads related to your book, please visit www.PacktPub.com.

    Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at for more details.

    At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

    https://www2.packtpub.com/books/subscription/packtlib

    Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.

    Why subscribe?

    Fully searchable across every book published by Packt

    Copy and paste, print, and bookmark content

    On demand and accessible via a web browser

    Free access for Packt account holders

    If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view 9 entirely free books. Simply use your login credentials for immediate access.

    To my parents and my gurus, Late Prof. Manohar Chandwani and Prof. Priyesh Kanungo

    Preface

    This book covers the Python APIs and toolkits used to perform scientific computing. It is highly recommended for readers who perform computerized engineering or scientific computations. Scientific computing is an interdisciplinary branch that requires a background in computer science, mathematics, general science (at least any one branch out of physics, chemistry, environmental science, biology, and others), and engineering. Python consists of a large number of packages, APIs, and toolkits for supporting the functionalities required by these diverse scientific and engineering domains.

    A large community of users, lots of help and documentation, a large collection of scientific libraries and environments, great performance, and good support make Python a great choice for scientific computing.

    What this book covers

    Chapter 1, The Landscape of Scientific Computing – and Why Python?, introduces the basic concepts of scientific computing. It also discusses the background of Python, its guiding principle, and why using Python for scientific computing is efficient.

    Chapter 2, A Deeper Dive into Scientific Workflows and the Ingredients of Scientific Computing Recipes, discusses the various concepts of mathematical and numerical analysis that are generally required to solve scientific problems. It also covers a brief introduction to the packages, toolkits, and APIs meant for performing scientific computing in the Python language.

    Chapter 3, Efficiently Fabricating and Managing Scientific Data, discusses all the aspects about the underlying data of scientific applications, including the basic concepts, various operations, and the formats and software used to store data. It also presents standard datasets and techniques of preparing synthetic data.

    Chapter 4, Scientific Computing APIs for Python, covers the basic concepts, features, and selected sample programs of various scientific computing APIs and toolkits, including NumPy, SciPy, and SymPy. A basic introduction to interactive computing, data analysis, and data visualization is also discussed in this chapter using IPython, matplotlib, and pandas.

    Chapter 5, Performing Numerical Computing, discusses how to perform numerical computations using the NumPy and SciPy packages of Python. This chapter starts with the basics of numerical computation and covers a number of advanced concepts, such as optimization, interpolation, Fourier transformation, signal processing, linear algebra, statistics, spatial algorithms, image processing, file input/output, and others.

    Chapter 6, Applying Python for Symbolic Computing, starts with the fundamentals of the Computerized Algebra System (CAS) and performing symbolic computations using SymPy. It covers a vast range of topics on CAS, from using simple expressions and basic arithmetic to advanced concepts of mathematics and physics.

    Chapter 7, Data Analysis and Visualization, presents the concepts and applications of matplotlib and pandas for data analysis and visualization.

    Chapter 8, Parallel and Large-scale Scientific Computing, discusses the concepts of high-performance scientific computing using IPython (which is done using MPI), the management of the Amazon EC2 cluster using StarCluster, multiprocessing, multithreading, Hadoop, and Spark.

    Chapter 9, Revisiting Real-life Case Studies, illustrates several case studies of scientific computing applications, libraries, and tools developed using the Python language. Some cases studied from various engineering and science domains are presented in this chapter.

    Chapter 10, Best Practices for Scientific Computing, discusses the best practices for scientific computing. It consists of the best practices for designing, coding, data management, application deployment, high-performance computing, security, data privacy, maintenance, and support. We also cover the best practices for general Python-based development.

    What you need for this book

    The example programs given in this book require a computer with Python 2.7.9 or a higher version, and several Python APIs/packages/toolkits. You will also require some Python libraries (namely NumPy, SciPy, SymPy, matplotlib, pandas, IPython), the IPython.parallel package, pyzmq, SSH for security (if necessary), and Hadoop.

    Who this book is for

    The book is intended for Python programmers willing to get hands-on exposure to scientific computing. The book expects that you have had exposure to various concepts of Python programming.

    Conventions

    In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.

    Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: The functions of the random module are bound methods of a hidden instance of the random.Random class.

    A block of code is set as follows:

    import random

    print random.random()

    print random.uniform(1,9)

    print random.randrange(20)

    print random.randrange(0, 99, 3)

    print random.choice('ABCDEFGHIJKLMNOPQRSTUVWXYZ') # Output 'P'

    items = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

    random.shuffle(items)

    print items

    print random.sample([1, 2, 3, 4, 5, 6, 7, 8, 9, 10],  5) 

    weighted_choices = [('Three', 3), ('Two', 2), ('One', 1), ('Four', 4)]

    population = [val for val, cnt in weighted_choices for i in range(cnt)]

    print random.choice(population)

    Note

    Warnings or important notes appear in a box like this.

    Tip

    Tips and tricks appear like this.

    Reader feedback

    Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

    To send us general feedback, simply e-mail <feedback@packtpub.com>, and mention the book's title in the subject of your message.

    If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

    Customer support

    Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

    Downloading the example code

    You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

    Downloading the color images of this book

    We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from https://www.packtpub.com/sites/default/files/downloads/8823OS.pdf.

    Errata

    Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.

    To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

    Piracy

    Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

    Please contact us at <copyright@packtpub.com> with a link to the suspected pirated material.

    We appreciate your help in protecting our authors and our ability to bring you valuable content.

    Questions

    If you have a problem with any aspect of this book, you can contact us at <questions@packtpub.com>, and we will do our best to address the problem.

    Chapter 1. The Landscape of Scientific Computing – and Why Python?

    Using computerized mathematical modeling and numerical analysis techniques to analyze and solve problems in the science and engineering domains is called scientific computing. Scientific problems include problems from various branches of science, such as earth science, space science, social science, life science, physical science, and formal science. These branches cover almost all the science domains that exist, from traditional science to modern engineering science, such as computer science. Engineering problems include problems from civil and electrical to (the latest) biomedical engineering.

    In this chapter, we will cover the following topics:

    Fundamentals of scientific computing

    The flow of the scientific computation process

    Examples from scientific and engineering domains

    The strategy to solve complex problems

    Approximation, errors, and related terms

    Concepts of error analysis

    Computer arithmetic and floating-point numbers

    A background of Python

    Why choose Python for scientific computing?

    Mathematical modeling refers to

    Enjoying the preview?
    Page 1 of 1