NumPy

Language: Python

Data Science

NumPy was created in 2005 by Travis Oliphant as an evolution of the older Numeric and Numarray libraries. It standardized array computing in Python and became the backbone of the Python scientific computing ecosystem, enabling high-performance numerical computations.

NumPy is a fundamental package for scientific computing in Python. It provides powerful n-dimensional array objects, vectorized operations, linear algebra functions, random number capabilities, and integration with C/C++ and Fortran code.

Installation

pip: pip install numpy
conda: conda install numpy

Usage

NumPy provides ndarray, a multidimensional array object, and functions for fast operations on arrays. It supports element-wise operations, broadcasting, linear algebra, statistical functions, random sampling, and more.

Creating arrays and basic operations

import numpy as np
arr = np.array([1,2,3])
print(arr + 1)
print(arr * 2)

Create a 1D array and perform element-wise addition and multiplication.

2D arrays and matrix multiplication

import numpy as np
A = np.array([[1,2],[3,4]])
B = np.array([[5,6],[7,8]])
print(A @ B)

Create 2x2 matrices and perform matrix multiplication using the @ operator.

Broadcasting example

import numpy as np
arr = np.array([[1,2,3],[4,5,6]])
print(arr + np.array([10,20,30]))

Demonstrates broadcasting: the 1D array is automatically expanded to match the 2D array for element-wise addition.

Statistical functions

import numpy as np
arr = np.array([1,2,3,4,5])
print(np.mean(arr))
print(np.std(arr))

Compute mean and standard deviation of a numeric array.

Random sampling

import numpy as np
rand_arr = np.random.randn(3,3)
print(rand_arr)

Generate a 3x3 array of samples from a standard normal distribution.

Linear algebra operations

import numpy as np
A = np.array([[1,2],[3,4]])
print(np.linalg.inv(A))
print(np.linalg.eig(A))

Compute the inverse and eigenvalues/eigenvectors of a square matrix.

Error Handling

ValueError: shapes not aligned: Check that matrix dimensions match when performing dot products or matrix multiplication.
IndexError: index out of bounds: Ensure array indices are within valid dimensions.
TypeError: unsupported operand type: Verify that arrays have compatible numeric types for operations.

Best Practices

Use vectorized operations instead of Python loops for performance.

Leverage broadcasting for efficient computation on arrays of different shapes.

Prefer NumPy functions over manual Python calculations for large datasets.

Be mindful of array shapes and memory layout (C-contiguous vs F-contiguous).

Use NumPy random functions with fixed seeds for reproducibility in experiments.