5  NumPy

NumPy is a powerful library in Python that provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays. It is one of the core libraries for numerical computing in Python and is widely used in data science, machine learning, scientific computing, and engineering.

5.1 Import NumPy

import numpy as np

1D Array

arr = np.array([1, 2, 3, 4, 5])
print(arr)
[1 2 3 4 5]

2D Array (Matrix)

matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(matrix)
[[1 2 3]
 [4 5 6]
 [7 8 9]]

5.2 Array operations: Element-wise

arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
result = arr1 + arr2
print(arr1)
print(arr2)
print(result)
[1 2 3]
[4 5 6]
[5 7 9]

5.3 Matrix multiplication

mat1 = np.array([[1, 2], [3, 4]])
mat2 = np.array([[5, 6], [7, 8]])
product = np.dot(mat1, mat2)
print(mat1)
print(mat2)
print(product)
[[1 2]
 [3 4]]
[[5 6]
 [7 8]]
[[19 22]
 [43 50]]

5.4 Slicing and indexing

arr = np.array([10, 20, 30, 40, 50])
sliced_arr = arr[1:4] 
print(arr)
print(sliced_arr)
[10 20 30 40 50]
[20 30 40]

5.5 Example using NumPy

import numpy as np

# Creating a 2x3 matrix
matrix = np.array([[1, 2, 3], [4, 5, 6]])

# Adding a scalar value to the matrix
result = matrix + 10

# Transposing the matrix
transpose = np.transpose(matrix)

print("Original Matrix:\n", matrix)
print("Matrix after adding 10:\n", result)
print("Transposed Matrix:\n", transpose)
Original Matrix:
 [[1 2 3]
 [4 5 6]]
Matrix after adding 10:
 [[11 12 13]
 [14 15 16]]
Transposed Matrix:
 [[1 4]
 [2 5]
 [3 6]]

5.6 Summary statistics

What are summary statistics?

x = [10, 11, 23, 16, 20]
  1. What is the minimum value?
  2. What is the maximum value?
  3. What is the range?
  4. What is the average?
  5. What is the medium?
  6. What is the variance?
  7. What is the standard deviation?

5.7 Using NumPy to summarize data

Code
import numpy as np

# Create a 1D array
data = np.array([10, 20, 30, 40, 50, 60, 70, 80, 90, 100])

# Calculate the average (mean)
mean = np.mean(data)

# Calculate the median
median = np.median(data)

# Calculate the standard deviation
std_dev = np.std(data)

# Calculate the variance
variance = np.var(data)

# Calculate the minimum and maximum values
min_value = np.min(data)
max_value = np.max(data)

# Output the results
print(f"Mean: {mean}")
print(f"Median: {median}")
print(f"Standard Deviation: {std_dev}")
print(f"Variance: {variance}")
print(f"Minimum Value: {min_value}")
print(f"Maximum Value: {max_value}")
Mean: 55.0
Median: 55.0
Standard Deviation: 28.722813232690143
Variance: 825.0
Minimum Value: 10
Maximum Value: 100

5.8 What are random numbers

A random number is a number that is generated in such a way that each possible value has an equal chance of being chosen, with no predictable pattern or sequence. Random numbers are often used in various applications like simulations, cryptography, games, and statistical sampling, where unpredictability is essential.

In computing, random numbers are typically generated using algorithms known as pseudo-random number generators (PRNGs), which produce sequences of numbers that appear random, even though they are determined by an initial value known as a seed.

5.9 Generating random numbers

import numpy as np

# Generate a single random integer between 0 and 10
random_integer = np.random.randint(0, 10)

# Generate an array of 5 random integers between 0 and 10
random_integers_array = np.random.randint(0, 10, size=5)

print("Single random integer:", random_integer)
print("Array of random integers:", random_integers_array)
Single random integer: 1
Array of random integers: [9 9 8 7 4]

5.10 Exercises

  1. What is the primary purpose of NumPy in Python? And how do you access the NumPy library in your Python code?

  2. Simone writes the following line of code:

arr = np.array([1, 2, 3])

What kind of array does this create, and how are the elements organized? How could she change the structure to make it a 2D array instead?

  1. What is a NumPy array, and how is it different from a regular list? Why might someone choose to use a NumPy array instead of a list when working with numerical data?

  2. Taylor is editing a list of test scores stored as a NumPy array: 4, 8, 12, 16. She wants to replace the second to fourth items with: 20, 24, 28. What code should she use in Python, and what will the updated array look like?

  3. Raj creates a NumPy array of quiz scores and wants to give everyone a 5-point bonus. He adds the number 5 to the array and sees that each score increases. Why does this work with a NumPy array, and what would happen if he tried the same thing with a regular Python list?

  4. The following code produces an error. What’s the problem and how do you fix it?

a = np.array([1, 2, 3])
b = np.array([1, 2])
print(a + b)
  1. Martha multiplies a NumPy array by 2 to adjust test scores. What happens to the values, and why is this useful for data processing?

  2. What will the following code print?

a = np.array([1, 2, 3])
b = np.array([10, 20, 30])
print(a * b)
  1. This code is supposed to double every element in the array.
arr = np.array([2, 4, 6])
arr * 2
print(arr)

What’s the mistake?

  1. This code is supposed to return the dot product of two matrices. But it gives the wrong output. What should be changed?
a = np.array([[1, 2], [3, 4]])
b = np.array([[2, 0], [1, 2]])
result = np.multiply(a, b)
  1. How do you access the second row of a 2D array named arr?

  2. What does the following code return? Explain.

arr = np.array([5, 10, 15, 20])
print(arr[1:3])
  1. You want to add 5 to every element in a NumPy array without using a loop. Is this possible?

  2. A student writes the following code and is confused when it doesn’t work:

import numpy as np

arr = np.array([1, 2, 3, 4])
print(arr.mean)

What are the two mistakes in this code, and how should they be corrected?

  1. What is a pseudorandom number and why is it used in NumPy?

  2. Pseudorandom numbers are truly unpredictable. True or false?

  3. What does setting a seed with np.random.seed() do?

  4. Why might you use np.random.seed() in a group project or assignment?

  5. You want to generate random integers using NumPy. Which function would you use to create an array of random integers between two values, and what should you keep in mind about the range it covers?

  6. You’re simulating 15 rolls of a 10-sided die using np.random.randint(), where each roll gives a number from 1 to 10. Write code to generate the array of rolls. Then, explain how you could count how many times the number 10 appears.

  7. This code is meant to generate 5 random integers between 10 and 20:

nums = np.random.randint(10, 20, size=2)

What’s the mistake, and how would you correct it?

  1. You have a 2D array with 2 rows and 2 columns. What happens when you use np.transpose() on it?

  2. Given the array below, write the NumPy functions to find the: minimum, maximum, mean, median, variance, and standard deviation.

arr = np.array([5, 10, 15, 20])
  1. A student uses np.random.randint(1, 5, size=10) and is surprised they never see the number 5. Explain why this happens and how they can fix the code to include 5 in the results.