Anda di halaman 1dari 52

Arrays (Lists) in Python

one thing after another

Problem

Given 5 numbers, read them in


and calculate their average
THEN print out the ones that were
above average

Data Structure Needed

Need some way to hold onto all the


individual data items after
processing them
making individual identifiers x1, x2,
x3,... is not practical or flexible
the answer is to use an ARRAY
a data structure - bigger than an
individual variable or constant

An Array (a List)

You need a way to have many variables


all with the same name but
distinguishable!
In math they do it by subscripts or
indexes

x1, x2, x3 and so on

In programming languages, hard to use


smaller fonts, so use a different syntax

x [1], x[0], table[3], point[i]

Semantics

numbered from 0 to n-1 where n is


the number of elements
0
1
2
3
4
5

Properties of an array (list)

Heterogeneous (any data type!)


Contiguous
Have random access to any element
Ordered (numbered from 0 to n-1)
Number of elements can change very
easily (use method .append)
Python lists are mutable sequences of
arbitrary objects

Syntax

Use [] to give initial value to, like x =


[1,3,5]
refer to individual elements

uses [ ] with index in the brackets

most of the time you dont refer to the


whole array as one thing, or just by the
array name (one time you can is when
passing a whole array to a function as an
argument)

List Operations you know


Operator
<seq> + <seq>
<seq> * <intexpr>
<seq>[]
len(<seq>)
<seq>[:]
for <var> in
<seq>:

Meaning
Concatenation
Repetition
Indexing
Length
Slicing
Iteration

Python Programming, 2/e

Indexing an Array

The index is also called the subscript

In Python, the first array element always


has subscript 0, the second array
element has subscript 1, etc.
Subscripts can be variables they have
to have integer values
k =4
items = [3,9,a,True, 3.92]
items[k] = 3.92
items[k-2] = items[2] = a

List Operations

Lists are often built up one piece at


a time using append.
nums = []
x = float(input('Enter a number: '))
while x >= 0:
nums.append(x)
x = float(input('Enter a number: '))

Here, nums is being used as an


accumulator, starting out empty,
and each time through the loop a
new value is tacked on.
Python Programming, 2/e

10

List Operations
Method

Meaning

<list>.append(x)

Add element x to end of list.

<list>.sort()

Sort (order) the list. A comparison function may be


passed as a parameter.

<list>.reverse()

Reverse the list.

<list>.index(x)

Returns index of first occurrence of x.

<list>.insert(i, x)

Insert x into list at index i.

<list>.count(x)

Returns the number of occurrences of x in list.

<list>.remove(x)

Deletes the first occurrence of x in list.

<list>.pop(i)

Deletes the ith element of the list and returns its value.

Python Programming, 2/e

11

Using a variable for the


size

It is very common to use a variable to


store the size of an array
SIZE = 15
arr = []
for i in range(SIZE):
arr.append(i)

Makes it easy to change if size of


array needs to be changed

Solution to starting
problem
SIZE = 5
n = [0]*SIZE
total = 0
for ct in range(SIZE):
n[ct] = float(input("enter a number ))
total = total + n[ct]
cont'd on next slide

Solution to problem cont'd


average = total / SIZE
for ct in range(5):
if n[ct] > average:
print (n[ct])

Scope of counter in a for


loop

The counter variable has usual


scope (body of the function its in)

for i in range(5):

counter does exist after for loop


finishes
whats its value after the loop?

Initialization of arrays

a = [1, 2, 9, 10] # has 4 elements


a = [0] * 5 # all are zero

Watch out index out of


range!

Subscripts range from 0 to n-1


Interpreter WILL tell you if an index
goes out of that range
BUT the negative subscripts work as
they do with strings (which are, after
all, arrays of characters)
x = [5]*5
x[-1] = 4 # x is [5,5,5,5,4]

Assigning Values to
Individual
Array
Elements
temps = [0.0] * 5
m=4
temps[2] = 98.6;
temps[3] = 101.2;
temps[0] = 99.4;
temps[m] = temps[3] / 2.0;
temps[1] = temps[3] - 1.2;
// What value is assigned?
7000

99.4
temps[0]

7004

7008

98.6

temps[1]

temps[2]

7012

101.2
temps[3]

7016

50.6
temps[4]

18

What values are assigned?


SIZE =5
temps = [0.0]* SIZE
for m in range(SIZE):
temps[m] = 100.0 + m * 0.2
for m in range(SIZE-1, -1, -1):
print(temps[m])
7000

7004

7008

7012

7016

temps[0]

temps[1]

temps[2]

temps[3]

temps[4]

19

Indexes

Subscripts can be constants or


variables or expressions
If i is 5, a[i-1] refers to a[4] and a[i*2]
refers to a[10]
you can use i as a subscript at one
point in the program and j as a
subscript for the same array later only the value of the variable matters

Variable Subscripts
temps = [0.0]*5
m=3
......

What is temps[m + 1] ?
What is temps[m] + 1 ?
7000

100.0
temps[0]

7004

7008

7012

7016

100.2

100.4

100.6

100.8

temps[1]

temps[2]

temps[3]

temps[4]
21

Random access of
elements

Problem : read in numbers from a file,


only single digits - and count them report how many of each there were
Use an array as a set of counters

ctr [0] is how many zero's, ctr[1] is how


many ones, etc.

ctr[num] +=1
statement

is the crucial

Parallel arrays

Sometimes you have data of different


types that are associated with each
other
like name (string) and GPA (float)
You CAN store them in the same array

ar = [John, 3.24, Mary, 3.9, Bob, 2.7]

You can also use two different arrays


"side by side"

Parallel arrays, cont'd


for i in range(SIZE):
name[i], gpa[i] =
float(input(Enter))
Logically the name in position i
corresponds to the gpa in position i
Nothing in the syntax forces this to
be true, you just have to program it to
be so.

Parallel Arrays
Parallel arrays are two or more arrays that
have the same index range and whose
elements contain related information,
possibly of different data types

EXAMPLE
SIZE = 50
idNumber = [ ]*SIZE
hourlyWage = [0.0] *SIZE
arrays

parallel

25

SIZE = 50
idNumber = [ ] *SIZE
hourlyWage =[0.0] *SIZE

// Parallel arrays hold


// Related information

idNumber[0]

4562

hourlyWage[0]

9.68

idNumber[1]

1235

hourlyWage[1]

45.75

idNumber[2]

6278

hourlyWage[2]

12.71

.
.
.

.
.
.

.
.
.

.
.
.

idNumber[48]

8754

hourlyWage[48]

67.96

idNumber[49]

2460

hourlyWage[49]

8.97

26

Selection sort - 1-d array


Algorithm for the sort
1. find the maximum in the list
2. put it in the highest numbered element
by swapping it with the data that was at
that location
3. repeat 1 and 2 for shorter unsorted list not including highest numbered location
4. repeat 1-3 until list goes down to one

Find the maximum in the


list
# n is number of elements
max = a[0] # value of largest element
# seen so far
for i in range(1, n): # note start at 1, not 0
if max < a[i]:
max = a[i]
# now max is value of largest element in
list

Find the location of the


max
max = 0 # max is now location of
the
# largest seen so far
for i in range(1,n):
if a[max] < a[i]:
max = i
# now max is location of the largest in
# array

Swap with highest


numbered
Remember element at right end of
list is
numbered n-1
temp = a[max]
a[max] = a[n-1]
a[n-1] = temp
# there is a shorter way in Python!

The Python way!

The previous code of finding the


max and its location will work in
ANY high-level language.
Python has some nice functions
and methods to make it easier!
Lets try that.

The Python Way


To find the max of the whole list
mx = max(a)
loc = a.index(mx)
Is using index SAFE here? If it doesnt
find mx in a, it will crash!
But you just got mx from the list using
the max function, so it IS in the list a.

The Python Way

The swap then becomes


a[loc], a[n-1] = a[n-1],a[loc]
Python hides the temporary third
variable

Find next largest element


and swap (generic way)
max = 0;
for i in range(1,n-1): # note n-1, not n
if a[max] < a[i]:
max = i
temp = a[max]
a[max] = a[n-2]
a[n-2] = temp

put a loop around the


general code to repeat for
n-1
passes
for pss in range(n, 1, -1):
max = 0
for i in range(1,pss):
if a[max] <= a[i]:
max = i
temp = a[max]
a[max] = a[pss-1]
a[pss-1] = temp

The whole thing the


Python way
for pss in range(n, 1, -1): # n-1
passes
mx = max(a[0:pss])
loc = a.index(mx)
a[loc], a[pss-1] = a[pss-1],
a[loc]

2-dimensional arrays

Data sometimes has more


structure to it than just "a list"
It has rows and columns
You use two subscripts to locate an
item
The first subscript called row,
second called column

2-dimensional arrays

syntax

a = [[0]*5 for i in range(4)]


# 5 columns, 4 rows
Twenty elements, numbered from [0][0] to
[4][3]
a = [[0]*COLS for i in range(ROWS)]

Which has ROWS rows and COLS columns in


each row (use of variables to make it easy to
change the size of the array without having to
edit every line of the program)

EXAMPLE -- Array for monthly high


temperatures for all 50 states
NUM_STATES = 50
NUM_MONTHS = 12
stateHighs = [[0]*NUM_MONTHS for i in range(NUM_STATES)]
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10][11]
[0]
[1]
[2]
.
.

66 64 72 78 85 90 99 105 98 90 88 80
row 2,
col 7
stateHighs[2]
might be
[7]
Arizonas
.
high for
[48]
August
[49]

39

Processing a 2-d array by


rows
finding the total for the first row
for i in range(NUM_MONTHS):
total = total + a[0][i]
finding the total for the second row
for i in range(NUM_MONTHS):
total = total + a[1][i]

Processing a 2-d array by


rows
total for ALL elements by adding
first row, then second row, etc.
for i in range(NUM_STATES):
for j in range(NUM_MONTHS):
total = total + a[i][j]

Processing a 2-d array by


columns
total for ALL elements by adding
first column, second column, etc.
for j in range(NUM_MONTHS):
for i in range(NUM_STATES):
total = total + a[i][j]

Finding the average high temperature for Arizona

total = 0
for month in range(NUM_MONTHS):
total = total + stateHighs[2][month]
average = round (total / NUM_MONTHS)

average

85
43

Passing an array as an
argument

Arrays (lists) are passed by


reference = they CAN be changed
permanently by the function
Definition def fun1 (arr):
Call the function as
x = fun1 (myarr)

Arrays versus Files

Arrays are usually smaller than files


Arrays are faster than files
Arrays are temporary, in RAM - files
are permanent on secondary storage
Arrays can do random or sequential,
files we have seen are only sequential

Using Multidimensional Arrays

Example of three-dimensional array

46

NUM_DEPTS = 5 # mens, womens, childrens, electronics, furniture


NUM_MONTHS = 12
NUM_STORES = 3 # White Marsh, Owings Mills, Towson
monthlySales = [[[0]*NUM_MONTHS for i in range(NUM_DEPTS)]
for j in range(NUM_STORES)]

5 DEPTS
rows

S
E
R
O
s
ST eet
3 sh

monthlySales[3][7][0]
sales for electronics in August at White Marsh

12 MONTHS columns

47

Example of filling a 3-d


array
def main():
NUM_DEPTS = 5 # mens, womens, childrens, electronics, furniture
NUM_MONTHS = 12
NUM_STORES = 3 # White Marsh, Owings Mills, Towson
monthlySales = [[[0]*NUM_MONTHS for i in range(NUM_DEPTS)] for j in
range(NUM_STORES)]
storeNames = ["White Marsh", "Owings Mills", "Towson"]
deptNames = ["mens", "womens", "childrens", "electronics", "furniture"]
for store in range(NUM_STORES):
print (storeNames[store], end=" ")
for dept in range(NUM_DEPTS):
print (deptNames[dept], end = " ")
for month in range(NUM_MONTHS):
print("for month number ", month+1)
monthlySales[store][dept] [month] = float(input("Enter the sales "))
print()
print()
print (monthlySales)

Find the average of


monthly_sales
total = 0
for m in range(NUM_MONTHS):
for d in range(NUM_DEPTS):
for s in range(NUM_STORES):
total += monthlySales
[s][d][m]
average = total /
(NUM_MONTHS * NUM_DEPTS *
NUM_STORES)

Problem: student data in a


file

The data is laid out as


Name, section, gpa

John Smith, 15, 3.2


Ralph Johnson, 12, 3.9
Bob Brown, 9, 2.5
Etc.

Read in the data


inf = open(students,r)
studs = []
for line in inf:
data = line.split(,)
studs.append(data)
inf.close()
#studs looks like [[John Smith,15,3.2],
#[Ralph Johnson,12,3.9],[Bob Brown]]

Find the student with


highest GPA
max = 0
for j in range(1, len(studs)):
if studs[max][2] < studs[j][2]:
max = j
#max is now location of highest gpa
studs[max][0] is the name of the student
studs[max][1] is the students section

Anda mungkin juga menyukai