Anda di halaman 1dari 190
Advanced Level Mathematics Statistics 1 SS emus er eam eS WAL Cy International Examinations ‘The publishers would like to acknowledge the contuibutions ofthe following people to this series of books: Tim Cross, Richard Davies, Maurice Godfey, Chris Hockley, Lawrence Jarret, David A. Lee, Jean Matthews, Norman Moris, Charles Parker, Geof Staley, Rex Stephens, Peter Thomas and Owen Toller, (Cambridge, New York, Metbourne, Madrid, Cape Towa, Singapore, ‘Sto Paulo, Delhi, Dubai, Tokyo, Mexico City Cambridge Univesity Press ‘The Edinburgh Building, Cambridge CB2 SRU, UK worwcambridge org, Information on this tie: www cambridge.org/9780521530132 (© Cambridge University Press 2002, ‘This publication is in copyright. Subject sattory exception and tothe provisions of relevant collective licensing agreements _o reproduction of any part may take place without the written ‘permission of Cambridge University Press First published 2002 10 prising 2010 Printed inthe United Kingdom atthe University Press, Cambridge A catalogue ron for ths publication is ovilabe fom the Bish Library ISBN 978.0.521-53013-2 Paperback ‘Cambridge University Press has no respossibilty for the persistence or accuracy of URLs for extemal or third-party internet websites refered to in this publication, and does not guarantee that any content on such websites i or will remain, accurate or appropriate. Information segarding prices, travel timetables and other factual information given inthis work is core at the time of first printing but Cambridge University Press does nat guarantee the acewacy of such information thereafter. ‘Cover image: © Tony Stone Images / Arc Wolfe Contents Introduction y 1 Representation of data 1 2 Measures of location 2B 3° Measures of spread 39 4 Probability 65 5 Permutations and combinations 84 6 Probability distributions 100 7 The binomial distribution 109 8. Expectation and variance of a random variable 121 9 The normal distribution 133 Revision exercise 163 Practice examinations 167 Normal probability function table 172 Answers 13 Index 183 Introduction (Cambridge International Examinations (CIE) Advanced Level Mathematics bas been ‘writen especially for the CIE international examinations. There is one book corresponding fo each syllabus unt, except that units P2 and P3 are contained in single book. This book is the frst Probability and Statistics unit, $1. The syllabus content is arranged by chapters which are ordered so as to provide a viable teaching course. A few sections include important results that are difficult to prove oF ‘outside the syllabus. These sections are marked with an asterisk (*) in te seetion heading, and there is usually a sentence early on explaining precisely what it is thatthe student needs to know. Some paragraphs within the text appear in this pe sive, These paragraphs are usually outside the main stream of the mathematical argument, but may help to give insight, or suggest extra work or different approaches. Graphic calculators ate not permitted in the examination, but they can be useful ais in learning mathematics. Inthe book the authors have noted where access to graphic calculators ‘would be especially helpful but have not assumed that they are available to all students, ‘The authors have assumed that students have access to calculators with builtin statistical funetion. "Numerical work is presented in a form intended to discourage premature approximation, In ongoing calculations inexact numbers appear in decimal form like 3.436... signifying ‘thatthe number i held in a calculator to more places than are given, Numbers are not rourided at this tage; the full display could be either 3.456 123 or 3.456 789. Final answers are then stated with some indication that they are approximate, for example 1.23, correct to 3 significant figures’ Most chapters contain Practical activities. These can be used either as an introduction to a ‘opie, o, later on, to reinforce the theory. There are also plenty of exercises, and each chapter ends with a Miscellaneous exercise which ineludes some questions of examination standard. There is a Revision exercise, and two Practice examination papers. In some exercises a few of the later questions may go beyond the likely requirements of the ‘examination, either in difficulty or in length or both, Some questions are marked with an asterisk, which indicates thae they require knowledge of results outside the syllabus ‘Cambridge University Press would like to thank OCR (Oxford, Cambridge and RSA Examinations), part ofthe University of Cambridge Local Examinations Syndicate (UCLES) _goup, for permission to use past examination questions set in the United Kingdom, ‘The authors thank OCR and Cambridge University Press for their help in producing this book. However, the esponsibility for the text, and for any errors, remains with the authors A Representation of data ‘This chapter looks at ways of displaying numerical data using diagrams, When you have completed it, you should know the difference between quantitative and qualitative data be able to make comparisons between sets of data by using diagrams be able to construct astem-and-leaf diagram from raw data be able to draw a histogram from a grouped frequency table, and know that the area of ‘each block is proportional to the frequency in that class ‘© beable to construct a cumulative frequency diagram from a frequency distribution table. Introduction. ‘The collection, organisation and analysis of numesieal information are all part ofthe subject called statisties. Pieces of numerical and other information are called data, A more helpful definition of ‘data’ is “a series of facts from which conclusions may be drawn’. In order to collect data you need to observe or to measure some property. This property is called 2 varlable. The data which follow were taken from the internet, which has many sites containing data sources. In this example a variety of measurements was taken ‘on packets of breakfast cereal inthe United States, Each column of Table 1.1 (pages 2 and 3) represents a variable. So, for example, ‘type’, ‘sodium’ and ‘shelf are all variables. (The amounts for variables 3,4, 5 and 6 are per serving.) Datafile name Cereals Deseription Data which referto various brands of breakfast cereal in a particular store A value of ~I for nutrients indicates a missing observation, Number of eases 77 ‘Variable names 1 name: name of cereal type: cold(C) oF hot H) fat: grams of fat sodium: milligrams of sodium ‘carbo: grams of complex carbohydrates Sugar: grams of sugars shelf: display shelf (1, or 3, counting from the floor) ‘mass: mass in grams of one serving ‘ating: a measure of the nutritional value of the cereal Sransncs | name type fat sodium earbo sugar shelf mass rating 100% Bran 0) 6 3 mw 100% _Natur_Bran ea AlL-Bran os aan so) All-Bran_ith_ Extra Fiber © 0 mw 8 0 3 % OF Almond Delight c 2m mM 8s 3 » M Apple_Cinnamon_Cherios © 2 1% sm 1 3% Aple_Jacks | 0 i it i a ol 3 Basic 4 co 2m B® 8 3 4 9 Bran_Chex c lm ps 6 1 0 Bran_Flakes C0 esl Cap'n’Cruneh C lu @ is 3 30 if Cheerios fey ext si Cinnamon Toast Crane c 3 2 Bb 9 2 ww Custers Cee et eel Cocoa_Putfs St i iy al a) Coun_Ches 0 me eo a Coen _Flakes c 0 wm ma 2 1 w Corn_Pops 0 ei os) CountChocula ict im 1s a) an ail (Ceacklin™Oat Bran Cot Oo 5 oo (Cream _of Wheat (Quick) H 0 2 0 2 » 6 Crispix Coe ea) Crispy_Wheat_& Raisins C0 00 6 oo Double Chex coo mw wm > 8 Hm Froot_Loops c 1p uo Bp 2 % 3 Frosted Flakes c 0m won 1 3% OH Frosted_Mini-Wheats co 0 Mw 7 2M SB Fruit & Fibre Dutes.Walnus, = C2 «16012 D3 and_Oats Fruitful Bran c 0 m0 Ros @ 41 Pray Pebbles GA te so ae) Golden Crisp co 8 1 3035 Golden_Grahams on is an Grape Nuts Fikes i iy Ge Grape Nuss oO ig) on sl Great_Grains Pecan C3 5 B 4 3 9 Honey_Graham_Ohs c 2m 2 HN 2 3% @ Honey_Nut Cheerios GC ia iin i 0 si Honey-comb Com wT 1 3s Just Right_Cronchy_ Nuggets ei iGo) ie sr] Just Righ_ Fruit & Not Gt in oad Kix ae tl Life © 2 2 6 2 9» & ‘Lucky Charms ie lo i isp cr) ‘Cuarree I: Reneesentarion oPDAta name type fat sodium carbo sugar shelf mass rating Maypo Po eee ‘MucsliRaisins. Dates & Almonds C3 95 16 «M3303 Muesli Raisins Peaches Pecans C 3 150 16 M3 3038 c 2 10 7 1B 3 45 30 ‘Mul-Grain_Caserios c 1m 6 6 1 3% 40 [Nut_& Honey _Crunch co 1 im 1 9 2 % [Nutri-Grain_Almnond-Raisin pen 7 oa [Nutr-Grain_Wheat G0 G4 6 ‘Oatmeal Raisin Crisp C2 im) 13510 5) a0 20 Post_Nat_Raisin Bran co 1m now 3 0 8 Produet_19 G0) 9 35 5 5 Poffed_Rioe Choy vs pe sets Poffed_Wheat coo a WH 0 3 15s 6 ‘Quaker_Oat Squares Gia) 0 5 4 8 So ‘Quaker_Oatmeal ceo rr aol Raisin Bran ¢ i 2 won 2 0 a Reisin_Nat_Bran © 2 Wo ws 8 3 3% 40 Raisin_Squares ee Rice Chex Ca a Rice_Krispies Ce ‘Shredded_Whest SEG iso 0m ee ‘Shreddod_Whest'a'Bran Cet ou ao) ae ‘Shyreddod_ Wheat spoon size 0 Smacks Cor wm 9 2 WS Speial_K GE Ones 1G st fo) Strawberry_Fruit_Wheats coo s 8 5 2 w 9 ‘Toal_Corn Flakes Coa ays) ao ay, ‘Total_Ralsin_Bran Ci iw & oS & a “Total_Whole_Grain CG fam fe as a ay Triples Ce ea ese) ot Trix (Ste to isla) he Wheat Chex corm 7 3 1 Wheaties Cte om) ie Se tena ‘Wheaties Honey_Gold C1 to Bt 3 36 Tale 1D Cols ‘The variable “ype has two different leuer codes, H and C. “The variable ‘sodium’ takes values such as 130, 15,260 and 140. ‘The variable “shel? takes values 1.2.03. ‘You can soe that there are diferent types of variable. The variable “type” is non- numerical: such variables are usually called qualitative. The other two Variables are called quantitative, because the values they tke are numerical 4 Sratisnes 1 ‘Quantitative data can be subdivided into two categories. For example, ‘sodium’, the ‘mass of Sodium in milligrams, which can take any value ina paticular range, is called a ‘continuous variable, “Display shell", on the other hand, is a diserete variable: it can only take the integer values 1,2 or 3, and there are clear steps between its possible values. It would not be sensible for example, to refer to display shelf number 2.43, In summary AA variable is qualitative if itis not possible for it to take a numerical valu. eft can take a numerical value A variable is quanti ‘A quantitative variable which can take any value in a given eange is continuous. A quantitative variable which has clear steps between its possible values is diserete, 1.2. Stem-and-leaf diagrams ‘The datafile on cereals has one column which gives a rating ofthe cereals on a scale of (0-100, The ratings are given below Gus 4H tm am 7 4 5 18) 51 (00) 4 23) 4146 50) a ee eee ees 35 24525362231 293736 Gt ea ea Co 5 2 41 68 7 73 53 58 38 m9 47 3% 2 50 52 96 ‘These values are What statisticians call raw data, Raw data are the values collected in a ‘survey or experiment before they are categorised or arranged in any way. Usually ravv data appear inthe form ofa list. It is very difficult to draw any conclusions from these raw data Just by looking atthe numbers. One way of arranging the values that gives some information ‘about the patterns within the dat isa stem-and-leaf diagram. In this case the ‘stems’ are the tens digits and the “leaves” ae the units digits. You write the stems tothe left ofa vertical line and the leaves tothe right of the line. So, for example, you would write the frst value, 68,8 618. The leaves belonging to one stem are then written inthe same row. (CHAPTER I: REPRESENTATION OF DATA ‘The stem-and-leaf diagram for the ratings data i shown in Fig. 12. The key shows what the stems and leaves mean. ° o 18 o 2jo328429798 a9) 3/4 403766215176974000891996 Qs) 4|9016074116501202 as) 5.93182350153902 as) ols so138 o aa ° 8 o 94 wo Key: 618 means 68 Fig. 12,Stemang ef dagram af eal igs. ‘The numbers in the brackets tell you how many leaves belong to each stem. The digits in ‘cach stem form a horizontal ‘block’, similar a baron a bar chart, which gives a visual impression of the distribution. Infact, if you rotate a stem-and-leaf diagram anticlockwise through 90°, it looks like a bar chart, It is also common to rewrite the leaves in numerical ‘order; the stem-and-leaf diagram formed in this way i called an ordered stem-and-teaf diagram, The ordered stem-and-leaf diagram for the cereal ratings is shown in Fig. 13. © o 10) 445666677789999 Qs) 24566779 «s) 589 on) © 2 © ol a Key: 618 means 68 ig 13, Ore tonne diagram of cere abi. So far the ster-and-leaf diagrams discussed have consisted of data values which are integers between O and 100, With suitable adjustments, you can use stem-and-leaf 24 Froquny = 12 BA 2 3 rau a cumulative frequency graph, Use it to find the numberof kilometres below which (@) one-quarter, () three-quarters ‘of the employees travel to work, {6 The lengths of 250 electronic components were measured very accurately. The results are summarised in the following table. Length (em) <7.00 700-705 705-710 710-715 7.15~720 >720 Frequency 10 6 n 65 30 3 Given that 10% of the components are scrapped because they ate too short and 8% are scrapped because they are too long, use a cumulative frequency graph to estimate limits for the length of an acceptable component. 7. As part of a health study the blood glucose levels of 150 students were measured. The results, in mmol I! correct to I decimal place, are summarised in the following table. Glucose level <30 30-39 40-49 50-39 60-69 =70 Frequency os) 4 2 Draw a cumulative frequency graph and use ito find the percentage of students with blood splucose level greater than 5.2. ‘The number of students with blood glucose level greater than 5.2is equal tothe number with blood glucose level less than Find a Practical activities 1 One-sidedness Investigate whether reaetion times are different when a person uses ‘only information from ane “side of their body. {) Choose a subject and instruct them to close their left eye, Against a wall hold a ruler pointing vertically downvvards with the 0 cm mark atthe bottam and ask the subject to place the index finger of thei righthand aligned with this O cm mark, Explain that you ‘ill let go ofthe ruler without warning, and thatthe subject must ery to pin it against the ‘wall using the index finger of their right hand. Measure and record the distance dropped. (b) Repeat this for, say, 20 subjects (©) Take a further 30 subjects and carry out the experiment again foreach of these subjects, but for tis second set of 30 make them close their right eye and use ther left hand. (@) Draw a stemvand-leaf diagram for both sets of data and compare the distributions. (€) Draw two histograms and use these to compare the distributions. (D) Do subjects seem to react more quickly using their right side than they do using thei left side? Are subjects more erratic when using their left side? How does the fact that some people are naturally left-handed affect the results? Would it be mote appropriate to investigate ‘dominant’ sie versus ‘non-domninan’ side rather than left versus right? igh jump Find how high people ean jump. (@) Pick a subject and ask them to stand against a wall and stretch their arm as far up the wall as possible. Make a mark at tis point Then ask the subject, keeping thier arm ‘ised, 10 jump as high as they can. Make a second mark at this highest point. Measure the distance between the two marks. This is a measure of how high they jumped. (b) Take two samples of students of different ages, say I years old and 16 years old, and plot a histogram ofthe results for each group. {€) Do the okier students jump higher than the younger students? 3 Memory (a) Place twenty different small objects on a tray, for example a coin, a pebble and so ‘on, Show the tray to a sample of students for one minute and then cover the tray. Give the students five minutes to write down as many objects as they can remember. (b) Type alist of the same objects on a sheet of paper. Allow each of a different sample ‘of students to study the sheet of paper for one minute and then remove the sheet. Give the students five minutes to write down as many objects as they can remember Draw a diagram which enables you to compare the distribution ofthe number of objects remembered in each of the two situations. Is it easier to remember objects for one Situation rather than the other? Does one situation lead toa greater variation inthe numbers of objects remembered? Miscellaneous exercise 1 1 The following gives the scores ofa cricketer in 40 consecutive innings. 6 1 7 19 Ss? 12 2% 38 45 66 m 8 25 SM 4 3 6 0 2% 17 fers ee ie ez a i ee sso) ‘Mlustrate the data on a stem-and/-leaf diagram. State an advantage tha the diagram has over the raw data. What information is given by the data that does not appear in the diagram? 2 The service time, 1 seconds, was recorded for 120 shoppers atthe cash register in a shop, ‘The results are summarised in the following grouped frequency table. ' <30 30-60 60-120 120-180 180-240 240-300 300-360 >360 Frequency 23 8 6 2 2% Bw 6 Draw a histogram of the data. Find the greatest service time exceeded by 30 shoppers (CHAPTER I: REPRESENTATION OF DATA a 3. Atthe start of new school year, the heights ofthe 100 new pupils entering the school are ‘measured, The results are summarised inthe following table. The 10 pupils in the class ‘writen *110~" have heights not less than 110 cm but less than 120 em, Hight (em) 100- 110-120-130- 140= 150- 160~ Number of pupils 21029 ‘Use a graph to estimate the height ofthe tallest pupil of the 18 shortest pupil. 4 The following ordered set of numbers represents the salinity of 30 specimens of water taken from a stretch of sea, near the mouth ofa river. 42 45 58 63 72 79 82 85 93 97 102 103 104 107 114 116 116 117 118 118 119 124 124 125 126 129 129 13.1 135 143 (®) Forma grouped frequency table for the data using six equal classes. (b) Draw a cumulative frequency graph . Estimate the 12th highest salinity level, Calculate the percentage error inthis estimate, 5. The following ae ignition times in soconds, correct to the nearest 0.1, of samples of 80 ‘lammable materials. They are arranged in numerical order by rows, 12 14 14 15 15 16 17 18 18 19 21 22 23 25 25 25 25 26 27 28 3.1 32 35 36 37 38 38 39 39 40 41 42 43.45 45 46 47 47 48 49 51 51 51 52 52 53 54 55 56 58 59 59 60 63 64 64 64 64 67 68 68 69 73 74 14 76 79 80 86 88 BS 92 94 96 97 98 106 112 118 128 Group the data into eight equal classes, stating with 1.0-2.4 and 2.53.9, and form a ‘grouped frequency table. Draw a histogram. State what it indicates about the ignition times. 6 A company employs 2410 people whose annual salaries are summarised as follows. Salary in $1000s) <10 10-20 20-30 30-40 40-S0 S060 60-80 80-100 > 100 Numberofstaf 16 31502642. 875.283 ASD (a) Draw acumulative frequency graph for the grouped data (b) Estimate the percentage of staf with salaries between $26 000 and $52,000. (©) Ifyou were asked to draw a histogram of the data, what problem would arise and how would you overcome it? 7 Construct a grouped frequency table forthe following data 19.12 2143 2057 169714821961 19.35, 2002 1276 «04021382027 20.21 16.83 2108 77120691561 194L 21.25 19.72 213-034-2052 1730 (@) Draw a histogram ofthe data. (©) Draw a cumulative frequency graph, (©) Draw a stem-and-leaf diagram using leaves in hundredths, separated by commas. 2 Sransties 1 8 Certain insects can cause small growths, called ‘galls’, on the leaves of trees. The numbers of galls found on 60 leaves ofa tree are given below. Si at 4 iy 0 0) ot a Gl is ae) 1g) 27, an) S16 2 1 2 SID BM BM 3 9 2 0 Poo 7B 9 1 3 44 10 3 9 11 SH 8 36 44 10 (8) Putthe data into a grouped frequency table with clases 0-9, 10-19... , 70-99. (©) Draw a histogram of the data, () Use a cumulative frequency graph to estimate the number of leaves with fewer than 34 aul (@) State an assumption required for your estimate in part (c), and briefly discuss justification in this case, 9 The traffic noise levels on wo city streets were measured one Weekday, between 5.30 a.m, ‘and 8.30 p.m. There were 92 measurements on each street, made at equal time intervals, and the results are summarised in the following grouped frequency table. Noise level (IB) <65. 65-67 67-69 69-71 71-73 73-75 75-77 77-79 >79 Sweet frequency 4 ll B 2 1 9 5 4 2 Steet 2frequncy 2 3 7 1 27 6 0 8 7 (@) On the same axes, draw cumulative frequency graphs fo the two streets. (®) Use them to estimate the highest noise levels exceeded on 50 occasions in each street. (6), Write a brief compatison of the noise levels inthe two streets. 10 The time intervals (in seconds) between telephone calls received at an office were monitored ‘on a particular day. The fits 5] ealls alter 9.00 a.m, gave the following 50 intervals Mo 19 6 2 mR Ss 4 2 66 ga 5a Gas 23 104 35 «BBS Tk m4 1 4 mM 6 6 3 oO 5 38 Soh a tl no. eos lustate the data with a stem-and-teaf diagram, and witha histogram, using six equal classes. AL (For this question use the definition of frequency density on page 12.) A histogram is drawn to represent a set of data (2) The first two classes have boundaries 2.0 and 2.2, and 2.2 and 2.5, with frequencies 5 and 12. The height of the first bar drawn is 2.5 em. What is the height ofthe second bar? (©) ‘The class boundaries of the third bar are 2.5 and 2.7. What i the corresponding frequency ifthe bar drawn has height 35cm’? (©) The fourth bar has a height of 3em and the corresponding frequency is 9, The lower class boundary for this bar is 2.7 em. Find the upper class boundary. a4) 22 Measures of location ‘This chapter describes three different measures of location and their method of calculation. When you have completed it, you should ‘+ know what the median is, and be able to calculate it ‘+ know what the mean i and be able to calculate it efficiently ‘© know what the mode and the modal class are, and be able to find them ‘© beable to choose whieh isthe appropriate measure o use in a given situation, Introduetion ‘Suppose that you wanted to know the typical playing time for a compact dise (CD). You could start by taking afew CDs and finding out the playing time for each one. You ‘might obtain a lst of values such as 49, 56, 95,68, 61, 57, 61,52, 63, ‘here the values have been given in minutes, 1 the nearest whole minute. You can see that the values are located roughly in the region of I hour (rather than, say, 2 hours or 10 ‘minutes) It would be useful to have single value which gave some idea of this location. A single value would condense the information contained in the data set into a ‘typical’ value, and would allow you to compare this data set with another one. Such 3 value i called 2 measure of location, or a measure of central tendeney, or in everyday language, an average. ‘The median ‘You can get a clearer picture ofthe location of the playing times by arranging them in ascending order of size: 49, 52,55, $6,57, 61,61, 63,68 A simple measure of location isthe ‘middle’ value, the value that has equal numbers of values above and below it. In this case there are nine Values and the middle one is $7, ‘This value is called the median, If there are an even numberof values, then there is no single “middle” value. Tn the ease ofthe six values 47,49, 59, 62, 65,68, Which are the playing times of another six CDs (in order), the median is taken to be halfway between the third and fourth values, which is 1(59+62), or 60.5. Again, there are equal numbers of values below and above this value: in this ease, three. 2 Stansnics | ‘To find the median of a data set of » values, arrange the values in order of increasing size i mis odd the median is the (n+ 1th value. If iseven,the | median is halivay between the +nth value andthe following value ‘A convenient way of sorting the values into order of increasing size is to draw an ordered stem-and-leaf diagram, Fig, 2.1 is a stem-an-Leaf diagram of the heights of the female students from the ‘Brain size” datafile in Table 1.5 15/79 @ Wl0044445888999 (13) 17133459 o Key; 1713=173 em Fig. 21, Stomandefdsgram ofthe highs of female students “There ave 20 students and so the median is clelated tom the 10th and 11th ales. These two values are shown in old type in Fig 21. The median 1 (168+ 168, or 168 cm 2.3. Finding the median from a frequency table Datasets are often much larger than the ones in the previous section and the values will often have been organised in some way, maybe in frequency table. Asan example, ‘Table 2.2 gives the number of brothers and sisters of the children at school Number of brothers Cumulative and sisters Frequency 0 36 36 1 94 130 2 48 18 a 15 193 4 i 200 5 3 203 6 1 204 Tofind the median ing the method in the previous section you would wrt ot ist oll the idviua values, sarting with 36°0', then 941 nds oad find the "(G04}inso102nd, val andthe 108rd vale. A much eser metodo adda ‘column of cumulative frequencies, as in Table 2.2. From this column you can see that when you have come to the end ofthe (Cuarren 2: MEASURES OF LocaTION 2s 's you have not yet reached the 102d value but, by the end of the “I's, you have reached the 130th value. This means thatthe 102nd and 103rd values are both 1, $0 the median is also 1 Inthe above example the data have not been grouped, so itis possible to count to the ‘median, Large data sets for continuous variables, however, are nearly always grouped, and the individual values are lost. This means that you cannot find the median exactly and you have to estimate it, Table 2.3 gives the grouped frequency distribution for the playing time ofa large selection of CDs, Playing time, Class boundaries (min) Wxdd — 395eyedd5 45-49 d4Sex<495 50-84 d9S,, OL7. © evaluate © Ms 28 Sranisics 1 ita tay txy 1434445213 445? 251 (FHF +lu-3) = (13.25) +(3-3.25) +(4-325)+ (5-328) -225-025+075 +17 ‘ne sum $ (x; ~X) 8 aay equal fo 0. Try to prove ths resut in the general case © XG = (2.25) +(-0.25)* +0.75? +195" =8.15 ‘Table 2.5 contains a copy of the data in Table 2.2, which was te frequency distribution ‘of the number of brothers and sisters of the children ata school. Of the 204 values, 36 are 0's, 94 are “I's, 48 are 2's and so on. Their sum will be (02 36) + (194) + (2% 48) + (3215) + (4X7) + (59) + (G21) = 284 ‘You can include this calculation inthe table by adding a third column in which each value ofthe variable, is multiplied by its frequency. f 2.6 Calculating the mean from a frequency table Number of brothers and sisters, x, Frequeney. J xis 0 36 0 1 on 94 2 48 96 3 15 45 4 a 28 5 3 1s 6 1 6 284 “able 2.5. Cation of the mean fr he data in Table 22. 284 ‘The mean is equal to + = 1.39, correet to 3 significant figures. equal t is a Atihough the number of brothers and sisters of each child must be a whole number, the ‘mean ofthe data values need not be a whole number. 27 (Charen 2: MEASURES OF LOCATION In this example the answer i a recurring decimal, and so the answer has been rounded {03 significant figures. This degree of uecuracy i suitable for the answers to most statistical calculations. However, i is important co Keep more significant figures when. values are carried forward for use in further calculations ‘The calculation ofthe mean can be expressed in Z-notation as follows: ‘The mean, x ofa data set in which the variable takes the value with requency fj. % with frequency f; and soon is piven by Ifthe data in a frequency table are grouped, you need a single value to represent each 0, —0,,asin Fi Aistibution is sad to have positive skew. and the line representing the median would be aes us eS ot aba Fig 3.11. Borandwbisher plat fora set of data Inshich Q,-O, > 0,—9, If Q,~Qs Q, ~Q,, indicating positive skew, but for the left whisker to be longer than the right whisker, which would ten to suggest negative skew. In such cases you ‘must make a judgement about which method of assessing skewness you think isthe ‘more important, Fortunately data of this sort clo not occur commonly. 36° Outliers “The quartiles ofa data set can also be used to assess whether the data set has any ‘outliers. Outliers are unusual or “Treak" values which differ erealy in magnitude from ‘the majority ofthe data values. But just how large or small does a value have tobe to be an outlier? There is no simple answer to this question, but one ‘rule of thumb developed by the statistician John Tukey is to use ‘fences’ “The upper fence is la value 1S times the interquartile range above the upper quartile Q,+1.5(Q;-2,)- Upper fens ‘The lower fence is a a value 1.5 times the interquartile range below the lower quartile: 0,-15(0,-0)) John Tukey then said that any value which is bigger than the upper Fence or smaller than the lower fence is considered to be an outlier. Lower fene For the data of Example 34.1, Q)=2 and Q; =8, 50 +15(0, lower fence =, -15(0;-@))= upper fence 8415x(8-2) -15(8-2)=-7 Jn this case there is no value above 17 or below ~7,s0 tis dataset does not contain any values which could be said to be outliers Exercise 3A, 1 Find the range and interquartile range of each ofthe following data sets. 7 4 i469 2 2 1p 6 15) ©) 76 48 12 69 48 72 81 103 48 67 2 Find the interquartile range of the lea lengths displayed in Exercise 1A Question | a oy Srarisnes | ‘The number of times each week that a factory machine broke down was noted over a period of 50 consecutive weeks. The results are given in the following table. Numberofbreakdovns 0 1 2 3 4 5 6 [Number of weeks Gia ia ia Find the interquartile range of the number of breakdowns in a week. For the data in Miscellaneous exercise 1 Question 6, find the lower and upper quartiles of ‘the annul salaries. For the data in Miscellaneous exercise 1 Question 8, find the median and interquartile ‘ange for the traffic noise levels inthe two strets. Use the statistics to compare the noise levels in the two strees, ‘The audience size in a theatre performing a long-running detective play was monitored ‘over a periad of one year. The sizes for Monday and Wednesday nights are summarised in the following table. ‘Audience size 50-99 100-199 200-299 300-309 400-499 500-590 Number of Mondays Ge eo 5 3 o Number of Wednesdays 2 3 0 8 5 4 ‘Compare the audience sizes on Mondays and Wednesdays. ‘The following stem-and-lea diagrams refer to the datafle ‘Cereals’ in Chapter 1. They are ‘the ratings ofthe cereals with fat content O and with fat content 1 Fat content Fat content 2\9 wo 2\2347889 o 3|1356 ® 3/0112666789999 «13 4l11122467 ® 4)079 e s|333589 o 5002259 © 601358 o 68 w al34 @ 8 o o4 o Key: 417 means 47 ‘Compare the two sets of ratings by finding the ranges, medians and quartile, ‘Draw box-and-whisker plots for data which have the following five-number summaries, and in each case describe the shape of the data (@) 6Oke l02kg = 12TkeSDKe STK &) HC BC HT CC, © 3m 4m Om 2mm ‘State, giving reasons, whether box-and-whisker plots or histograms are better for ‘comparing two distributions 37 CCuapren 3: MEASURES OF SPREAD ® 10. The following figures are the amounts spent on food by a family for 13 weeks. ‘$48.25 $43.70 $52.83 $49.24 $58.28 $5547 $47.29 SS1.82 $5842 $38.73 $42.76 $50.42 $40.85 (2) Obsain a five-number surnmary of the data (b) Construct a box-and-whisker plot ofthe data (e) Deseribe any skewness ofthe data, 11° The lower and upper quartiles for a dataset are S6 and 84, Decide which ofthe following data values would be classified as an outlier according tothe criteria of Section 36 (@ 140 ) 10 © 100 12° For the data of Exercise 1A Question 3, construct a box-and-whisker plot, (a) Caleulate the inner and outer fences, () State, giving @ reason, whether there are any outliers. (©) Comment on the shape of the distribution, Variance and standard deviation (One of the reasons for using the interquartile range in preference tothe range as @ ‘measure of spread is that it takes some account of how the interior values are spread rather than concentrating solely on the spread ofthe extreme values. The interquartile range, however, does not take account ofthe spread of all ofthe data values and so, in some sense, itis stil an inadequate measure, An alternative measure of spread whieh does take ins wecout dhe spre ofall he values cat be Weve by Fain how at cach data value is from the mean. To do this you would calculate the quantities foreach x; ‘The example in Section 2.4 used the playing times, in minutes, of nine CDs. 49 56 55 8 Gl 57 OL 52 8 ‘The mean of these times was found to be 58 minutes. IFthe mean is subtracted from. cach of the original data values you get the following values. 2) 23) 10 2) a) =o) If you ignore the negative signs, then the resulting values give an idea ofthe distance of cach ofthe original values from the mean. So these distances would be Ope to Ges ‘The mean of these distances would be a sensible measure of spread, It would represent ‘the mean distance from the mean. In this case the mean distance would be (04+243410434143+6+5) 50 Stamens | ‘To represent this method with a simple formula it is necessary to use the modulus symbol |v], which denotes the magnitude, or numerical value, of v. [tis now possible ‘write a precise formula forthe mean distance: 235 Unfortunately a formula involving the modulus sign is awkward to handle algebraically. ‘The modulus sign can be avoided by squaring each of the quantities 2, ~ ‘This leads to the expression LE a-3) as a measure of spread ‘This quantity is called the variance ofthe data values. It is the mean of the squared distances from the mean, So, for the data on playing times of CDs, LG -3 j(ot +2243 +1 = B14 4494 100494149436 425) IF the data vals y+» have Units associated wih ihm then he variance will be measured in unis Inthe example the data values were measured in minutes and therefore the variance would be measured in minutes®. This is something which can be avoided by taking the postive quae roto the varance, The positive square root of the variance is known sth standard deviation, often shortened to “SD, and it always haste same units asthe orginal data ves. The forma for standard vation is sae (L3G -2) So te standard deviation fhe paying tes Fhe nine CDs is 30 cores 3 signfcant figures, 32, ‘The calculation ofthe variance can be quite tedious, particularly when the mean is not a \whole number. Fortunately, thee i an alternative formula which is easier to use: L (7 tay ; +x) ‘This can be waitten in S-notation as vances 2 Using this alternative formula withthe data on the playing times of CDs gives variance: =D ( J (498 +562 +55? +68? +612 +57 +612 +522 +632)—s8 = H(2401 43136-3005 4604 +3721 +3249 + 3721-42704 +3960)-3364 } x30 550 ~3364 = 3394.4... ~3364 = 3044. (CHAP 3: MEASURES OF SPREAD ‘This is the same value found by using the original formula, This does not, of course, prove that the two formulae are always equivalent to each other. A proof is given in Section 3.8. ‘The variance of a set of data values. ,x3,....%, Whose mean is Sa is given by cithor of the two alternative formulae 6.0.62) variance = 4X¢ 4) —. ‘The standard devi jon is the square root of the variance, Example 3.7.1 ‘The 12 boys.and 13 girls ina class of 25 students, were given a test. The mean mark for the 12 boys was 31 and the standard deviation of the boys’ marks was 6.2. The mean ‘mark for the giels was 36 and the standard deviation ofthe girls’ marks was 4.3. Find the ‘mean matk and standard deviation of the marks ofthe whole class of 25 students, Let x,.4,¢.- 112 be the marks of the 12 boys in the test and let Y2.0-4 Ys BE the marks of the 13 girlsin the test. As the standard deviation ofthe boys” marks is 6.2, the variance is 6.2? = 38.44 Therefore, using Equation 32, 38.44= 22 at wich gives Eo? 12x (3844431) = 1199828, ‘larly, x. Ly 13x36= 468, and 3x (4.32 +36?) =1708837 3.6. ‘Theowal means ZEEE. BA 0 sigs 1000 18%, tothe outcomes 1,2.3,4,5 and 6 respectively. These ar called the relative frequencies of the outcomes, and you can use them as estimates of the probabilities. ‘You should realise that if you were to roll the dice another 1000 times, the results would probably not be exactly the same, but you would hope that they would not be too \ifferent. You could rol the dice more times and hope to improve the relative frequency as an approximation tothe probability ‘You would then assign the probabilities and Sometimes you cannot assign a probability by using symmetry or by carrying out an experiment. For example, there isa probability that my house will be struck by lightning ‘next year and I could insure against this happening. The insurance company will have to have a probability in mind when it calculates the premium I have to pay, but it cannot calculate it by symmetry, or carry out an experiment fora few years. It will assign its ‘probability using its experience of such matters and its records When probabilities are assigned tothe outcomes in a sample space, + each probability must lie between 0 and | inclusive, and + the sum ofall the probabilities assigned must be equal to 1 Example 4.1.4 How would you assign probabilities 1 the following experiments or activities? (@) Choosing a card from a standard pack of playing cards. (b) The combined experiment of tossing a coin and rolling a dice (©) Tossing a deawing pin onto table to see whether it lands point down or point up. (@) Four international football teams, Argentina, Cameroon, Nigeria, Turkey (A.C, N and 7), play a knockout tournament, Who willbe the winner? (a) The sample space would consist of the list ofthe 52 playing cards| {AC,2C,3C....,KS} in some order. (Here A means ace, C means clubs, and so ‘on.) Assuming that these cards are equally likely to be picked, the probability assigned to cach of them is 3 ((H.1)(H4,2).(H,3)(H.4).(H,3),(H,6 (Te sample csi | -anteash (7.1), (7.2), (7.3), (TA), (15), (7.8) the outcomes would be assigned a probability of jh (©) The sample space is {point down, point up}. You would need to earry out an experiment to assign probabilities, (&) The sample space is {A wins, C wins, N wins, Twins}. You have to assign probabilities subjectively, according to your knowledge ofthe teams and the game. The probabilities 4, pe» Py and py musta be non-negative and satisfy Pat Pe * Pu + Pr =I (Charro 4: Propawuty o 42. Probabilities of events Sometimes you may be interested, not in one particular outcome, but in two oF thee or ‘more of them. For example, suppose you tos a coin twice. You might be intrested in ‘whether the restlt isthe same both times, The list of outcomes in which you are Jnterested is called an event and is writen in url brackets. The event that both tosses ofthe coin give the same results {(H,H),(F.1)} Events are often denoted by capital Jeuers. Thus if A denoes this event then A= ((H.H),(T-T)}. An event canbe just one jutcome, oa list of outcomes or even no outcomes at al ‘You can find the probability ofan event by looking at the sample space and adding the probabilities ofthe outcomes which make up the event. For example, i you were tossing mple space would be {(H.#),H.1).(P.H).(T,T)}. Tere are four ‘outcomes, each equally likely, so they each have probability 1. The event A consists of the wo outcomes (HH) and (7.7), 50 the probability of Ais 4+4.,or 5 ‘This isan example of a general tule. ‘The probability, P(A), ofan event A is the sum of the probabilities of the outcomes which make up A. (Often alist of outcomes can be constructed in such a way tha all of them are equally likely fall the outcomes are equally likely, then the probability of any event A can be found by finding the number of outcomes which make up event A and dividing by the total number of ‘outcomes. When the outcomes are not equally likely, then the probability of any event has 10 ‘be fund hy adding the individual probabilities ofa the outcomes which make up event A Example 42.1 ‘A fair 20-sided dice has eight faces coloured red, ten coloured blue and two coloured geen. The dice is rolled (a) Pind the probability that the bottom face is red (b) Let A be the event that the bottom face is not red. Find the probability of A (@) Each face has an equal probability of being the botom fac: as ther are 20 faces, each of them has a probability of +. Thre ate eight red faces. each with probability 4h,s0 Ped) =, =2 () P(A) =P(bue or green) = P( blue) + PC green Example 42.2 ‘The numbers 1,2,...,9 are writen on separate cards. The cards are shuffled and the top ‘one is tured over. Calculate the probability that the number on this ead is prime. ‘The sample space for this activity is (1,2,3,4,5,6,7,8,9]..As each outcome is equally ikely each has probability J [Let B be the event that the card turned over is prime. Then B= (2,3.5,7}. PB) sum ofthe probabilities of the outcomes in B = $ Stanisnes 1 Example 4.2.3 AA circular wheel is divided into three equal sectors, numbered 1, 2 and 3, as shown in Fig. 42. The wheel is spun twice. Ench time, the score is the number to which the black arrow points, Calculate the probabilities ofthe following events: (a) both scores are the same as each other, () neither score isa 2, (6) at least one ofthe scores is 13, (neither score is a 2 and both scores are the same, (@) neither score isa 2 or both Scores are the same Start by writing dawn the sample space. {0.0.0.2).0.3).2.0.2.2).2,3.8.0.6.2.8.3)} Each outcome has probability} (@) Let A be the event that both scores are the same,so A= {(141)(2.2).3.3)} P(A) = sum of the probabilities ofthe outcomes in A = $= § (©) Let B be the event that nether score isa 2,s0 B= {(,1).3)3,0.3.3)} PB) ur ofthe probabilities ofthe outcomes in B= (©) Let C be the event that a least oe ofthe scares is 3,50 1,3)42.3)(3.1)4G,2)43.3)}. Then P(C)= § (@) Let D be the event that neither score isa 2 and bath svores are the same, so D={(1,1),G,3)}. Then P(D)= 3. (©) Let E be the event that neither score is a2 or both scores are the same, so fi E={(1)(1,33.1)8.3).(2,2)}. Then PE Example 4.2.4 Jafar has three playing cards, wo queens and a king. Tangi selects one of the cards at random, and return it to Jafar, who shuffles the cards. Tani then selects a second car ‘Tandi wins if both cards selected are kings. Find the probability that Tandi wins Imagine that the queens are different and call them Q, and Qs, and call the king Then the sample space is {(2,,21).(0,.02).(0, 4) (02.0))(0s-02)(Q,.K)(KQ, (Ks ach outcome bas probably § 2 W(K KY}, Let be the event that Tundi wins. Then T= {(KK)} and P(r) =sum of the probabilities ofthe outcomes in? =} ‘The probability that Tandi wins a prize is $ Atthough the event that Tandh wins is just a single outcome, itis til listed in curly brackets, (Curren 4: PRosaniLery Example 425 [Alice wih six faces has been made from bras and alunsinium, and is wot air. The probability of 6 is {the probabiles of 2,3, 4, and S ae each {andthe probability of 1 is qh. The dice is rolled. Find the probability of olling (a) ! or 6, (b) an even number. () PQLor6)=P)+PO)=f+4=4 (6) Plan even number) (20rd or 6) =P) + PLA) + PEO)= Lh Sometimes it is worth using a different approach to calculating the probability of an event. Example 42.6 ‘You draw two cards from an ordinary pack. Find the probability that they are not both kings. ‘Te problem is thatthe sample space has large numberof ovtcomes Infact thore are 52 ways of picking the frst cad, and thon $1 ways of picking the second 0 there 5251 = 2652 posites. The sample space therefore consists of 2652 outcomes, each of which s assigned a probability ey ‘To avoid counting all the outcomes which are nat both kings, it is easier to look at the number of outcomes which are both kings. Writing the first card to be drawn as the frst ofthe pair, these outcomes are (KC.KD), (KD,KC), (KC,KH), (KH, KC}, (KC,KS), (KS,KC), (KD.KH) (KH,KD), (KD, KS). (KS. KD), (KH, KS) and (KS,KH), “There are thus 12 outcomes tha ate both Kings So the number which are not both kings is 2652 —12 = 2640. All 2640 of these outcomes have probability 45.50 Plat both kings) = 202 = 2, It is always worth watching for this shor cut andi also useful to have some language to describe it IFA isan even, the event not A” i the event consisting of those ‘outcomes in the sample spice Which are notin. Since the sum ofthe probabilities assigned to ovtcomes inthe sample space is 1 P(A)+ P(not A) ‘The event ‘not A’ is called the complement ofthe event A. The symbol A’ is used 10 «denote the complement of A. If A isan event,then A’ isthe complement of A.and |) P(A)+ P(A’) =1 ay n sstanstics | 43 Addition of probabilities ‘Consider a game in which a fair cubical dice with faces numbered | to 6 is rolled twice. A prize is won if the total score on the two rolls is 4 or if each individual score is over 4. ‘You can write the sample space of all possible outcomes as 36 equally likely pars, a.) 4.2) 13) G4) 5) (6) 2a) 2.2) 2.0) (6.0) (6.2) 6.) each having probability Let A be the event that the total score is 4 and let be the event that each roll of the dice gives a score over 4 ‘Then A={(1,3),2,2),8,0)}, and (5.5).(5.6.46,5).6.6}},50 P= Z=7 and PH)=3=$. A prize is won i A happens ori B happens, so Pfa prize is won) = P(A or B). “This means that a prize will be won if any’ of the outcomes in {(1.3).2.2).G.145.5)45.6),(6,5)(6,6)} occurs. Therefore (a prize is won) = P(A or B) = xs ‘The key point is that P(A oF B) = P(4) + P(B). The word ‘ois important. Whenever vou see it it should suggest to you the idea of adding probabilities. Notice, however, that A and 2 have no outcomes which are common to bath events, Two events which have no ‘outcomes common to both are called mutually exclusive events. So the result (Aor B) = P(4)+P(B) is known as the addition law of mutually exelusive events. ‘The addition law of mutually exclusive events can be extended to apply to more than two ‘events if none ofthe events has any outcome in common with any of the other events. UE Ay Arve ae mutually exclusive events then | P(A, oF Ay oF...0F A,)=P(4,) +P(4y) +... PCA,) an | ‘The addition law needs to be modified when the events are not mutually exclusive. Here isan example. Example 43.1 ‘Two fair dice are thrown. A prize is won if the total is 10 ori each individual score is lover 4, Find the probability that a prize is won, ‘The sample spuce isthe same set of 36 pairs listed atthe top ofthis page. (Crarret d: Paowasury n Let C be the event tha the total score is 10,80 C (5.5).4,6).6.4)} Let B be the event that each roll ofthe dice results in a score over 4. as before, so B= {(5.5},5.6).6.9)6.6)} “Therefore P(C) = $= 7} and P(B)= A prize is won if B or C occurs, and the possible outcomes which make up this ‘event are {(5,5),(4,6),(6,4}.(5.6).(6,5).(6,6)} J, soin this case Therefore P(Bor C)= $=} But P(B)+P(C) P(Bor C) + P(B)+P(C). For evens such as B and C., which are not mutually exclusive, the addition rule given by Equation (4.2) snot valid The rule can be modified so that it applies to any two events. This wil be studied later in ‘the course. Can you see how to modi the rule? Exercise 4A. 1A fair dice is thrown onee. Find the probabilities that the score is (2) Digger han 3, (b) bigger than or equal to 3, (©) anodd number, @ aprime number, (©) bigger than 3 und a prime number, (0) bigger than 3 or a prime number or both, (@) bigger than 3 ora prime number, but not both 2. A card is chosen at random from an ordinary pack. Find the probability that itis @ wd, (©) picture card (K,0,.), (©) anhonour (A, K,Q,4, 10), (@ ared honour, (©) red, or an honour, or both. 3. Two fai dice are thrown simultaneously. Find the probability that (@) the tual is7, (©) the total is at east 8, (©) the total is a prime number, (©) neither of the scores is a 6, (©) atleast one of the scores is a6, (8) exactly one of the scores is a 6, (@) the evo scores are the same, (h) the difference between the scores is an odd number. 4A fair dice is thrown twice. I the second score isthe same as the frst, the second throw does not count, andthe dice is thrown again until a different score is obtained. The two different scores are added to give a total List the possible outcomes. Find the probability that (a) the totais 7, (8) the tua is at east 8, (©) atleast one ofthe two scores is a6, (@) the first score is higher than the last. n Stantencs | 5. Acbag contains ten counters, of which six are red and four are green. A counter is chosen at ‘random: its colour is noted and itis replaced in the bag. A sevond counter is then chosen at ‘random. Find the probabilities that (a) both counters are red, (b) both counters are green, (©) justone counters red, (@) atleast one counter is red, (©) the second counter is red 6 Draw a bar chart to illustrate the probabilities of the various total scores when two fair dice ‘are thrown simultaneously, 44. Conditional probability Consider a class of 30 pupils, of whom 17 ae girls and 13 are boys. Suppose further hat five of the girs and sx ofthe boys are le-handed, and ll ofthe remaining pupils are right-handed. Ifa pupils selected at random fom the whole cas, then the chance that he or she is left-handed is, oo Yh However, suppose now that a pupil i selected at random from the gts inthe las. The chance that this gir willbe left-handed is So being told thatthe selected pupil agi alters the chance that the pupil wl be et handed, This san example of conditional probability. The probability hasbeen calcvlted on the bass ofa ext condition’ which yo have boon given ‘There is some notation which is used for conditional probability. Let. be the event that 4 left-handed person is chosen, and let G be the event that a girl is chosen. The symbol (LG) stands forthe probability that the pupil chosen i left-handed given thatthe pupil chosen i girl So in this ease P(L IG) = §,althowgh PCL) = 8 It is useful tind a connection between conditional probabilities (Where some extra information is known) and probabilities where you have no extra information, Notice that the probability P(L1G) can be written as ‘The fraction in the numerator isthe probability of choosing a left-handed giel if you are selecting from the whole class, and the denominator i the probability of choosing a gil iffyou are selecting from the whole class. In symbols this could be writen as P(L and G) rio P ‘This equation can be generalised to any two events A and B for which P(A)>0. (Cuarre 4: ProBairy B If A and B are two events und P(A) > 0, then the conditional probability of B given A is, 3) Rewriting this equation gives P(A and B)=P(A)xP(BLA), aay whieh is known asthe multiplication law of probal ‘Suppose a jar contains seven red discs and four white dises. Two dises are selected ‘without replacement. Without replacement” means tha te first disc is not put back in the jar before the second disc is selected.) Let R, be the event {the frst disc is red}, let Ry be the event {the second dise is red} let W; be the event the frst dis is white} and Jet W, be the event {the second dise is white}. To find the probability that both ofthe dises are red you want to find P(R, and 2). Using the multiplication law, Equation 4.4, 0 find this probability, P(R, and Ry) = PCR) PCR Ry). Now P(R,)= since there are 7 red discs in the jarand 11 dises altogether. The probability P(R:1.R,) appears more complicated, but it represents the probability hat the second dise selected is red given that the frst dise was red. To tind this just amagine that one red disc has already been removed from the jar. The jar now contains 6 red dises and 4 white dses. The probability now of geting ared disc is P(Rs1R,) = $ ‘Therefore, using the multiplication aw, (Rand) = PCR) «PCR, =P Fiscdic Sond die ‘You can represent all the possible outcomes ‘when to dises ate selected from the jar in a tree diagram, as in Fig. 4.3, Notice that probabilities onthe ist ‘lyer of branches give the chances of geting ared disc ora white dise when the fist discs selected. “The probabilities on he second layer” are the conditional probabilities. You can use the tree diagram to caleaate the probability of any of the four possiblities, R, and R, R; and 3, Fg. 4.3, Teedingrm to show the ote hen W; and Ry and M, and Wy (woes doen wont ecm ot. “ sstansnes 1 Fite Second ie POR) fh Ryan ky PAR pewany o> Riad be ax 7 vgn) = Wiad W, Fig. 44. Tre diagram to show the cacuation oF sn) ‘To do this you move along the appropriate route, multiplying the probabilities, as shown in Fig. 44. For example, co find the probability of geting a white disc followed by a red dise, (WW, and Ry) trace that route on the tee diagram and multiply the relevant probabilities, ‘You coud als have found P(W, and) by using the multiplication tw POW, and) = P(W,) PCR 1K) = fp >eq = ‘You can now use the addition and multiplication laws together to find the probability of more complex event, For example, (both discs are the same colour) = P((a, and R) or (1W, and W,)). ‘The event 2, and Ris the event that both discs are red, and the event W; and Wz isthe ‘event that both dises are white, These events cannot both be satisied a the same time, so they must be mutually exclusive, Therefore you can use the addition law, giving P((R, and fs) oF (W, and W,)) = P(R, and Ry) + P(N; and W) P(R,) PCR R,) + PC) PCW) 64 4y3_ 8,8 Tea HI Ho 10 = 0 ‘You can also use the twee diagram for this calculation. This time there is more than one route through the ire diagram which satisfies the event whose probability isto be ‘ound. As before, you follow the appropriate routes and multiply the probabilities. You then ad all the resulting products, as in Fig. 45. PIR = | + Wand | PHI) = Mind Ms a= His i + = fi Fig. 45. Tie diagram to show thecleltion of F(R, aR) or sal). 45 (Cuaeren 4 Prosaniusry i ‘You can use tree diagrams in any problem in which there isa clear sequence tothe ‘outcomes. including problems which are not necessarily to do with selection of objects. Example 44.1 ‘Weather records indicate that the probability that paniculardayis dy is Ard B.C isa fotball eam whose record of success is better on dry days than on wet days. The probability that Arid win on a dry day is}, whereas the probability that they win on a fvet dy is Arid are due to py ther next match on Saturday (Whit the probability that Arc wll win? (@) The Saturdays ago Arid won their match, Whats the probability that it was a dy day? ca “Typeof weather Hore the sequence involves frst the type of weather and then the result of the foosball match. The tee diagram in Fig. 4.6 illustrates the information, POvinI Wet et and win Notice that the probabilities in bold ype were not given inthe statement el of the question. They have been 2a. ‘aleulated by using equations like Porat winlwat of “totes Powet) + Pldry) = 1 Fig.446. Tre dara foe otal was, (8) Ploin) = P{(dry and win) or (wet and win)) (dry and win) + P{wet and win) (ary) x P(win Lary) + P(wet) x P(win | wet) 33423 9 4 2 STE 0+ TO ~ a (b) In this case you have been asked to calculate « conditional probability However, here the sequence of events has been reversed and you want to find ary !win) vidya) MEDS) Sh You can thnk of Pity Lwin) a being the proportion of tines that the weathers chy out of al the tines that Ard wir Independent events Consider again a jar containing seven red dises and four white dises. Two dises are selected, but this time with replacement. This means that the first disc is returned tothe jarbefore the second dise is selected 16 Sransnies 1 Let R; be the event that the fist disc is red, Rs Pinte Second dite ‘be the event that the second disc is red, W; be PER) =5, ‘the event thatthe first dise is white and W be the event thatthe second dise is white. You ean. represent the selection ofthe two dises with Fig. 47, tee diagram similar to Fig. 43 but ‘with different probabilities on the second ‘layer’ and ‘The probability P(A.) that the second disc is red Iso be found using the addition and ‘ulkiplication laws, Fig 4h diagram o show the tomes when tds oem with replete rom Hy op Mana P(A) =P(R, and R,) or P{W, and) = P(R, and R)+P( and Ry In this ease P(R,) = P(A; 18)), which means thatthe first disc's being red has no effect on the chance of the second dise being red. Ths is what you would expect, since the fist dise was replaced before the second was removed. Two events A and B for which P(B1A)= P(B) are called independent. Independent events have no effect upon one another, Recall also tht, fom the definition of conditional probability, P(BI a)= Psd 8) Pa) So when you equate the two expressions for P(BI A) for independent events, you get F(A ane) ‘which when rearranged gives P(A ank x Pay eB) which wh iged gives P(A and B) = P(A) x PCB). Independent events ate events which have no effect on one another. For wo independents events A and B, as “This result is called the multiplication law for independent events. Example 45. In a carnival game, a contestant has to first spin a fair coin and then roll fair cubical dice whose faces are numbered 1 to 6. The contestant wins a prize if the coin shows ‘heads and the dice score is below 3. Find the probability that a contestant wins a prize. P(prize won) = P((coin shows heads) and (dice score is lower than 3) ‘The event thatthe cain shows heads and the event that the dice score is lower than 3 are independent, hecause the score on the dice can have no effect on the result of the spin ofthe coin. Therefore the multiplication law for independent events can be used P(prize won) = P((coin shows heads) and (dice score is lower than 3)) P(coin shows heads} P(dice scare is lower than Teed 255 (Cuarren 4: PronaamLsry 7 “The law of multiplication for independent events can be extended to more than two ‘events, provided they are all independent of one another. If Ay, Ay... A, ate 1 independent events then P(A, and A, and ..and A,) = P(A))xP(A)x...XP(A,) 6) Example 4.5.2 ‘A fair cubical dice with faces numbered 1 0 6 is thrown four times. Find the probability that three ofthe four throws result in a 6. ‘You can use the addition law of mutually exclusive events and the multiplication Jaw of independent events to break the event {three of the four scores are 6) «down into smaller sub-events whose probabilities you can easily determine: (6, and 6, and 63 and N,) or (6, and 6, and N; and 6,) or (6, and Ns and 63 and 6,) or |" (¥; and 6, and 6, and 64) Pichree of the scores are 6s) = ‘where, for example, 6, meuns that the frst seore was a6 and Ny means that the third score was nota 6. Using the addition and multiplication laws, (6, and 6, and 63 and N,) or (6; and 6 and Ny and 6,) or (6; and Ny and 6; and 6,) or (, and 6, and 6, and 6,) =P(6, and 6, and 6, and N,) +P(6, and 6, and, and 6,) +P(6, and Nand 6, and 64) + P(N; and 6, and 6, and 6,) = P(6,) P(62) x P(63)xP(Ms) +P(6,) x P(62)* P(Ns) x P(64) ++ P(6,)> and 2, using a different procedure from that used to write out Fig. 52. Write the first permutation of, Z,, Z; and Z, at the top ofthe first column. Any ‘permutation can be the first permutation. Leave A inthe same position, but write the other permutations of Z,, Z and Z; underneath. Write a permutation not already used atthe top of the next column, and repeat writing the other permutations of Z,, Z, and Zs underneath while Keeping A inthe same position. Keep going until you have written all the permutations of A, Z;. Z: and Zs, Ade — GABE — BLA, — DiPasA ALZZ, LAL, ZLAZ BEsZaA ALLL, LAD, — BZ AZy — ZyBZoA ADL2, LAL, EslshZ, — D,Z,A ALhI, All, TAZ, LAA ADL, TAOsE, Tilak, Lyl,A Fi, Alpert eters 2-2, a0 2, “There are 4! arrangement in Fig. $.4. Each column has all the permutations of Z,, Z, and 24, 28a so the mt clos akogser Now ep ZZ ml 2 by ‘and you have the permutations of A,Z, Zand Z in the top row. These are AMZ, TALL, IAL, TLIA. ‘You can generalise this argument. Suppose that you have n objects and r of them are identical, Then the number of arrangements in the table equivalent to Fig. 54 will ben When you write down the permutations inthe columns corresponding to the arrangement at the top ofthe column, you find that there are +! of them, o the table will have r! rows. The umber of oumns (hat isthe amber of stint permustons) is therefore © ‘This result also generalise, “The number of distinct permutations of w objects, of which p are ‘identical to each other, and then q of the remainder are identical, andr of the remainder are identical, and so on is SS where ptgtre= pixglxrix 53 (Chao 5: Pexataions AND COMBINATIONS 89 Example 5.2.1 Find the number of distinet permutations of the leters of the word MISSISSIPPI. “The number of leters is 11, of which there are 4,4 18,2 Psand | M. The ‘number ofdistinet permutations of the letters is therefore uy —— 34650. Weaxaal Exercise SA 1. Seven different cars are to be loaded on to a transporter truck. In how many different ways cca the ears be arranged? 2. How many numbers are there between 1245 and 5421 inclusive which contain each ofthe digits 1, 2,4 and 5 once and once only? 3. Ananistis be done? cing to aerange five paintings in a row on a wall, In how many ways can this 4 Ten athletes are running in a 100-mete race, In how many different way’; can the frst tree places be filled? 5. By writing out all the possible arrangements of D\E\ ED; . show that there are different arrangements ofthe letters of the word DEED. 3 6 A typist has five letters and five addressed envelopes. In how many different ways ean the lewers be placed in each envelope without getting every letter inthe right envelope? Ifthe lewers are placed in the envelopes at random what Is te probabllity thar each leer Is In is correct envelope? 7. How many differen arrangements can be made of the letters in the word STATISTICS? 8 (@) Calculate the number of arrangements ofthe leters in the word NUMBER. (b) How many ofthe arrangements in par (a) begin and end with a vowel? 9 How many different numbers can be formed by taking one, two, three and four digits from the digits 1,2, 7 and 8, if repetitions are not allowed? ‘One of hese numbers is chosen at random, What i the probability that itis greater than 2007 ‘Combinations Inthe last section you considered permutations (arrangements) for which the order of the objects i significant when you count the numberof different possibilities. In some circumstances, however, the order of selection does not matter. For example, if you. ‘wore dealt a hand of 13 cars from a standard pack of 52 playing cards, you would not ‘be interested inthe order in which you received the cards. When a selection is made From a set of objects and the order of selection is unimportant i is called a combination. 90 Sranisncs | ‘To see the difference between combinations and permutations consider what happens ‘when you select three letters from the four leters A B. C and D. Here is a procedure for finding all the combinations. It stars by considering permutations, and gives you a method of counting the combinations. ‘Start with any permutation of three letters from A, B, Cand D, and write it atthe top of the first column. Write the other permutations of the same three letters underneath it, ‘Write a permutation not already used st the top ofthe next column, and write the other permitations ofthe letters undemeath. Keep on until you have used all the permutations of three letters from A, B, Cand D. ‘The results are shown in Fig. 55. ABC ABD ACD-—«BCD ACB ADB ADC BDC BAC BAD CAD. CBD. BCA BDA CDA CB. CAB DAB DAC DBC. CBA DBA DCA Fig. 55. Poco fo finding he nanber of combinations, Each column then corresponds to a single combination because the elements in any one ‘column differ ony in the onder in which the letters ate written. The permutations are all different, but they all give rise tothe same combination at the head ofthe column. To ‘count the combination, it is sufficient to count the columns. ‘There are *P; permutations of3 objets from 4 objects, so there ae #7, elements in total in Fig. 55. ach column has 31 elements, so, by diving, you find that here mast be 2 ‘n a 4 columns, which means — combinations. as “= 7755 cepa 4 4x3x2%1 BI Ga aea Weal IxGx2xD So there are 4 combinations of tree letters from the four letters A, B, Cand D. Yen can ply thn saorng nd isco finding the maner of cmb ofr oes etn feeb, ie ube hich cones oH 5., tee eld be "eens in alec coun woulshae rele. Tee taken from mn objects would therefore be combinations of r objects CCunore 5: PerraTions xo Commissions Writing "P, in factorials as leads to a simpler expression to remember: the number of combinations of r objects taken from 1 objects is (a-r)xrt also used, and your calculator probably uses one of them, ‘A combination isa selection in whieh the order of the ‘objects selected is unimportant. ‘The number of different combinations of r objects selected tom dic es") wee ("= Example $3.1 ‘The manager ofa football team has a squad of 16 players, He needs to choose 11 10 play inna match, How many possible teams can be chosen? ‘is example (snot entry realste because players wit not be equaly capable of laying in overy pasion, butt does show how many possible teams there ave (18 ‘portant to decide whether tis question is about penmutations or combinations. Cleary {he inporlant issue here isthe people in the team and not thei order of selection. Thereore tis question fs about combinations rather than permutations. is)__is 161 11)” Q6=1)be ‘The number of teams is surprisingly large. "You may notice in Example 5.3.1 that i you had chosen the 5 players to drop out ofthe squad of 16 players, you would in effect be selecting the 11 by another method. You can ie st te pyri (5) wyean (acct 5)” G6=syxat” xsi” shih cet aa () When oncom naa umber) oe (8), sock shit Sie 161= 16x 15%...x 12X11 10%...%21, you can cancel the 11 10%...%2%1 in the numerator with the 11! in the denominator. 2 Stamisnes 1 ‘Therefore you can write down immediately that (1s) texssee ‘where you need to make sure that you multiply 5 numbers inthe numerator ifthe denominator is 51 In general, ( Example $32 ‘A eam of 5 people, which must contain 3 men and 2 women, is chosen from 8 men and ‘7 women, How many different teams can be selected? The mmbeof teeta of3 men weasels mi (8) The mambe feet ans 2 women wih an be ste tons (2) Anyi (2) menses ann pith any tte (2) women ‘make an acceptable team of 5. Therefore you need to multiply these two quantities, together to find the numberof different teams possible. ‘The number of possible teams is (3)(2) FS BS = soca = 1176, ‘You can now apply some ofthese counting methods to probability examples. Example $3.3 Five cards are dealt without replacement from a standard pack of 52 cards. Find the probability that exactly 3 of the 5 eards are heats ‘The sample space is very large It would consist of alist ofall possible sets of $ cards which you could choose from the 52 cards inthe pack. You do not need such alist, however. All that you need to know is how many different sets of cards the sample space contains. You are choosing $ objects from 52, so the number of meso choke (52) bere oer en iat Let A be the event that exactly 3 cards of the 5 dealt out are hearts, The method used to find the number of outcomes inthe event A is very similar to the technique used in Example 5:32. The ae (& (Curren 5: PERMUTATIONS AND COMBINATIONS 9% treat (2) ea of Tro he nombres of Scan with extly 3 eas (!)p(%9) 12) (8 G)G The probabil that event happens is 222-9015, eonetto 3 gsc gure (3) Exercise 5B How many different three-card hands ean be dealt from a pack of 52 cards? From a group of 30 boys and 32 girls two girls and two boys are to be chosen to represent their school. How many possible selections are there? A history exam paper contains eight questions, four in Part A and four in Part B (Candidates are required to attempt five questions. In how many ways can this be done if (@) there are no restrictions, (®) atleast two questions from Part A and atleast two questions from Part B must be attempted? A committee of three people i to be selected from four women and five men, The rules ‘ate that there must be at least one man and one woman on the committee, In how many different ways can the committee be chosen? Subsequently one of the men and one ofthe women marry each other. The rule also state ‘that a married couple may not both serve on the committee. In how many ways can the committee be chosen now? ‘A box of one dozen eggs contains one that is bad. If three exes are chosen at random, what is the probability that one of them will be bad? Ina game of bridge the pack of 52 card i shared equally between all four players. What is, the probability that one particular player has no hearts? A bag contains 20 chocolates, 15 toffees and 12 peppermints, If three sweets are chosen at random, what is the probability that they are (@ an aitterent, () all chocolates, (©) all the same, (@ all not chocolates? snow ost (")=(,",) ‘Show that the number of permotations of objects of which r are of one kind and =r sofa tins”) s4 96 Sranisnes 1 Applications of permutations and combinations {In Example 5.1.2 you were asked to find the number of ways that eight people could stand in ‘line when two people had to stand next to each other. This was an example in which you ‘were asked to find the numberof permutations or combinations ofa set of objects with some ‘extra condition included. This section will show you how to answer such questions. Example $4.1 Find the number of ways of arranging 6 women and 3 men to stand in a row so that all 3 ‘men are standing together. ‘You can make this problem simpler by thinking of the 3 men as single unt Imagine tying them together for example! You would then have 7 items (or units), the 6 individual women and the block of 3 men. So one posible arrangement would be WoW We We We We MAMA, ‘where 1, for example, represents the third woman “The number of permutations ofthese 7 units is 7!. However, for each of these ‘permutations the men could be arranged (inside the rape) in 3! different ways. ‘Therefore the total number of permutations in which the 6 women and 3 men can be arranged so thatthe 3 men are standing together is 7!x<3!= 30 240. Example 54.2 Find the number of ways of arranging 6 women and 3 men in a row so that no two men are standing next to one another. ‘You can ensure that no two men stand next to one another inthe fllowing way. Arrange the 6 women to standin a line witha space between each pair of them and two extra spaces, one at exch end of the line. One such arrangement is Spe! Sree? Soe Src Soe Soe Sou? wim ty tw bow tom ‘There are 6! arrangements of the 6 women, For any ofthese 6! arrangements you can ‘now pick a space in which to place the first man MTs can be done in 7 ways Here is the arrangement above fone ofthe men, M;, placed in Space 2, ee ee ee, Ping hy tt mt my By using a similar argument you can sce that there will be 6 choices forthe position of 3, and 5 choices forthe pasition of My. Once the 3 men have been placed the remaining spaces can be ‘closed up’ or simply ignored. By using this ‘method you can guarantee that no two men can stand next to one another. Also all possible arrangements will be counted using this method. (Gunerer 5: PERMUTATIONS AND COMBINATIONS 95 ‘Therefore the number of permutations in which no two men stand next tone another is 61% 76% 5=151200. {It should be clear thet you muttply 61 by 7, 6 and 5 because for every ane of he 6! arrangements of the women there willbe 7 spaces to choase for M,, and then 6 places {0 choose for My, and then 5 places choose for My ‘tis also worth noting thatthe answers fo Examples 5.4.1 and 5.4.2 when added togetier 0 not give 91, which is the total number ofarangements of 9 people without any restriction a al. This is because there is @ third passivity Iftwo men were standing {ogethor and the third man was separated tom these two by same women, than it would nat be the case that al the men were logether but nether would i be the case that the ‘three men were all apart from one another. Example 84.3, A group of 12 people consisting of 6 married couples is arranged at random in a line for 1 photograph. Find the probability that each wife is standing next to her husband, ‘The numberof unrestricted arrangements is 12!, Each of them is equally likely. If each husband and wife ‘couple’ isto stand together, then you can consider each couple as a unit. There ate therefore 6 such units ‘The number of permutations ofthese units is 61 But the first couple HW; can be arranged in 21 ways, either H,1¥, or W;#. This applies equally to couples 2,3,4,5 and 6. Therefore the number of arrangements in which each couple stands together is 6! (2) Hae Mensa pte) =O 29 L910 coe bam 544 Four letters are tobe selected from the letters inthe word RIGIDITY. How many \ifferent combinations are there? 3 nomi conintiie(5)=5, Tesh coninton ih Reet sin is a prten abt ‘combinations you are not interested in the order. Al that matters here is which team” of letters you choose. The possible teams are RGDT RGDY RGTY RDTY GDTY. 96 Sransnes L ase? Conditions witoe Ines yous sectg 3 ees om fe Skee... 7. ogee ith ove se nue eons (2)=10, Hot tins ih =: RGD! RGM RGM ROT ROY ‘on Gon Gov Gr DTM oe Combintions wi of Te icwe yous sete? ete fom the leer AG... ¥.togee RGI RDI RTI RYE GDIT GIN GY DTI DYN TY. cased Com Intivene ou eel eter om he tes 0.7, oer ith tee se ero eos (°) sions with three Is 5. Here they are RUT Gilt DIN Ti Yat ‘The total numberof distinct combinations of four letters selected from the letters ofthe ‘word RIGIDITY is 5 +10-+ 10 +5=30. All 30 combinations have been Iisted above Exercise 5C “The leters of the word CONSTANTINOPLE are written on 14 cards, one on each card. The ‘cards are shuffled and then arranged in a straight line. (@) How many different possible arrangements are there? (0) How many arrangements begin with P? (© How many arrangements start and end with 0? (@) How many arrangements are there where no two vowels are next to each other? A coin is tossed 10 times. (@) How many different sequences of heads and tils are possible? (b) How many different sequences containing six heads and four tls are possible? (©), What isthe probability of getting six heads and four tails? Bight cards are selected with replacement from a standard pack of 52 playing eards, with 12 picture curds, 20 odd cards and 20 even cards. (@)_ How many different sequences of eight cards are possible? (b) How many ofthe sequences in part (a) will contain three picture cards, three odd- ‘numbered cards and two even-numbered cards? (©) Use pats (a) and (b) to determine the probability of geting three picture card, three ‘odd-numbered cards and two even-numbered cars if eight cards are selected with replacement from a standard pack of 52 playing cards, (Canpren 5: PERMUTATIONS AND ComsnNamTIoNS 2 ight women and five men are standing inline. (@) How many arrangements are possible if any individual can stand in any position? (b) In how many arrangements will all five men be standing next to one another? (©) Inhow many arrangements will no two men be standing next t© one another? Each of the digits 1, 1,2,3,3, 4, 6 is writen on a separate card. The seven cards ae then laid out in a row to form a 7-digit number. (a) How many distinct 7-digit numbers ae there? (b) How many of these 7-digt numbers are even? (©) How many of these 7-igit numbers are divisible by 4? (@) How many of these 7-digit numbers tart and end withthe same digit? ‘Three families, the Mehtas, the Mupondas and the Lams, go tothe cinema together to \wateh a film. Mr and Mrs Mehta take their daughter Indira, Mr and Mrs Muponda take their sons Paul and John, and Mrs Lam takes her children Susi, Kim and Lee. The families ‘occupy a single row with eleven seats, (@)_Inhow many ways could the eleven people be seated if there were no restriction? (6) Inhow many ways could the eleven people sit down so that the members of each family areal siting together? (©) Inhow many of the arrangements will n0 two adults be sitting next to one another? ‘The lets ofthe word POSSESSES are writen on nine cards, one on each card, The cards are shuffled and four of them are selected and arranged in a straight line. (@)_ How many possible selections are thereof four letters? (b) How many arrangements are there of four letters? Miscellaneous exercise 5 ‘The judges in a “Beautiful Baby" competition have to arrange 10 babies in order of met. In how many different ways could this be done? Two babies are to be selected to be photographed. In how many ways can this selection be made? [In how many ways can a committee of four men and four women be seated in a row if (@) they can st in any position, (b) no one is seated next toa person of the same sex? How many distinct arangements are there of the letters in the Word ABRACADABRA Six people are going to travel in a six-scater minibus but only thee of them can drive, In how many different ways can they seat themselves? ‘There are eight different books on a bookshelf: three of them are hardbacks and the rest are paperbacks, (@) In how many different ways can the books be arranged if all the paperbacks are together and all the hardbacks are together? (©) Inhow many different ways can the books be arranged if all the paperbacks are together? 10 1" 2 B “4 15 Sransmcs 1 Four boys and two gil sit ina line on stools in front ofa coffee bar. (a) In how many ways can they arrange themselves so thatthe two girls are together? (b) In how many ways can they sit if the two girs are not together? (ocr) ‘Ten people travel in two cars, a saloon and a Mini. If the saloon has seats for six and the ‘Mini has seats for four, find the nurnber of different ways in which the party can travel, assuming that the order of seating in each ear does not mater and all the people ean dive. (OCR) Giving a brief explanation of your method, calculate the number of different ways in which the letters of the word TRIANGLES can be arranged if no two vowels may come together. (ocr) have seven fruit bars to last the week. Two are apricot, thee fig and two peach. I select, ‘one bar each day, In how many different orders can I eat the bars? If I select a frit barat random each day, what i the probability that Ieat the two apricot ‘ones on consecutive days? ‘A class contains 30 children, 18 girls and 12 boys. Four complimentary theatre tickets are listributed at random tothe children in the class. What isthe probability that (a) all four tickets go to girls, (b) two boys and two girls receive tickets? (ocr) (a) How many different 7-ligit numbers can be formed from the digits 0,1,2,2,3.3,3 assuming that a number cannot start with O ? (b) How many of these numbers will end in 0? ocr) Caleulate the number of ways in which three girls and four boys can be seated on a row of seven chairs if each arrangement is to be symmetrical (ocr) Find the number of ways in whieh (@) 3 people can be arranged in 4 seats, (b) 5 people can be arranged in 5 seats In ablock of 8 seas, 4 are in row A and 4 are in row B. Find the number of ways of arranging 8 people inthe 8 seats given that 3 specified people must be in row A. (OCR) Bight different curds, of which Four are red and four are black, are dealt 10 two players so that each receives a hand of four ears. Calculate (2) the total numberof different hands which a given player could receive, (b) the probability that each player receives a hand consisting of four cards all of the same ‘colour. (ocr) A piece of wood of length 10 em isto be divided into 3 pieces so that the length of each piece isa whole number of em, for example 2 em,3 em and 5 em. (8) List all the different sets of lengths which could be obtained. (b) fone of these sets is selected at random, what isthe probability thatthe lengths of the pieces could be lengths ofthe sides of triangle? (ocr) 16 W 18 w (Cuan 5: PoearATIONS AND COMBINATIONS » Nine persons are to be seated at three tables holding 2,3 and 4 persons respectively. In how ‘many ways ean the groups siting atthe tables be selected, assuming that the order of sitting atthe tables does not matter? (ocr) (@) Calculate the numberof different arrangements which can be made using all te letters of the word BANANA. (6) ‘The number of combinations of 2 objects from mis equal to the number of combinations of 3 objects from 1, Determine 7 (ocr) ‘A *hand! of 5 cards is dealt from an ordinary pack of $2 playing cards. Show that there are nearly 2.6 million distinet hands and that, ofthese, 575.757 contain no card from the heart suit On three successive occasions a card player is deat # hand containing ao heart, What isthe probability of this happening? What conclusion might the player justifiably reach? (ocr) Notice that 7161 =10!. Find three integers, m,n and 7, where r> 10, for which 6 Probability distributions ‘This chapter introduces the idea ofa discrete random variable, When you have completed it, you should ‘© understand what a discrete random variable is + know the properties of a discrete random variable + be able to construct a probability distribution table for a diserete random variable. Diserete random variables ‘Most people have played board games at some time. Here is an example, Game A A turn consists of throwing a dice and then moving a number of squares equal to the score an the dice. “The numberof squares moved in a turn’ isa variable because it can take diferent values, namely 1, 2, 3, 4, 5 and 6, However, the value taken at any one turn cannot be predicted, but depends on chance. For these reasons “the number of squares moved ina turn’ is called a “random variable’ A random variable isa quantity whose value depends on chance. | ‘The Snunber of squases moved in a tur’ is discrete random variable because there are clear steps between the different possible values it can take, [Although you cannot predict the result ofthe next throw of the dice, you do know that, if the dice is fair, the probability of getting each value is £. A convenient way of expressing this information isto let_X stand for ‘the number of squares moved in turn’. Then, for example, PC 3} means ‘the probability that X takes the value 3 is Gencaising, POX =£) means ‘the probability thatthe variable X takes the value" Note how the capital latter stands for the variable sal and the smal eter stands fr the value which the variable takes. ‘This notation is used in Table 6.1 to give the possible values for the number of squares moved and the probability of each value, This table is called the ‘probability distribution’ of X. “Toble 6.1, Probability dvibaton of the numberof squares moved inaturfora single throw fade. (Charen 6: Pronasitry DistRIBUTIONS 101 ‘The probability distribution of diserete random variable isa listing of the possible values of the variable and the corresponding probabilities. In some board games, a dice is used in a more complicated way in order to decide how many squares a person should move. Here ate two different examples Game BA person is allowed a second throw of the dice ifa 6 is thrown, and, in this ‘case, moves a number ¥ of squares equal to the sum ofthe to scores obtained. Game C The dice is thrown twice and the number, W , of squares moved is the sum of the two scores. Fig. 6:2is tree diagram illstrating Game 8. As V is the numberof squares moved in turn itean tke the values 1,2, 3, 4, 5,7, 8,9, 10, 1 and 12. The probability of each ofthe frst five values is fas in the previous game. In onder to score 7, you have to score 6 followed by 1 Since the two evens are independent, the probability of scoring a6 followed by a | is found by multiplying the two probabilities: fi. 62. Tie dag rae P(Y = 7) = P(6 on first throw) x P(I on second throw) nixed EXE" we “The probability hat Y takes cach ofthe values 8, 9, 10,11 and 12 willalso be 4 Table 6.3 gives the proba aston of 37 8 9 111 12 Total pe i ae re 8 ee ‘Tobl 6.3, Probability eribton of Ye name of spares moved in Game ‘The possible values of W in Game Can be found by constructing a table as shown in Fig. 64. First throw Second ow Fig. 6.4, Psble tot sores when ndvdul sors ae added in Game. 12 ssvantsmcs | ‘Tere are 36 outcomes inthe ble and they areal equally likely so, fo exarple P(W =6)=$ and P(W=7)=. Table 65 gives the probability disibuton of W ‘The factions tou ave been cancelled tin ter present forms itis ier to 6 the Shape ofthe dstbuton © 394.5 67 8 9 11 12 Total wy) L2345 664324 RW=") 35 et ‘Toble 65. Probability dsribaon of W. th umber of squares moved in Game C Fig. 66 allows you to compare the probability distributions of X, ¥ and W.. Tisase7 bowie 1as4seTesuola 12sas6789i0n Fig. 66. Comparing he probability dinribations of Games A and C Looking at Fig. 86, which method of scoring wil tke you round the board most quickly ‘and which most slowly? (A method for fining the answer to this question by calculation ‘6 given in Example 8.1.1) Examples 6.1.1 and 6.1.2 illustrate some other probability distributions. Example 6.1.1 ‘A bag contains wo red and three blue marbles, Two marbles are selected at random Without replacement and the number, X, of blue marbles is counted. Find the probability distribution of X. Fig. 6.7 isa tee diagram illustrating this situation, Rj denotes the event that the first marble is red and Ry the event thatthe second marble is red. Similarly B, and By stand forthe events that the first and second marbles respectively are blue. X can take the values 0, 1 and 2. aod ad Band P{xX=0)=P(R, and) gn) =F Bsns (R,)xP(RIR,) Fig. 67. Tee diagram for Example 6.1.1 P(X =1)=P(B, and fp) +P(R; and B,) (By) «PCR; 1B,) + PCR, )P(B: Ry) CCuAPren 6: PaoBABILITY DISTRIBUTIONS 103 P(B, and By) P(a))xP(:1 8) ne baa $oa Here isthe probably distibution of 2 1 2 Toul Boe) Example 6.1.2 ‘A random variable, X, has the probability distribution shown below. x Too PUX=1) 01 02 03 04 ‘Two observations are made of X and the random variable ¥ is equal to the larger ‘minus the smaller; if the two observations are equal, ¥ takes the value 0. Find the probability distribution of Y. Which value of ¥ is most Likely? ‘The following table gives the values for X and the corresponding value of ¥. Since the two observations of X are independent, you find the probability of each pair by ‘multiplying the probabilities for the two X values, as shown inthe lst column of the table First value of X Second value of Probability 01x01 01x02 =002 0.1x03 =0.03 01x04 =008 020.1 =002 02x02 = 0.04 02x03 =006 02x04 =0.08 04x03 =0.12 04x04 =0.16 108 Stamsnes 1 ‘This table shows that ¥ takes the values 0, 1, 2.and 3. You ean find the total probability foreach value of ¥ by adding the individual probabilities: P(Y =0)=0.01 +0004 +0.09 +0.16=0.30, PUY =1) =0.02+0.02+0.06 +0.06 +0.12+0.12=0.40, P(Y =2)=0:03+008+0.03+0.08=0.22, PU =3)=0004 +004 = 0008. So ¥ has the probability distribution shown below. y oul P(¥=y) 030 040 0.22 0.08 1 ‘Tho most likely value of ¥ is 1, because it has the highest probability. Exercise 64 A fair coin is thrown four times, The random variable X is the number of heads obtained, ‘Tabulate the probability distribution of X ‘Two fair dice are thrown simultaneously. The random variable D isthe difference between ‘the smaller and the lager score, of zero if they are the same. Tabulate the probability distibution of D. A fair dice is thrown once. The random variable X is related to the number N’ thrown on the dice as follows. If N is even. then X is half 1V: otherwise X is double NV. Tabulate the probability distribution of ‘Two fair dice are thrown simultaneously. The random variable H is the highest common factor of the two scores. Tabula the probability distribution of H, combining together all the possible ways of obtaining the same value ‘When a four-sided dice is thrown, the score is the number on the bottom face. Two fair four-sided dice, exch with faces numbered 1 to 4, are thrown simultaneously. The random variable M is the product of the two scores multiplied together. Tabulate the probability 3). pb 2 3 4 5 Pr=) ¢ 2% % % © (a) Since the probabilities must sum to 1, cHle+2e+2e+e=1, so Be=l, giving c=}. (0) POP <3)= POP= 1) + PUT =2)+ PUT = Hch2e+20~50-$ ©) PUT >3)= PC i 4) PO e+e Be Example 62.2 ‘A computer is programmed to give single-digit numbers X between 0 and 9 inclusive in such a way that the probability of getting an odd digit (1,3,5,7,9) is half the probability of getting an even digit (0,2, 4 6,8). Find the probability distribution of X. Lethe probability of geting an even cit be c. Then the probability of geting en ond digitis 4 Since the probabilities must sum to 1, Lraran-crlerertercrteresterertent, 15, which give ‘The probability distribution of X is POX =) PUK =x)= for x=0,2,4,6and8 3,5, and 9 and i for 63 106 stanismcs | Exercise 6B 1 Inthe following probability distribution, c isa constant. Find the value of 4 The score § ona spinner is a random variable with distribution given by P(S=s)=k (5=1,2,3.4,5,6,7,8), where k isa constant, Find the value of k. 5 A cubical dice is biased so thatthe probability of an odd number is three times the probability of an even sumber. Find the probability distribution ofthe score 6 A cubical dice is biased so thatthe probability of any particular score between I and 6 inclusive) being obtained is proportional to that score, Find the probability of scoring 1 7. For biased cubical dice the probability of any particular score between | and 6 (inclusive) being obtained is inversely proportional to that score. Find the probability of scoring @ 1 8 Inthe following probability distribution, c isa constant. Find the value of Using a probability distribution as a model So far. the escusson of probability dsibations inthis chapter has been very mathemati Attis point it may be hepfal o point out the practical aplication of probability Aisuibutions. Probability distributions ae useful because they provide modes for experiments Come again the nom variable X the score ona die, whose probability distribution was given in Table 6.1. Suppose you actualy threw acice 360 times, Since the vals 1,2,3,4,$ and 6 have equal probabilities, you would expect them to occur vith approximately equal frequencies of in thi case, 1360 = 60, Tis very unlikely that all he observed frequencies will be exactly equal to 60. However if the models suitable one th observed frequencies should be closet the expected values (Quaeren 6: Pronaauury DisrRIRU TONS 107 nat conclusion would you draw about the cee ithe observed irequencies were not ‘cose tothe expected values? Now look atthe random variable ¥, whose probability distribution was given in Table 63. For ths variable the values are not equally likely and so you would not expect to ‘observe approximately equal frequencies. In Section 4.1 you met the idea that frequency relative frequency = —feaueney _ rane ney = otal frequency = probability ‘You can rearrange this equation to give an expression forthe frequencies you would cexpeet to observe! frequency ~ total frequency x probability For 360 observations of ¥ the expected frequencies willbe about 360% = 60 for 203,445 and 360% 9h =10 for y=7,8,9,10,11 and 12, Vitae wil the expected trequencies be for 360 observations ofthe random variable W, whose probability clstibution is given in Table 6.5? Exercise 6C 1. A card is chosen at random from a pack and replaced. This experiment is cartied out 520 times, State the expected numberof times on which the card is (a) actub, () anace, (©) a picture card (K..Q.) (@) cither an ace ora club or both, (6) neither an ace nora ele. 2 The biased dice of Exercise 6B Question 5 is rolled 420 times. State how many times you would expect to obtain (@) sone, (®) aneven number, (©) prime number 3, The table below gives the cumulative probability distribution fora random variable R Cumulative’ means thatthe probability given is PR), not P(R =). r 0 1 2 3 4 5 PIR k for g=5, 0 Tor all other values of. Find the value of (snd find the expected Irequeney ofthe result G=3 when 1000 independent observations of G are made. 108 stamsnes 1 EE Miscellancouserercise6 1. Three cards are selocted at random, without replacement, from a shuffled pack of 52 playing cards. Using a tree diagram, find the probability distribution of the number of honours (A, K,Q. J 10) obtained. 2. Anclectronie device produces an output of 0,1 oF 3 vols, ach ime it is operated, with probabilities 4, | and £ respectively. The rndom variable X denotes the result oF adding the outputs for two such devices which act independently (3) Tobulate the posible values of X with their comesponding probabilities (b) T1360 independent operations ofthe device, state on how many occasions you would expect the outcome tobe 1 volt (OCR, adapted) 3. The probabilities of the scores on a biased dice are shown in the table below. Score 1 es Pett (@) Find the value of &. ‘Two players, Hazel and Ross, play a game with this biased dice and a fair dice. Hazel ‘chooses one ofthe two dice at random and rolls it. Ifthe score is 5 or 6 she wins a point (©) Calculate the probability that Hazel wins a point, (©) Hazel chooses a dice, rolls it and wins a point. Find the probability that she chose the biased dice. (ocr) 4. Inan experiment, a fair cubical dice was rolled repeatedly until a six resulted, and the number of rolls Was recorded. 'The experiment was conducted 60 times. (@) Show that you would expect to get a six on the first roll en times out ofthe 60 repetitions of the experiment, () nde expected teeny foro ol one on dina pace (OCR, adapted) 5. The probability distribution ofthe random variable Y is given in the following table, where ¢ is @constant. Prove that there is only one possible value of cand state this value. 1 The binomial distribution _ ‘This chapter introduces you to a discrete probability distribution called the binomial distribution. When you have completed it, you should ‘+ know the conditions necessary for a random variable to have a binomial distribution ‘© beable to calculate probabilities for a binomial distribution ‘* now what the parameters of u binomial distibution are, ‘The binomial di ‘The spinner in Fig, 7.1 is an equilateral triangle. When iis spun it comes to rest on 7 one ofits thee edges, Two ofthe edges are A ‘white and one is black. In Fig. 7.1 the spinner is resting on the black edge. This ud ‘will be descriped as ‘showing black’. / “The spinner i fir, 0 the probability that the spinner shows black is 4 and the probability that it shows white is Fig. 7.1. tangle spinner, ‘Suppose now that the spinner is spun on 5 separate occasions. Let the random variable X be the numberof times out of S that the spinner shows black {To Gerive the probability distribution of X, itis helpful to define some terms, The act of spinning the spinner once is called a trial. A simple way of describing the result of each ‘wal isto call it a suecess (s) when the spinner shows black, and a fallure () when the spinner shows white. So X could now be defined as the number of successes inthe 5 tals, ‘The event {X=0]} would mean thatthe spinner did not show black on any of ts 5 spins. The notation f;, will be used to mean thatthe first trial resulted in a failure, f will mean thatthe second trial resulted in a failure, and so on, giving. fifa fifils) Since the outcomes of the trials are independent, P{X'=0)=P(there are 5 fuilures) PU fafsfi a) = PUL)» Pla) x PU) x PLE) XP) Budde de padi ‘The probability P(X =1) is more complicated to calculate, The event { that thee is one success and also four failures. One possible sequence of a success and four failures is 5, fa fff (Where 5, denotes the event thatthe first tial was a success). } means ho Statismcs 1 ‘The probability that the frst wal isa success and the other four trials are failures is, Pls fofidafs) = Plo) * Pla) x PUB) PLE) PLAS) However there ae four other possible sequences forthe event {X'=1}. They are fishlih fhsshls fhfiels Khhtess ‘Therefore P(X = 1) = Plsifafifals) + Pl fisafafels) + PUiufassfafs) + PU fafasals) *PUifafafess) 8)" + @<@*@)+ QQ +6) We) + BQ) =x()xQ' = [Notice that the probability for each individual sequence is the same as for any other sequence, namely ({)(2)', and that the 5 inthe last line corresponds to the number of different sequences which give X =1, You could have counted the number of sequences by using an argument involving combinations. There are 5 positions in each sequence and one ofthe positions must be filled witha sucess () and the remaining four must all be failures (The choice of which postion to place the s in can be made in any cet (°) nn Trereore POC=1)=(1) (8)x(3)° Similacly you could find that for X-=2 there are sequences such as s,s3ffafs and ‘fsa iss f-To see how many of these sequences there are, consider the following, argument. There are 5 positions which have tobe filled with 2 ss and 3, After you ‘choose the places in which to put the 2s, there is then no choice as to where the fs go. ‘There are 5 places to choose from forthe 2s, From Section 5.3, you have seen thatthe S Aa umber of coics is (5), ach of these choices has probability (J) (3). Using these results Px=2)=(3)x(1)° x ‘Continuing inthis way, you can find the distribution of X. given in Table 7.2. (Charren 7: THE BINOMIAL DistRUBUTION m o ah 1 =" 2 =3 3 =8 4 = =a ‘Teble 72. Probl fo the mb of times ot of tha he pane shows back. Notice hat 9 (x isbton table is 1. This is a useful check for the probabilities in any Notice also that as 1 and (2)" =1 you could write P(x =0) as (0) i and vrs) as (5) (a) x formal rocear-(2)x (0) However the fomul nit ow so set, bea you must ao gv the ales for which the formula is defined. In this case x can take integer values from 0 t0 5 inclusive. So a more concise definition of the distribution of X than Table 7.2 would be rex=n=(})(4) Although the case ofthe spinner isnot important in itself, itis an example of an ‘important and frequently occurring situation ‘These results enable you to write P(X’ ) as 8 for x=0,1,2, [A single tial has just two possible outcomes (often called success, s and failure, f). ‘There isa fixed numberof trials, m “The outcome of each trial is independent of the outcome of all the other trials, ‘The probability of success at each trial, p, is constant. ‘The random variable X, which represents the number of successes in m trials of this ‘experiment, is sid 10 have a binomial distribution, ‘A consequence of the last condition is thatthe probability of failure will also be a constant, ‘equal to 1p. This probability is usually denoted by q which means that q= 1p. Ina binomial distribution, the random variable X has a probability distribution given by PX. ("per toe 02. ‘You will sce the reason forthe ame ‘binomial’ litle later. Provided you ae given the valves of mand, you can evaluate al ofthe probabilities in the distribution table. The ‘values of 1 and p are therefore the essential pieces of information about the probability distibution. Inthe example, n was $ and p was 4. You did not need tobe tld 4. because its value always 1p, so in the example the valve of q was 2 “The vlues of and p are called the parameters ofthe binomial distribution. You need © know the parameters of a probability distribution to caleulate the probabilities numerically ‘To denote that a random variable X has a binomial distribution with parameters 1 and you write X~B(n,p). So forthe probability distribution in Table 72 you write X ~ B(S,4) Example 7.14 Given that X~B(8,4). find (@ P(X=6), &) PIX=2), © PX>0) (2) Using the binomial probability formula with n=8 and p=} you get Px=6)=(6)x(4) «(9 correct to 3 significat figures Sf eee 0.003 85, (6) The easiest way to find P(X > 0) isto use te fact that P(X > 0) is the ‘complement of P(X =0). So P(X >0)=1-P(X =0) ‘By 1)/3)* (Le) oman on =1-01001 05998090, comet 3 signa gues m2 Stanisties 1 (Cuaeren 7: The Broan DisRURUTION us “To check hatte binomial formula does represent probsbility distribution you must show that $°P(X =.x)= 1. Consider the example involving the spinner, but use p and 4 instesd of and 3 respectively. Table 73 shows the distribution. ‘Table 73. Probebily dsbtion forthe number of ines ot of tthe spinner shows Back Ifyou sum the probabilities inthe right column you get 4°45 +10pa) +10 pa? + Spat oF “The tight side of this equation isthe binomial expansion of (+ p)® (See PI Chapter 9. ‘You could check for yourself by multiplying out (q+ pg pa> P\(g+ P)lg+ P).50 (q+ py ‘You can use a similar argument to show that =9=3 [Q-« x] =(a+oy" ‘Ihe individual probabities in the binomial dlstnbution ae the tems ofthe binomial expansion of (q+ p)": these are two similar uses of the word tinomia!. (On the next page there isa surmmary ofthe binomial distribution, us Sramismcs | Binomial distribution ‘+ A single trial has exactly ewo possible outcomes (success and failure) and these aze mutually exclusive. © A fixed number, n,of tals takes place. The outcome ofeach tril is independent of the outcome of all the other trials, ‘The probability of success at each trial is constant “The random variable X., which represents the numberof successes in the trials ofthis experiment, has probability distribution given by (per were where p isthe probability of suecess and P(x nm ay = p isthe probability of fature. ‘When the random variable X satisfies these conditions, X ~ B(a,p). Exercise 7A In this exercise give probabilities correct to 4 decimal places 1. The random variable X has a binomial distribution with n=6 and p=0.2. Calculate @ P(K-3), ©) P=), ©) PX-6) 2 Given thar ¥~B(7,3),cateulate @ PpY=4). ) PO=6), © PO 3. Given thatZ ~ B(9,0.45) calculate @ PZ=3. ) P(Z=40r5), © PZ>7) 4. Given that D~ B(12,0.7) calculate (@) Pea), () the smallest value of d such that P(D > d) < 0.90, Given tat 17 ~ B(0.),calulte the probability that 1 i (@) exactly, () S06, (© attests, (@) sore han. 6 Given nat 5~ (7,2) find the probability hat is (@) exactly 3, (b) alent. Pln=r) for =O n=1 ard IF you have access toa spreadsheet, use this formula construct tables for binomial probabities Sy sito hsm nt ca ely? Tf X~B(n.p).show that POX =r +1)= P(X =”) (Cuapren 7: THE BINOMIAL DistRIBUTION 8 Use the formula of Question 7 fo prove thatthe mode of a binomial distribution (that i, the value of r with the highest probability) satisfies (n+ 1)p~1-= mode = (n+1)p ‘When is there equality? 7.2. Using the binomial distribution as a model Before using the binomial distribution as a model for a situation you need t0 convince ‘yourself that all the conditions are satisfied, The following example illustrates some of the problems that can occur, Example 724 ‘A school ear park has 5 parking spaces. A student decides to do a survey to see whether this is enough. At the same time each day, she observes the number of spaces which are filled. Let X be the number of spaces filled at this time on a randomly chosen day. Is it reasonable to model the distribution of the random variable X with a binomial distribution? ‘She looks at each parking space to see whether it is occupied or not, This represents a single tal. ‘Are there exactly two outcomes for each tial (parking space), and are these ‘mutually exclusive? In other words, is each parking space ether occupied by a single car or not? The answer will usualy be yes, but sometimes poorly parked vehicles will give the answer no. ‘Are there fixed numberof trials? The answer is yes. On each day there ar S parking spaces available so the number of tals is 5. ‘Are the trials independent? This is not likely. Drivers may be less inclined to park none of the centre spaces i tis surrounded by cars, because getting out of their own car may be more difficult, Is the probability p of success (in this case a parking space being filled by a car) constant? Probably not, because people may be more likely to choose the space <’losest tothe school entrance, for example. ‘You can see that, when you are proposing to model a practical situation with a binomial «distribution, many of the assumptions may be questionable and some may not be valid at all. In this case, however, provided you are aware that the binomial model is far from perfect, you could still use it asa reasonable approximation. You might also have realised that you do not know the value of p in this example, so you would have to estimate it. To do this you Would divide the total number of cars observed by the total numberof available car parking spaces, which in this case is ‘5 (the numberof days for which the survey was earred out) 6 Stamisms | Example 72.2 State whether a binomial distribution could be used in each ofthe following problems. If the binomial distribution isan acceptable model, define the random variable clearly and state its parameters. (@) A fair cubical dice is rolled 10 times, Find the probability of getting three 4s, four 5s and three 6, (©) A fair coin is spun until a head occurs. Find the probability that eight spins are ‘necessary, including the one on which the head occurs (©) A jar contains 49 balls numbered 1 to 49. Six ofthe balls are selected at random. Find the probability that four ofthe six have an even score. (a) In this case you are interested in thre different outcomes: # 4,85 and a6. A binomial distribution depends on having only two possible outcomes, success and failure, soit cannot be used here (b) The binomial distribution requires a fixed number of tials, rt, and this is not the ease here, since the number of tral is unknown, In fact, the number of tials i the random variable of intrest here (©) Whethera binomial model is appropriate or not depends on whether the selection of the balls is done with replacement or without replacement. If the selection is without replacement, then the outcome ofeach al will not be independent ofall the ober trials. the selection is with replacement, then define the random variable X tobe the umber of alls with an even sore out of six random selections. X wil hen havea binomial distribution with parameters 6 and 2. You write this as X ~ B(6,24). You are assuming ofcourse, thatthe balls are thoroughly nixed before each selection and that evry ball hasan equal chance of being selected. Example 72.3 ‘A card is selected at random from a standard pack of $2 playing cards. The suit of the ‘ard is recorded and the cad is replaced. This process is repeated to give a total of 16 selections, and on each occasion the cad is replaced inthe pack before another selection Js made. Calculate the probability that (@) exactly five hearts occur in the 16 selections, () at least three hearts occur Let X be the numberof hears in 16 random selections (with replacement) of playing card from a pack. Then X satisfies all the conditions fora binomial distil ‘© Bach trial consists of selecting a card from the pack, wit replacement. © Bach trial has exactly two possible outcomes, and these are mutually exclusive: ‘petting @ heart isa success and not getting a heart isa failure. You may think that there are 52. possible outcomes for each ti, bu you are anly interested in whether the cards @ heart or nota heart + The outcome of each trial is independent of any other til. This i true since ceach card is eplaced before the next one is selected. But you must ensure that each selection is random and thatthe cards are thoroughly shuffled before each selection. Cuaenen 7: THe BINOMIAL DistRIBUTION nT * The probabilities of sucess and failure are constant. As the cards are replaced, P(selecting a heart) = P(svccess) so this condition is fulfilled X therefore has a binomial distribution with parameters n=16 and p= 1. That is, X~B(t6.4) (@) Using the binomial formula, 2 Pc =s)=('S)x({) x2)" =0.18, comet 03 sgnitont es (b) To find P(X'= 3), use the fact that P(X) P(x<2) P(X =3)=1-P(X <2) (ha) -CAG) -CIG) 10.010 02-0053.45-0.13363. =1-0.19711 0.80288, £803, correct 0 3 significant figures. 13 Practical activities 1 Penalties or shots (a) Selecta group of students and ask them each to take either 8 penalties at football oF 8 shots at haskethall, For each stvlent record the number af successful penalties ar shots (b) Does the binomial distribution provide a reasonable model for these results? Is it necessary to use the same goalkeeper forall of the football penalties? (6) Does the skill level of each person matter ifthe binomial distribution i 1 be a reasonable model? Is the basketball example more likely tobe fitted by a binomial mode than the football example? Exercise 7B 1 Ina certain school, 30% of the students are inthe age group 16-19. (@) Ten students are chosen at random. What i the probability that fewer than four of them are in the 16-19 age group? (b) If the ten students were chosen by picking ten who were sitting together at lunch, ‘explain why a binomial distribution might no longer have been suitable, 2. A factory makes large quantities of coloured sweets, and itis known that on average 205% ‘of the sweets are coloured green. A packet contains 20 sweets, Assuming that the packet forms a random sample ofthe sweets made by the factory, calculate the probability that ‘exactly seven ofthe sweets are green. If you knew that, in fat, the sweets could have been green, red, orange or brown, would it have invalidated your calculation? us Sranismes 1 3 Eggs produced ata farm are packaged in boxes of six. Assume that, for any egg, the probability that iis broken when it reaches the retail outlet is 0.1, independent of all other ‘eggs. A box is said tobe bad if it contains at least two broken eggs. Calculate the probability that a randomly selected box is bad, ‘Ten boxes are chosen at random, Find the probability that just ovo of these boxes are bad is known that, in fact, breakages are more likely to occur after the eggs have been packed Jt boxes, and while they are being transported to the retail outlet. Explain why this fact is likely to invalidate the calculation 4 Ona pantcular tropical island, the probability that there is a hurricane in any given month can be taken to be 0.08. Use a binomial distribution to caleulate the probability that there is ‘aburticane in more than two months ofthe year. State wo assumptions needed for 3 binomial distribution 0 be a good model, Why may one ofthe assumptions not be valid? 5 Iris given chat, ata stated time of day, 35% of the adults inthe country are wearing jeans. At that time, a sample of twelve adults is selected. Use a binomial distribution to calculate the probability that exactly five out ofthese twelve are wearing jeans. Explain carefully ‘ovo assumptions that must be made for your calculation to be valid. (IF you say ‘sample is ssindom* you must explain what this means in the context of the question.) 6 Explain why a binomial distribution would not be a good mode! in the following problem. (Do not attempt any calculation.) ‘Thirteen cards are chosen at random from an ordinary pack. Find the probability that there are four clubs, four diamonds, three hearts and two spades, 7 Explain why the binomial distribution B(6,0.5) would not be a good model in each of the following situations, (Do not attempt any calculations.) (a) Iris known that 50% of the boys in a certain school are over 170 em in height, They are arranged, fora school photograph, in order of ascending height. A group of six boys standing next to each other is selected at random. Find the probability that ‘exactly three members of the sample are over 170 em in height (©) Tis known that, on average, the temperature in London reaches atleast 20°C on ‘exactly half the days inthe year. A day is picked at random from each of the months January, Match, May, July, September and November. Find the probability thatthe temperature in London reaches 20 °C on exactly three of these six days. 8A bog contains six ed and four green counters. Four counters are selected a random, without replacement, The events A, B C and D represent obtaining a red counter on the first, second, third and fourth selection, respectively Use a tree diagram to show that PA) = P(B) = P(C)= P(D)= 06. Explain why the total number of red counters could not be well modelled by the distribution B(4,0.6) The purpose ofthis and the preceding question i to lustate thatthe properties the probability ofa success is constant’ and the outcomes are independent are not the same, ‘and you should ty 10 oistinguish caretuly between them. Notice also that the outcomes are ‘Independent is not the same thing as ‘samoling with replacement: (Caneren 7: THe BINOMIAL DISTRIBUTION ug === Miscellaneous exercise 7 ‘The probability of a novice archer hitting a target with any shot is 0.3, Given that the archer shoots six arrows, find the probability thatthe target is hit at Ieast twice. (OCR) |A computer is programmed to produce at random a single digit from the lst 0,1, 2.3.4.5, 67,8, 9.'The program is run twenty times. Let ¥ be the numberof zeros that occur. (a) State the distribution of ¥ and give its parameters, (b) Calculate PCY <3) A dice is biased so that the probability of throwing a 6 is 0.2. The dice is thrown eight times, Let_X be the number of "6's thrown, (a) State the distribution of X and give its parameters. (b) Calculate P(X > 3) Joseph and four friends each have an independent probability 0.45 of winning a prize. Find the probability that (@) exactly two ofthe five friends win a prize, (b) Joseph and only one friend win a prize. cocr) A bag contains two biased coins: coin A shows Heads with probability 0.6, and coin B shows Heads with probability 0.25. A coin is chosen at random from the bag, and tossed three times. (2) Find the probability that the three tosses ofthe coin show two Heads and one Tail in any order. (b) Find the probability thatthe coin chosen was coin A. given that the three tosses result in two Heads and one Tail. (ocr) (@)_A fair coin is tossed 4 times. Calculate the probabilities thatthe tosses result in 0, 1,2, 3 and 4 heads. (©) A fair coin i tossed 8 times. Calculate the probability that the first 4 tosses and the last 4 tosses result in the same number of heads, (©) Two teams each consist of 3 players. Each player in a team tosses a fair coin once and the team’s score isthe total number of heads thrown, Find the probability that the teams have the same score, (ocr) State the conditions under which the binomial distribution may be used for the calculation ‘of probabilities, ‘The probability tha a gil chosen at random has a weekend birthday in 1993 is 2 Calculate the probability that, among a group of ten girls chosen at random, (8) none has a weekend birthday in 1993, (©) exactly one has a weekend birthday in 1993, Among 100 groups often girls, how many groups would you expect to contain more than fone girl with » weekend birthday in 19937 (ocr) 120 Stansncs | 15 Show that, when two fir dice are thrown, the probability of obtsining a “double” is vwhote a “doubles defined as the same score on both ice. Four players playa oard game which requires them to take it in toms to throw two fi dice. Each player throws the two dice once in each rund. When a double is thrown the player moves forvard six squares. Otherwise the player moves forward one square Find (a) the probability that the ist double occurs on the third throw ofthe game (0) the probsbility that exaty one ofthe four players obtains a double in the fis round, (0 the probability that a double occurs exactly once in ofthe frst Sounds. (OCR) 9 Six hens are observed over a period of 20 days and the number of eggs lad each day is summarised in the following table Number of eggs 30405 «6 Number of day’ ‘Show that the mean numberof eges per day is 5. may be assumed that a hen never lays more than one egg in any day. State one other assumption that needs tobe made in ander to consider a binomial model, with n = 6 , for the total numberof eggs laid ina day. State the probability that a randomly chosen hen lays, an egg on a given day Calculate the expected frequencies of 3,4, 5 and 6 exgs. (ocr) 10. A Personal Identification Number (PIN) consists of 4 digits in order, each of which is one of the digits 0, 1,2, .. ,9. Susie has difficulty remembering her PIN. She ties remember it and writes down what she thinks it is."The probability that the first digit is correct is 0.8 and the probability thatthe Second digit is correct is 0.86. The probability that the first two digits are correct is 0.72. Find (a) the probability tha the second digit is comect given that the first digit is correct, (b) the probability tha the first digit is correct and the second digit is incorrect, (©) the probability that the first digits incorrect and the second digit is correct, (@) the probability that the second digit is incorrect given that the first digit is incorrect. ‘The probability that all four digits are correct is 0.7. On 12 separate occasions Susie writes down independently what she thinks is her PIN. Find the probability thatthe number of ‘occasions on which all four digits are correct isles than 10, (OCR) 8a Expectation and variance of a random variable ‘This chapter shows you how to calculate the mean and variance of a discrete random variable. When you have completed it, you should ‘© Know the mesning ofthe notation E(X) and Var(X) ‘© beable to calculate the mean, E(X),of arandom variable X ‘© beable to calculate the variance, Var(X) of a random variable X © beable to use the formulae E(X)= np and Var(X) = np(t~ p) for a binomial distribution Expectation A computer is programmed to produce a sequence of integers, X rom 010 3 inclusive, ‘with probabilities as shown below. ® o 123 P(X=s) 04 03 02 O4 ‘Suppose that a sequence of 100 integers is produced by the computer. What would you expect the mean ofthese 100 values to be? Its not possible to answer this question exactly beeause you cannot tll how often each value will setually turn up in the sequence. However, itis possible fo obtain an estimate of the mean value. You can estimate the frequency with which each integer occurs using frequency = total frequency X probability (ce Section 6.3). You might expect there tobe about 100% 0.4 = 40 ‘0's, 100% 0.3 = 30 “1's, 100% 0.2= 20 2's and 100% 0.1 =10 ‘3's. The sum of these integers would be (0% 40) + (1x 30) + (2 20) + (3% 10), 00 so their mean would be 10° 1 100 1f you look at this calculation carefully, you will see that it is independent of the number of integers in the Sequence. For example, if you had a sequence of 1000 integers, then the sum of the integers would be 10 times as prea, but the estimate of the mean Would stay the same. The same result canbe obtained more directly by multiplying each value by its probability and summing. Using p, as. shortened form of P(X = x) this gives Lauri = (0x04) + (120.3) + (20.3) +(3% 041 ‘The value which has just been calculated is a theoretical mean. Itis denoted by 4 (which is read as mu’), the Greek letter m., standing for ‘mean’. The new symbol i used in order to distinguish the mean ofa probability distribution from ¥ ,the mean ofa dataset. The mean, 1, 0f a probability distribution does not represent the mean ofa finite sequence of numbers. Itis the value to which the mean tends as the length ofthe sequence gets larger and larger, fy Stanstcs 1 or, a8 mathematicians say, “tends to infinity", In practice, it is helpful to think of ge as the ‘mean you would expect fora very very long sequence. For ths reason, 1 i often called the ‘expectation or expected value of X and is denoted by E(X). = Lan, ‘The expectation of arandom variable X is defined by E(X)= Example 8.1.1 Find the expected value of each of the variables XY and W, which have the probability distributions given below. g E ba = w pia a) SG 7) 8) Oo) aa toad Mize ets e432 PW=W) xe ie ie ie 3 35 3s es! ‘You may have spotted tha there isa quicker way to Find the mean in his example Since the distribution is symmetrical about 34 the mean must equal 33. EC) = Dans = (1x4) + (2h) + (9g) (44) + (5a) + (736) + (8% 4) +(9% 4) + (10%) + (Lise) +(I2x 4) (42434445 xh4(748+94104114 12) | Siselesxd ai aisxbesx daa (©) Asin par (a), the probability distibution is symmetrical in this ease about 7, so E(W)=7 The variables X.Y and W were olscussed in Section 6.1 in connection withthe number of squares moved ina tum at thee alferent board games. This calculation shows that ‘yew move round the board (astest in Game C and slowest in Game A. (Charu 8: EXPECTATION AND VARIANCE OP A RANDOM VARIABLE 3 Example 8.1.2 A random variable R has the probability distribution shown below. Also E(R)=3,s0 DyPR=1)=3, 1x14 2x 043x0344%h) so 2044 Solving these two equations simultaneously gives a =0.2 and 6 8.2 The variance of a random variable Example 8.1.1 showed thatthe random variables X,Y and W have different means. If you compare the probability distributions (which are illustrated in Fig, 6.6), you will soe that X,¥ and W also have different degrees of spread. Just as the spread in a data sot can be measured by the standard deviation or variance, so itis possible to define a corresponding measure of spread fora random variable. The symbol used forthe standard deviation ofa random variable is o (a small Greeks, read as “sigma’) and its square, 0” the variance of a random variable, is denoted by Var(X) Before deriving a formula for Var(X). itis helpful to look at another method of arriving at the Formula for B(X). Suppose that you had a sequence of n integers produced by the computer described in Section 8.1, and thatthe sequence contained J, ‘0's, fy ‘I's, {2's and f, ‘3's. The mean for these m integers i given by ‘The Fight side ofthe expression can be writen slightly differetly in the form. eaxertierl 3s xll) Now consider what happens as n becomes very large: he valve of ¥ tends to and the ratio, which isthe relative frequency, tends to the corresponding theoretical probability, 2. This gives E(X)= Dp, A) ‘which was the result obtained in Section 8.1 4 Srarisies 1 ‘Now consider the formula given in Equation 3.3 forthe variance ofa data set. Replacing Df, by m and raranging gives variance: Bends Dea xh Li ‘Again consider what happens when m becomes large. The ratio # tends to 1, giving tends to pj,and = o? = VartX) = E(x; — 1)" », 82) _Atematively, starting from Equation 3.4 forthe variance ofa dataset, repacing Lf by m and searanging gives Ya xk When n becomes large, tends io p, and tends 01, giving a0(X) = Dain -H 3) ‘The variance ofa random variable Nis defined by a0(X)= D(x — HY a= Daly, “The standard deviation of random variable isc the square root of Var(X) In practice it is usually simpler to calculate Var(X) from Equation 8.3 rather than from Equation 8.2. Example 82.1 Calculate the standard deviation ofthe random variable X in Example 8.1.1, using Equation 83. First calculate x?p, Taken (ad) hed) leaded) a (P4248 eae st 4 e)x From Example 8.1.1, =E(X)=34 Using Equation 8.3 Var(X)= Sof, -o? =15$-

Anda mungkin juga menyukai