Running Head: Motion Picture Industry

Running Head: Motion Picture Industry
Module 2: Data Presentations

MTH410 Quantitative Business Analysis
Colorado State University Global Campus
Nicole Eddy
December 9, 2016

Variables Contributing to the Success of a Motion Picture
Introduction
This report analyzes some common variables relevant to the success of a movie utilizing statistics and
attempts to ascertain how each of the variables contributes to the success of a movie while providing an
explanation of each type of statistic discussed. To conduct this analysis we will look at a sample consisting of
100 motion pictures released in 2012 and how the variables of opening gross, total gross, number of theaters,
and number of weeks in theater contribute to the success of a movie. Ultimately, one of the best indicators of
success for any product produced by any type of business is profit. However, since profit is not one of the
variables available, we will ultimately assume that a higher total gross amount is what makes a particular
movie, the most successful.
Table 1: Descriptive Statistics
Mean
Median
Range
Standard Deviation
Minimum Value
Maximum Value
Opening Gross ($)

30,219,994
20,962,718
207,407,098
33,166,537
31,610
207,438,708
Total Gross ($)

100,348,852
64,055,593
594,522,382
95,827,983
28,835,528
623,357,910
Theaters
3,130
3,115
3,480
617
924
4,404
Weeks
16
15
30
6
5
35
Mean
The mean is considered one of the most important measures of location for a variable as it provides a
measure of central location for the data (Anderson et. al., 2015). Looking at Table 1, we can find the mean
values for the 4 variables of interest. The data in Table 1 was produced with common Excel functions; for
example, to find the mean the =Average(DATA RANGE) formula was used. For the 4 variables relevant to
this report, we see that the mean or average opening gross was about $30 million dollars; the average total gross
was about $100 million dollars; the average number of theaters was about 3,000; and the average number of
weeks was 16. In other words, this means that on average, a movie stays in theaters for about16 weeks or 4

months, appears in 3,130 theaters, has an opening gross of about $30 million Dollars and on average make a
total gross of $100 million dollars.
If being successful was defined by being better than average, than any movie whose opening gross, total gross,
number of theaters, or number of weeks, was greater than the average, then that movie could be considered
successful for that particular variable. Considering total gross as the most important variable and considering
that the average total gross was about $100 million, 31 of the 100 movies in our data set had a total gross of
over 100 million dollars. Interestingly enough though, even if a movie had a total gross of over $100 million, it
did not mean that each one of the values for the movies other 3 variables were above the mean for that variable.
In other words, consider the Silver Linings Playbook whose opening gross was only $443,003 which isnt above
the average opening gross for our sample of data, which is $30,219,994. In the case of Silver Linings Playbook,
this means that the movie may have had a bad opening weekend or faced a lot of competition during its
opening, but after having been recognized as a good movie, the movie then made more money later on.
Median
The median is also a measure of central location for a data set. It can be obtained by arranging ones data
in ascending order and then identifying the middle value. In this report, we used the =MEDIAN(DATA
RANGE) formula from Excel to find the 4 median values of our 4 variables of interest. In cases where ones
data set has some extreme values, the median is sometimes a more preferred measure of central tendency than
the mean (Anderson et. al., 2015). For our data set, we see that all the medians for our 4 variables are smaller
than their respective means. This means that for each of the 4 variables, it is likely that our data is skewed to the
right.
If one was to arrange all of the movies in ascending order of total gross, one would find that there are 50 movies
which have a higher total gross than the median of $64,055,593; 50 movies that have a higher opening gross of
about $20.1 million; 50 movies that were shown in 3,115 theaters or more; and 50 movies which were shown

for 15 weeks or more. It is important to remember that the 50 movies greater than the median are not
necessarily the same for each variable.
Range
Another important characteristic to consider when analyzing a dataset is variability or dispersion of the
data. One such simple statistic is the range of the data. To find the range, we take the smallest value of our data
set and subtract it from the largest value. In our case, this would be 623,367,910 28,835,528 = 594,522,382,
for the range of the Total Gross; 207,407,098 for the opening gross; 3, 480 theaters; and 30 weeks for the
number of weeks in theater. Since to compute the range of a dataset only requires us to take into consideration 2
values of our dataset, it is not the best measure compared to others.
Standard Deviation
The standard deviation is the positive square root of the variance and is very nice as it has the same units
as the data for which it represents. The standard deviation is a measure of the spread of data and is computed by
taking the square root of the variance. Going further, the variance of a sample is computed by summing all of
the deviations about the mean, squaring that sum, and dividing it by n 1, where n is the number of data points
available. If one only knows the mean or median of a sample of data and does not know something like the
standard deviation or variation of the data, then one can have a very ill informed understanding of their data.
For example, if one knows that the mean of a data, the shape of the curve that represents all the values for that
data set can be bell shaped, bi-modal, or multi-modal and have the same mean. If one knows the standard
deviation though, alongside the mean, then one has a better understanding of how spread out the data is from
the mean.
For our movies, the standard deviations are as follows: $33 million opening gross, $96 million total gross, 617
theaters, and 6 weeks. Comparing the standard deviations for opening gross and total gross, we see that the

opening gross has a considerably smaller standard deviation than the total gross. This means that on average, the
opening gross of a movie tends to be less spread out from the mean than the total gross of a movie.
z-Score
Similar to our discussion in the previous section regarding standard deviations, is the z-Score. The zScore helps us measure the relative location of a particular value from the mean and is calculated by taking a
datum for a variable, subtracting the mean for that variable from that datum, and dividing it by the sample
standard deviation. The z-Scores are as follows for the highest grossing movie, The Avengers: opening gross
5.3; total gross 5.5; theaters 2.0; and weeks 1.1. This means that The Avengers opening gross is 5.3 standard
deviations greater than the mean, 5.5 standard deviations greater than the mean, 2.0 standard deviations greater
than the mean, and 1.1 standard deviations greater than the mean for the opening gross, total gross, theaters, and
weeks, respectively. This tells us that when it comes to opening gross and total gross, the most successful film
had a much higher value for these 2 characteristics compared to the other movies than it did for the number of
theaters or the number of weeks the movie was in theaters. This suggests that the most successful movie, in
terms of total gross, distinguishes itself from the other movies in those 2 characteristics than it does in the other
2 characteristics. In other words, the number of theaters or the number of weeks a movie is in theaters seems to
be less relevant characteristic to its overall success than the other 2 variables.
Another use of the z-Score is to determine outliers. It is suggested that any observation which has a z-Score less
than -3 or greater than 3 is an outlier (Anderson et. al., 2015). Below you will see any outliers listed by variable
for our data set.
Table 2: Opening Gross
Movie Title
Marvel's The Avengers
The Dark Knight Rises
The Hunger Games
The Twilight Saga: Breaking Dawn Part 2
z-Score
5.34
3.94
3.69
3.34

Table 3: Total Gross
Movie Title
z-Score
Marvel's The Avengers
5.46
The Dark Knight Rises
3.63
The Hunger Games
3.21
Table 4: Number of Theaters

Movie Title
z-Score
Moonrise Kingdom
-3.57
Table 5: Number of Weeks

Movie Title
z-Score
Life of Pi
3.46
The above movies listed are each outliers for the listed category. For example, Moonrise Kingdom has a z-Score
of -3.57, which implies that it was about 3.6 standard deviations below the sample mean. This is quite
understandable as it was only released in 924 theaters whereas the average for our data set is 3,130 theaters!
Astonishingly, despite being the movie that was shown in the least amount of theaters in our set, it was not the
last ranked movie in terms of the amount of money it made. It actually ranked 74th with its total gross of about
45 million dollars.
Correlation Coefficient
The correlation coefficient is computed by dividing the sample covariance by the product of the sample
standard deviation of x and the sample standard deviation of y (Anderson et. al., 2015). As one can see within
the definition of the correlation coefficient, it involves the comparison of 2 different variables and their
covariance and sample standard deviations. In essence, it is a tool which helps identify positive, negative, or no
linear correlation between the two variables. If the sample correlation coefficient is close to 1, then this implies
a strong positive linear relationship. If the sample is correlation coefficient is close to -1, then it implies a strong

negative linear relationship. And if the sample correlation coefficient is close to 0, then it implies either no
linear relationship or a very weak one.
Table 6 below was generated using Excels CORREL function which allows one to determine the
correlation coefficient of 2 arrays.
Table 6: Correlation Coefficients

Total & Opening
Total Gross &
Total Gross &
Gross
Theater
Weeks
0.93
0.58
0.42
The above correlation coefficients show that there is a strong linear relationship between the total gross of a
film and its opening gross, a weaker linear relationship between a films total gross and the number of theaters
it is viewed in, and an even weaker linear relationship between the total gross of a film and the number of weeks
it is in theaters. It is important to remember that we have only found positive linear relationships, which do not
mean that an increase in one of the variables causes an increase in the other. In other words, there is a
correlation between the two variables but not necessarily the idea that one causes the other. To bolster this data,
you will find below 3 scatter diagrams that plot the total gross versus each of the 3 other variables and a trend
line which attempt to model the best linear relationship possible for the data. Looking at figure 1, which had a
correlation coefficient closest to 1, we see that many more of the data fall near the line when compared to the
other 2 figures which plot total gross versus theaters and total gross versus weeks. Again, this bolsters the claim
that there is a strong positive linear relationship between the total and opening gross of a film. In other words, if
a film made a lot of money in the opening weekend, then it was likely to have made more money over all.

Figure 1: Total Gross ($) vs. Opening Gross ($)
700,000,000
600,000,000
500,000,000
400,000,000
300,000,000
200,000,000
100,000,000
0
0
100,000,000
200,000,000
300,000,000
Figure 2: Total Gross ($) vs. Theaters

700,000,000
600,000,000
500,000,000
400,000,000
300,000,000
200,000,000
100,000,000
0
500 1,000 1,500 2,000 2,500 3,000 3,500 4,000 4,500 5,000

Figure 3: Total Gross ($) vs. Weeks
700,000,000
600,000,000
500,000,000
400,000,000
300,000,000
200,000,000
100,000,000
0
0
10
15
20
25
30
35
40
Conclusion
In summary, we have looked at measures of location, measures of variability, measures of relative
location, and measures of association to ascertain some important characteristics of successful movies in the
film industry. We discovered that the average movie (n=100) was in theaters for approximately 16 6 weeks,
appeared on average in 3,130 617 theaters, had an opening gross of about 30 33 million dollars, and on
average had a total gross of 100 96 million dollars. We also found that in our sample, our most successful
movie, based on total gross, was The Avengers, especially as its z-Score was 5.46. Separately we noticed that in
order for a movie to be somewhat successful, like the Moonrise Kingdom with a z-Score of -3.57 for the
number of theaters, it did not necessarily have to be viewed in a lot of theaters. Lastly, we realized that if we
compared the total gross with each of the 3 other variables using the correlation coefficient, that there existed a
positive linear relationship in each of the comparisons, but the strongest positive linear relationship in our
sample existed between the opening gross and total gross of the film with a correlation coefficient of 0.93.
Using this sample data which was derived from 2013, we can make some interesting assumptions about the
larger population of movies that exist. Of course, if one attempts to compare this data with more unrelated data,

say for example a sample of movies 10 years from now, one will have less reliable data as the sample of data
one has is more characteristic of the conditions it was obtained from.
References
Anderson, D.R., Sweeney, D.J., Williams, T.A., Camm, J.D., & Cochrane, J.J. (2015). Essentials of statistics
for business and economics (7th ed.). Mason, OH: South-Western Cengage Learning.

Running Head: Motion Picture Industry

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Running Head: Motion Picture Industry

Diunggah oleh

Hak Cipta:

Format Tersedia

Running Head: Motion Picture Industry

Module 2: Data Presentations