Mailbox# 272 The data on this sheet are from 831 coffee growing households in Southern Mexico. Yields refers to the total kilograms of coffee produced per hectare; farm_size refers to the farm's size in hectares of coffee; hh_migrant equals 1 if the household has a migrant living elsewhere in Mexico or the U.S. and zero otherwise. 1. Many people think that small farms tend to be more productive than large farms because they more efficiently use their land. Is this true for coffee growers in Southern Mexico? Do large farms have higher yields? a. Make a scatter plot in Excel. Put farm size on the horizontal axis and yields on the vertical axis.
What type of relationship, if any, is suggested by the scatter plot?
The scatterplot doesnt indicate theres at least somewhat of a linear relationship, so the correlation doesnt mean much.
Does the scatterplot reveal any outliers?
Yes
b. Calculate the correlation coefficient between yields and farm size.
What is the nature of the relationship (positive or negative?), and is
the correlation coefficient statistically significant at the 1 percent level (see Table C.5 in the textbook)? The nature of relationship is negative, indicates an inverse relationship between the variables; as the farm size increase, they tend to decrease on yields. The correlation coefficient is -0.0326. Critical value at alpha 1% and df more than 100 is less than 0.254 Since |robs|< rcrit, we fail to reject H0 Conclusion: There is no evidence for a relationship between total kilograms of coffee produced per hectare (yield) and farms size in hectares of coffee (farm_size)
c. How much does the correlation coefficient change when excluding
farms larger than 25 hectares? In Stata you can do this with the following syntax: corr yield farm_size if farm_size <25 The correlation when excluding farms larger than 25 hectares is now 0.0092. The correlation coefficient is now become positive, indicates a positive relationship between variables farm size and yields; as the farm size increases, the yields tend to increase also. However, comparing to previous correlation coefficient, the absolute value of
the correlation coefficient does not change much ; |0.0092|
compares to |-0.0326| and if we compare to critical value at alpha 1% and df more than 100 (which is CV= less than 0.254), this correlation coefficient is not significant. d. Regress yields on farm size and interpret the coefficient on the farm size variable. Slope of the regression line: b=1.68665 means that every 1 hectare increase in farm size would decrease the yield by 1.68665 kilograms of coffee. The Y-Intercept a=231.7886 Equation of the Regression Line Y^ =1.68665 X +231.7886 e. How does removing the outlier farms affect this result? (Hint: do "regress yield farm_size if farm_size <25"). Removing the outlier has in fact changed the slope. The slope now become 0.6745775, means that every 1 hectare increases in farm size will increase the yield by 0.6745775 kilograms of coffee. f. How large is a 0.67 increase in yields as a percent of the average yield of all sample farms? Is this an economically important effect? 0.67 increase in yields is around 0.3%. O.3% increase of yields of every 1 hectare farm size will be economically important effect if the farm the farm size is so large, and the price of every kilogram coffee is good. However, it will not be significant if the farm size is not large and the price of every kilogram coffee is not good. 2. Many people think that migration of members from agrarian households can improve agricultural productivity. The thought is that migrants send money back to their home household, which enables the household to finance improvements in the farm. Regress yields on the hh_migrant. a. What is the average yield of households that do not have a migrant? The average yield that do not have a migrant is 246.3474 kilograms. b. What is the average difference in yields between households with and without a migrant? with migrant will be yields (- 36.45724) + 246.3474 = 209.8902 kilograms without migrant will be yields 246.3474 kilograms so, the average difference in yields between households with and without a migrant would be 36.45724 Kilograms c. How large is the "migrant" effect on yields? (Discuss the migrant effect as a percentage change relative to the average non-migrant yield)?
The migrant effect on yields is 36.45724/246.3474 = 0.1479 or
14.79%. It means that yields would be 14.79 less with migrant compare to without migrant. d. Suppose that the only thing we know about a household is whether or not it has a migrant . If we used that information to predict the yield of the household's coffee farm, how far off would our prediction be on average? In this case, R2 = 0.0108 Adjusted R2 = 0.0096, which means that the independent variable, hh_migrant, explains 0.96% of the variability of the dependent variable, yields, in the population. In this case, the regression model is statistically significant, F(1, 829) = 9.08, p = 0.0027. This indicates that, overall, the model applied can statistically significantly predict the dependent variable, yields .