(a) When there is a non-linear relationship; (b) when distinct subgroups are present.
In both of these examples the correlation coefficient quoted is spurious.
Spurious correlations crop up all the time:
The price of petrol shows a positive correlation with the divorce rate over
time
Number of deaths from heart attacks in a population rises with incidence of
long-sightedness over time
Maximum daily air temperature and number of deaths of cattle were
positively correlated during March 2001
If we repeatedly measure two variables on the same individual over a period
of time e.g. a child's height and ability to read, then we will tend to see a
correlation
Spearman rank correlation coefficient
Linear regression
Regression analysis fits the best line to the observed data and allows us to
make predictions about one variable from the values of the other.
One variable (the independent variable) is assumed to predict the other (the
dependent), the results are not the same if we swap the variables.
The values of the independent variable may be selected.
The values do not have to be normally distributed
There are other assumptions and requirements of a regression analysis. (The
relationships is approximately linear; the residuals have to be normally
distributed etc.)
Regression analysis is best carried out under the guidance of a statistician