Template:Normal distribution rank regression on Y

Rank Regression on Y
Performing rank regression on Y requires that a straight line be fitted to a set of data points such that the sum of the squares of the vertical deviations from the points to the line is minimized.

The least squares parameter estimation method (regression analysis) was discussed in Chapter Parameter Estimation and the following equations for regression on Y were derived:


 * $$\begin{align}\hat{a}= & \bar{b}-\hat{b}\bar{x} \\

=& \frac{\sum_{i=1}^N y_{i}}{N}-\hat{b}\frac{\sum_{i=1}^{N}x_{i}}{N}\\ \end{align} $$

and:


 * $$\hat{b}=\frac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{x}_{i}}{{y}_{i}}-\tfrac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{x}_{i}}\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{y}_{i}}}{N}}{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,x_{i}^{2}-\tfrac{N}}$$

In the case of the normal distribution, the equations for $${{y}_{i}}$$  and  $${{x}_{i}}$$  are:


 * $${{y}_{i}}={{\Phi }^{-1}}\left[ F({{t}_{i}}) \right]$$

and:


 * $${{x}_{i}}={{t}_{i}}$$

where the values for $$F({{T}_{i}})$$  are estimated from the median ranks. Once $$\widehat{a}$$  and  $$\widehat{b}$$  are obtained,  $$\widehat{\sigma }$$  and  $$\widehat{\mu }$$  can easily be obtained from above equations.

The Correlation Coefficient

The estimator of the sample correlation coefficient, $$\hat{\rho }$$, is given by:


 * $$\hat{\rho }=\frac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,({{x}_{i}}-\overline{x})({{y}_{i}}-\overline{y})}{\sqrt{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{({{x}_{i}}-\overline{x})}^{2}}\cdot \underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{({{y}_{i}}-\overline{y})}^{2}}}}$$

Example 2: