Regression Diagnostics (regress)
The regress function computes diagnostics for bi-variate linear regression. The regress function takes three parameters:
-
The numeric field of the independent variable (x).
-
The numeric field of the dependent variable (y).
-
The sample size of the regression.
Sample syntax
select regress(petal_length_d, sepal_length_d, 150) as regress_sig,
regress_rsquared,
regress_r,
regress_slope
from iris
Result set
The result set for the regress function has one record that contains the selected regression diagnostics. The regress function returns the statistical significance of the regression analysis. The following regression diagnostics can be selected as well:
-
regress_slope(slope) -
regress_intercept(y-intercept) -
regress_rsquared(R Squared) -
regress_r(correlation coefficient) -
regress_mse(mean square error) -
regess_sse(sum square error) -
regress_ssr(sum square due to regression) -
regress_ssto(total sum of squares)
regress result in Apache Zeppelin
Visualization
Sample visualization of the regress function using Apache Zeppelin’s Number visualization.
