Consider a multiple linear regression model with predictor variables:
Let each of the predictor variables, , ..., have levels. Then represents the th level of the th predictor variable . For example, represents the fifth level of the
first predictor variable , while represents the first level of the
ninth predictor variable, . Observations, , ..., recorded for each of these levels can be expressed in the following
way:
The system of equations shown previously can be
represented in matrix notation as follows:
(7)
where:

The matrix in Eqn. (7)
is referred to as the design matrix. It contains information
about the levels of the predictor variables at which the observations
are obtained. [Note]
The vector contains all the regression coefficients.
To obtain the regression model, should be known. is estimated using least square estimates.
The following equation is used:
(8)
where represents the transpose of the matrix
while represents the matrix inverse. Knowing
the estimates, , the multiple linear regression model
can now be estimated as:
(9)
The estimated regression model is also referred to as the fitted
model. The observations, , may be different from the fitted
values obtained from this model. The difference
between these two values is the residual, . The vector of residuals, , is obtained as:
(10) The fitted model of Eqn. (9) can also be written as follows, using from Eqn. (8):
(11) where . The matrix, , is referred to as the hat matrix.
It transforms the vector of the observed response values, , to the vector of fitted values, .
An analyst studying a chemical process expects the yield to be affected by the levels of two factors, and . Observations recorded for various levels of the two factors are shown in Table 5.1. The analyst wants to fit a first order regression model to the data. Interaction between and is not expected based on knowledge of similar processes. Units of the factor levels and the yield are ignored for the analysis.
|
Table 5.1: Observed yield data for various levels of two factors. |
The data of Table 5.1 can be entered into DOE++
using the Multiple Regression tool as shown in Figure 5.7.
A scatter plot for the data in Table 5.1 is shown in Figure 5.8.
The first order regression model applicable to this data set having two
predictor variables is:
where the dependent variable, , represents the yield and the predictor variables, and , represent the two factors respectively. The and matrices for the data can be obtained as:
|
Figure: 5.7: Multiple Regression tool in DOE++ with the data in Table 5.1. |
|
Figure 5.8: Three dimensional scatter plot for the observed data in Table 5.1. |

The least square estimates, , can now be obtained:
Thus:
and the estimated regression coefficients are , and . The fitted regression model is:
In DOE++, the fitted regression model can be viewed using the Show Analysis Summary icon in the Control Panel. The model is shown in Figure 5.9.
|
Figure 5.9: Equation of the fitted regression model for the data in Table 5.1. |
|
Figure 5.10: Fitted regression plane for the data of Table 5.1. |
In DOE++, fitted values and residuals are available using the Diagnostic
icon in the Control Panel. The values are shown in Figure 5.11.
The fitted regression model can also be used to predict response values.
For example, to obtain the response value for a new observation corresponding
to 47 units of and 31 units of , the value is calculated using:
|
Figure 5.11: Fitted values and residuals for the data in Table 5.1. |
The least square estimates, , , ..., are unbiased estimators of , , ..., provided that the random error terms,
, are normally and independently distributed.
The variances of the s are obtained using the matrix. The variance-covariance matrix
of the estimated regression coefficients is obtained as follows:
(12)
is a symmetric matrix whose diagonal
elements, , represent the variance of the estimated
th regression coefficient, . The off-diagonal elements, , represent the covariance between
the th and th estimated regression coefficients,
and . The value of is obtained using the error mean square,
, which can be calculated as discussed
in the beginning of Chapter 5, Multiple
Linear Regression Analysis. The variance-covariance matrix for the
data in Table 5.1 is shown in Figure 5.12. It is
available in DOE++ using the Show Analysis Summary icon in the Control
Panel. Calculations to obtain the matrix are given in Example 5.3 in Chapter
5, Test
on Individual Regression Coefficients. The positive square root of
represents the estimated standard
deviation of the th regression coefficient, , and is called the estimated standard
error of (abbreviated ).
(13)
|
Figure 5.12: The variance-covariance matrix for the data of Table 5.1. |