This chapter expands on the analysis of simple linear regression models and discusses the analysis of multiple linear regression models. A major portion of the results displayed in DOE++ are explained in this chapter because these results are associated with multiple linear regression. One of the applications of multiple linear regression models is Response Surface Methodology (RSM). RSM is a method used to locate the optimum value of the response and is one of the final stages of experimentation. It is discussed in Chapter 9. Towards the end of this chapter, the concept of using indicator variables in regression models is explained. Indicator variables are used to represent qualitative factors in regression models. The concept of using indicator variables is important to gain an understanding of ANOVA models, which are the models used to analyze data obtained from experiments. These models can be thought of as first order multiple linear regression models where all the factors are treated as qualitative factors. ANOVA models are discussed in Chapter 6, Analysis of Experiments.
A linear regression model that contains more than one predictor variable
is called a multiple linear regression model. The following model
is a multiple linear regression model with two predictor variables, and .
(1)
The model is linear because it is linear in the parameters , and . The model describes a plane in the three dimensional space of , and . The parameter is the intercept of this plane. Parameters and are referred to as partial regression coefficients. Parameter represents the change in the mean response corresponding to a unit change in when is held constant. Parameter represents the change in the mean response corresponding to a unit change in when is held constant. [Note]
Consider the following example of a multiple linear regression model
with two predictor variables, and :
(2)
This regression model is a first order multiple linear regression model. This is because the maximum power of the variables in the model is one. The regression plane corresponding to this model is shown in Figure 5.1. Also shown is an observed data point and the corresponding random error, . The true regression model is usually never known (and therefore the values of the random error terms corresponding to observed data points remain unknown). However, the regression model can be estimated by calculating the parameters of the model for an observed data set. This is explained in Chapter 5, Estimating Regression Models Using Least Squares.
|
Figure 5.1: Regression plane for the model . |
Figure 5.2 shows the contour plot for the regression model of Eqn. (2). The contour plot shows lines of constant mean response values as a function of and . The contour lines for the given regression model are straight lines as seen on the plot. Straight contour lines result for first order regression models with no interaction terms.
|
Figure 5.2: Contour plot for the model . |
A linear regression model may also take the following form:
(3)
A cross-product term, , is included in the model. This term represents an interaction effect between the two variables and . Interaction means that the effect produced by a change in the predictor variable on the response depends on the level of the other predictor variable(s). As an example of a linear regression model with interaction, consider the model given by the equation . The regression plane and contour plot for this model are shown in Figures 5.3 and 5.4, respectively.
|
Figure 5.3: Regression plane for the model . |
|
Figure 5.4: Contour plot for the model . |
Now consider the regression model shown next:
(4)
This model is also a linear regression model and is referred to as a
polynomial regression model. Polynomial regression models contain
squared and higher order terms of the predictor variables making the response
surface curvilinear. As an example of a polynomial regression model with
an interaction term consider the following equation:
(5)
This model is a second order model because the maximum power of the terms in the model is two. The regression surface for this model is shown in Figure 5.5. Such regression models are used in RSM to find the optimum value of the response, (for details see Chapter 9, Response Surface Methods). Notice that, although the shape of the regression surface is curvilinear, the regression model of Eqn. (5) is still linear because the model is linear in the parameters. The contour plot for this model is shown in Figure 5.6.
All multiple linear regression models can be expressed in the following
general form:
(6)
where denotes the number of terms in the
model. For example, the model of Eqn. (5) can be
written in the general form using , and as follows:
|
Figure 5.5: Regression surface for the model . |
|
Figure 5.6: Contour plot for the model . |