Date of Award

6-2022

Document Type

Open Access

Degree Name

Bachelor of Science

Department

Mathematics

First Advisor

Roger Hoerl

Keywords

Mixture variable, Process variable, Overfitting, Bootstrapping, Fractional design

Abstract

Mixture variables are unique as the components must sum to 1, causing problems when there is interaction between mixture and process variables. The best model is the fully linearized model, but this can get large quickly. We began by comparing models on multiple data sets. These models include linear and nonlinear models. After seeing that nonlinear models appear to be the best alternatives, we used the systematically selected fractions of each data set in order to obtain an in and out of sample RMSE. This allows us to see if there is evidence of overfitting, how well the model predicts out of sample, and how well the model fits the training data. To increase the number of data points, we used bootstrapping to create a random sample that is proportional to the size of the full data set. The resulting RMSEs indicated that Zhong’s model and the fully linearized model had extreme evidence of overfitting. Thus, we considered the other 3 models as better options. There was evidence of overfitting for these models with two data sets. The models seemed to do better for the Prescott data, which is interesting as it is the largest data set and the only data set where the mixture variables have constraints and the process variables have 3 levels. Within the Prescott data sets, the SHB nonlinear appears to perform the best with the least evidence of overfitting, so we conclude that this is in general the best alternative to the fully linearized model.

Share

COinS
 

Rights Statement

In Copyright - Educational Use Permitted.