Passive data collection leads to a number of problems in statistical modeling. Observed changes in a response variable may be correlated with, but not caused by, observed changes in individual factors (process variables). Simultaneous changes in multiple factors may produce interactions that are difficult to separate into individual effects. Observations may be dependent, while a model of the data considers them to be independent.
Designed experiments address these problems. In a designed experiment, the data-producing process is actively manipulated to improve the quality of information and to eliminate redundant data. A common goal of all experimental designs is to collect data as parsimoniously as possible while providing sufficient information to accurately estimate model parameters.
For example, a simple model of a response y in an experiment with two controlled factors x1 and x2 might look like this:
y = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 1 x 2 + ε
Here ε includes both experimental error and the effects of any uncontrolled factors in the experiment. The terms β1x1 and β2x2 are main effects and the term β3x1x2 is a two-way interaction effect. A designed experiment would systematically manipulate x1 and x2 while measuring y, with the objective of accurately estimating β0, β1, β2, and β3.