Generalized Linear Models Dialog

Forward Search in Generalized Linear Models

Forward Library Help

Model Page:

Data Set:

The data to be used in the Forward Search. By default only the data frames in the current working directory are displayed. However, any data frame on the search path (for instance fuel.frame) may be selected by typing its name into the Data Set combo box.

Weights:

Select the column that specifies weights to be applied to all observations used in the analysis. To weight all rows equally, leave this blank.

Use Example Data

Selecting this option causes the Data Set combo box to list the data frames containing the example data provided with the Forward Library. Additionally, when this option is selected the formula, family and link will be filled in automatically.

Omit Rows with Missing Values

If this option is selected, rows in the data frame that contain missing values (NA's) will be removed. Otherwise an error will be generated if missing values are present.

Dependent:

Specifies the dependent variable in the model formula.

Independent:

Specifies the independent (explanatory) variable(s) in the model formula. Use control-clicks to select multiple variables. For more compilcated models select the data (in the Data Set combo box) and press the Create Formula button.

Family:

Select the distribution Family for the Model.

Link:

Select the link function for the model. The link function of the response is modeled as the sum of linear terms. The possible link functions depend on the family.

Lambda:

The value of the transformation parameter lambda for the Gamma family with Box-Cox link.

LMS Subsets:

The initial subset for the Forward Search in Generalized Linear Models is found by fitting the model with the Forward Library function lmsglm. This option allows the user to control how many subsets are used in the Least Median of Squares criterion. The choices are standard (which uses 100 subsets) and all. Additionally, a specific number of subsets may be specified by typing a number in to the LMS Subsets combo box.

Balance Search

This option is only available for models with binary response. Balancing the search causes the success/failure ratio in each subset of the search to be held as close as possible to the success/failure ratio for the entire data.

Show Progress in Report

Prints a message in the report window for every ten iterations completed in lmsglm (the initial robust estimate) and in the forward search.

Options Page:

Max Iterations:

Enter an integer value specifying the maximum number of iterations to perform for the maximum likelihood estimation procedure. If convergence has not been reached after this number of iterations, the procedure stops. The default value appears in the field.

Convergence Tolerance:

Enter a positive number used as the tolerance for the convergence criterion in the algorithm. This relative offset criterion measures the numerical imprecision in the parameter estimates compared to the statistical variability. Smaller values of Convergence Tolerance require more iterations while larger values result in convergence being declared earlier. The default value appears in the field.

Remark

The default values for these parameters are different than those used by the function glm which uses 10 for the maximum number of iterations and 1e-4 for the convergance tolerance.

Results Page:

Short Output

This will display a brief summary of the Forward Search.

Long Output

This option provides a more detailed summary of the Forward Search. The last Steps (see below) of the diagnostic statistics monitored during the search are displayed.

Steps:

Controls how many steps are shown in the summary when Long Output is selected. The default behavior ("auto") is to show either the last 5 steps or the last 10% of the search, which ever is greater. Selecting "all" will display the statistics for the entire search and selecting "user" will allow you to enter the desired number of steps in the Number of Steps field.

Number of Steps:

If the "user" option is selected in Steps then the desired number of Steps should be entered here.

Omit Perfect Fit Steps

Selecting this option causes the Long Output to omit steps for which there is a perfect fit. Note that a perfect fit can only occur in a model with a binary response.

Save As:

The Forward Search is saved as an S object with (S version 3) class "fwdglm" in the current working directory. If you do not wish to save the results simply leave this field blank.

Plots Page:

Deviance Residuals

Plots the deviance residuals at each step of the forward search. Note that each line in the plot corresponds to the residual for one observation.

Threshold for Labels in the Plot:

Only observations whose deviance residual exceeds this threshold (at some point during the forward search) will be labeled in the Deviance Residuals plot.

Max and Min Residuals

Plots the maximum deviance residual in the subset and the mth overall ordered deviance residual as well as the minimum deviance residual in the complement of the subset and the (m+1)th overall ordered deviance residual for each step in the forward search.

Leverage

Plots the leverage for each step of the forward search.

Omit Perfect Fit Steps

Selecting this option causes the plots to omit steps for which there is a perfect fit. Note that a perfect fit can only occur in a model with a binary response.

Coefficients and t Statistics

Plots the coefficients and t statistics for each step of the forward search.

Include Labels on Plot

If TRUE then the lines in the coefficients and t statistics plots will be labeled.

ylim: t Statistic:

The range for the plot of the t statistics. This should be a numeric vector of length 2 containing the lower and upper bounds for the plot. For example, "c(-3.5, 3.5)" (without the quotes) would set the range to show values between -3.5 and 3.5.

Confidence Bands for t Statistic:

The value (on the scale of the t statistics) used to draw the confidence interval on the plot of the t statistics.

Cook's + Modified Cook's Distance

Plots the Cook's and modified Cook's for each step of the forward search.

Deviances

Plots the deviance, the deviance explained, the dispersion parameter and the pseduo R-squared for each step of the forward search.

Glm Weights

Plots the weights from the generalized model fit during each step of the forward search.

Link Test

Plots the goodness of link test at each step of the forward search.

ylim: Link Test:

The range for the goodness of link test plot. This should be a numeric vector of length 2 containing the lower and upper bounds for the plot. For example, "c(-10, 10)" (without the quotes) would set the range to show values between -10 and 10.

Confidence Level:

The value used to draw the confidence interval on the goodness of link test plot.