回归系是按非标准化的还是标准化的？

mjf

2669

收藏 2014-04-09

回归系是按非标准化的还是标准化的？非标准化的系数可能为0？标准化的没和T值？

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

全部回复

ReneeBK

2014-4-10 01:19:23

In statistics, standardized coefficients or beta coefficients are the estimates resulting from an analysis carried out on independent variables that have been standardized so that their variances are 1. Therefore, standardized coefficients refer to how many standard deviations a dependent variable will change, per standard deviation increase in the predictor variable. Standardization of the coefficient is usually done to answer the question of which of the independent variables have a greater effect on the dependent variable in a multiple regression analysis, when the variables are measured in different units of measurement (for example,income measured in dollars and family size measured in number of individuals).

Some statistical software packages like PSPP, SPSS and SYSTAT label the standardized regression coefficients as "Beta" while the unstandardized coefficients are labeled "B". Others, like DAP/SAS label them "Standardized Coefficient". Sometimes the unstandardized variables are also labeled as "B" or "b".

A regression carried out on original (unstandardized) variables produces unstandardized coefficients. A regression carried out on standardized variables produces standardized coefficients. Values for standardized and unstandardized coefficients can also be derived subsequent to either type of analysis.

Before solving a multiple regression problem, all variables (independent and dependent) can be standardized. Each variable can be standardized by subtracting its mean from each of its values and then dividing these new values by the standard deviation of the variable. Standardizing all variables in a multiple regression yields standardized regression coefficients that show the change in the dependent variable measured instandard deviations.

Advantages

Standard coefficients' advocates note that the coefficients ignore the independent variable's scale of units, which makes comparisons easy.

Disadvantages

Critics voice concerns that such a standardization can be misleading. Since standardizing a variable removes the unit of measurement from its value, a standardized coefficient for a given relationship only represents its strength relative to the variation in the distributions. This invites bias due to sampling error when one standardizes variables using means and standard deviations based on small samples. Furthermore, a change of one standard deviation in one variable is only equivalent to a change of one standard deviation in another predictor insofar as the shapes of the two variables' distributions resemble one another. The meaning of a standard deviation may vary markedly between non-normal distributions (e.g., when skewed or otherwise asymmetrical). This underscores the importance of normality assumptions inparametric statistics, and poses an additional problem when interpreting standardized coefficient estimates that even nonparametric regression does not solve when dealing with non-normal distributions.

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

ReneeBK

2014-4-10 05:09:33

I'm curious about why you want standardized coefficients.  If it is to (try to) judge relative importance of predictor variables, you
might find the following excerpt from John Fox's book interesting.------

In his book "Applied Regression Analysis and Generalized Linear Models"(2008, Sage), John Fox is very cautious about the use of standardized regression coefficients.  He gives this interesting example.  When two variables are measured on the same scale (e.g.,years of education, and years of employment), then relative impact of the two can be compared directly. But suppose those two variables differ substantially in the amount of spread.  In that case, comparison of the standardized regression coefficients would likely yield a very different story than comparison of the raw regression coefficients.  Fox then says:

"If expressing coefficients relative to a measure of spread potentially distorts their comparison when two explanatory variables are commensurable [i.e., measured on the same scale], then why should the procedure magically allow us to compare coefficients [for variables] that are measured in different units?" (p. 95)

Good question!

A page later, Fox adds the following:

"A common misuse of standardized coefficients is to employ them to make comparisons of the effects of the same explanatory variable in two or more samples drawn from different populations.  If the explanatory variable in question has different spreads in these samples, then spurious differences between coefficients may result, even when _unstandardized_ coefficients are similar; on the other hand, differences in unstandardized coefficients can be masked by compensating differences in dispersion." (p. 96)

And finally, this comment on whether or not Y has to be standardized:

"The usual practice standardizes the response variable as well, but this is an inessential element of the computation of standardized coefficients, because the _relative_ size of the slope coefficients does not change when Y is rescaled." (p. 95)

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群