I’m doing a linear regression with cluster robust SE and I have the following conceptual problem:
I have five regressors, of which four are statistically significant, while the remaining regressor is not.When I put $K$ dummy variables in the model in order to control for effects not captured by the $5$ initial explanatory variables, I saw that:
 Some dummy variables were statistically significant
 The regressor that initially was not significant becomes significant.
What is the reason for the second result? What does it mean?
Answer
What you have described is a classic example of the phenomenon “confounding.” For the sake of argument, suppose you want to know what factors affect the price of a car, and the original model you fitted was:
$Price_i=MPG^*_i + Weight_i + Length_i + GearRatio_i$
*$MPG$ is how many miles per gallon the car gets
The regression results are as follows:
Source  SS df MS Number of obs = 74
+ F( 4, 69) = 10.93
Model  246385405 4 61596351.2 Prob > F = 0.0000
Residual  388679991 69 5633043.35 Rsquared = 0.3880
+ Adj Rsquared = 0.3525
Total  635065396 73 8699525.97 Root MSE = 2373.4

price  Coef. Std. Err. t P>t [95% Conf. Interval]
+
mpg  90.8697 82.54167 1.10 0.275 255.5358 73.79643
weight  5.330082 1.259779 4.23 0.000 2.816892 7.843272
length  112.6501 39.26864 2.87 0.005 190.9889 34.31134
gear_ratio  1747.338 940.8806 1.86 0.068 129.6674 3624.343
_cons  7909.196 6803.245 1.16 0.249 5662.907 21481.3

$Weight$ and $Length$ are significantly associated with price at the 5% level, whereas $GearRatio$ is significant at the 10% level. In this example, I will use 10% as the significant level often used in econometrics instead of the customary 5% in statistics/biostatistics.
Now suppose you realize that the country of origin of the car might have something to do with the price, so you enter “Country of origin” ($Country$)–a variable with 4 categories: 1. USA, 2. Japan, 3. Germany, and 4. France/Italy–into your model as dummy variables with “USA” as the reference/omitted category. The resulting model is as follows:
Source  SS df MS Number of obs = 74
+ F( 7, 66) = 7.05
Model  271664993 7 38809284.6 Prob > F = 0.0000
Residual  363400404 66 5506066.72 Rsquared = 0.4278
+ Adj Rsquared = 0.3671
Total  635065396 73 8699525.97 Root MSE = 2346.5

price  Coef. Std. Err. t P>t [95% Conf. Interval]
+
mpg  43.63664 88.87729 0.49 0.625 221.0859 133.8126
weight  5.627906 1.277128 4.41 0.000 3.078037 8.177775
length  108.6306 40.96925 2.65 0.010 190.4283 26.83285
gear_ratio  1036.988 1011.416 1.03 0.309 982.369 3056.344

country 
Germany  1474.478 786.7092 1.87 0.065 96.23774 3045.193
Japan  1508.771 931.8605 1.62 0.110 351.7485 3369.291
France/Italy  1513.169 1660.423 0.91 0.365 1801.972 4828.311

_cons  6825.621 6936.845 0.98 0.329 7024.236 20675.48

When we added $Country$ into the model, $GearRatio$ was no longer significant at the 10% level and $MPG$ became even more not significant (p was 0.28 in the original model, and became 0.63 after adding $Country$). We also note that the only significant category of $Country$ was $Germany$.
How do we interpret these results?
 Recall that dummy variables are entered into the model as a set as $(N1)$ dummy variables where $N$ is the number of categories in the original variable. Recall also that dummies are interpreted relative to the excluded (reference) category. It is therefore normal for some dummy variables not to be significant in the model if the difference between that category and the reference category is not significant. In our example, German cars are on average USD 1,474.48 more expensive than American cars, whereas Japanese and French/Italian cars are both not significantly different from American cars in terms of $Price$. If you want to know whether the effect of the construct you entered as dummy variables was significant or not, you will need to do an Ftest of the joint significance of your dummies, as the pvalue given in the model only tells you if the given category was different from the reference or not, and not whether the $Country$ as a whole is significantly associated with $Price$:
test Germany Japan FranceItaly
( 1) Germany = 0
( 2) Japan = 0
( 3) FranceItaly = 0
F( 3, 66) = 1.53
Prob > F = 0.2148
It turns out $Country$ as a whole is not a significant predictor of price (p=0.21), although German cars are significantly more expensive than American cars in this model.
 We also noted that some variables that were significant ($GearRatio$) became nonsignificant after adding $Country$. This means that in the model where we omitted $Country$, the parameter estimate for $GearRatio$ “absorbed” the effect of $Country$. That is, $Country$ is significantly associated with $GearRatio$ and $Price$, and failing to control for $Country$ biased the parameter estimate of $GearRatio$, making it seem more significant than it really is. That is, the “significant” effect of $GearRatio$ on $Price$ we saw in the original model is actually reflecting the effect of $Country$ on $Price$. $GearRatio$, as it turns out, has nothing to do with the $Price$ of a car.
Of course, the reverse can be true too: You CAN have something that was not significant become significant after adding variables to the model. The logic behind it is the same. The originallynotsignificant variable was significantly associated with the omitted variable and reflects the effect of the omitted variable in addition to its own effect (plus some other unobservables, which we will ignore for the sake of argument). When you add the omitted variable (the dummies) into the model, the originallynotsignificant variable no longer captures the partial effect of the omitted variable but now reflects the “true” effect of that variable…which, it turns out, is significantly associated with the outcome.
(Data: Stata builtin dataset “1978 Automobile Data” from http://www.statapress.com/data/r13/auto.dta)
Attribution
Source : Link , Question Author : Luca Dibo , Answer Author : Marquis de Carabas