Poids pondérations STATA

par lem Jeu 26 Aoû 2010 - 15:03

J'ai posté cette question dans le forum questions générales. Peut être j'aurai du la poser ici.
Bonjour à tous. J'ai un problème. Je traite une enquête INSEE démographie (TEO). Dans l'enquête il y a les poids (weight) ainsi que les poids normalisés (à la taille de l'échantillon).
Lorsque je travail sur STATA et je estime un modèle PROBIT, j'utilise la variante FWEIGHT. Ainsi, mon échantillon de 8000 individus devient 4 400 000 individus. Jusqu'à là tout est en ordre. Le problème est : les écart types des coefficients deviennent HYPER significatifs (39 coefficients tous significatifs à au moins 1%). Je pense que la variance diminue avec le poids rendant significatifs n'importe quel paramètre. Est-ce un erreur de STATA? Peut on se fier à ces résultats? Le papier de Laurent Davezies et Xavier D'Haultfoeuille "Faut-il pondérer ? ...Ou l'éternelle question de l'économètre confronté à un problème de sondage" de juin 2009 (CREST) me fait douter de la qualité de l'ajustement.
En revanche, si j'utilise la variante PWEIGHT, la valeur des coefficients reste la même (qu'avec FWEIGHT), mais la significativité des paramètres "semble" plus normal (à mon grand regret car certaines des variables explicatives deviennent non significatives ...)
Comment je peut m'assurer que la procédure est correcte?
Merci pour votre aide. Shocked

par c@ssoulet Ven 27 Aoû 2010 - 9:31

Ca devrait t'éclairer.....

Choosing the Correct Weight Syntax

One of the most common mistakes made when analyzing data from sample surveys is specifying an incorrect type of weight for the sampling weights. Only one of the four weight keywords provided by Stata, pweight, is correct to use for sampling weights. The purpose of each type of weight follows.

Sampling or Probability weights: pweight
Stata has a special weight, pweight, to specify probability weights. Probability weights are another name for sampling weights. The pweight command causes Stata to use the sampling weight as the number of subjects in the population that each observation represents when computing estimates such as proportions, means and regressions parameters. A robust variance estimation technique will automatically be used to adjust for the design characteristics so that variances, standard errors and confidence intervals are correct.
Si pweight = 100 alors cette observation a la probabilité 1/100 d’être dans l’échantillon

Frequency Weights: fweight
Frequency weights are integers that indicate the number of times the observation was actually observed. It is used when your data set has been collapsed and contains a variable that tells the frequency each record occurred.
Par exemple si fweight = 127, il y a 127 observations avec ayant la meme valeur. Si on ne précise pas de poids, fweight=1

Analytic Weights: aweight
Typiquement les observations représentent les moyennes et aweight= nombre d’éléments qui ont donné lieu à ces moyennes.

Analytic weights are used when you want to compute a linear regression on data that are observed means.
Do not use aweights to specify sampling weights. This is because the formulas that use aweights assume that larger weights designate more accurately measured observations. Conversely, one observation from a sample survey is no more accurately measured than any other observation. Hence, using the aweight command to specify sampling weights will cause Stata to estimate incorrect values of the variance and standard errors of estimates, and p-values for hypothesis tests.

Importance Weights: iweight
Stata has a special weight command, iweight, which can be used by programmers who need to implement their own analytical techniques by using some of the available estimation commands. Special care should be taken when using importance weights to understand how they are used in the formulas for estimates and variance. This information is available in the Methods and Formulas section in the Stata manual for each estimation command. In general, these formulas will be incorrect for computing the variance for data from a sample survey.

par lem Ven 27 Aoû 2010 - 9:44

Merci pour avoir pris le temps de répondre.
Mon problème n'est pas celui d'utiliser le bon weight (en effet, pour mon travail il n'y a que fweight ou pweight que doivent/peuvent être utilisés).
Mon problème sont les écart types des coefficients qui deviennent tellement petits que tout variable utilisée est significative, lorsqu'on utilise fweight. J'ai comme l'idée que STATA ne calcule pas correctement la variance, car elle diminue trop. Autrement dit, mes variables sont toutes hyper significatives lors de l'utilisation de fweight. Je vais poster d'ici peu une comparaison des équations.
Merci encore pour ta bonne volonté.

par Contenu sponsorisé

Poids pondérations STATA

Poids pondérations STATA

Re: Poids pondérations STATA

Re: Poids pondérations STATA

Re: Poids pondérations STATA