Thursday, December 6, 2012

Stata tip: Quickly, and in one command, rename all variable labels of variables generated with the 'xi' command to reflect the value labels of the xi'd variable

When you use the xi command on categorical variables, even on those that have a value label associated with them, you get this:

. xi: svy, subpop(rural): reg logpccd rural_regressors refrigerator car livingroomsper numchild11 numchild11_sq i.oblast i.typeofdwell i.roof i.coldwatermeterinstalled i.soc_

. d

_Ioblast_3      byte   %8.0g                  oblast==3
_Ioblast_4      byte   %8.0g                  oblast==4
_Ioblast_5      byte   %8.0g                  oblast==5
_Ioblast_6      byte   %8.0g                  oblast==6
_Ioblast_7      byte   %8.0g                  oblast==7
_Ioblast_8      byte   %8.0g                  oblast==8
_Ioblast_11     byte   %8.0g                  oblast==11
_Itypeofdwe_2   byte   %8.0g                  typeofdwelling==2
_Itypeofdwe_3   byte   %8.0g                  typeofdwelling==3
_Itypeofdwe_4   byte   %8.0g                  typeofdwelling==4
_Itypeofdwe_5   byte   %8.0g                  typeofdwelling==5
_Itypeofdwe_6   byte   %8.0g                  typeofdwelling==6
_Itypeofdwe_7   byte   %8.0g                  typeofdwelling==7
_Itypeofdwe_8   byte   %8.0g                  typeofdwelling==8
_Itypeofdwe_9   byte   %8.0g                  typeofdwelling==9


The value labels are not much more informative than are the variable names. Below is a program that will automatically rename the variable label of these variables that result from the xi command so that they include the corresponding value label, as follows:

_Ioblast_3      byte   %8.0g                  oblast=Jalalabat
_Ioblast_4      byte   %8.0g                  oblast=Naryn
_Ioblast_5      byte   %8.0g                  oblast=Batken
_Ioblast_6      byte   %8.0g                  oblast=Osh
_Ioblast_7      byte   %8.0g                  oblast=City of Osh
_Ioblast_8      byte   %8.0g                  oblast=Chui
_Ioblast_11     byte   %8.0g                  oblast=City of Bishkek
_Itypeofdwe_2   byte   %8.0g                  typeofdwelling=Apartment or room in a residential hotel
_Itypeofdwe_3   byte   %8.0g                  typeofdwelling=Separate house
_Itypeofdwe_4   byte   %8.0g                  typeofdwelling=Part of a house
_Itypeofdwe_5   byte   %8.0g                  typeofdwelling=Dormitory
_Itypeofdwe_6   byte   %8.0g                  typeofdwelling=Lodge or a tied cottage (temporary tenure dwelling)
_Itypeofdwe_7   byte   %8.0g                  typeofdwelling=Other non-residential premises used for residence
_Itypeofdwe_8   byte   %8.0g                  typeofdwelling=Other residential premises
_Itypeofdwe_9   byte   %8.0g                  typeofdwelling=Barracks
 

There are actually two programs - mine is a wrapper for a program that Nicholas J. Cox wrote. Both of these are below.

Usage (run this after the xi command):

. varsformyrelabel

Programs:

program define varsformyrelabel

    // Written by Shafique Jamal (shafique.jamal@gmail.com), 12-07-2012
    // UPDATE 12-07-2012: Need to change how the variable name for the list of `allunxidvariables' is determined. Need to get it from the variable label, rather than the variable name
    //

    // Get list of variables that were xi'd
    local xivars "`_dta[__xi__Vars__To__Drop__]:'"
    // di `"xivars:`xivars'"'
   
    // Now just need to get list of un-xi'd variables from this list
    // Here is the first one
    local currentdummyvar : word 1 of `xivars'
    // di `"currentdummyvar:`currentdummyvar'"'
   
    // This will get the full variable name
    local currentunxidvar = regexr("`: variable label `currentdummyvar''","==.*$","")
    // di `"currentunxidvar:`currentunxidvar'"'
    local allunxidvars "`currentunxidvar'"
    // di `"allunxidvars:`allunxidvars'"'
   
    // This will get the _I`var' name, without the _# suffix - I need this for the first argument to the myrelabel routine. Variable name gets shortened
    local currentunxidvarwith_I = regexr("`currentdummyvar'","_[0-9]+$","")
    // di `"currentunxidvar:`currentunxidvarwith_I'"'
    local allunxidvarswith_I "`currentunxidvarwith_I'"
    // di `"allunxidvarswith_I:`allunxidvarswith_I'"'
   
    // Now loop through the rest
    local count = 0
    foreach var of local xivars {
        local count = `count' + 1
        if (`count' != 1) {
            local w : word `count' of `xivars'
            // di "w: `w'"
           
            // check whether the next xi'd var is related to the current one
            // if (regexm("`w'","^_I`currentunxidvar'_[0-9]+$")) { // yes, this is part of the same family as the current _I.... variable under consideration
            if (regexm("`: variable label `w''","^`currentunxidvar'==.*$")) { // yes, this is part of the same family as the current _I.... variable under consideration
                // di "skip"
            }
            else { // no, it is different. add to the list
                // this gets the full variable name
                local currentunxidvar = regexr("`: variable label `w''","==.*$","")
                // di `"currentunxidvar:`currentunxidvar'"'
                local allunxidvars "`allunxidvars' `currentunxidvar'"
                // di `"allunxidvars:`allunxidvars'"'
               
                // This gets the _Ivar name
                local currentunxidvarwith_I = regexr("`w'","_[0-9]+$","")
                // di `"currentunxidvar:`currentunxidvarwith_I'"'
                local allunxidvarswith_I "`allunxidvarswith_I'  `currentunxidvarwith_I'"
                // di `"allunxidvarswith_I:`allunxidvarswith_I'"'
            }       
        }
    }

    di "allunixidvars: `allunxidvars'"
    di `"allunxidvarswith_I:`allunxidvarswith_I'"'
    local count = 0   
    foreach var of local allunxidvars {
        local count = `count' + 1
        local varwith_I : word `count' of `allunxidvarswith_I'
        myrelabel `varwith_I'_* `var'
    }
   
end


program def myrelabel
*! NJC 1.0.0 15 July 2003
    version 7
    syntax varlist(numeric)

    tokenize `varlist'
    local nvars : word count `varlist'
    local last ``nvars''
    local vallabel : value label `last'
    if "`vallabel'" == "" {
        di as err "`last' not labelled"
        exit 498
    }

    local `nvars'
    local varlist "`*'"

    foreach v of local varlist {
        local varlabel : variable label `v'
        local eqs = index(`"`varlabel'"', "==")
        if `eqs' {
            local value = real(substr(`"`varlabel'"', `eqs' + 2, .))
            if `value' < . {
                local label : label `vallabel' `value'
                label var `v' `"`last'=`label'"'
            }
        }
    }

end


1 comment:

boat said...

holy shit this is the best thing i've found on the internet in a decade