Monday, May 14, 2012

Stata tip: Plotting the coefficients estimated from a regression (bar graph in stata)

Suppose you want to make a bar chart/graph/plot of the coefficients (betas) that are returned in the ereturn list from the regression (reg) command. You might want to do this if you want to visualize the relative weight the coefficients give to your estimation. For example, suppose you want to predict consumption based on the assets: car, satellite dish, generator, household size (e.g. if you are working on a Proxy Means Test (PMT) formula). Assume the first three are dummy/binary indicators.

The coefficients estimated from the regression will give you an indication how important each factor is. For example, if the coefficients are, respectively: +5, +1, +3, -15, then you know that the household size dominates the calculation: an additional member reduces predicted consumption more than having all the other assets increases it.

Here is some code for a program (.ado file) that you can call after running the reg command that will create a dataset with the variables in the regression (including the constant) and one observation for each variable, which is the coefficient (see more text after the code below. Yes I know this code is horribly inefficiently written, I just wanted something quick, which means I got something quick and dirty):

program define dataset_coefficients
   
    syntax , Filename(string) [Separator(string)]
    version 9.1
    if ("`separator'" == "") {
        local separator  ","
    }
    // Get the names of the variables to write out. Need to change " o." to " " for making name for the macro to hold the variable labels
    local varnames : coln e(b)
    local coefs ""
    foreach varn of local varnames {
        local coef = _coef[`varn']
        local coefs "`coefs' `coef'"
        local varn1 = regexr("`varn'","o._I","_I")
        if ("`varn'" != "_cons") {
            local varlab_`varn1' : variable label `varn1'
        }
        else {
            local varlab_constant "constant"
        }
    }
    preserve
    drop *
    // Generate the new variable names, and apply the labels
    local variablenamestoplot ""
    foreach varn of local varnames {
        local varn1 = regexr("`varn'","o._I","_I")
        if ("`varn'" != "_cons") {
            gen `varn1' = .
            label var `varn1' `"`varlab_`varn1''"'
            local variablenamestoplot "`variablenamestoplot' `varn1'"
        }
        else {
            gen constant = .
            label var constant "constant"
            local variablenamestoplot "`variablenamestoplot' `constant'"
        }
    }
    // Apply the values to the variables as observations
    set obs 1
    local count = 0
    foreach varn of local varnames {
        local count = `count' + 1
        local coef1 : word `count' of `coefs'
        local varn1 = regexr("`varn'","o._I","_I")
        if ("`varn'" != "_cons") {
            // constant?
            replace `varn1' = `coef1' in 1
        }
        else {
            replace constant = `coef1' in 1
        }
    }
    cap drop __*
    // global variablenamestoplot "`variablenamestoplot'"
    // char [variablenamestoplot] "`variablenamestoplot'"
    notes : `variablenamestoplot'
    save "`filename'" , replace
    restore
end program

After calling this, you can simply load the dataset and graph/chart/plot the coefficients on a bar graph using the following command:

use plotme.dta, clear
 // get the list of variables. I can't just use * because I get some error like __00000 not found. And I don't want to plot the constant.
local listofvars ""
foreach var of varlist * {
        if ("`var'" != "constant") {
            local listofvars "`listofvars' `var'"
        }
}
graph bar (asis) `listofvars', blabel(name, pos(outside) orient(vertical)) legend(off) title("Coefficients ")
graph export coef.png, replace

Let me know how this works for you.

3 comments:

Benjamin Josiah said...

Thanks for the code

So I saved it as randomname.ado in my ado directory which is c:\ado , and then go in stata and type "randomname" after my regression?

Is this what I am supposed to do? I tried it and it didnt work.

Benjamin Josiah said...

Thanks for the code

So I saved it as randomname.ado in my ado directory which is c:\ado , and then go in stata and type "randomname" after my regression?

Is this what I am supposed to do? I tried it and it didnt work.

Shafique Jamal said...

Sorry for being so late (more than a year) with this reply - didn't see your comment until just now. In case this is useful for anyone I'm replying now.

So you need to put this file, named dataset_coefficients.ado, in your stata personal ado path, which you can find using the information at this link:

http://www.stata.com/support/faqs/programming/personal-ado-directory/