How to get y axis range in Stata
Asked Answered
T

2

6

Suppose I am using some twoway graph command in Stata. Without any action on my part Stata will choose some reasonable values for the ranges of both y and x axes, based both upon the minimum and maximum y and x values in my data, but also upon some algorithm that decides when it would be prettier for the range to extend instead to a number like '0' instead of '0.0139'. Wonderful! Great.

Now suppose that after (or while) I draw my graph, I want to slap some very important text onto it, and I want to be choosy about precisely where the text appears. Having the minimum and maximum values of the displayed axes would be useful: how can I get these min and max numbers? (Either before or while calling the graph command.)

NB: I am not asking how to set the y or x axis ranges.

Tupelo answered 3/2, 2022 at 19:38 Comment(4)
Good question. I don't think that's possible. The closest approximation might be to take the min and max of your variables.Epirus
@Epirus Yes. My motivations in a current very specific case are because Stata sometimes 'prettifies' range (and tick/label) choices, and that makes the min and max of y and x unhelpful. :/ Aside: a good question, but no upvote!? Sheesh! Rough crowd. :DTupelo
It's a good question. I usually turn it round when I have a related problem and decide on the range I want as based on the empirical range and specific axis labels.Moneybags
I do think it's possible, but my solution is very inefficient.Nereid
O
2

Since this issue has been a bit of a headache for me for quite some time and I believe there is no good solution out there yet I wanted to write up two ways in which I was able to solve a similar problem to the one described in the post. Specifically, I was able to solve the issue of gray shading for part of the graph using these.

  1. Define a global macro in the code generating the axis labels This is the less elegant way to do it but it works well. Locate the tickset_g.class file in your ado path. The graph twoway command uses this to draw the axes of any graph. There, I defined a global macro in the draw program that takes the value of the omin and omax locals after they have been set to the minimum between the axis range and data range (the command that does this is local omin = min(.scale.min,omin) and analogously for the max), since the latter sometimes exceeds the former. You could also define the global further up in that code block to only get the axis extent. You can then access the axis range using the globals after the graph command (and use something like addplot to add to the previously drawn graph). Two caveats for this approach: using global macros is, as far as I understand, bad practice and can be dangerous. I used names I was sure wouldn't be included in any program with the prefix userwritten. Also, you may not have administrator privileges that allow you to alter this file based on your organization's decisions. However, it is the simpler way. If you prefer a more elegant approach along the lines of what Nick Cox suggested, then you can:
  2. Use the undocumented gdi natscale command to define your own axis labels The gdi commands are the internal commands that are used to generate what you see as graph output (cf. https://www.stata.com/meeting/dcconf09/dc09_radyakin.pdf). The tickset_g.class uses the gdi natscale command to generate the nice numbers of the axes. Basic documentation is available with help _natscale, basically you enter the minimum and maximum, e.g. from a summarize return, and a suggested number of steps and the command returns a min, max, and delta to be used in the x|ylabel option (several possible ways, all rather straightforward once you have those numbers so I won't spell them out for brevity). You'd have to adjust this approach in case you use some scale transformation.

Hope this helps!

Ona answered 8/12, 2022 at 18:22 Comment(0)
N
1

I like Nick's suggestion, but if you're really determined, it seems that you can find these values by inspecting the output after you set trace on. Here's some inefficient code that seems to do exactly what you want. Three notes:

  1. when I import the log file I get this message: Note: Unmatched quote while processing row XXXX; this can be due to a formatting problem in the file or because a quoted data element spans multiple lines. You should carefully inspect your data after importing. Consider using option bindquote(strict) if quoted data spans multiple lines or option bindquote(nobind) if quotes are not used for binding data.
  2. Sometimes the data fall outside of the min and max range values that are chosen for the graph's axis labels (but you can easily test for this).
  3. The log linesize is actually important to my code below because the key values must fall on the same line as the strings that I use to identify the helpful rows.
* start a log (critical step for my solution)
cap log close _all
set linesize 255
log using "log", replace text   


* make up some data: 
clear 
set obs 3
gen xvar = rnormal(0,10) 
gen yvar = rnormal(0,.01) 


* turn trace on, run the -twoway- call, and then turn trace off
set trace on 
twoway scatter yvar xvar 
set trace off
cap log close _all


* now read the log file in and find the desired info 
import delimited "log.log",  clear  
egen my_string = concat(v*)
keep if  regexm(my_string,"forvalues yf") | regexm(my_string,"forvalues xf")
drop if  regexm(my_string,"delta") 
split my_string, parse("=") gen(new)
gen     axis = "vertical"   if regexm(my_string,"yf")
replace axis = "horizontal" if regexm(my_string,"xf")
keep axis new*
duplicates drop 
loc my_regex = "(.*[0-9]+)\((.*[0-9]+)\)(.*[0-9]+)"
gen min     = regexs(1) if regexm(new3,"`my_regex'")
gen delta   = regexs(2) if regexm(new3,"`my_regex'")
gen max_temp= regexs(3) if regexm(new3,"`my_regex'")
destring min max delta , replace
gen max     = min + delta* int((max_temp-min)/delta) 


*here is the info you want: 
list axis min delta max
Nereid answered 27/9, 2022 at 16:19 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.