In ggplot, how to draw a circle/disk with a line that divides its area according to a given ratio and colored points inside?
Asked Answered
S

1

2

I want to visualize proportions using points inside a circle. For example, let's say that I have 100 points that I wish to scatter (somewhat randomly jittered) in a circle.

100pointsbw

Next, I want to use this diagram to represent the proportions of people who voted Biden/Harris in 2020 US presidential elections, in each state.

Example #1 -- Michigan
Biden got 50.62% of Michigan's votes. I'm going to draw a horizontal diameter that splits the circle to two halves, and then color the points under the diameter in blue (Democrats' color).

michigan


Example #2 -- Wyoming
Unlike Michigan, in Wyoming Biden got only 26.55% of the votes, which is approximately a quarter of the vote. In this case I'd draw a horizontal chord that divides the circle such that the disk's area under the chord is 25% of the entire disk area. Then I'll color the respective points in that area in blue. Since I have 100 points in total, 25 points represent the 25% who voted Biden in Wyoming.

wyoming


My question: How can I do this with ggplot? I researched this issue, and there's a lot of geometry going on here. First, the kind of area I'm talking about is called a "circular segment". Second, there are many formulas to calculate its area, if we know some other parameters about the shape (such as the radius length, etc.). See this nice demo.

However, my goal isn't to solve geometry problems, but just to represent proportions in a very specific way:

  1. draw a circle
  2. sprinkle X number of points inside
  3. draw a (real or invisible) horizontal line that divides the circle/disk area according to a given proportion
  4. ensure that the points are arranged respective to the split. That is, if we want to represent a 30%-70% split, then have 30% of the points under the line that divides the disk.
  5. color the points under the line.

I understand that this is somewhat an exotic visualization, but I'll be thankful for any help with this.


EDIT


I've found a reference to a JavaScript package that does something very similar to what I'm asking.

Science answered 31/7, 2021 at 20:23 Comment(2)
I understand that your goal isn't to solve geometry problems, but I think that's what you'll have to do in order to solve this problem ... what have you tried to get started? en.wikipedia.org/wiki/Circular_segmentBowerman
I also think that this is a really poor way to visualize proportions.Alveolus
B
2

I took a crack at this for fun. There's a lot more that could be done. I agree that this is not a great way to visualize proportions, but if it's engaging your audience ...

Formulas for determining appropriate heights are taken from Wikipedia. In particular we need the formulas

a/A = (theta - sin(theta))/(2*pi)
h = 1-cos(theta/2)

where a is the area of the segment; A is the whole area of the circle; theta is the angle described by the arc that defines the segment (see Wikipedia for pictures); and h is the height of the segment.

Machinery for finding heights.

afun <- function(x) (x-sin(x))/(2*pi)
## curve(afun, from=0, to = 2*pi)
find_a <- function(a) {
    uniroot(
        function(x) afun(x) -a,
        interval=c(0, 2*pi))$root
}
find_h <- function(a) {
    1- cos(find_a(a)/2)
}
vfind_h <- Vectorize(find_h)
## find_a(0.5)
## find_h(0.5)
## curve(vfind_h(x), from = 0, to= 1)

set up a circle

dd <- data.frame(x=0,y=0,r=1)
library(ggforce)
library(ggplot2); theme_set(theme_void())
gg0 <- ggplot(dd) + geom_circle(aes(x0=x,y0=y,r=r)) + coord_fixed()

finish

props <- c(0.2,0.5,0.3)  ## proportions
n <- 100                 ## number of points to scatter
cprop <- cumsum(props)[-length(props)]
h <- vfind_h(cprop)
set.seed(101)
r <- runif(n)
th <- runif(n, 0, 2 * pi)
  
dd <- 
 data.frame(x = sqrt(r) * cos(th), 
            y = sqrt(r) * sin(th))

dd2 <- data.frame(x=r*cos(2*pi*th), y = r*sin(2*pi*th))
dd2$g <- cut(dd2$y, c(1, 1-h, -1))
gg0 + geom_point(data=dd2, aes(x, y, colour = g), size=3)

There are a bunch of tweaks that would make this better (meaningful names for the categories; reverse the axis order to match the plot; maybe add segments delimiting the sections, or (more work) polygons so you can shade the sections.

You should definitely check this for mistakes — e.g. there are places where I may have used a set of values where I should have used their first differences, or vice versa (values vs cumulative sum). But this should get you started.

circle with points representing proportions

Bowerman answered 1/8, 2021 at 0:3 Comment(4)
Thanks @Ben, this is incredibly helpful. Trying to wrap my head around the code. May I ask for a hint -- where should I modify the code to ensure that the number of points in each segment represents the respective proportion? For example, in the plot you generated, the top segment (blue) accounts for 20% of the disc area, yet there are only 8 blue points (rather than 20 of total of 100). If I wanted the number of points in each segment to reflect proportions given in props vector, how should I do that?Science
I adopted the code from this answer to deal with the clustering around the center. As a side-effect, the modified code also changed the number of points in each color/segment., bringing it closer to the desired proportions in props. Still, as can be seen here, this is still not an accurate representation of props. The modified code: r <- runif(n); th <- runif(n, 0, 2 * pi); dd2 <- data.frame(x = sqrt(r) * cos(th), y = sqrt(r) * sin(th)Science
Please feel free to edit ... also, are the inaccuracies just binomial sampling inaccuracies? Do you get closer to the expected proportions with larger N?Bowerman
Yes, I do get more accurate proportions with larger N...Science

© 2022 - 2024 — McMap. All rights reserved.