The Bechdel Test
A Data Visualization about Women and the Top 50 Movies of 2014
The objective of this project is to utilize effective visual elements to communicate large quantities of data. I chose to focus on the topic of women in movies because I wanted to reveal how hollywood's portrayal of women and their concerns is far from reality and hopefully spark a discussion about how women could be better represented in the future.
I found some interesting facts about how women were portrayed in the top movies for 2014. For example, only 28.1% of movie characters were female, nearly half of the movies failed the Bechdel test, etc.
As I delved into the data, I found that I needed to select data with which I could make apples-to-apples comparisons and decided to focus on the Bechdel Test because I could get specific information about how it pertains to each individual movie, whereas the other information was not explicitly tied to specific movies. Since nearly half of the top 100 movies of 2014 fail the Bechdel test, I wanted to get an idea of how this might influence the people watching these movies. So, first I had to figure out who were watching the movies. I gathered data about domestic gross incomes to tell me which movies got the most views. I gathered data about MPAA ratings to give me an idea about the age groups who viewed the movies. I also gathered data about genres to inform me about the interests of those watching the movies.
One of the first steps to visualizing the data set was to determine the scale of the information I wanted to use. I had data about the genres, gross domestic income, MPAA ratings, movie rank, and movie title for 100 movies. I needed to find a way to simplify the scale of the information to make it more digestible. Eventually I simplified the scale to 50 movies, which degreased the range of the domestic gross income, and simplified the number of genres from 9 to 3. This helps make the information easier to organize.
I began plotting my data points to get an idea of what the data could look like and also to determine if there were any clear patterns.
Once I was able to get a basic visualization of the data, I found that:
- Most of the top movies are PG-13
- About half of the movies fail the Bechdel Test
- Most of the top R-rated films fail the Bechdel Test
- Few G-rated movies are top movies
I also found that some design methods, although simple thus far, communicate better than others. Using color to indicate the pass/fail status of the Bechdel Test is powerful and clear. Also, using an axis to indicate gross is also clear. Using opacity to indicate the gross income becomes unclear when removed from an axis because it is difficult to differentiate subtle degrees of change.
Plotting the Data on a Coordinate System
According to Nathan Yau’s Book, Data Points Visualization that Means Something, the three types of coordinate systems are cartesian, polar, and geographic. I experimented with the cartesian and polar coordinate systems, trying to come up with a breadth of ideas to help me find the best fit for my information.
I found that the best coordinate system for my project was the cartesian system through the process of elimination. My data is not very cyclical and does not repeat itself like a circle does, so the polar coordinate system was unnecessary. Also, my data has little to do with geography, so that eliminated the geographic coordinate system as well. Also, as I began plotting my data, I naturally used the cartesian method to understand it and it worked because I was able to identify patterns in the data.
As I experimented with several different versions of the cartesian coordinate system, I found that I always returned to using gross or rank as an axis as well as rating as an axis. It is easy to understand these data points on an axis because they are a range (i.e. 1–100, $1–350 million, or G–R), which we have learned to understand from using number lines, etc. in elementary school.
I also often avoided the use of genre altogether. This is because I feel like genre tells me the least about the type of people who watch movies. I was more interested to understand the target audiences based on MPAA ratings.
On top of that, I ditched the idea of using the gender of the person who was listed first on the movie poster because I don’t know the reasons behind why some names would be listed before others. If they were listed alphabetically, then it wouldn’t be very relevant to the point I want to make about women and movies. Although, I did find that it was mostly male names that were listed first.
Experimenting with Representation
Next I explored visual elements such as color, shape, line, scale, contrast, etc. to communicate the different types of information. I looked at ways to communicate genre, gender, rating, total gross, and the Bechdel test results.
According to Donald A. Norman’s book, Things That Make Us Smart: Defending Human Attributes In The Age Of The Machine, we come up with external cognitive artifacts to help us remember, think, and reason through information. These cognitive artifacts are considered the “representing world: a set of symbols, each standing for something— representing something— in the represented world.” Norman walks through symbols and methods used to help the brain comprehend information.
As I worked on this project, I thought about Norman’s methods and tried to think about what I wanted to communicate. I wanted to communicate how stunning it is that half of today’s most popular movies fail the Bechdel Test. So, as a result, I wanted to format my diagram to make that the most clear and quickly digestible data.
These brainstorming exercises helped me discover different ways to represent data points. I found that position could help me more clearly demonstrate the difference between Bechdel Test passes and fails. I also found that I could use shape and color to depict the difference in genres.
I started thinking about how I could better use icons and color in my project. If I found icons that effectively communicated the Bechdel Test results, then I could use color in a different area like genre or rating. Some of my iterations are below.
I decided that I wanted to use scale to indicate gross income, so if I was going to use icons to depict the Bechdel Test or genre, I needed them to be similar in shape so that their scale could be easily comparable. In the above sketches, I tried making all of the icons unique while still keeping them in a circular shape so their scale would be easy to compare.
I found that using the above icons, although only shaped slightly different from each other, was distracting. So, I stuck with simpler, more similar icons.
Out of the three diagrams above, I chose the one on the left because it most clearly showed the concept of the Bechdel Test. The only problem is that I didn’t have a way to show the genre at this point. The color relationships in the center diagram were somewhat ineffective in showing the MPAA ratings, so I didn’t pursue that avenue further. The diagram on the right was interesting, but took too much time to comprehend. There were too many axises which made it complicated and difficult to encode.
My next challenge was to come up with icons that informed the audience about the topic of the data visualization. I wanted to maintain icons whose exterior shape was consistent, but whose interior shape could change based on the genre of the movie. I came up with the camera idea below.
The icons above seemed to communicate successfully. Color also seemed to help. By using genre icons, I could decide wether or not to allocate color towards reenforcing genre or the Bechdel Test. I also found that the icon with the “X” over the mouth was powerful in communicating how women in movies do not speak about regular things that interest women.
Moving Through the Data
My next challenge was to figure out how to move an audience through the data. I divided the data into 5–6 different diagrams that all organized the same information differently. Then I added complexity as the user moved through the data.
After getting some opinions from my classmates, I found that starting the data walkthrough with a list of movies in alphabetical order didn’t make much sense. Few people think of top movies being listed in alphabetical order. So for my next iteration, I started out by listing movies based on their rank/gross. I also changed the background color to black to help support the dark cinematic feel of a movie theatre.
Here I started the walkthrough by using scale to illustrate gross. The largest grossing movie was on the bottom of the list. Then I progressed through the piece by adding color, more icon detail and positioning to explain genre, MPAA rating, and the Bechdel Test.
Some of the feedback I got from the above rendition is that it was confusing to have the “top” grossing movie on the bottom of a list. So I took note of that and made changes in my next rendition below. I also got feedback that I had lost the “punch” of the final Bechdel Test scene by getting rid of the pink. So I changed that as well. The red, blue, and yellow are interesting because they are more informative about the genre, however, out of gross, MPAA rating, and Bechdel Test results, genre is the least important to me because it tells me the least about the people watching the movies. So it was a good idea to transfer the colors over to the Bechdel section to give it the needed punch.
Planning for Interaction
Next I wanted to make the data visualization more interactive. And, although I did not have the time to make it truly interactive, I came up with a few ways to give the gist of an interactive experience. One way was that I wanted the user to be able to use the mouse to hover over a camera icon at any time during the walkthrough and get all the relevant details about the movie it represented. (See below.)
One of the problems I ran into for my presentation was that I was trying to present very vertical data on a very horizontal screen. That meant that if I were to show the whole of my data on the screen, it would be too small to see. So, I had to decide if it was important to see everything at once or not. To answer that question, I decided to build in a zoom functionality that allows the user to choose how closely they want to examine the information. I think this method would work because it allows the user to explore the information they way they wish to explore it.
After I presented this slideshow to my class, I got feedback about the following:
- My presentation needs a better into and ending page
- The way I zoom in and out of information is a bit jarring, maybe I should make the horizontal line consistently in the center of the screen
- It is difficult to differentiate between the genre icons unless they are zoomed in
- People want more context about the Bechdel Test
Next, I will prepare for my next presentation by giving more context at the beginning of the presentation and trying to make the horizontal bar an immovable constant in the piece so that viewers don’t get disoriented by the constant zooming in and out.