Skip to content

rokazhao/wikipedia-globe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

wikipedia-globe

The files in the public directory are deployed to: https://cse442.pages.cs.washington.edu/25au/a3/wikipedia-globe

What question is being answered by your visualization?

We are finding out how often a inputted keyword occurs in each country's Wikipedia article. Since most countries have very large Wikipedia pages, we find that the distribution of most keywords is not uniform. For example, the phrase "China" shows up exponentially more in the article for China itself, but much more rarely in the articles for other countries. Because of this, we use a scale that is weighted logarithmically, which allows us to normalize the color encoding even with a very broad range of occurrence values.

A rationale for your design decisions. How did you choose your particular visual encodings and interaction techniques? What alternatives did you consider and how did you arrive at your ultimate choices?

We knew we wanted to look at data on a international level, which led us to choose a interactive globe as our main visualization. This allows the user to see patterns between countries, represented by the occurrences of the keyword in multiple distinct articles. For example, inputting "China" shows how other countries have shared history with China as the keyword appears in their article significantly more than countries that are historically distant from China. We felt that a color encoding was the best way to display this, as a sequential color encoding is easily interpretable for the viewer and also allows us to best represent the percentage of words that match the keyword, which along with the logarithmic scaling provides a normalized version of non-uniform data. We also had to make the decision to use a slightly less defined version of the globe, since using a version that included small city-state or island countries drastically reduced performance. We then manually labeled some of the unlisted landforms that were displayed on the map but not officially tied to a country, so we added labels for the major missing elements.

Resources used:

https://observablehq.com/@michael-keith/draggable-globe-in-d3: We used this to get the map for the visualization. https://observablehq.com/@d3/color-legend: We used this to get a better legend for the visualization. https://www.wikipedia.org/: We used Wikipedia to get the data for words in each article.

An overview of your development process. Describe how the work was split among the team members. Include a commentary on the development process, including answers to the following questions: Roughly how much time did you spend developing your application (in people-hours)? What aspects took the most time?

Andrew was able to pull data for each countries Wikipedia article and store the text, allowing us to calculate the frequency of each word. We then all were responsible for taking the processed data and combining it with the globe model we found and the user input search, which created the basic checking system. We then collaborated on adding the color encodings and finalizing the interactive elements of the map, such as the globe being clickable. Additionally, Jayden and Noah helped determine how to embed the Observable notebook on the GitLab page. We found that it was easier to work in person, since it gave us the opportunity to talk through encoding options, such as the log scale, and see what design choice we felt best communicated our ideas. We spent a total of 12 hours total on this project. The most time intensive elements were accessing and cleaning the data from Wikipedia, and then in turn connecting that data to the globe view. The other design elements took more time to tweak for readability and expressiveness, but were easier to implement.

About

Interactive visualization tool showing prevalence of specific keywords in a country's Wikipedia page

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors