Exploring WCMA Dataset (and C3)

This week I am using C3, a D3-based library that makes generating basic charts much easier. This will allow me to get a better grasp of both the WCMA dataset, as well as get more practice making web visualizations.

Artwork Classifications

Originally I wanted to do a count of the mediums, but upon looking at the mediums, I realized that mediums would be a little harder to categorize, since the descriptions vary from “damask with fish, with resist dye, indigo and a mask” to “watercolor on paper, mounted to panel”. I would have to do a little more analysis of the medium descriptions to get a better sense of how they can be categorized, which may be a project I take on in the future. For now, though, each piece has a classification such as “WCMA-PRINTS” and “WCMA_SCULPTURE”, which is much more consistent and easier to work with.

From this barchart, we can see that PRINTS outnumber the other classifications by far; WCMA’s collection also has a decent number of PHOTO and PRENDERGAST pieces. While this bar chart gives us a general sense of how many pieces there are within each classification, it doesn’t answer the following questions: how many pieces are there in total, and what fraction of the entire collection is each classification? This is an instance where a piechart may come in handy.

Here, we see that PRINTS is a little more than a fourth of the WCMA collection. We also see that these three classifications (PRINTS, PHOTO, and PRENDERGAST) make up more than half of the entire collection. The other half, however, is roughly evenly distributed between the rest of the classifications.

These two charts give a general sense of the makeup of WCMA’s collection, but we have to be aware that artworks may fall under multiple classifications (for example, ANCIENT and SCULPTURE), but the dataset only has one classification for each piece. How would these distributions look like if each piece were given multiple classifications?

While the collection is skewed, in terms of amount, toward the top three classifications, only a very small subset of the entire collection is displayed at the museum at a given time. The distribution of artwork on display may look different from that of the collection, and this is another question that we may want to look more into.

<< Initial Research Evaluation >>