Spotify Wrapped: R & ggplot2 Edition
About
While Spotify presents us with Spotify Wrapped at the end of each year which summarizes our top songs, artists, and minute spent listening, I was curious about how my Spotify listening patterns have evolved over a much longer time frame. Furthermore, the data analysis presented through Spotify Wrapped is fairly shallow. Spotify will provide your personal Spotify data free to use and analyze, which is what I did.
I was mainly curious about things like the time of the day I listened to songs, top artists over time, top songs over time, and the number of times I listened to a song each day. However, there are some other variables I wanted to explore such as how tracks end (e.g., stop button, next song) as well as how far I get through song.
Exporting your Spotify Data
To export your Spotify data, you will need to do so from the Spotify app. Once your data is requested, you will receive an email with instructions on how to download it. Depending on the length of your Spotify history, there will be one or multiple data files. In my case, there were six files which I loaded into R using the jsonlite
package then combined them using rbind
.
Daily Songs
This plot visualizes the number of songs I listened to daily over a seven-year period, from 2015 to 2024, using a 30-day rolling average. The rolling average smooths out day-to-day fluctuations in my listening habits, providing a clearer view of broader trends. Each point on the graph represents the average number of songs played per day over the preceding 30 days, helping to highlight periods of more consistent or heavy listening activity.
Peaks in the plot indicate times when I was regularly playing 75-100 songs a day on average, suggesting phases of increased music engagement, possibly driven by discovering new music, particular life events, or seasonal patterns. Valleys in the plot reflect periods of reduced listening, where the average dropped closer to 25-30 songs per day.
This plot reveals interesting patterns in my listening habits over time, showing that my highest levels of music engagement occurred between 2015 and 2019, while there were more fluctuations in subsequent years, and a drop-off from 2022-2024.
Monthly Songs
This plot visualizes the total number of songs played per month from 2015 to 2024, faceted by year. The data reveals interesting listening patterns over the years, highlighting seasonal trends, fluctuations in music engagement, and potential yearly shifts.
2015-2016: These years show steady growth in monthly listening habits, with peaks around the middle of each year. Notably, 2016 had more consistent engagement, with no extreme lows, suggesting a balanced pattern of music listening throughout the year.
2017-2019: The trend continues to rise, especially in 2019, which stands out due to a significant spike in April. This could indicate a specific event, music discovery phase, or personal life moment that led to an unusual increase in music activity during that time. Other months remain relatively balanced.
2020: In 2020, there was a significant shift in listening habits, possibly tied to the global pandemic, with peaks in June and December. The data suggests increased engagement during lockdown months, potentially as a form of coping or escape during periods of restricted movement.
2021-2022: These years reflect more irregular patterns, with fewer pronounced peaks compared to earlier years. December 2021 shows an increase, possibly reflecting end-of-year music engagement, but 2022 reveals a decline overall, with no standout months for high engagement.
2022-2024: During these years, the number of songs I listened to seems to even out, and remains consistent. This is likely due to me solidifying a routine around the job I started in 2022.
Key Observations:
Mid-year peaks are common across many years, particularly in the spring and summer months. April 2019 had the most dramatic increase in average songs played across the entire dataset. 2020 shows the impact of external factors like the pandemic on listening habits, with noticeable peaks in the middle and end of the year. Listening patterns appear to stabilize post-2020, with more moderate monthly engagement.
Listening Calendar
This heatmap provides a detailed look into Spotify listening habits broken down by hour of the day and day of the week over an eight-year period (2014-2022). The intensity of the green color corresponds to the total number of minutes listened, with darker shades representing higher listening times. Key Observations:
Late evening hours (20:00 - 22:00) show the most intense activity, especially on Fridays and Saturdays, suggesting that music listening peaks during these hours, likely due to leisure time at the end of the workweek.
Mid-week (Monday to Thursday) has more balanced listening patterns throughout the day, with slightly elevated activity during the early morning (8:00 - 10:00) and again in the evening (18:00 - 21:00), possibly reflecting music consumption during commute or post-work relaxation.
Sundays appear to have consistently lighter listening activity compared to other days, with no strong peaks across the day.
The hours between 00:00 and 6:00 generally show lower engagement, which is expected during typical sleeping hours, though slight spikes suggest occasional late-night music sessions.
Overall Trends: This visualization highlights a clear pattern of higher music consumption in the evenings, particularly on weekends, while weekdays maintain a steadier, but less intense, level of engagement. The correlation between weekday routines and weekend relaxation is evident, providing insight into how listening habits align with work and free time.
Artists
Top 50 Artists - All Time
This list reflects my top artists based on hours listened from 2014 to 2022. With a diverse range of genres spanning trance, classical, and pop, the data showcases both my passion for electronic music and occasional dips into classical and mainstream pop.
Armin van Buuren dominates the list with a staggering 434 hours, underscoring his influence as a pivotal figure in my listening habits. His long-standing career in trance music and epic live sets clearly resonate with me.
Close behind are other trance acts like GAIA, an alias of Armin van Buuren (89 hours) and Above & Beyond (86 hours), further reinforcing my strong inclination toward uplifting and progressive trance.
Artists like Gareth Emery (70 hours), Alex M.O.R.P.H. (69 hours), and ReOrder (43 hours) further highlight my affinity for high-energy and emotionally driven music within the trance genre.
Interestingly, classical composers such as Gustav Mahler (49 hours) and Johann Strauss II (27 hours) also feature prominently on the list, adding depth to my listening preferences. This reflects periods of exploration into more structured and orchestral music.
On the pop side, Lady Gaga (44 hours), Halsey (39 hours), and Ariana Grande (17 hours) make appearances, indicating a penchant for modern pop music during different phases of my listening journey.
In summary, this data provides a snapshot of my eclectic musical tastes, which are heavily rooted in trance but complemented by occasional classical and pop influences. These top 20 artists represent over 1,000 hours of music listening over the past eight years, with a clear focus on powerful, melodic, and emotionally resonant music.
Top 20 Artists of All Time | |
Data reflects listening habits from 2014 to 2022 |
Top Artists Over Time
The figure below visualizes my listening habits over time, highlighting the top 10 most-listened artists. Key artists like Armin van Buuren, Gustav Mahler, Lady Gaga, and Halsey are highlighted in different colors, while other trance artists are represented in grey. Key Insights:
Armin van Buuren (in green) dominates the plot, particularly from 2015 to 2016, with a notable peak in 2015 where I listened to him for over 30 hours in a single month. This period reflects a strong preference for Armin’s trance music, possibly driven by new album releases or live performances.
Gustav Mahler (in orange) shows sharp listening spikes in 2020 and 2022, indicating periods of deep interest in classical music, potentially due to certain life events or moments when I craved orchestral compositions for focus or relaxation.
Halsey (in red) has sporadic spikes, particularly in 2019 and 2020, which coincides with the release of her albums “Manic” and “Hopeless Fountain Kingdom”. These spikes reflect moments of increased interest in her pop and alternative music style.
Lady Gaga (in purple) shows more consistent but lower engagement, with her peak periods occurring around 2016 and 2020, which aligns with the release of her albums “Joanne” and “Chromatica”.
Broader Patterns:
Trance Artists: The grey lines represent other prominent trance artists such as GAIA, Above & Beyond, and Aly & Fila, showing periodic increases that align with my long-standing affinity for trance and progressive music. These artists maintain steady but lower listening hours compared to Armin van Buuren.
Diverse Listening: The data showcases my eclectic listening habits, ranging from classical composers like Mahler to pop icons such as Lady Gaga and Halsey, along with heavy trance influences. This diversity suggests different moods or phases in my listening behavior, with spikes often correlating with new music releases or life events.
Summary:
Overall, this plot paints a detailed picture of how my listening preferences have shifted over time, with a core focus on trance artists while making room for classical and pop. The periods of intense engagement with particular artists highlight moments of musical discovery or renewed interest in specific genres.
Listening Device
The following table breaks down the total time I spent listening to music across different platforms.
Windows stands out as the dominant platform, accounting for 130,415 minutes of listening time (approximately 2,174 hours or 91 days). This suggests that the majority of my listening occurs while using my computer, likely during work or leisure activities.
Android follows with 93,496 minutes (about 1,558 hours or 65 days), showing that a substantial amount of my listening takes place on the go via my mobile device.
iPhone also plays a significant role with 65,375 minutes (approximately 1,090 hours or 45 days), indicating that I switch between Android and iPhone platforms for mobile listening.
Google Home contributes a minimal 56 minutes, showing very limited use of smart home devices for music.
In total, I’ve spent 289,342 minutes (around 4,822 hours or 201 days) listening to music across these platforms, with a clear preference for desktop and mobile devices.
Time Spent Listening by Platform | |||
Track Ends
The histogram below visualizes the track end reasons for songs that were played for 10 minutes or less. The stacked bars represent different actions that caused the track to end, with colors indicating specific reasons. The majority of track ends are due to:
Track Done (orange): This is by far the most common reason, accounting for most of the track ends across all time intervals. This suggests that many songs were allowed to play until their completion.
Back Button (red): The back button is frequently used within the first minute of a song, likely due to quick dissatisfaction or a desire to repeat a previous track.
Forward Button (green): Occurrences of using the forward button are more evenly distributed across the first few minutes of a song, indicating active song-skipping behavior.
Other reasons like App Load, Stop Button, and Unknown appear sporadically, though they account for much fewer instances overall.
This chart reveals that most of my track ends occur early in the song (within the first minute), suggesting quick decisions to either skip or repeat tracks. Beyond the first minute, a greater proportion of tracks are allowed to play to completion.
The data reveals that 72.93% of track starts are due to the completion of a previous song (i.e., trackdone), which highlights my tendency to let music play continuously, likely through albums or playlists. The second most common reason for starting a song is clickrow at 17.78%, suggesting active selection of tracks, whether in playlists or while browsing music.
Other reasons like appload (starting a track upon app load) and backbtn (starting a track by hitting the back button) play a smaller role but still represent a portion of my active engagement with Spotify.
Reason Start | Count | Percent |
---|---|---|
Songs like “911” and “Chromatica I” were repeated most frequently using the back button, suggesting they are personal favorites or tracks I tend to revisit often, potentially to relisten to a specific part of the song. This behavior indicates a high engagement level with these tracks, particularly for songs from Lady Gaga’s “Chromatica” album. This is mostly because the change from Chromatica I to 911 is a banger.
Track Name | Count |
---|---|
References
Wickham H, Averick M, Bryan J, Chang W, McGowan LD, François R, Grolemund G, Hayes A, Henry L, Hester J, Kuhn M, Pedersen TL, Miller E, Bache SM, Müller K, Ooms J, Robinson D, Seidel DP, Spinu V, Takahashi K, Vaughan D, Wilke C, Woo K, Yutani H (2019). “Welcome to the tidyverse.” Journal of Open Source Software, 4(43), 1686. doi:10.21105/joss.01686 https://doi.org/10.21105/joss.01686.
Ooms J (2014). “The jsonlite Package: A Practical and Consistent Mapping Between JSON Data and R Objects.” arXiv:1403.2805 [stat.CO]. https://arxiv.org/abs/1403.2805.
Xie Y (2022). knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.40, https://yihui.org/knitr/.
Xie Y (2015). Dynamic Documents with R and knitr, 2nd edition. Chapman and Hall/CRC, Boca Raton, Florida. ISBN 978-1498716963, https://yihui.org/knitr/.
Xie Y (2014). “knitr: A Comprehensive Tool for Reproducible Research in R.” In Stodden V, Leisch F, Peng RD (eds.), Implementing Reproducible Computational Research. Chapman and Hall/CRC. ISBN 978-1466561595, http://www.crcpress.com/product/isbn/9781466561595.
Yutani H (2022). gghighlight: Highlight Lines and Points in ‘ggplot2’. R package version 0.4.0, https://CRAN.R-project.org/package=gghighlight.
Müller K (2022). hms: Pretty Time of Day. R package version 1.1.2, https://CRAN.R-project.org/package=hms.
Grolemund G, Wickham H (2011). “Dates and Times Made Easy with lubridate.” Journal of Statistical Software, 40(3), 1-25. https://www.jstatsoft.org/v40/i03/.
Müller K (2020). here: A Simpler Way to Find Your Files. R package version 1.0.1, https://CRAN.R-project.org/package=here.
Iannone R, Cheng J, Schloerke B, Hughes E (2022). gt: Easily Create Presentation-Ready Display Tables. R package version 0.7.0, https://CRAN.R-project.org/package=gt.
Chang W (2022). webshot: Take Screenshots of Web Pages. R package version 0.5.4, https://CRAN.R-project.org/package=webshot.