Last weekend, I participated in my first hackathon, HackMIT's Blueprint 2022 competition. I had a lot of fun, and I was really happy with the project that my team ended up working on: a tool that analyzes your Discord chat history to find fun trends, patterns, and more.
Discord is unusual in that it saves your complete chat history in the cloud. Unlike other messaging apps like iMessage or Signal, there's no way to export every conversation by dumping the app's database. Instead, we must download the messages from the cloud with a scraper that pretends to be a client and requests every message. We could have written such a tool for use with Discord Wrapped, but I knew that there was already a project called DiscordChatExporter that does just that. So, we removed chat export from our project's scope, and instead expect users to follow a two-step process: use DiscordChatExporter to download their data and Discord Wrapped to analyze it. This saved us lots of time during the hackathon because we only had to focus on data analysis and not authentication, network requests, storage, etc.
Design of Discord Wrapped§
Because this project was made under significant time pressure, it was important for us to be able to work on different parts of the codebase simultaneously. Additionally, we wanted to allow for extensibility in the future, because we would surely think of new features and analyses that could be performed. So we split up the project into two main parts: the framework and the visualization modules.
The framework of Discord Wrapped includes the app's shell, FAQ, filepicker, and code to create a stack of visualization cards. It does not perform any analysis on the data, but it provides a standard, consistent interface for modules to interact with to both access data and return visualizations. It can also gracefully handle failure of child modules by hiding only the ones that have errored. The visualization modules are responsible for performing one specific analysis on the data, like "what are the top words used in this conversation?". This modularized design allowed us to quickly develop new features without waiting on other developers to finish their work.
To speed up development time, we also made extensive use of libraries, including React, Tailwind, and Plotly. React handles the app's state and UI, Tailwind handles styling, and Plotly creates the histograms that are shown in some visualizations. The group had some pre-existing experience with React and Tailwind, but none with Plotly. Luckily, Plotly is similar to other plotting libraries like Matplotlib and was easy to learn and implement within the time constraints.
Discord chat histories can span years and contain hundreds of thousands of private messages. Discord Wrapped must, by definition, analyze every message in order to be useful, so it's important that such intimate data is handled with care. We made the conscious decision to process all data on the client-side; the need for trust is removed if we never have your data at all. The code is also open-source, so you can see how it works and verify that it's safe.
That being said, it can be useful to collect some limited forms of information to understand how people are using your tools. Discord Wrapped collects the following information:
- Anonymous analytics (via Goat Counter)
- Collected so that we know how many people are using Discord Wrapped, and where they're coming from.
- Crash & error data (via Sentry)
- Collected so that we can diagnose issues. It's only sent if you actually experience an error; no data is sent during normal usage.
There are a couple of things which we would have loved to have added, but didn't have time for:
- Filter specific analyses by user, channel, or timeframe
- Automatically collect chat data via a bot account
- Share your results with friends via a private link or image export
- Allow users to create and share their own visualizations