Measuring Trends in the Fediverse

posted on 2023-01-26

Note: This is the blog version of a thread on the diffusion and fragmentation of trends on Mastodon, originally published here

Introduction

Trends (e.g. twitter's trending topics) are an affordance of social media platforms that enable users to discover content and connect with other users. They are also being used by e.g. journalist and politicians to gage topics the general public is interested in (which is a huge attack-vector for Dark Participation and propagandistic communication).

On centralized social networks these trends are defined and moderated by the platform owners. While these trends that are published by the platform owners should be taken with a grain of salt, they still can be a measure of what topics are currently relevant to a lot of users.

On decentralized networks the concept of trends become a bit harder to measure. There is no central authority on what is popular on Mastodon (or the Fediverse in general).

Luckily, Mastodon’s API is pretty well documented and open! The GET /api/v1/trends endpoints can give you an idea of what is happening on the network. So I collected some anonymous, public data from a bunch of (400+) popular instances and have so plots to share!

Using the API, I crawled the three kinds of trends available on Mastodon (hashtags, links and posts) across 400+ instances (all with more than 500 users) for a couple of days. For hashtags I aggregated the uses and number of accounts across all instances, for links I also extracted the top level domain. The trending posts were anonymized (only saving interaction metrics, the users’ home instances and the hashed uri).

Hashtags

Let’s start with hashtags. Mastodon's Explore page shows users hashtags that “are gaining traction among people on this and other servers of the decentralized network.” As expected with decentralized structures, there definitely is a long-tail of hashtags that are presented as “trending tags” to users (6k distinct trending hashtags per day).

Date Unique Hashtags Uses (mean) Accounts (mean) Share of Instances (%)
2023-01-08 3261 207.86 149.26 4.11
2023-01-09 4446 318.71 207.64 5.46
2023-01-10 5272 441.28 299.87 7.06
2023-01-11 5941 539.39 368.35 8.07
2023-01-12 6284 516.90 342.80 8.26
2023-01-13 6710 631.85 441.59 8.37
2023-01-14 7000 624.19 434.99 8.58
2023-01-15 6866 501.99 361.84 8.25
2023-01-16 6319 451.59 329.82 7.56
2023-01-17 6044 319.93 228.17 6.89
2023-01-18 5285 236.44 168.02 5.50
2023-01-19 4790 248.10 182.00 5.00
2023-01-20 4107 276.59 204.07 4.63
2023-01-21 3444 204.77 155.23 4.05
2023-01-22 2276 262.35 184.72 3.08
2023-01-23 2284 150.15 101.20 3.09

This graph shows all trending hashtags on all crawled instances per day, with the share of instances the hashtag is trending on on the y axis. Size/opacity shows the number of cumulated uses across all instances. Per day, the hashtag with the most uses is labeled.

You can see that some tags that are used by a huge number of people, only diffuse to a small (ish) share of instances. For example, #ukraine is used >90k times per day, but only trends on ~50% of instances.

Trending Tags over Time

Similarly, when aggregating the domains of the trending links, there are some news outlets that are shared often and trend on many instances (e.g. The Washington Post), but some regional outlets (e.g. @tagesschau) are also widely shared, but only on smaller share of instances.

Trending Domains over Time

The full data shows that established news outlets form the US and from Germany are often producing trending links. To illustrate this, I aggregated the top-level domains of the trending links:

tld n p accounts uses
.com 9,613 54.76 3,513,595 3,723,705
.org 1,873 10.67 225,914 239,558
.de 1,603 9.13 577,777 614,198
.uk 664 3.78 223,719 237,466
.fr 484 2.76 75,397 80,553
.net 359 2.05 42,600 45,085
.social 313 1.78 34,548 35,394
.edu 160 0.91 16,243 16,785
.at 155 0.88 9,500 9,867
.eu 149 0.85 15,715 17,105

Posts

Trending posts are defined as “posts from this and other servers in the decentralized network” that “are gaining traction.”

Looking at the posts with the most reblogs that are trending across instances shows how the big instances often produce these viral posts.

Trending Instances over Time (Note: the crawler for this crashed on the 22nd)

On the topic of trending posts: The interaction metrics (reblogs, replies, favs) differ depending on the instance the API request is made to. Although they should sync up over time, the state of these metrics is not always perfectly up to date.

To illustrate this, these plots show the interaction metrics for a bunch of trending posts across many instances. The red line is the “actual” metric for the post (drawn from the “home-instance” of the post). They are all lagging behind!

Favourites
Reblogs
Replies
<
>

Executive Summary

With decentralized social media, our understanding of trends/virality has got to change. Trending on Twitter and trending on Mastodon are two very different beasts (even if it looks so similar).

It also means for us as media scholars that we need to take good care sampling the instances we get our data from and triangulating post metrics is going to help us as well, I think.