Measuring Trends in the Fediverse

posted on 2023-01-26

Note: This is the blog version of a thread on the diffusion and fragmentation of trends on Mastodon, originally published here

Introduction

Trends (e.g. twitter's trending topics) are an affordance of social media platforms that enable users to discover content and connect with other users. They are also being used by e.g. journalist and politicians to gage topics the general public is interested in (which is a huge attack-vector for Dark Participation and propagandistic communication).

On centralized social networks these trends are defined and moderated by the platform owners. While these trends that are published by the platform owners should be taken with a grain of salt, they still can be a measure of what topics are currently relevant to a lot of users.

On decentralized networks the concept of trends become a bit harder to measure. There is no central authority on what is popular on Mastodon (or the Fediverse in general).

Luckily, Mastodon’s API is pretty well documented and open! The GET /api/v1/trends endpoints can give you an idea of what is happening on the network. So I collected some anonymous, public data from a bunch of (400+) popular instances and have so plots to share!

Using the API, I crawled the three kinds of trends available on Mastodon (hashtags, links and posts) across 400+ instances (all with more than 500 users) for a couple of days. For hashtags I aggregated the uses and number of accounts across all instances, for links I also extracted the top level domain. The trending posts were anonymized (only saving interaction metrics, the users’ home instances and the hashed uri).

Hashtags

Let’s start with hashtags. Mastodon's Explore page shows users hashtags that “are gaining traction among people on this and other servers of the decentralized network.” As expected with decentralized structures, there definitely is a long-tail of hashtags that are presented as “trending tags” to users (6k distinct trending hashtags per day).

Date Unique Hashtags Uses (mean) Accounts (mean) Share of Instances (%)
2023-01-08 3261 207.86 149.26 4.11
2023-01-09 4446 318.71 207.64 5.46
2023-01-10 5272 441.28 299.87 7.06
2023-01-11 5941 539.39 368.35 8.07
2023-01-12 6284 516.90 342.80 8.26
2023-01-13 6710 631.85 441.59 8.37
2023-01-14 7000 624.19 434.99 8.58
2023-01-15 6866 501.99 361.84 8.25
2023-01-16 6319 451.59 329.82 7.56
2023-01-17 6044 319.93 228.17 6.89
2023-01-18 5285 236.44 168.02 5.50
2023-01-19 4790 248.10 182.00 5.00
2023-01-20 4107 276.59 204.07 4.63
2023-01-21 3444 204.77 155.23 4.05
2023-01-22 2276 262.35 184.72 3.08
2023-01-23 2284 150.15 101.20 3.09

This graph shows all trending hashtags on all crawled instances per day, with the share of instances the hashtag is trending on on the y axis. Size/opacity shows the number of cumulated uses across all instances. Per day, the hashtag with the most uses is labeled.

You can see that some tags that are used by a huge number of people, only diffuse to a small (ish) share of instances. For example, #ukraine is used >90k times per day, but only trends on ~50% of instances.

Trending Tags over Time

Similarly, when aggregating the domains of the trending links, there are some news outlets that are shared often and trend on many instances (e.g. The Washington Post), but some regional outlets (e.g. @tagesschau) are also widely shared, but only on smaller share of instances.

Trending Domains over Time

The full data shows that established news outlets form the US and from Germany are often producing trending links. To illustrate this, I aggregated the top-level domains of the trending links:

tld n p accounts uses
.com 9,613 54.76 3,513,595 3,723,705
.org 1,873 10.67 225,914 239,558
.de 1,603 9.13 577,777 614,198
.uk 664 3.78 223,719 237,466
.fr 484 2.76 75,397 80,553
.net 359 2.05 42,600 45,085
.social 313 1.78 34,548 35,394
.edu 160 0.91 16,243 16,785
.at 155 0.88 9,500 9,867
.eu 149 0.85 15,715 17,105

Posts

Trending posts are defined as “posts from this and other servers in the decentralized network” that “are gaining traction.”

Looking at the posts with the most reblogs that are trending across instances shows how the big instances often produce these viral posts.

Trending Instances over Time (Note: the crawler for this crashed on the 22nd)

On the topic of trending posts: The interaction metrics (reblogs, replies, favs) differ depending on the instance the API request is made to. Although they should sync up over time, the state of these metrics is not always perfectly up to date.

To illustrate this, these plots show the interaction metrics for a bunch of trending posts across many instances. The red line is the “actual” metric for the post (drawn from the “home-instance” of the post). They are all lagging behind!

Favourites
Reblogs
Replies
<
>

Executive Summary

With decentralized social media, our understanding of trends/virality has got to change. Trending on Twitter and trending on Mastodon are two very different beasts (even if it looks so similar).

It also means for us as media scholars that we need to take good care sampling the instances we get our data from and triangulating post metrics is going to help us as well, I think.


Mastodon Profile Redirect

posted on 2022-11-09

As the #twitterMigration is in full swing, a lot of people post links to their Mastodon-profiles on social media. If that profile is not hosted on the instance your own profile is hosted on (which will be true more often than not), a click on the "Follow"-button will bring you to page similar to this one …

Solve WORDLE with R

posted on 2022-01-12

As seemingly everyone is playing WORDLE, I thought it would be fun to code something in R to help with the guesswork! While this technically can be considered cheating, it was still a fun exercise to code in R.

Automatisierte Wissenschaft

posted on 2019-07-12

Die Tage ging ja diese Nature-Publikation rum, in der Tshitoyan et al. über Word-Embeddings latentes Wissen aus der bestehenden Literatur der

timeSeriesLib

posted on 2016-07-27

Convert case-based to time-based data This Python library converts case-based to time-based data, which can be used in time series analysis. Usage ``` bash

Minifer for Alfred

posted on 2016-02-28

If you are a web developer, chances are, you stumbled over this piece of software. This minifier removes all unnecessary line breaks, comments and whitespace

Sass Helper for Alfred

posted on 2014-11-26

I just discovered the truly awesome css language extension Sass and as usual I wanted to use it via Alfred App. So I glued together this little workflow: Sass

Video Downloader for Alfred

posted on 2014-09-22

This workflow lets you download videos or just the audio track of a video from YouTube, Vimeo, Vine and many more [^sources] to your computer.

The workflow

Update, Umzug und neues Aussehen

posted on 2014-04-03

Goodbye Tumblr, hello Wordpress.org!

Im Zuge meines Umzugs… read more.

ColorClock screensaver

posted on 2014-02-12

Time and color, combined into a screensaver!

This Mac OSX… read more.