I recently setup a new Lemmy instance and was surprised when my feed was mostly empty. I’ve since learned that a key part of Lemmy’s federation is based on a user from your instance subscribing to communities on other instances. Only then, will your instance pull in posts from the subscribed community to your “All” feed.
This means that subscribing to new communities is especially important if you’re on a young Lemmy instance since it helps to build out everyone’s feed on that instance.
I’ve found discovering new communities to subscribe to on other instances can be difficult. To help me search for new communities I may be interested in, I tried aggregating as much of the Lemmy fediverse together into a single feed by subscribing to the widest range of Lemmy communities possible. This offers a Lemmy feed that’s kind of like reddit.com/r/all
. If it’s interesting to anyone else, you can find the instance here: https://lemmy.directory.
Hopefully this offers another way to find new communities to subscribe to on other instances.
Here’s a better description of my understanding on how Lemmy federates communities and why you might be interested in checking out lemmy.directory: https://lemmy.directory/post/34207.
Hope this helps ease the orientation to how Lemmy federation and communities work.
You might want to update your instance link to be https://lemmy.directory/home/data_type/Post/listing_type/All/sort/Active/page/1. Ironically, it defaults to the local feed which is… empty. You could probably make nginx rewrite the homepage to be the all feed as well so the simple/nice url does what you want.
You might also want to add a section to your post writeup about federation load and why Lemmy doesn’t do this by default. In a world where Lemmy is very successful and there are lots of instances (many thousands) that subscribe to all communities like you’re doing…
Very cool project though. Having an “all” instance sounds like a great service for discoverability. I’d also be interested to see a writeup from you about the hosting requirements. Does it take a lot of CPU/ram/disk to receive the full lemmyverse right now, or is it trivial? Your performance profile could be an interesting leading indicator of replication load as the network grows.
Do instances fully replicate and locally store remote subscribed communities? My understanding is they are still solely hosted on the original instance; subscribing just opens a window to the community by making your instance aware it exists.
To a first approximation yes, they replicate the posts and comments made after the time of subscription. Images aren’t replicated, but posts, comments, votes, and mod actions do replicate.
I don’t know what you consider “a window” to be, and maybe this is a fuzzy description of some steps of community discovery… but it’s definitely not a complete or coherent view of how instances interact through federation.
Thank you for taking the time to answer. I hope you might be willing to clarify a bit more for me. By “window”, I meant just… having access to a remote community via an API gateway, I guess.
I was under the impression that if I try to subscribe to a remote community hosted on
lemmy.world
fromvlemmy.net
, that is simply registering the URL of that community into some local directory in my instance, not duplicating the entire community contents intovlemmy.net
. And then when I view a thread in that remote community, I am just retrieving the thread data from the host server atlemmy.world
straight to my browser, not loading some local duplicate of the thread fromvlemmy.net
. Seems like it would get out of sync quickly if we are all reading separate local copies of the original.So based on your answer, I am still misunderstanding something. What is the purpose of all the duplication then? Is it just for local caching purposes? Does this not needlessly drive up the amount of traffic because each instance is frantically trying to keep up to date with every other instance, rather than just letting each instance handle the requests for its own communities?
Pretty much everything in your summary was wrong. I can’t reasonably type out a complete primer on federation here, but in short… When the first user on an instance subscribes to a remote community, the subscribing server tells the community-hosting server “send me future updates about posts, comments, and votes and such for this community”. The subscribing server then stores them locally.
Why do this actual replication rather than just an API gateway?
lemmy.ml
recently when it was struggling and frequently down due to overload.lemmy.world
has commonly has like 4k active users this week. But it only takes one batch of federated replication messages for the community’s instance to serve the browse traffic of all those users… then the reads come out oflemmy.world
’s db. This spreads the browse workload around the federated network.Can federated networks have problems where the replication traffic becomes “too much”? Yes, they must be carefully designed to avoid that problem. And for some apps, it makes single-user instances sometimes anti-efficient as the federation workload for the instance can exceed the browse workload from the user, but for multi-user instances the federation messages are a rounding error compared to the browse work. But replication overload is a problem that federated networks generally weather, and the replication offers benefits that on-balance outweigh the costs.
No worries. I appreciate the time you have taken to explain things. I have watched a few videos and Googled around, but unfortunately most of the results I find are either way too vague (Lemmy is part of the Fediverse. What is that? We don’t know either!) or give the analogy of “It’s like email” and then proceed to basically explain the API gateway thing I was assuming.
I will dig into this some more now that I know what I’m looking for; thank you. I’m hopeful there will be some more/clearer/accurate resources for the Reddit refugees before the current frenzy dies down to help build up the network.
Thanks for the tip, I updated the link! And I’ll work on drafting some more text about the implications of subscribing to everything.
The rate a small instance would need to send off copies would be tied to the frequency of posts to the community, right? I’m interested in aggregating the lemmyverse, but wouldn’t want to overburden other instances in doing so. Although, honestly, I’m more worried about overburdening my own trying. Were you suggesting that it might be rude to subscribe to a community just for discoverability?
So far the hosting requirements seem moderate, although I’d image if more people browse the feed the CPU and database reads will spike. I’ll keep an eye on them. I’m not sure if this project is feasible/needed yet, but I thought it might be interesting to try. And if ever help, then I’d imagine during the start of a bunch of new instances.