I’m really enjoying lemmy. I think we’ve got some growing pains in UI/UX and we’re missing some key features (like community migration and actual redundancy). But how are we going to collectively pay for this. I saw an (unverified) post that Reddit received 400M dollars from ads last year. Lemmy isn’t going to be free. Can someone with actual server experience chime in with some back of the napkin math on how expensive it would be if everyone migrated from Reddit?
By not all ending up on the same server.
A small cloud server + a domain name costs less than a Netflix subscription. The developers have taken care to package lemmy in ways that are relatively straight forward to deploy, so a dedicated person with a small amount of experience can have an instance up and running in an evening. As long as a few percentage of users are willing to pay a netflix subscription to keep a server running, the financial burden would be spread.
I think this underestimates how users will naturally gravitate towards more centralized instances, or they’ll give up because the bigger instances are closed. Someone’s gotta pay for it, and it’s going to cost more than a Netflix subscription. Servers aren’t cheap.
This also ignores that the system isn’t horizontally scalable at all, so scaling up gets even more expensive
I think this underestimates how users will naturally gravitate towards more centralized instances, or they’ll give up because the bigger instances are closed.
(This is purely my personal opinion, of course!) In the scenario in which a few large instances dominate, the idea of the fediverse failed. One may estimate the likelyhood of success or failure given how they expect humans to behave, but in the end experiment beats theory. I think that for the fediverse to work a significant cultural shift has to occur, but I don’t think that it is an impossible shift. I would like the fediverse to succeed, and so I choose to take part in the experiment.
This also ignores that the system isn’t horizontally scalable at all, so scaling up gets even more expensive
Yes, that might cause some serious issues. The project is still in an early-development phase, and I don’t understand the technical aspects well enough yet to be able to identify whether there is obviously a fundamentally invincible barrier when it comes to scalability. My optimistic hope is that the developers are able to optimize horizontal scalability fast enough to meet the demand for scale. If it turns out to be impossible to scale, then only rich enough parties would be able to have viable instances, and that could be a reason for failure.
What does ‘horizontally scalable’ mean here? I haven’t come across that before.
This is what I think, but if anyone understands it differently please correct me.
Vertical scalability refers to scaling within a single instance. More users join and they post more content, increasing the amount of disk space needed to hold that memory, network bandwidth to handle many users downloading comments and images at once, and processing power.
Horizontal scaling refers to the lemmyverse growing because of the addition of new instances. The problem in this form of scaling is due to the resources that an instance has to use due to its interactions with other instances. So, you may create a small instance without a lot of users, but the instance might still need a lot of resources if it attempts to retrieve a lot of information (posts, comments, user information, etc) from the other larger instances. For example, at some point a community in lemmy.ml might be so popular that subscribing to that community from a small instance would be too much of a burden on the smaller instance because of the amount of memory required to save the constant stream of new posts. The horizontal scaling is a problem when the lemmyverse becomes so large that a machine with only a small amount of resources is no longer able to be part of the lemmyverse because its memory gets filled up in a few hours or days.
You can summarize by thinking of vertical scaling as “make machine bigger / more powerful” with horizontal scaling as “make more machines”.
Kind of like building a very large/tall building vs having multiple buildings!
I don’t believe this is how it works though.
Let’s say your tiny 3 person instance is connected to a big one. I believe it only pulls in content from the communities somebody from the small instance is subscribed too. Correct me if I’m wrong.
That’s what they’re saying.
Essentially - if someone from the small instance subscribes to a community that has a ton of data (huge post volume, images, whatever), the small instance needs to pull data over from the larger instance. At some point there may be communities that are so large small instances can’t pull them in without tanking.
Could that be solved by caching? Can’t the smaller instance avoid some duplication?
That’s what I’ve gathered, but I don’t believe there’s a way for instance owners to limit what’s fetched - a user crafts the query and the server does the needful.
I imagine this could amount to a denial of service attack of sorts, if some high-churn communities are imported into tiny instances. How bad that could be, I have no idea - I’m speaking pretty theoretically, here. Text is tiny, after all, so it’s probably not much of a concern, since most of the media is actually handled elsewhere…
I’m not a web developer. I’m sort of a sysadmin so i have some experiences maintaining machines for web apps for other people. And you are right…text will not create massive amounts of data. But a lot of tiny transactions can bring down machines surprisingly fast even if the total amount of data is relatively small.
I guess we are here to experience it first hand. I don’t think anybody…not even the developers have a clear idea of how well this will scale. There is only one way to find out lol
Some things can go faster if you add more workers, some things can only go faster if you make the workers bigger or faster .
If you’re tidying a garden you can get it all done more quickly, and tackle bigger gardens, by getting your friends to help. That’s horizontal scaling.
If you need to get a parcel from your house to Burkina Faso the only way to do that more quickly is to use a bigger, faster machine. That’s vertical scaling.
The way Lemmy is designed right now (says the op, I don’t know the detail) you can only support more users by making the server bigger and more expensive, not by using lots of smaller servers.
Edit: note that Lemmy as a whole scales horizontally: more instances == more users, but each instance has to scale vertically.
It’s certainly a challenge. I run three instances on the #fediverse. Two small ones and a larger one with 450 users. I have a donation page; initially people were enthusiastic and I covered costs. Now, I have regular monthly donators but not enough to cover costs so I am subsidising it. I took the decision when I launched that it could happen and it’s my problem.
I think there will be many instances will fall away in the coming months due to costs. Especially if you are thousands of users and associated costs.
We need to come up with a new funding model, where people appreciate you get nothing for nothing. All the large corporates sell your data as advertising for revenue. The greater public do not appreciate they are selling their soul.
Is there an approximate specs per number of users guide to size a lemmy instance?
Yeah, I’d love to get an approximate sense of how much these instances cost
I haven’t seen one yet. Disk usage this morning on lemmy.world was reported at about 4GB over 11 days (probably low usage). The 100GB drive would probably fill up in 275 days or so if usage did not increment. If it’s not redundant and dies, all that content is lost.
So storage will be a huge issue for lemmy unless I’m missing something.
Can content be stored somewhere like S3 instead of spinning disk? It would certainly be more robust and cheaper.
100GB is practically nothing nowadays.
There are people (myself included, not to brag) running home servers with literally hundreds of terabytes of data. At that ~0.3 GB/day number, I alone could host 3,500 years worth of data. Get some of those r/DataHoarders and r/HomeLab guys on here and Lemmy would never run out of space.
As everyone else has already said that’s a very good question, one that doesn’t necessarily have an answer, but Im not too concerned.
I’d point out (rather excitedly) that this really isn’t unlike how the Internet used to be up until the late 00s or very early 2010s and the rise of insta, FB, birdsite, digg and reddit. EVERYone had to shoulder hosting costs (unless you were on Geocities,Myspace then it was ads)
Yes, we’ve had bulletin boards and discussion forums since perl and CGI were a thing; each was self hosted at the hoster’s expense. Newsgroup and IRC servers too - THOSE all acted like “federated” instances - common newsgroups and chat channels would be synchronized and replicated from server to server EXACTLY how federated Lemmy/Kbin/etc. instances do it now.
And the infrastructure costs were a struggle then and they will be now. Back then to have a capable CGI forum host, or to colocate your server in someone’s data center it cost a lot - like decent hosting/co-loc plans started at $50/month and went up from there. Most hosting plans had steep bandwidth caps, think like 5GB included and +$5 per GB - if you hosted a popular site 40-50GB of traffic wasn’t abnormal. If you ran a newsgroup server you frequently had to futz with how long newsgroup msgs were retained to save disk space; like 48 hrs or less (then the data would be purged).
What you can get for $50/month THESE days is quite a lot more capable, and you can run a low retention instance for a lot less. Bandwidth and disk space are ludicrously cheap (at least compared to 10-15+ yrs ago). If your instance is low user, low community, and reasonable data retention/cloning, you could run Lemmy or a Mastodon or Calkey server on an old computer you have kicking around and host it from your home internet connection with a dynamic DNS mapping.
Obviously the big instances with gobs of users will struggle with how they pay for the server infrastructure - some will use crowdfunding, patrons, donations etc. Others will run ads, or subscriptions.
My home instance lemmy.ca is at 1400 users (as of right now) and is on a $25-30/month hosting plan and so far the site is doing just fine (or seems to be). I’d guess that a massive instance like lemmy.ml might be north of $1-200. But, if you think about it, all you need are 20 ppl to donate $10/month. I donate yearly to Wikipedia. As they discuss in this thread here https://lemmy.ca/post/599590 Mastodon gets $28k Euros a month in donations and pays for two? full time developers, so its not like there aren’t people donating to open source projects… and so far Fediverse servers are doing fine.
Do you happen to know what service that’s with? Trying to see what resources that takes. ($25-$30 can mean very different specs depending on where it is hosted). Ty.
Dunno. Hey @smorks@smorks@lemmy.ca , what is lemmy.ca’s host provider/plan (unless its top secret Canadian Moose Power Secrets)?
it’s currently running on a $14 USD/month 4 CPU 2GB plan, but i’m going to bump it up to the $28/month 6 CPU 4 GB plan.
That’s why I’m running my own server. Mastodon is much bigger than Lemmy and it does fine with community run servers.
Yeah technical users and people who are friends with technical users can just use a micro instance.
Yep, I’m using lemmy.world mostly as a trial period, but I plan on getting on my own instance to help reduce the load on the community any way I can. Will probably get a few friends on it as well.
You bring up a very good point. Currently lemmy.ml has thousands of users. Lemmy.world has thousands of users. The hardware they have selected to run their instances is adequate for now, but, what is the plan for scaling out if the user base grows? Is there one? They have a donation page on each lemmy instance (click or tap the heart icon,) but that can’t be enough to pay for the cost of running something used by millions of people, even if only 100s of thousands are ever only online at any given time.
I’m terms of UI/UX, @dessalines@lemmy.ml has mentioned in a post they are currently working on major performance improvements and enhancements.
Ideally, I think no one instance should have a million users to begin with.
User caps might be a good way to decentralize and ensure that we don’t end up with just a few mega-instances. If there were a page showing available instances with percent of max users then people could use that when selecting.
Ideally, yes. If that can be the reality, and I suppose that is how it should would with federation, then server costs should never get out of hand.
For that to happen, I believe that interacting with people from other instances and moving your community and account from one instance to another have to become possible / easier.
At present, people flock to the instances with most users as those often have more local content (local content is generally easier to find than federated content) and they often have a smaller risk of shutting down. If I create a community on a smaller instance, the chance of it being found and interacted with are also much smaller than if it had been created on a bigger instance (because of, as I said, local content being user to find).
Sure, I can create an account on myfirstlemmyinstance.com (example URL, not an actual instance) with 10 users, but if my instance decides to shut down, my community of, say, 500 users will now have to move somewhere else and all old content will be deleted.
A “transfer my community” feature that allowed an entire community to be moved between instances would certainly help. That’s a great idea.
From what I’ve seen so far looking through the Postgres db, every instance has data from most other instances. I see users in my local Postgres db from other instances. So, theoretically moving a community from one instance to another could be as simple as changing a few values in the database. Of course in practice it’s never that simple. 😀
Wouldn’t it require changing all those values in the database of all instances with subscribers to that community?
Good question. I don’t know. Hypothetically speaking, if the parent instance of the community changes the relevant data in the database to another instance, would federation take over and automatically propagate the change? 🤷🏻♂️
Sounds like an interesting experiment at least, or a possible major bug waiting to happen.
Idk for everyone else, but when I was on reddit once I had set up the subreddits I wanted to see, I really spent 99% on my time on just those. Every so often I would leave or join subreddits but it was rare. Like if people are not doing searches as often then the lag is more tolerable. Plus, won’t content from larger and older instances be indexed by search engines eventually? Right now because so many communities are being created on so many different instances, it’s more obvious that the searching is laggy but things will surely settle down as time passes.
It is an old programming trope that premature optimization is a waste of time. As Lemmy scales, several bottlenecks will be hit. Some might be predictable, but many will only become evident after crossing a specific threshold. There are a lot of guides for scaling Mastodon servers after hitting certain bottlenecks, but this is all uncharted territory for Lemmy and we’re going to find out the fun way.
Real devs do it in prod!
I’m seriously tempted to write some performance tests in jmeter, locust, or k6, and fire up some live traffic simulations / simulated load against my lemmy instance to see what happens. But at the same time that would feel too much like work and I don’t want to work over the weekend.
Sounds like a great github issue though that we can fund via bountysource or someone with more free time can take a look. Mind creating it?
For an example of a problem that will only come up once it’s popular enough, I think hexbear has found that when comment threads get too long (like 800+ nested comments), lemmy starts to break
I think the biggest cost will be image/video storage. The text takes very little space in today’s standards. The good thing is that symmetric fibre internet connections are becoming more common so it may be possible for members of the instance to contribute unused disk space to help with its image/video storage. This plus limiting the image/video sizes (and maybe forbidding video uploads altogether) will allow the instances to scale with user count.
this works on the same principals as fidonet, UseNet, email, etc. These protocols are more like fundamental services. The idea behind these was that instead of running a bunch if proprietary garbage you would run things that support A LOT of standard protocols. Why? Because NO ONE should be allowed to own our communications but ourselves.
The corps did not build these networks, we did. Software will improve over time, OSS shows the way.
In all honesty, there are a ton of us tech enthusiasts who have no problem paying 10-20$ per month to run an instance out of our own pockets. We get the ability to subscribe to content we used to use Reddit for, and we can have a few folks hop on with us. Multiply that by a bunch, and add in community funded instances, and we’ll be fine.
Gotta consider server costs were only a fraction of Reddit’s costs. Salaries are quite pricey, and we have lots of folks volunteering time which will make it all work.
I signed up for the lemmy.ml Patreon and am happy to support an open, federated site like this. I’d never pay for Reddit Gold, Twitter Blue, Discord Nitro, or any of those other nasty pay-to-win commericalized things but I’ll pay to keep an open platform from implementing stupid “premium” bullshit.
Don’t forget to make a donation to your instance if you love it. For most of us it’s a bit early but I give my 10 EUR per year to my Mastodon admin. Also, if you can choose instance ran by a non profit rather than a person as it ease the whole donation mechanism and give you the right to check where your membership due go.
Is there a list of non-profits that run an instance?
The distributed nature of Lemmy should make things more manageable. Personally, I’m running an instance on a dedicated machine I already pay for, so it’s not costing me anything unless storage skyrockets. Many other instance hosts are also hobbyists that don’t mind covering the costs, and may take some form of donations locally on their sidebars.
There probably should be a built-in feature for instance admins to enable a local donation button to contribute to their costs, though. While Lemmy is fairly resource-efficient, larger instances are eventually going to require pretty beefy VMs to keep up with the traffic, image uploads, etc. I could see some instances randomly vanishing when their owners can’t/don’t keep up with their bills (which would force users over to other instances), but ideally if any instance owners can’t afford to cover it, they hand control over to another community member to pick it up.
In general, Fedi admins simply close registrations when they can’t keep up with an influx of new users, and point people to other, smaller instances
This is pretty much my exact same situation as well, plus I get so few opportunities to “pay it forward” so to speak, and now is finally my chance to do so.
The cost will be spread out, and people can monetize how they see fit. I’m wondering if there will be additional benefits you can add to your instance for a charge that people might be willing to pay.
I’m considering offering an Element server and maybe email on mine with a shared username for each service. That’s going to take time to setup, though.
We must prove that it’s valued and let the monetization come later. I’m working on this in my spare time. Once I can grow, maybe I can put more effort into it. I think it’ll be a lot of people like me for a while financing it out of pocket.
I just signed up for the Lemmy.ml patreon. I wish they had a $5 per month option, but I can just not skip ordering doordash one extra time and help pay for this instance. I use the hell out of it so it’s the least I can do.
Donating $20 a month over here, hope it helps