As the Fediverse grows more and more, rules and regulations become more important. For example, is Lemmy GDPR complient? If not, are admins aware of the possible consequence? What does this mean for the growth of Lemmy?
Former (small scale) data protection officer here. While I am long out off the data protection game and there are surely a lot more qualified people out there I maybe can clear up a few misconceptions here and answer a few questions that come up regularly:
(BTW: My first language is not English and all my comments/books on that topic are not in English so excuse me if my translations are sometimes not 100% accurate)
-
Does the GDPR even apply to a instance hosted outside the European Union? It absolutely does. And in fact it is harder to comply to the GDPR outside of the European union. The GDPR does apply to all data collectors (from now on DCs) that collect data of European citiziens. While §2 Section 2a GDPR limits the application of the GDPR to usage within EU laws the collection of EU citiziens information clearly falls under the EU law as long as the EU citizien is within the EU during the collection process.
-
So why is it harder to comply to EU law outside of the EU? Because of local laws. A good example are US homeland security laws that do contradict the GDPR (and various other EU laws) and therefore make it impossible for someone to host EU data in the US complying to the GDPR. Facebook made a pretty costly experience in that regard recently. To comply to the GDPR one would need to keep EU citiziens out of their service AND defederate all EU instances. More of that later.
-
Does the GDPR even apply to Lemmy posts? It absolutely does! GDPR §4.1 states clearly that all information relating to an “online identifier” (aka username) is already protected. So the IP adresses, etc. collected by the initial server aren’t even the only personal data. This makes the whole topic a clusterfuck in terms of federation.
-
But what about my small/medium size instance? I am not a business! I make no money. The GDPR does not care a bit about ones intentions here - it applies to all instances that are beyond “personal or intrafamiliy” data collection. This basically means that you can absolutely do what you want with the data you collected at the last family reunion. Maybe one can even get away with a invitation only private instance that only caters to a group of friends knowing each other. But any DC having a public instance is not, by definition, a private DC anymore. Therefore the GDPR does absolutely apply.
-
Can I simply the user for permission to use their data indefinitly and however I want? One surely can ask that. But that automatically invalidates the agreement. (Funnily enough this is exactly what reddit does and why reddit is not in compliance. Which might turn out costly.) The consent always has to be revokeable, amongst other things.
-
So what does the GDPR stipulate? There are three main topic we need to look at: Data deletion, traceability of data transfers and connected to this information about data usage.
Lets start with traceability. Because that makes the federation a federation!
-
What does traceability of data transfers mean? It basically means that a DC must record its data transfers to third parties and ensure that data is handled there according to the consent agreement with the user and the GDPR. Usually a data transfer agreement is necessary to ensure the rights of all parties. This makes it so difficult for a federated system: In theory a instance would need a data transfer agreement with ALL instances that federate data from it. And these instances woud then need to make sure that they don’t transfer OR their transferpartner is covered in the original data transfer agreement as well their own one. A receipe for a pretty nice clusterfuck.
-
What does data deletion mean? Under the GDPR every user has the right to have his data deleted from a DC. This does not include data necessary for legal obligations but basically everything else. So the user can at any point revoke his consent and make the instance delete all their data.
-
Okay, I deleted the data on my instance, do I now comply to the GDPR? Surely I can simply ask the user to go to the other instances and ask them to remove the data? No. And here is another problem: The original DC (the users instance) is responsible for the data handled through transfer. That’s why one needs a transfer agreement. To ensure that the data is deleted on all instances it was transfered to. There are two exceptions here: “Involuntary data transfer” is generally seen as not being part of the data handling. But that mainly applies to datascrapers like the web archive and similar usage where the data is transfered through general usage of a page that the DC cannot reasonaby prevent without limiting the usage of their service massively. That would very very likely not apply to a service that does provide a specialised api for the transfer. The other one is a data transfer partner not complying. In that case the user can sue the DC, but the DC can sue the transfer partner for breach of contract.
-
What does right to information usage mean? Basically a user has a right to know what happened to their data. So in case of the federation: To what instances got my data transfered to? How did they use it? Did they transfer it?
-
The end: What does that mean for Lemmy? To be honest: I can not fathom a way that put Lemmy in a position that is fully GDPR compliance. There might be one, but I can’t imagine one that does not entail full defederation. But Lemmy can and must urgently improve the GDPR compliance as far as possible:
- We need tooling for administrators to easily remove a users personal information from their own instances. Currently this is still very bothersome and time consuming manual work as far as I know.
- We need a tool to federate deletion requests. So once the administrator of the “original instance” deletes the data a request is sent out to all instances and they automatically delete the user data then.
- We need a system to deal with instances who do not follow deletion requests. This, for example, could include a “karma” system - once you are caught to not delete the userdata you are getting bad karma. And with enough bad Karma you get defederated by more and more instances.
- We need a tool to inform people which instances did federate their data.
- We need to optimize data frugality: The less data is collected the better it is.
- We should consider data transfer agreements between the instances being set up automatically.
In theory even then someone can sue an instance owner. Even then we are not 100% in compliance. But it is a far better position in court if one can argue that they did basically everything they can to ensure the users right compared to “I don’t give a f****, your honour”.
Additionally we should lobby for change in the GDPR to include better rules for federated systems. Also because E-Mail as another federated system is not in compliance - that can easily be weaponized as a good point.
Just wanted to let you know your English is significantly better than many native speakers. Thank you for the great and amazingly detailed response!
Thank you. But especially with the legalese English it is sometimes fairly hard to find the proper translation.
Well I’m not an attorney but from my read you did great. 👍
Best answer so far thank you!
This is a great answer, you (or someone else) should make sure the devs see this! Maybe as a Github issue
-
It isn’t up to Lemmy to be GDPR compliant, but the individual instances.
People are struggling really bad to understand the concept of software federation
Both ways are a wheel with a hub in the center and spokes out to the wheel. The users are the spoke/wheel location, the “corporation” is the spoke/hub connection
The Old Way was users connecting to a corporation that provided a service. The corporation controls almost everything.
The New Way is that users control almost everything and connect to the hub which allows them to connect with each other.
Lemmy is the hub, instances are the users, and communities are the data shared.
Has this actually been court-tested? I get the feeling that this is all really quite grey until something in the Fediverse actually gets sued over this.
For example: when you create something (a comment, a post, a community), the “true” version exists on your home-instance, but copies also get sent and saved across the entire Fediverse. Is an instance really able to be GDPR compliant if it’s constantly “backing up” data to non-compliant instances?
On the one hand, you could make the case that these outside instances are separate entities. Like the equivalent of a webarchive. Simply being public on the internet means other people can save copies and that’s obviously all fair play under the GDPR.
On the other hand, you could make the case that saving copies to the outside instances is a lot like using third-party cookies. It’s not technically “strictly necessary” for the instance to send your data to outside instances, even though it would seriously complicate the underlying design to allow specific users to opt-out of federating their content specifically.
There’s no reason why activitypub would be considered any different from email, nntp, or even search engines and internet archives. When an website or email server gets a GDPR request it’s not propagated in any way, and it would be a stretch to expect it to.
There’s no reason why activitypub would be considered any different from email
Are you sure? Email only sends your message to servers which you explicitly ask it to. If you only trust protonmail, you can choose to only send emails to other protonmail addresses. If protonmail chose to share your emails with other third parties regardless, I can’t help but think maybe that breaches the GDPR.
Lemmy, by design, propagates copies to instances based on opaque factors outside of the user’s control, even when the UI suggests that you are sending content locally. In the case of posting a comment to a community hosted on your home instance: Lemmy will send a copy to whichever servers happen to have users that are currently subscribed to that community. It’s a very opaque outcome and pretty far from the outcome you’d experience when sending an email message to someone using the same email provider.
even search engines and internet archives
Yes, but these are genuinely disconnected entities who come across the data as a user might. Lemmy doesn’t personally phone up Google and send them a copy of your comment as soon as you post it, but that’s basically exactly what happens when Lemmy federates a comment with other instances via ActivityPub.
FWIW: I think Lemmy as a piece of software is actually very aligned with the interests of the EU more generally and I think it would be a bad idea for them to come down on federated social media as a GDPR issue. I nevertheless worry that it represents untested waters and can certainly imagine a reality where it receives a raw deal from regulators.
Wouldn’t this be solvable by one of those cookie banners or some sort of waiver? After all, the only personal information I can think of that is shared is your username, which anyone can see if they just go to your instance. The post and the comments are public, aren’t they?
Wouldn’t this be solvable by one of those cookie banners or some sort of waiver? After all, the only personal information I can think of that is shared is your username, which anyone can see if they just go to your instance. The post and the comments are public, aren’t they?
Wouldn’t this be solvable by one of those cookie banners or some sort of waiver? After all, the only personal information I can think of that is shared is your username, which anyone can see if they just go to your instance. The post and the comments are public, aren’t they?
Wouldn’t this be solvable by one of those cookie banners or some sort of waiver? After all, the only personal information I can think of that is shared is your username, which anyone can see if they just go to your instance. The post and the comments are public, aren’t they?
Wouldn’t this be solvable by one of those cookie banners or some sort of waiver? After all, the only personal information I can think of that is shared is your username, which anyone can see if they just go to your instance. The post and the comments are public, aren’t they?
I would imagine that the caching that Lemmy does has been tested in court, since the intent of the cache isn’t to create a permanent copy of the data. It would likely only become a problem with GDPR if that data would stay across the instances.
As far as the federated server is concerned, the copy it has is canonical and kept forever until such a time that it receives an edit/delete signal from the original instance. I’m not really sure if you could plausibly call that caching, but I’m not a GDPR lawyer (or any variety of legal professional, for that matter) 🤷
the copy it has is canonical and kept forever until such a time that it receives an edit/delete signal from the original instance.
I don’t see this staying in Lemmy as the federation grows. I can’t see admins being able to sustain these costs.
Well… that’s just kind of how it has to work. Storage is cheaper than bandwidth and it’s not a close contest. Historically, storage costs have fallen faster than networks have grown and it is probably safe to assume that this trend will continue indefinitely.
FWIW: The stuff that gets federated is all text. Image uploads aren’t federated at all – those are just shared as URLs which point to the instance wherein they were originally uploaded. This is actually why things like avatars are currently so unreliable on Lemmy – they can’t scale well without there being local copies.
deleted by creator
I think to this might be a reductive view.
the fediverse uses activypub.
ActivityPub is. a W3C raccomandation and this organisation cares about privacy.
it’s likely that the protocol will, if it already doesn’t, take care of it.
even if it’s up to single imstamcesy is true, there are two further questions here (beyond how much it’s enforceable)
should fediverse help admin in the task?
should fediverse help users to protect their privacy?
and to me the answer to both is yes.
You need the protocol to implement crosshonoring of deletion requests, which is the default now. However, that deletion request could be ignored.
As others noted, it gets complicated if two instances defederate from each other, as the communication link which would process these requests have been severed.
this to me is good though.
ActivityPub takes care of it.
this means that the fediverse is gdpr friendly.
easier situation.
out of curiosity, is it resistant to temporary partitions?
deleted by creator
I think to this might be a reductive view.
the fediverse uses activypub.
ActivityPub is. a W3C raccomandation and this organisation cares about privacy.
it’s likely that the protocol will, if it already doesn’t, take care of it.
even if it’s up to single imstamcesy is true, there are two further questions here (beyond how much it’s enforceable)
should fediverse help admin in the task?
should fediverse help users to protect their privacy?
and to me the answer to both is yes.
GDPR Art 4.(1) ‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person;
Posts in the Lemmy instances contain information relating to an identifiable natural person (by their user handle), as they contain the person’s ideas and opinions. Therefore the Lemmy instances are handling personal data and must comply with the GDPR.
Lemmy can avoid the impossibly heavy burden of compliance by becoming an underground illegal service and/or IP banning the Europeans Union and/or abolishing the European Union.
That is a terrible option, cuts off a huge amount of potential users, and basically impossible to do fediverse wide. In fact, The European Union actually has official Fediverse accounts (on Mastodon, custom instance), and if the EU itself is willing to use a platform, that means it’s probably not gonna be taken down by the EU.
Recent event shows the lemmeyverse cares neither about new users nor federation. Everything is designed to work off a single exclusionary instance or small cabal of large instances
Can you elaborate on this? How does it not care about those things?
If you’re not in the main instance, your going to be handicapped in your ability to stay in the loop. First because now everything goes through federation, which was a design afterthought for Lemmy, and that means stuff outside the instance always takes second place to what is inside the instance. Then you have issues like federation which are extra layers of censorship for everything outside your instance. I’ve of the biggest problem is accessing outside communities. First you have to actually go to other instances and find them. They won’t show up until at least one person subscribes. And this has to be fine in every instance for every other instance and every communities in each of their instances before they would even become visible. Of course, this is such a high bar that by the time you do all this, you’ll realize 99.99% of users will not go through this trouble. They will just go to the biggest community on the biggest instance.
Last problem, if you go to your instance/c/acommunity , you’ll see only that instance’s “acommunity” There is no way to refer to “acommunity” for the entire fediverse. There is no fediverse community. Only parallel, same named but unrelated communities that would require extra steps to view all at once if it were even possible.
There is a proposal , an old proposal, to create multireddit like feature for Lemmy. But first, the devs so not want to test down this barrier, si they won’t do it. But even if they did, it would not work. Since you’d have to take extra action to aglomerate selected communities with a multireddit, you would be one of very few people to do so because agglomeration would still not be the default. And that means most communities would remain empty deserts anyway.
First because now everything goes through federation, which was a design afterthought for Lemmy, and that means stuff outside the instance always takes second place to what is inside the instance.
At least for me, that hasn’t seemed to be a problem. I found everything I wanted to subscribe to from my smaller server via Lemmyverse.net, and now when I look at my subscriptions page, I see all the newest posts from all those different communities. Unless you mean that it prioritizes local content on the ‘All’ page instead of subscriptions.
First you have to actually go to other instances and find them. They won’t show up until at least one person subscribes.
That isn’t ideal, I will admit. Without Lemmyverse.net it would be difficult to find everything that interested me.
But first, the devs so not want to test down this barrier, si they won’t do it.
If Lemmy won’t, then I suspect that would leave the door open for Kbin to implement.
It’s not going to be a problem to find the communities. Since people on arandominstance.com won’t be posting on arandominstance.com/c/interestingtopic
They will know if they did, no one would every see it, except for the dozen other people on arandominstance.com
Instead, they’ll Google for the biggest /c/interestingtopic , find on what instance it is and go post there
We don’t get to the part of having difficulty finding them because they don’t get created in the first place
This is a natural phenomenon called the Pareto principle. Roughly 80% of the users will sign up for the top 20% of instances. It happens everywhere in nature and it’s unavoidable. But I don’t think it will be a problem, federation is designed to work like this. You’re not forced to stay on lemmy.world, you can move whenever you want.
lemmy.world should really shut off signups to allow other instances to grow
This is a really bad idea as it would discourage new users to even sign up. lemmy.world could offer you the instance list when signing up but stopping altogether is just too much
Lemmy is developed using EU funds and many of the biggest instances are in the EU
Lemmy can avoid the impossibly heavy burden of compliance by becoming an underground illegal service and/or IP banning the Europeans Union and/or abolishing the European Union.
abolishing the European Union
Ah, yes. I believe this was step 4 of setting up your self-hosted instance.
Yes, someone please automate this
Wither GDPR applies to an individual instance will be up to those running the instance to decide.
If you decide it does, then you need to do a few things. Number one is read up advice on compliance with GDPR.
Being able to delete data alone doesn’t mean GDPR compliance. I’m thinking about the need for privacy notices on sign up, retention schedules for data, lawful basis of processing, records of processing activities… Data subjects have numerous rights, which apply depend on the lawful basis you’re processing under.
I’d suggest that larger general instances might want to read up more urgently than smaller single focus “hobby” instances.
edit: more I think about this, I think there is an moral responsibility for the developers to help those running instances comply. If GDPR does not apply to an instance, it is still good practice to allow uses to delete their data, etc… Also, art. 20 of GDPR is the right to portability. Interesting to see how this applies to fediverse platforms like Lemmy.
[This comment has been deleted by an automated system]
My statement about it being up to those running instances is mean in terms of it’s up to them to read the legislation and come to a conclusion. If I were hosting an instance I’d certainly assume it applied, though I doubt there has been any case testing its implementation in this sort of situation.
I can see someone starting a lawsuit against a standards incompliant server that ignores deletes and edits, though.
I wonder if the first data breach will draw the attention of a regulator. We’re all using essentially alpha software, with no privacy notice, I doubt there are RoPAs or DPIAs, I doubt there is a DPO… all those things might upset someone like the ICO in the UK if a breach were to occur.
Edit: saying that, I’m not sure any breach would even be reportable given what data is collected by Lemmy.
[This comment has been deleted by an automated system]
Isn’t the key operating word here business?
With no advertising on the line and no operations currently in place operating at anything but a loss there isn’t a commercial interest at stake.
Nope, GDPR does not limit itself on businesses. It applies to all data collection.
Governments also have to comply with GDPR. They also have no commercial interest in the personal data.
deleted by creator
Regarding GDPR, one thing I’ve done as an instance admin is making clear in our privacy policies that lemmy allows you to send and receive social content and interactions across the internet in a way that’s similar to email.
Haven’t checked it yet myself, but thought it would be useful to have the actual link here ;)
I have read it now, I like it. It’s pretty much exactly the stance I took in other GDPR-Lemmy discussions. Just way more elaborate and better written. I’ll save that link for the next time it comes up :D
Actually I think it needs a lot of work regarding wording, because it doesn’t really read as too professional. But I plan on keeping the core concept. My assumption here is that the GDPR is fine with people sending emails with details over to servers that the email provider has no control over, so the same should apply to the fediverse.
Does Lemmy even need to be gdpr compliment? It’s not a company, it’s private individuals.
I (with my own single user instance), do not. As soon as you offer your service to other users, it’s different. If you are a company or not, does not matter.
Edit: So to clarify, the Lemmy developers need not worry, the instance admins do. That said, IME (literally, our local DPA contacted us about compliance issues where I work), they (DPAs) are interested in helping people be compliant, not suing them.
This isn’t true since your single user instance is federated. For example, this comment is going to end up on your instance, and it could have my personal data.
edit: here’s a meta-link to this comment on your instance: https://lemmy.cwagner.me/comment/2786 – despite it originating from lemmy.one and the post being lemmy.ml from a user on lemmy.world (interestingly every person involved in this interaction is on a different instance)
That is a very different way of looking at it. I take the view of this Lemmy privacy policy that you are essentially sending your comment to me, just like an e-mail.
Though unlike an email, it’s public on my instance for now, so yeah, you have a point there.
My eventual plan is to make my instance only visible for logged in users (= only me), but I heard that for now that (the private instance flag) is not possible with federation.
deleted by creator
You can disable most endpoints in your application firewall, or put them behind a whitelist. For federation to succeed you don’t need all that many publicly reachable endpoints (mostly a bunch of inboxes and the data for your own user account).
Is there a guide somewhere? Because experimenting when federation is already as unstable as it is, is hard.
My post will end up on your server but also on the server this community is hosted on, from which it’ll end up on hundreds or thousands of other servers. I’ve never agreed to any of their privacy policies and terms of service and neither has anyone else here.
Just like with e-mail, yes. Sending an e-mail to user@example.org does not make you agree to the example.org TOS and PP. Or more relevant to federation, sending an e-mail to a mailing list will end up on hundreds of servers. This is not that new a concept.
deleted by creator
Thanks, I’ll bookmark this and have a look when I have some time :D
[This comment has been deleted by an automated system]
It doesn’t apply to purely personal use. See Article 2 section 2 ©. For shits and giggles would fall under that.
[This comment has been deleted by an automated system]
I agree. I was replying to your comment that GDPR applies to private data collection for shits and giggles, which isn’t correct. For Lemmy, I’m certain it applies. GDPR applies to small churches even
For now anyways, I can see that changing in the future. Company centric instances with communities for each of their product lines.
If I understand correctly, the GDPR includes provisions or restrictions on transferring personal data from the EU to third countries. So, I’m wondering if Lemmy and Fediverse replication follow these GDPR regulations.
Lemmy is GDPR compliant, as far as I know.
Admins can entirely purge you off their instance, should you ask them to, and other servers do not store any personal details that GDPR would require be deletable. By most interpretations.
It can be argued that previously federated data that is now out of reach and as such cannot be deleted, could constitute a breach of GDPR.
deleted by creator
There’s not just ignoring the request.
An instance can simply be offline when the request is made. Or be defederated.
[This comment has been deleted by an automated system]
Other servers do store personal data. Any post or comment made by a user is personal data as it contains the thoughts/ideas of that user.
GDPR Art 4.(1) ‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person;
That’s one interpretation. One I illuded to.
But you can also argue that if the person who made the comment is unidentifiable, there is no “natural person” to make the data GDPR related.
Well that depends on the comment, doesn’t it? As far as I understand it, if I posted personal information about you, such as your name, home address, etc, in a comment, you could demand from the admin to remove that comment as it would contain personal information you don’t want in the open.
Personal data posted by the user also falls into this, so they might have to force deleting on any instance hosted by organizations. Individuals or small teams running instances which don’t take money don’t need to comply to GDPR.
Individuals or small teams running instances which don’t take money don’t need to comply to GDPR.
Are you sure about that? So if I hosted a website that shows your name and address, you could do nothing to make me take it down because I’m not an organisation or company?
Yeah, but I imagine that could be handled via email. The tricky thing is to verify that the email is coming from the account in question, but that could be done by posting or commenting a specific phrase.
I am pretty sure it isn’t fully compliant and needs to be. Could be significant issues if not.
There are definitely references to GDPR but there are basics which are not even there yet.
It is a serious business, so hopefully a high priority on the backlog.
It will be interesting to see how it is addressed centrally and across instances, and how quickly it is tested.
I also created a topic about this one, and there were quite a few different opinions in it. Check it here: https://lemmy.ml/post/1409164