Hey folks! Just realized something that makes Lemmy different from Reddit. Because of the federation, your votes are not technically anonymous on Lemmy. At least, I think.
Although there’s no UI to look at a user’s voting history yet, one could conceivably be built by an instance. Perhaps coincidentally, I hear there’s instances out there populated by mostly bots?
From a technical standpoint, it’s not different from Reddit. The only difference here is that normal people can host their own instances, whereas Reddit is only hosted by the company and they can keep it under wraps.
Agreed from a technical standpoint.
But the implications are still interesting. One might (big might) trust Reddit as an organization not to use this data for evil, but with federation, there’s nothing stopping an instance from simply releasing all users’ voting history to be public.
Of course, my instance didn’t even ask for an email to sign up, so my entire account is anonymous that way.
I wonder if there are technical ways to federate votes anonymously?
Yeah, I wonder how you can federate anonymously while still maintaining defenses against vote manipulation.
I think you could probably do something like have the votes be reported in aggregate by the instance.
Any individual instance admin could use defences against vote manipulation by their own users, and other instances’ admins could use defences against one particular instance being widely used for vote manipulation.
I know some privacy oriented services (Brave Browser comes to mind) aggregate telemetry data like that to preserve privacy. Perhaps something like that is possible for Lemmy as well.
Someone could just run a rogue instance host all their bots on there, hiding it from anyone else.
Right, but that’s where defederation comes in. Good faith admins can detect their own users and selectively ban them, while bad-faith admins running a server full of brigaders can be defederated if, for example, they detect anomalous patterns coming from that instance.
but with federation, there’s nothing stopping an instance from simply releasing all users’ voting history to be public.
Which kbin.social does.
Maybe you could hash the user and post together somehow this way it is hashed but also unique per post. If you only hashed the username then the entirety of the user’s voting history would be known if the hash was reverted.
Could be hashed and salted, with a random salt.
The trouble is, then, that it’s harder to disallow users from voting multiple times if the voting user isn’t on the post’s home instance.
Couldn’t someone vote multiple times anyway by just having a bunch of different accounts?
Yes, true, the current system does allow that. But the current system also doesn’t allow users to accidentally vote twice (and it remembers your vote)— this is the feature I think would be more challenging to implement if we were to hash & salt the user’s ID.
That’s always been a problem on Reddit and is on Lemmy now too though
Hashing can’t effectively protect known values. If you want to know if someone voted for a post you can just hash their username and post ID. This is trivial and cheap.
If you want to know who voted on a post you just find every username you can find and hash it. It isn’t super cheap but isn’t very expensive either. There are only 8G people on the planet, many bitcoin rigs can calculate this in seconds. Sure, you can use a more expensive hash and there may be more accounts than people but it will remain feasible.
This is the same reason you can’t hash phone numbers in a meaningful way.
The best option is probably just for the instance to report counts and you just have to trust it. If it is noticed that an instance seems to be inflating votes you stop counting its votes. People can work together to create blocklists for known cheating instances. Your instance would still know this but at least it is within your trust, not federated publicly.
Nah, if you can properly hash a password such that it doesn’t match the same properly hashed password from a different website then you can properly hash usernames in this case such that others couldn’t reverse it or put in the same input and get the same output you created. The technology is there. It’s more of a question if it’s really worth it. At least for now I’m not concerned with a malicious admin leaking someone’s vote history.
No, hashing passwords is a different case because you know what the user is so you can use a unique salt. The password itself is also high entropy. For this use cause you can have at best per-post salt.
Think about it. The task that you are asking for is to quickly check if a user has voted for a post to prevent duplicates. So literally the operation you want is the same as you are trying to prevent. If you can enumerate users then you an by definition check if they have voted for a post.
But the implications are still interesting. One might (big might) trust Reddit as an organization not to use this data for evil, but with federation, there’s nothing stopping an instance from simply releasing all users’ voting history to be public.
Another potential privacy issue is that deleted content stays server and I believe it’s similar with posted images.
I think this issue is overblown. Instances of Lemmy might run modified code and choose to save things that the user intended to delete, of course, but the default setup of Lemmy seems reasonable to me in terms of how it treats deletion.
Currently it keeps deleted posts forever to allow users to un-delete if they choose, but deleting your account clears everything. And I believe there’s work in progress to discard deleted posts after 30 days. Details here: https://github.com/LemmyNet/lemmy/issues/2977
Thank you for pointing this out. I was looking into privacy in relation to Lemmy and came across this post where I got the wrong idea I guess. I couldn’t find much else online at the time
And I believe there’s work in progress to discard deleted posts after 30 days.
That would be a nice addition
This keeps on being asserted but it is far from true. If defederation happens or your local goes offline, posts/comment history/profile/votes will remain on other widely used instances and out of your control.
A large instance has already defederated with 2 other larger instances. If you run a personal instance I feel it will become very, very common to be be locked out of managing your data.
You can expect defederation to happen all the time as that is a deliberate part of the open federated model.
And that is to say nothing about federation simply breaking sometimes.
I already have been locked out of content that exists on other instances that will remain forever and I’ve only been around a short while. I don’t care personally, but people keep asserting this claim that only bad actors or scrapers will dupe your data. Federated data is very different than a non-federated copy for many reasons and that matters to some people. Everyone should understand deleting your account, or modifying your content will often not remove your content outside your instance, and many people engage outside their local. It will likely exist in federated, Lemmy searchable form forever in some capacity (in the current iteration anyway).
Not trying to spread FUD, but if we want to maintain users they have to be educated as they will find out eventually and not be happy.
I have some working drafts on policies for admins to help them navigate and explain their responsibilities to their users.
It is a bit of a weird read outside of the context, but this is an optional primer I have drafted that will hopefully help explain the distinctions:
https://github.com/BanzooIO/federated_policies_and_tos/blob/main/optional-privacy-policy-intro.md
Yes, that’s a fair point. Just because you send a “I have deleted this message” signal out into the universe doesn’t mean that everyone will receive or obey it.
I assumed that was understood.
But that’s very different from instances intentionally and malevolently keeping data despite indicating to users that it was deleted, which is what I think folks’ privacy concerns are about.
EDIT: What I mean is that the federation model is inherently non-private in a certain sense (but in the same sense that someone could take a screenshot of your Reddit comment and your deleting your comment won’t delete their copy). But Lemmy is not egregiously misusing data.
This is largely assumed by someone like yourself or I who understands the implications. I am finding it evident that a lot of people are not aware.
There is also a distinction to a potential screenshot, a scrape or archive no one visits, and a federated copy on a widly used instance you have lost access to.
I edited my comment above to include a project I am working on to hopefully help admins get this across and educate users on how to appropriately engage to their comfort level.
I appreciate your commitment to this privacy consideration. I personally don’t think it’s the hill I’d prefer to die on, but I welcome your contributions.
In fact, Reddit has suspended people for upvoting before.
True, but in Unidan’s defense, it was a jackdaw, not a crow.
We need Unidan back now more than ever 🤗🐦⬛
Multidan.
You’re kidding surely. That’s actually awful. Any source for this? Would love to read more about it.
Not from normal upvoting, but vote manipulation like was mentioned above with unidan. Basically using multiple accounts to upvote your own post for visibility.
That’s not really true, since on reddit only the one host can see the votes, as opposed to anyone who is willing to put the effort in.
That’s exactly what I mean when I said:
whereas Reddit is only hosted by the company and they can keep it under wraps.
Yes, I thought that was implied. Voting is part of moderation. All moderation must be radically transparent. Your voting history is the weight of your credibility and reputation on Lemmy.
@interdimensionalmeme @o_o Agreed! Your voting history will help others gauge how objective (or not) you are.
And they will also be needed to distinguish votes by spammers and bots
All lemmy users will have to host a cache of the entire fediverse and run their own content sorting algorithm. Moderation and voting will be subscription based
This is something that we could leave up to the admins and their tools to figure out rather than making everyone’s voting history public.
All lemmy users will have to host a cache of the entire fediverse and run their own content sorting algorithm. Moderation and voting will be subscription based
Damn, that sounds awesome. I hadn’t envisioned that scenario but it makes a lot of sense.
It also discourages people from upvoting more controversial topics, for better or for worse.
I just hope it doesn’t turn into Twitter’s culture of ruining people’s lives by showing they liked a sus tweet 5 years ago, LMAO.
It would be nice if we could upvote interesting posts rather that posts that we agree with.
Isn’t that what upvotes are supposed to be for? Just not how people actually use them…
This would be interesting to see if certain articles were pushed up or down by bot accounts.
This could be anywhere from news articles being buried/promoted or by hidden advertising within a post.
True! Also instances could each do their own brand of “vote manipulation mitigation” by counting or ignoring different sources of votes.
Other cool features come to mind, like having a separate vote count for voters from the local instance.
Other cool features come to mind, like having a separate vote count for voters from the local instance.
That’d be cool. Like a big colorful number representing the total votes and then a smaller number right below that in parenthesis and a gray color to indicate local instance votes.
That’s interesting, tho I’d rather have that info seen by admins only, it could lead to brigading/harassment otherwise IMO.
How do you identify a bot account?
If they vote in a way I don’t like, they’re a bot. If they don’t vote in a way that I like, also a bot. In fact, if they vote at all, probably a bot
Undercook the fish? Bot, believe it or not.
When us older folks say “Anything you put on the public internet should be considered public and recorded forever”, it’s because of that.
Because of the federation, your votes are not technically anonymous on Lemmy. At least, I think.
I was a little skeptical of this assertion without any sources, but 10m of source scanning does seem to support it:
- The ActivityPub message for a vote does seem to transmit the ID of the voter: https://github.com/LemmyNet/lemmy/blob/v0.17.3-rc.2/crates/apub/src/activities/voting/vote.rs#L26
- And the table for storing votes stores that id as well: https://github.com/LemmyNet/lemmy/blob/v0.17.3-rc.2/migrations/2019-03-03-163336_create_post/up.sql#L17 (or did at one point, it’s hard to get a complete schema just out of git).
I haven’t looked for APIs to extract this data, it might only be available to an instance admin… but yeah Lemmy does not seem to aggregate vote histories, but rather stores them on a per-user basis.
Good on you for actually checking and not blindly assuming like me! Hahaha glad to see my assumptions bore out this time.
But yeah, even if lemmy doesn’t aggregate it, it would be possible to set up a bot pretending to be an instance which collects and aggregates vote histories.
You can’t aggregate them internally, anyway. You need to be able to know if someone already voted on something.
I think activitypub needs to be extended so that the likes and reduces only need to be sent to the host of the content, with federation then being told just the aggregate number. Then the only servers that need to know identity of votes are the host server (necessary to ensure nobody can multi vote) and optionally the server the user voted on (could just relay the information to the host server and not store it locally, but then it’d be harder to tell what you’ve already upvoted – could use local storage but I think lots of people use social media on multiple devices).
Is that ID local to the instance or universal across the fediverse?
Thanks to Marius, looks like when you interact with a server you get assigned a local id that’s tied to your username/email combo. So no global ID but all your actions on a server are tied to you. It wouldn’t take a lot to link that to your account on your home server.
Overall idk if it’s a big deal. Upvotes and such need to have a user ID tied to them in order to keep things straight. Unless someone wants to update the Lemmy software to start encrypting the GUID on the table an instance owner will always be able to see what you’ve voted on ON THEIR instance. Reddit totally can as well, but they had a reason not to put their users on blast (At least until lately lol).
It’d take some collusion to put that all together across the fediverse.
The table that store upvote is named
comment_like
andpost_like
. Here, you have the vote’s unique id, the local user id that is linked to the it’s “global id”, the local post id which is also linked to it’s “global id”, the value of the vote (+/-1) and the date.So votes are indeed totally not anonymous.
(and I run my server sinces less than a week, and the sum of entry in the two table is of 114 950 votes. Certainly enought for doing a bunch of analysis.)
Example of datas for the comment vote table (ids has been changed)
id | person_id | comment_id | post_id | score | published -------+-----------+------------+---------+-------+---------------------------- 1 | 10 | 3 | 61 | 1 | 2023-06-17 20:01:20.948684 2 | 34 | 1 | 22 | 1 | 2023-06-17 20:01:26.346783 3 | 12 | 2 | 54 | 1 | 2023-06-17 20:01:27.627144 4 | 20 | 7 | 91 | 1 | 2023-06-17 20:01:36.570636
As far as I know (another assumption haha), there’s no universal IDs across the fediverse.
Is that ID local to the instance or universal across the fediverse?
I dunno, this is as far as I chased it. It should give you a starting point to dig further if you choose… but I don’t plan to at the moment.
On the other hand, I don’t really trust Reddit with my upvotes/downvotes all that much more than random users, so I already refrain from voting on content I wouldn’t want to be associated with…
Of course, Reddit can still see what posts I view, while that isn’t the case for Lemmy (at least since I self-host an instance).
Can’t the other instances see what you saw via federation? Or does a self hosted lemmy instance federate all comments from all threads once it knows about a server?
Or does a self hosted lemmy instance federate all comments from all threads once it knows about a server?
Only in subscribed communities, but yes.
They’re definitely not anonymous, and Kbin actually does have the UI to show who is upvoting and downvoting any post if you view it on there.
That’s only on KBin though and it’s a feature design of the KBin software. That’s because it’s based on Twitter where likes are public. Lemmy doesn’t publicly show those actions. Idk how far it goes in terms Kbin tracking guests though.
It’s not specific to kbin actually, it’s just how the activitypub protocol reports likes/upvotes. I can see votes on lemmy threads made by lemmy users for example because that data needs to be reported as part of the protocol.
I love it. I’ve already used that feature to block someone who was stalking my posts and downvoting them. Then I got curious and checked out a bunch of posts on the front page that had downvotes but didnt really warrant them. I found there were about 5 accounts who were heavy downvoters for apparently no reason. They also got blocked.
I agree, I think it’s useful! I also blocked a person yesterday who was downvote stalking me, they clearly didn’t want to see the things I was posting anyway.
Can someone show me how to find this? I can’t figure it out.
Also curious if it’s possible to see who is subscribed to a magazine or who is following me.
More > activity > reduces for Kbin.
Scroll to the bottom of the thread and find reduces for the thread.
Kbin is influenced by wykop.pl that have a list of wbo downvotes or upvotes. Helps finding bots or mobs. Wykop was originally a Digg clone. It was huge in Poland for some time until it went full right wing populist and community got split into those who stayed and those who migrated to reddit.
Digg got a polish clone. Digg dies. Polish clone is huge. Some people move to Reddit. Polish clone eventually becomes a trash can. Wykop migrants are fed up by reddit, they move to kbin that is a Polish reddit clone but better.
What a circle.
It’s an unavoidable feature of federation that up-/downvotes are public. Kbin chooses to make that fact visible, but if it didn’t, it would be easy to make a tool to get that information from any fediverse instance. Kbin making that fact visible at least alerts you that your votes are public.
Im sorry for the stupid question but can someone explain the difference between lemmy and Kbin?
I just recently created a kbin account and downloaded the kbin app but see a lot about lemmy on here. Is kbin a subset of lemmy? If I want a wider variety of content would I go to lemmy or kbin?
Again, sorry for the stupid question.
Not a stupid question at all!
Lemmy and Kbin are two different systems that talk to each other. Like how Gmail and Outlook are two different systems, but you can still send emails between them.
So you can make posts over there on Kbin and I can upvote them from over here on Lemmy.
Make sense?
Just from what I understand myself, it’s that they are two different software setups. But they both use the ActivityPub standard that all federated content is using. KBin is different though in that it’s trying to be more like Twitter with Sub Reddits, than like Reddit with Tweeting. And Lemmy is just purely trying to recreate the Reddit experience. So like on Twitter, Mastadon, and KBin upvote is more of a like and people can see what accounts “like.” KBin and Mastadon share the same ‘microblogging’ feature which is like twitter. From what I understand they share microblogs across the fediverse.
And Lemmy and Kbin share communities/magazines together thanks to federation. So you’re on a magazine in KBin but I’m on Lemmy on Lemmy World looking at this community from Lemmy.ml interacting with you.
Overall I think if you like Twitter and Reddit and are fine with your entire history of actions being public KBin is perfect. If you just want a Twitter experience, Mastadon. If you want some more obscurity with your account like Reddit and only the Reddit experience, Lemmy.
Right now I think KBin’s feature parity isn’t too far off from Mastadon and Lemmy. But like the rule of any thing that combines two tools, it can’t be better than both separately. I think as Lemmy and Mastadon matures into their niches KBin will almost exclusively be playing catch up with both in the long term.
Most Lemmy instances and kbin are connected/federated to each other meaning that the content is available to both. For example, the thread you currently are replying to is on the Lemmy.ml instance. So in general you don’t need to worry if the content is on a Lemmy instance or on kbin. You will see it anyway. So you can just pick the instance you like the UI best and use that.
There is a small caveat, in that a magazine/community will only start federating/being visible on a remote instance after someone visits that community for the first time. E.g if someone creates a new community on kbin, then it is only visible on kbin until someone goes to newcommunity@kbin.social on their instance. And also other way around. If someone makes a new community on Lemmy.world you won’t see it on kbin until you it someone else goes to newcommunity@lemmy.world on kbin.
@Mirror_I_rorrIMG kbin and lemmy are two different clients (imagine Outlook or Apple Mail) for the same service (the threadiverse, or email in the prior example). Pick one, they federate with each other (cross-pollinate).
Hmmmm so I see that you pinged me in this post, but I didn’t get a notification for it. Wonder how that works.
Unfortunately from what I’ve seen, that doesn’t actually result in a notification
Pretty sure you need the server name in the ping for it to work. Kinda like email, lol.
Ohh that makes sense. Thank you for explaining.
It’s also technically not on Reddit either. Letting people see what you up/downvoted is a setting that you can toggle as desired, although I believe that it defaults to “off”.
Fine by me. I’ll stand by my votes!
Has also been handy for a makeshift bookmark, so I can return to the stuff I liked.
Where can you see your vote history? I use Firefox and Jerboa to browse.
IIRC, navigate to your own profile and look at the “tabs” beneath your profile blurb.i forget which one but they’ll be a list of what you’ve boosted.
Hope that makes sense; post-op and woozy . I also had Firefox in mind when I answered
You’re the best, thanks
Because in order to federate, user data needs to be shared, including the username. It prevents double votes from the same account and also allows votes to be rescinded at the user’s discretion, at which point the instance will send effectively an undo signal for the activity.
The first rule of the Fediverse is to make sure you’re perfectly okay with the entire internet seeing your account activity, including votes and favourites. If you’re unsure or are worried you might regret it, best not to post it at all. This is equally true on all social media but the way the Fediverse works just makes it more obvious.
Wasn’t part of the reason for anonymous voting and even “blurred” voting so accounts that were shadowbanned wouldn’t know it and go create a new bot account to spam more?
If I remember correctly, another reason was so that bots wouldn’t get usable feedback from voting.
I didn’t know, that’s awesome! Downvote shouldn’t be the “fuck you I disagree” button, save that garbage for Twitter and Reddit. Downvote is there for democratically killing malevolent bullshit. Expose the names!
It’s sometimes hard to separate those feelings
Maybe a 3 button setup
- Agreed and acceptable content
- Disagree and acceptable content
- Bad content
Default Rankings are based on minimum bad-content/maximum agreed count
And controversial ranking is based on minimum bad-content/maximum good-content count (agree+disagree)This way even comments that people disagree with can be exposed so long as it’s still good content
Brigadiers who blindly vote everything bad content to maliciously influence rankings can be identified and removed for manipulation, while people who vehemently disgree with an idea can still have that outlet without influencing the community/magazine haphazardly
Yeah I never understood to pearl clutching about downvotes. It just reeks of “everyone who drives faster than me is a lunatic, everyone who drives slower than me is a granny” mentality.
Sometimes reporting technically covers the last one. But usually not. Not all subs have rules against bigotry, trolling, dog whistles, general assholery, etc. I strongly hold it’s important that downvoting is an option to deal with these kinda things. It’s a way to show everyone that the comment isn’t acceptable.
Plus even when reporting is an option, it may not be fast enough. Can’t really automate removals, either, as people will abuse that.
Arguably “disagree but acceptable” should just not upvote. In a certain sense, that’s already a middle option.
Fuck you, I disagree.
Is that better?
In term of privacy this is pretty bad though. From the upvotes/downvotes activity you can very easily build a whole graph of stuff you like/dislike. For advertisers that’s juicy data.
Or upvote/downvote randomly, or, on a lot of stuff so the interest pool is huge. I’m upvote stuff I see just to participate and give the poster a “someone seen this post” feel good high.