What You Know to Be True

ProfessorYakkington@lemmy.ml · 11 days ago

I like theming , I am already a Firefox user. I think the sad reality is that for more adoptions , in the order of numbers that chrome puts up , Firefox needs to be a default application ; the common users doesn’t want to customize anything ( my hot take ).

I don’t think it is important that Firefox gets to those numbers as long as they can generate enough revenue to keep going.

ProfessorYakkington@lemmy.ml · 7 months ago

Once I had to use the internet without and ad blocker ( shiver ). It was horrible. I still have nightmares.

Joking aside. I couldn’t believe how crammed full and chaotic sites were without an ad blocker. I have no evidence to support this other than my experience but I think , for me , ad blockers are good for my mental health. Being constantly exposed to all those messages trying to exploit insecurities can’t be good for people.

Anyways ad blockers are the best.

ProfessorYakkington@lemmy.ml · 7 months ago

This seems very obvious to me , not that it isn’t worth highlighting. Particularly in a world with open models and weights , which we should desperately want. The don’t worry water marks will be a thing just seems like an attempt have some response that dampers concerns. I don’t imagine most people working in the AI space actually think this would work. I could be wrong.

ProfessorYakkington@lemmy.ml · 7 months ago

Yeah I 100% understand and to a large extent agree with this. I think money should be involved , creators should get paid. I don’t think peertube has become “the answer” yet and there is some combination of market level event and technology/feature set that needs to be in place to create enough moment for people to move off YouTube. It will happen eventually ( I think ) but what exist today isn’t enough of a pull to overcome the momentum YouTube has but that doesn’t mean that “we” should give up.

ProfessorYakkington@lemmy.ml · 7 months ago

As much as I like the privacy frontends I think ‘we’ have to move to alternative platforms sooner than later and pull the bandaid vs. continuing to indirectly be dependent on google as the base platform.

ProfessorYakkington@lemmy.ml · 7 months ago

Yeah it is great – their sales are pretty good but they kind of hide it but I often pick up a bunch of books from there and then host them on my home server so my family can read the books I bought. Also good to support your local book shop if you can , in my opinion.

ProfessorYakkington@lemmy.ml · edit-2 7 months ago

https://libro.fm/ you can buy audiobooks through independent book store and they are drm free. It’s great. If you want more of an eye patch solution. You could rip audio from YouTube. There tends to be a lot of books uploaded there.

ProfessorYakkington@lemmy.ml · 8 months ago

I really wanted this game to be amazing but I’ve heard very mixed things to not good things. Perhaps I will give it a go after the update.

ProfessorYakkington@lemmy.ml · 10 months ago

I’ve used StandardNotes for years. They are great, very privacy friendly and lots of good features. I’ve also used Obsidian like others have mentioned but I didn’t use 95% of the features on either standard notes or Obsidian – now days I just use a general markdown files and store them in a git repo – low complexity and I like the simplicity of it. 100% recommend.

ProfessorYakkington@lemmy.ml · 10 months ago

I use apple music. On linux I use Ciderwhich is amazing. Super clean interface and lots of nobs to turn in order to make everything sound and behave the way I like. If you like apple music or are looking for a streaming solution cider is awesome.

ProfessorYakkington@lemmy.ml · 1 year ago

Hey, I am a machine learning engineer that works with people data. Generally you measure bias in the training data, the validation sets, and the outcomes ( in an ongoing fashion - AIF 360 is a common library and approach ). There are lots of ways to measure bias and or fairness. Just checking if a feature was used isn’t considered “enough” by any standards or practitioner. There are also ways to detect and mitigate some of the proxy relationships you’re pointing to. That being said, I am 100% skeptical that any hiring algorithm isn’t going to be extremely bias. A lot of big companies have tried and quit because despite using all the right steps the models were still bias https://www.reuters.com/article/us-amazon-com-jobs-automation-insight-idUSKCN1MK08G. Also many of the metrics used to report fairness have some deep flaws ( disprate impact ).

All that being said the current state is that there are no requirements for reporting so vendors don’t do the minimum 90% of the time because if they did it would cost a lot more and get in the way of the “AI will solve all your problems with no effort” narrative they want to put forward so I am happy to see any regulation coming into place even if it won’t be perfect.

ProfessorYakkington@lemmy.ml · 1 year ago

I am on pop is for my home desktop. I like the built in tiling manager. Ubuntu for work. Might give nix or kde a go next.

ProfessorYakkington@lemmy.ml · 1 year ago

Jobs done

ProfessorYakkington@lemmy.ml · edit-2 1 year ago

I use kagi. I think it depends on your level of concern , as it does with most things. Kagi has a pretty nicely written privacy policy. They do require an account but I signed up with a masked email and cc. For my use I find their privacy policy enough given the other measures I take but the main reason I like kagi is zero ads or prioritized posting. Experiencing search with out ads is a pretty awesome exp in my opinion. There are other ways to get free search with ads stripped out but this “feel” fundamentally different from a service purpose built to be ad free and private. I am happy to pay for ad free platforms vs using platforms that are trying to do privacy preserving ads but this is more of a personal stance and preference. I know your question was more about privacy than ads but I find the two closely linked. I’ve attached a summary of their privacy policy below:

Searches are anonymous and private to you. Kagi does not see what you are searching at all.
We do not log or store your IP address. Your IP address is used only temporarily when enriching location/maps searches, and is not shared with any other party.
We only store cookies needed for site functionality. We do not use any web browser analytics or other frontend telemetry.
We do not display any ads, or have any first-party or third-party tracking in service of ads.
We do not share customer data with third parties, except as needed to perform explicitly accessed services. In those cases, we will share the minimum amount of data needed to provide the service, and will do so in an anonymous way.
We collect only the data needed to provide and protect the service.
We proxy all images to prevent tracking from third parties.
We use HTTPS encryption everywhere. All passwords are hashed and salted.

https://kagi.com/privacy

ProfessorYakkington@lemmy.ml · 1 year ago

Cool, yeah I logged into my account. This is my user https://bookwyrm.social/user/tulkaa I tend to read shorter fiction and philosophy with a mix of fantasy. Need to port some reviews and books from StoryGraph for the last year or so.

ProfessorYakkington@lemmy.ml · 1 year ago

I used it for a bit and supported it when it came out but it was super slow. I moved to StoryGraph. Are people liking it? Is it slow? Maybe I will give it another go.

ProfessorYakkington@lemmy.ml · 3 years ago

I consume analytics and identify topics I am interested in via jupyter sometimes i just use ipython if I don’t want to leave the terminal – I need to build more of a frontend but I’ve not got there yet. I mostly read the articles in the terminal. And yup my plan is to find a good db but I am not sure what to use yet.

ProfessorYakkington@lemmy.ml · 3 years ago

Yeah, it is odd how much weight is put into the slur filter and how quickly the discussion gets to the slur filter when the code is open source. I wish the discussion was focused past that and on more constructive issues but \ ’ ’ / . I imagine it is more of a vocal minority commenting in the thread on these topics. It will be interesting to see if there is an influx of users.

ProfessorYakkington@lemmy.ml · 3 years ago

I think this is probably true for most providers. They could add logs if they were legally required but don’t actively keep them. I think there is way too much stock put in the ‘we don’t log’ comments that are common amongst privacy tools. Most VPN providers can log if they have to and often do log some data for service abuse and load monitoring but quibble over the definition of what ‘we don’t log’ means. I used to work for a VPN provider where we kept statements in our privacy policies about some logging and users ripped us apart despite these comments being truthful + other providers being dishonest ( or at least confusing ); but since so many providers provided false confidence via slamming all over their site that they don’t log the user base buys into these statements as 100% true ( and unchangeable ) and providers that try and provide a realistic view of what can happen get slammed. I am happy to see that proton put the statement up. I would have preferred they had statements up already but just because another provider says they don’t log I wouldn’t trust these statements. For me, I am not too worried if the provider can log some data like ip when they receive a non-avoidable court order ( https://en.wikipedia.org/wiki/United_States_Foreign_Intelligence_Surveillance_Court ) as I generally expect this to be true for all services and my threat model isn’t to avoid three letter agencies. If your threat model requires avoiding three letter agencies then trusting almost any service provider is going to be difficult. Obviously you should be using tor to connect to anything but you would have to assume almost everything with a server is either compromised or can be given certain court orders. Using services like briar seem like your best bet ( https://briarproject.org/ ).

ProfessorYakkington@lemmy.ml · edit-2 3 years ago

I am not happy with it yet but that is because I want it to be perfect and it never will be but I do find that I engage with content at a larger scale and more varied than I do when I go to a single source. I am using the nltk features from newspaper for key word extraction + the trending sources to monitor a few hundred sources. Currently I store all the meta data + links ( urls ) + wikipedia links in a pandas dataframe ( which is becoming a problem ) and visualize trends and data about news in a jupyter notebook. For the enhanced summaries + named entity extraction I am using spacy (https://spacy.io/) from there I use SPARQL ( https://en.wikipedia.org/wiki/SPARQL ) to query dbpedia (https://en.wikipedia.org/wiki/DBpedia) to augment entity knowledge ( ex: adding data about the size , industry of a company or summary explanations of scientific concepts, etc ). The named entity matching and augmentation is the portion that needs the most work. Newspaper has some nice caching features so I query all sources everyday but only pull in new articles.

I might play around with moving portions of the data into a graph db and some better ways to query based on concepts. Right now I just write python code to query the pandas DB based on different parameters.