@infinitepcg

infinitepcg@lemmy.world · 7 months ago

This looks like an embarrassing mistake. If someone were to try to “tank” Twitter, it wouldn’t really make sense to do this on purpose.

infinitepcg@lemmy.world · edit-2 9 months ago

Pretty much the hardware version of && false

infinitepcg@lemmy.world · 9 months ago

it happened again with the Intuitive Machines lander that landed on the moon last week

infinitepcg@lemmy.world · 9 months ago

The article just says that the account is suspended, there is no official statement from Twitter an no indication that they suspended the account on purpose. The most likely reason is that the account was mass reported by trolls and got suspended automatically.

infinitepcg@lemmy.world · 10 months ago

Nobody knows if and when programming will be automated in a meaningful way. But once we have the tech to do it, we can automate pretty much all work. So I think this will not be a problem for programmers until it’s a problem for everyone.

infinitepcg@lemmy.world · 10 months ago

I think it’s reasonable to not short stocks. I just find it a bit weird to see people confidently proclaim that a company is overvalued, but than not shorting the stock, which would be the rational thing to do.

infinitepcg@lemmy.world · 10 months ago

It’s hard to tell how much a platform is worth, arguably the value of Twitter was 44B, since someone was willing to pay that.

The good news is, if you’re really certain that Reddit is overvalued, you’ll soon be able to short it and get rich if you end up being right!

infinitepcg@lemmy.world · 10 months ago

I don’t think the number of bots matters much, there are much more real people on Twitter than on Mastodon. It’s not an issue for Twitter because they already are the platform where everyone else is. I’m optimistic about Mastodon, it already has the better UX and the better business model and I think it will slowly attract more users over time and eventually reach the relevance that Twitter had at its peak.

infinitepcg@lemmy.world · 10 months ago

When the Apple car is released, the EU will invent 350 kW DC fast charging via USB-C 🙏

infinitepcg@lemmy.world · 10 months ago

The difficult thing is gaining users, not writing the code.

infinitepcg@lemmy.world · edit-2 10 months ago

I’ve been on Mastodon for over a year and I never experienced anything that could be classified as a technical glitch. From a tech / UI perspective it feels very polished to me.

I guess the only exception would be that old posts are sometimes missing on profiles from different servers.

infinitepcg@lemmy.world · 11 months ago

I agree with your point on biodiversity and yes, climate change poses an existential threat to individual people, but not to civilization as a whole.

infinitepcg@lemmy.world · 11 months ago

No, I’m certain that human civilization would survive.

infinitepcg@lemmy.world · 11 months ago

I don’t think this kind of catastrophizing helps. Climate change certainly doesn’t “threaten the fundamental existence of organized human society”. Sure, we should do more about it and future generations would be better off if we were to lessen the impact, but it is not an existential threat.

infinitepcg@lemmy.world · 1 year ago

I’ve definitely had both. Sometimes the hosts of the actual podcast read an ad using their own voices. In this case everyone gets the same audio file and crowdsourcing the timestamps would work.

For dynamically inserted ads, it will be more complicated. Maybe a system like content id that has a library of known ads and detects them in the audio.

infinitepcg@lemmy.world · 1 year ago

I wonder if something like sponsor block is feasible for podcasts 🤔

infinitepcg@lemmy.world · edit-2 1 year ago

This article is full of errors!

At its core, an LLM is a big (“large”) list of phrases and sentences

Definitely not! An LLM is the combination of an architecture and its model parameters. It’s just a bunch of numbers, no list of sentences, no database. (Seems like the author confused the word “LLM” with the dataset of the LLM???)

an LLM is a storage space (“database”) containing as many sample documents as possible

Nope. This applies to the dataset, not the model. I guess you can argue that memorization happens sometimes, so it might have some features of a database. But it isn’t one.

Additional data (like the topic, mood, tone, source, or any number of other ways to categorize the documents) can be provided

LLMs are trained in an unsupervised fashion. Just sequences of tokens, no labels.

Typically, an LLM will cover a single context, e.g. only social media

I’m not aware of any LLM that does this. What’s the “context” of GPT-4?

software developers have gone to great lengths to collect an unfathomable number of sample texts and meticulously categorize those samples in as many ways as possible

The closest real thing is the RLHF process that is used to fine tune an existing LLM for a specific application (like ChatGPT). The dataset for the LLM is not annotated or categorized in any way.

a GPT uses the words and proximity data stored in LLMs

This is confusing. “GPT” is the architecture of the LLM.

it is impossible for it to create something never seen before

This isn’t accurate, depending on the temperature setting, an LLM can output literally any word at any time with a non-zero probability. It can absolutely produce things it hasn’t seen.

Also I think it’s too simple to just assert that LLMs are not intelligent. It mostly depends on your definition of intelligence and there are lots of philosophical discussions to be had (see also the AI effect).

infinitepcg@lemmy.world · 1 year ago

I assume these would be credentials in the training data, not something it got from other ChatGPT users?

infinitepcg@lemmy.world · 1 year ago

I’m pretty sure it doesn’t remember conversations from other users (or even your own). That’s just not how it works.

infinitepcg@lemmy.world · edit-2 1 year ago

Whether something is derivative or not is one of the key questions used to determine whether the free use of someone else’s copyrighted work is fair, as in fair use.

I think training an AI model is not fair use. It’s either derivative work and needs a license or it’s not derivative work and can be used without a license. In both cases it’s not fair use (in the legal sense of “fair use”).

I’m not sure if you’re making an argument about what the law currently says or what it should say. In my opinion the law should be updated to clarify if you need a license to use copyrighted material as training data.

The amount that artists would be paid would be determined by negotiation between the artist (the rights holder) and the entity using their work

Sure, my point is such an agreement will never be made. It’s a good deal for AI companies to use the data for free, but if they can’t do that, they will not be interested.

Either way, I think there is no way for artists to win this. It’s completely possible to train large image generators without copyrighted material. These datasets are so large that paying artists per image will never be feasible.