- How to get chapter wise summary of the novels from chatgpt, so that I don’t miss anything from the book : r/ChatGPTPro
“But, even GPT 4 is still currently really bad at this, and often makes stuff up, even with The Bible, which is the book it probably has the most training data on. Once it starts making stuff up, then usually the rest of the response goes downhill very quickly.”“The best solution I’ve found so far is to copy chunks of the book at a time (keeping the lengths roughly the same) – using api with low creativity helps for me too. I’ve also found 3.5 does as good of a job as 4 (and I prefer them over Bard and Claude)”
–interesting to see the conversation (even when it’s a year old)
- Welcome to the Artificial Intelligence Incident Database
- Making AI Less “Thirsty”: Uncovering and Addressing the Secret Water Footprint of AI Models
- How much energy can AI use? Breaking down the toll of each ChatGPT query – The Washington Post
- fast.ai – fast.ai—Making neural nets uncool again
- LinkedIn Is Training AI on User Data Before Updating Its Terms of Service
- wordfreq/SUNSET.md at master · rspeer/wordfreq
The open Web (via OSCAR) was one of wordfreq’s data sources. Now the Web at large is full of slop generated by large language models, written by no one to communicate nothing. Including this slop in the data skews the word frequencies.Sure, there was spam in the wordfreq data sources, but it was manageable and often identifiable. Large language models generate text that masquerades as real language with intention behind it, even though there is none, and their output crops up everywhere.
As one example, Philip Shapira reports that ChatGPT (OpenAI’s popular brand of generative language model circa 2024) is obsessed with the word “delve” in a way that people never have been, and caused its overall frequency to increase by an order of magnitude.
h/t funnymonkey
- simonw/tools: Assorted tools
Browser-based AI tools - The Math Behind LLMs, Transformer Model | Medium
Positional Encoding creates positional information about tokens’ positioning in the input sequence.
Multi-Head Attention captures detailed information about the relationships between tokens. It does this in multiple ways at once (multiple heads) from multiple angles to learn as much about the relationships as possible.
FFNN, etc. refines the representation of the input before passing it to the decoder, which then assists in producing our output. As mentioned above, I’ll write an additional article describing the decoder and link it here. - Welcome to LessWrong! — LessWrong
LessWrong is an online forum and community dedicated to improving human reasoning and decision-making. We seek to hold true beliefs and to be effective at accomplishing our goals. Each day, we aim to be less wrong about the world than the day before. - Ban warnings fly as users dare to probe the “thoughts” of OpenAI’s latest model | Ars Technica
“For example, in the future we may wish to monitor the chain of thought for signs of manipulating the user,” the company writes. “However, for this to work the model must have freedom to express its thoughts in unaltered form, so we cannot train any policy compliance or user preferences onto the chain of thought. We also do not want to make an unaligned chain of thought directly visible to users.” - The growing energy footprint of artificial intelligence: Joule
- The Energy Footprint of Humans and Large Language Models – Communications of the ACM
A nice break down with some actual contextualization. More difficult to jump between electricity to jet fuel but still much better than saying “so much energy!!!”“The cost of training a big foundation model can be daunting. The electricity required to train GPT-3 was estimated at around 1,287,000 kWh [5]. Estimates for Llama 3 are a little above 500,000 kWh[b], a value that is in the ballpark of the energy use of a seven-hour flight of a big airliner. However, in contrast to a single long flight that is not reusable, a foundation model once trained will instantiate a set of weights that can be shared and reused in many different instances.”
- Semantically related words for “cat_NOUN”
Click on the raw vector output to see why things are hard to explain. - Making Orbit Animations with CSS Custom Properties / Coder’s Block
- Pilots on Facebook are sharing stories + images of #SpaceJunk breakups and alerts they have seen.