

Attention in transformers, step-by-step | Deep Learning Chapter 6
Views
Likes
Dislikes
Comments
YouTube Dislikes are provided by ReturnYoutubeDislike.com.
About Attention in transformers, step-by-step | Deep Learning Chapter 6
Demystifying attention, the key mechanism inside transformers and LLMs. Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support Special thanks to these supporters: https://www.3blue1brown.com/lessons/attention#thanks An equally valuable form of support is to simply share the videos. Demystifying self-attention, multiple heads, and cross-attention. Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support The first pass for the translated subtitles here is machine-generated and, therefore, notably imperfect. To contribute edits or fixes, visit https://www.criblate.com Звуковая дорожка на русском языке: Влад Бурмистров. ------------------ Here are a few other relevant resources Build a GPT from scratch, by Andrej Karpathy https://youtu.be/kCc8FmEb1nY If you want a conceptual understanding of language models from the ground up, @vcubingx just started a short series of videos on the topic: https://youtu.be/1il-s4mgNdI?si=XaVxj6bsdy3VkgEX If you're interested in the herculean task of interpreting what these large networks might actually be doing, the Transformer Circuits posts by Anthropic are great. In particular, it was only after reading one of these that I started thinking of the combination of the value and output matrices as being a combined low-rank map from the embedding space to itself, which, at least in my mind, made things much clearer than other sources. https://transformer-circuits.pub/2021/framework/index.html Site with exercises related to ML programming and GPTs https://www.gptandchill.ai/codingproblems History of language models by Brit Cruise, @ArtOfTheProblem https://youtu.be/OFS90-FX6pg An early paper on how directions in embedding spaces have meaning: https://arxiv.org/pdf/1301.3781.pdf ------------------ Timestamps: 0:00 - Recap on embeddings 1:39 - Motivating examples 4:29 - The attention pattern 11:08 - Masking 12:42 - Context size 13:10 - Values 15:44 - Counting parameters 18:21 - Cross-attention 19:19 - Multiple heads 22:16 - The output matrix 23:19 - Going deeper 24:54 - Ending ------------------ These animations are largely made using a custom Python library, manim. See the FAQ comments here: https://3b1b.co/faq#manim https://github.com/3b1b/manim https://github.com/ManimCommunity/manim/ All code for specific videos is visible here: https://github.com/3b1b/videos/ The music is by Vincent Rubinetti. https://www.vincentrubinetti.com https://vincerubinetti.bandcamp.com/album/the-music-of-3blue1brown https://open.spotify.com/album/1dVyjwS8FBqXhRunaG5W5u ------------------ 3blue1brown is a channel about animating math, in all senses of the word animate. If you're reading the bottom of a video description, I'm guessing you're more interested than the average viewer in lessons here. It would mean a lot to me if you chose to stay up to date on new ones, either by subscribing here on YouTube or otherwise following on whichever platform below you check most regularly. Mailing list: https://3blue1brown.substack.com Twitter: https://twitter.com/3blue1brown Instagram: https://www.instagram.com/3blue1brown Reddit: https://www.reddit.com/r/3blue1brown Facebook: https://www.facebook.com/3blue1brown Patreon: https://patreon.com/3blue1brown Website: https://www.3blue1brown.com
Embed Attention in transformers, step-by-step | Deep Learning Chapter 6's Count on Your Website!
Have you ever wanted to put ANY Social Counter on your own website? We've made it possible with Embed feature!
Simply copy and paste below's code wherever you want to place it on your site!
Or... Do you want to embed our counts in streaming software (such as OBS)? Don't worry, we are supporting that as well!
Simply make new Browser Source and below's string into URL field!
YouTube Live View Counter is the best way to check your Favorite Creator's Statistics updated in real-time! Data seen on Most Social Medias might be inacurate or delayed, that's why Livecounts.io came with idea for YouTube Live View Counter!
Everything is directly taken from official API Service provided by Social Networks. Every single count is updated every 2 seconds and is as accurate as possible.
To search for specific channel simply click "Change User" button below Follower Count Box, type your favorite creator's username and you're good to go! This IS NOT case-sensitive thus you type for example "MrBeast" or "MrBeAsT" and it should still work!
If you're interested in watching Follower Count battle then navigate to Compare Page below Follower Count Box or on Navigation Bar.
Thanks for using YouTube Live View Counter! If you have any idea to improve the website then feel free to get in touch with us it on our Twitter page.
We've got a wide range of social networks to choose from and track for.
Have you ever wanted to compare creators across different Social Media Platforms? With Livecounts.io we've made it possible.