38 items tagged “youtube”
2024
Project: VERDAD—tracking misinformation in radio broadcasts using Gemini 1.5
I’m starting a new interview series called Project. The idea is to interview people who are building interesting data projects and talk about what they’ve built, how they built it, and what they learned along the way.
[... 1,025 words]Apple’s Knowledge Navigator concept video (1987) (via) I learned about this video today while engaged in my irresistible bad habit of arguing about whether or not "agents" means anything useful.
It turns out CEO John Sculley's Apple in 1987 promoted a concept called Knowledge Navigator (incorporating input from Alan Kay) which imagined a future where computers hosted intelligent "agents" that could speak directly to their operators and perform tasks such as research and calendar management.
This video was produced for John Sculley's keynote at the 1987 Educom higher education conference imagining a tablet-style computer with an agent called "Phil".
It's fascinating how close we are getting to this nearly 40 year old concept with the most recent demos from AI labs like OpenAI. Their Introducing GPT-4o video feels very similar in all sorts of ways.
I Was A Teenage Foot Clan Ninja.
My name is Danny Pennington, I am 48 years old, and between 1988 in 1995 I was a ninja in the Foot Clan.
I enjoyed this TMNT parody a lot.
YouTube Thumbnail Viewer.
I wanted to find the best quality thumbnail image for a YouTube video, so I could use it as a social media card. I know from past experience that GPT-4 has memorized the various URL patterns for img.youtube.com
, so I asked it to guess the URL for my specific video.
This piqued my interest as to what the other patterns were, so I got it to spit those out too. Then, to save myself from needing to look those up again in the future, I asked it to build me a little HTML and JavaScript tool for turning a YouTube video URL into a set of visible thumbnails.
I iterated on the code a bit more after pasting it into Claude and ended up with this, now hosted in my tools collection.
How to succeed in MrBeast production (leaked PDF). Whether or not you enjoy MrBeast’s format of YouTube videos (here’s a 2022 Rolling Stone profile if you’re unfamiliar), this leaked onboarding document for new members of his production company is a compelling read.
It’s a snapshot of what it takes to run a massive scale viral YouTube operation in the 2020s, as well as a detailed description of a very specific company culture evolved to fulfill that mission.
It starts in the most on-brand MrBeast way possible:
I genuinely believe if you attently read and understand the knowledge here you will be much better set up for success. So, if you read this book and pass a quiz I’ll give you $1,000.
Everything is focused very specifically on YouTube as a format:
Your goal here is to make the best YOUTUBE videos possible. That’s the number one goal of this production company. It’s not to make the best produced videos. Not to make the funniest videos. Not to make the best looking videos. Not the highest quality videos.. It’s to make the best YOUTUBE videos possible.
The MrBeast definition of A, B and C-team players is one I haven’t heard before:
A-Players are obsessive, learn from mistakes, coachable, intelligent, don’t make excuses, believe in Youtube, see the value of this company, and are the best in the goddamn world at their job. B-Players are new people that need to be trained into A-Players, and C-Players are just average employees. […] They arn’t obsessive and learning. C-Players are poisonous and should be transitioned to a different company IMMEDIATELY. (It’s okay we give everyone severance, they’ll be fine).
The key characteristic outlined here, if you read between the hustle-culture lines, is learning. Employees who constantly learn are valued. Employees who don’t are not.
There’s a lot of stuff in there about YouTube virality, starting with the Click Thru Rate (CTR) for the all-important video thumbnails:
This is what dictates what we do for videos. “I Spent 50 Hours In My Front Yard” is lame and you wouldn’t click it. But you would hypothetically click “I Spent 50 Hours In Ketchup”. Both are relatively similar in time/effort but the ketchup one is easily 100x more viral. An image of someone sitting in ketchup in a bathtub is exponentially more interesting than someone sitting in their front yard.
The creative process for every video they produce starts with the title and thumbnail. These set the expectations for the viewer, and everything that follows needs to be defined with those in mind. If a viewer feels their expectations are not being matched, they’ll click away - driving down the crucial Average View Duration that informs how much the video is promoted by YouTube’s all-important mystical algorithms.
MrBeast videos have a strictly defined formula, outlined in detail on pages 6-10.
The first minute captures the viewer’s attention and demonstrates that their expectations from the thumbnail will be met. Losing 21 million viewers in the first minute after 60 million initial clicks is considered a reasonably good result! Minutes 1-3, 3-6 and 6-end all have their own clearly defined responsibilities as well.
Ideally, a video will feature something they call the “wow factor”:
An example of the “wow factor” would be our 100 days in the circle video. We offered someone $500,000 if they could live in a circle in a field for 100 days (video) and instead of starting with his house in the circle that he would live in, we bring it in on a crane 30 seconds into the video. Why? Because who the fuck else on Youtube can do that lol.
Chapter 2 (pages 10-24) is about creating content. This is crammed with insights into what it takes to produce surprising, spectacular and very expensive content for YouTube.
A lot of this is about coordination and intense management of your dependencies:
I want you to look them in the eyes and tell them they are the bottleneck and take it a step further and explain why they are the bottleneck so you both are on the same page. “Tyler, you are my bottleneck. I have 45 days to make this video happen and I can not begin to work on it until I know what the contents of the video is. I need you to confirm you understand this is important and we need to set a date on when the creative will be done.” […] Every single day you must check in on Tyler and make sure he is still on track to hit the target date.
It also introduces the concept of “critical components”:
Critical components are the things that are essential to your video. If I want to put 100 people on an island and give it away to one of them, then securing an island is a critical component. It doesn’t matter how well planned the challenges on the island are, how good the weather is, etc. Without that island there is no video.
[…]
Critical Components can come from literally anywhere and once something you’re working on is labeled as such, you treat it like your baby. WITHOUT WHAT YOU’RE WORKING ON WE DO NOT HAVE A VIDEO! Protect it at all costs, check in on it 10x a day, obsess over it, make a backup, if it requires shipping pay someone to pick it up and drive it, don’t trust standard shipping, and speak up the second anything goes wrong. The literal second. Never coin flip a Critical Component (that means you’re coinfliping the video aka a million plus dollars)
There’s a bunch of stuff about communication, with a strong bias towards “higher forms of communication”: in-person beats a phone call beats a text message beats an email.
Unsurprisingly for this organization, video is a highly valued tool for documenting work:
Which is more important, that one person has a good mental grip of something or that their entire team of 10 people have a good mental grip on something? Obviously the team. And the easiest way to bring your team up to the same page is to freaken video everything and store it where they can constantly reference it. A lot of problems can be solved if we just video sets and ask for videos when ordering things.
I enjoyed this note:
Since we are on the topic of communication, written communication also does not constitute communication unless they confirm they read it.
And this bit about the value of consultants:
Consultants are literally cheat codes. Need to make the world's largest slice of cake? Start off by calling the person who made the previous world’s largest slice of cake lol. He’s already done countless tests and can save you weeks worth of work. […] In every single freakin task assigned to you, always always always ask yourself first if you can find a consultant to help you.
Here’s a darker note from the section “Random things you should know”:
Do not leave consteatants waiting in the sun (ideally waiting in general) for more than 3 hours. Squid game it cost us $500,000 and boys vs girls it got a lot of people out. Ask James to know more
And to finish, this note on budgeting:
I want money spent to be shown on camera ideally. If you’re spending over $10,000 on something and it won’t be shown on camera, seriously think about it.
I’m always interested in finding management advice from unexpected sources. For example, I love The Eleven Laws of Showrunning as a case study in managing and successfully delegating for a large, creative project.
I don’t think this MrBeast document has as many lessons directly relevant to my own work, but as an honest peek under the hood of a weirdly shaped and absurdly ambitious enterprise it’s legitimately fascinating.
Apple, Nvidia, Anthropic Used Thousands of Swiped YouTube Videos to Train AI. This article has been getting a lot of attention over the past couple of days.
The story itself is nothing new: the Pile is four years old now, and has been widely used for training LLMs since before anyone even cared what an LLM was. It turns out one of the components of the Pile is a set of ~170,000 YouTube video captions (just the captions, not the actual video) and this story by Annie Gilbertson and Alex Reisner highlights that and interviews some of the creators who were included in the data, as well as providing a search tool for seeing if a specific creator has content that was included.
What's notable is the response. Marques Brownlee (19m subscribers) posted a video about it. Abigail Thorn (Philosophy Tube, 1.57m subscribers) tweeted this:
Very sad to have to say this - an AI company called EleutherAI stole tens of thousands of YouTube videos - including many of mine. I’m one of the creators Proof News spoke to. The stolen data was sold to Apple, Nvidia, and other companies to build AI
When I was told about this I lay on the floor and cried, it’s so violating, it made me want to quit writing forever. The reason I got back up was because I know my audience come to my show for real connection and ideas, not cheapfake AI garbage, and I know they’ll stay with me
Framing the data as "sold to Apple..." is a slight misrepresentation here - EleutherAI have been giving the Pile away for free since 2020. It's a good illustration of the emotional impact here though: many creative people do not want their work used in this way, especially without their permission.
It's interesting seeing how attitudes to this stuff change over time. Four years ago the fact that a bunch of academic researchers were sharing and training models using 170,000 YouTube subtitles would likely not have caught any attention at all. Today, people care!
Tom Scott, and the formidable power of escalating streaks
Ten years ago yesterday, Tom Scott posted this video to YouTube about “Special Crossings For Horses In Britain”. It was the first in his Things You Might Not Know series, but more importantly it was the start of a streak.
[... 1,352 words]After ten years, it’s time to stop making videos. Ten years ago, my friend Tom Scott started a deliberate streak of posting YouTube videos—initially about one a day before settling into a cadence of one a week. He kept that up for the full ten years, growing his subscribers to over 6 million in the process.
Today he’s ending that streak, in unparalleled style.
(I’m proud to have made an appearance in video number 13, talking about Zeppelins.)
2023
Exploring MusicCaps, the evaluation data released to accompany Google’s MusicLM text-to-music model
Google Research just released MusicLM: Generating Music From Text. It’s a new generative AI model that takes a descriptive prompt and produces a “high-fidelity” music track. Here’s the paper (and a more readable version using arXiv Vanity).
[... 1,323 words]2022
lite-youtube-embed (via) Handy Web Component wrapper around the standard YouTube iframe embed which knocks over 500KB of JavaScript off the initial page load—I just added this to the datasette.io homepage and increased the Lighthouse performance score from 51 to 93!
2021
Weeknotes: Datasette and Git scraping at NICAR, VaccinateCA
This week I virtually attended the NICAR data journalism conference and made a ton of progress on the Django backend for VaccinateCA (see last week).
[... 773 words]2020
Scaling Datastores at Slack with Vitess (via) Slack spent three years migrating 99% of their MySQL query load to run against Vitess, the open source MySQL sharding system originally built by YouTube. “Today, we serve 2.3 million QPS at peak. 2M of those queries are reads and 300K are writes. Our median query latency is 2 ms, and our p99 query latency is 11 ms.”
One academic who interviewed attendees of a flat-earth convention found that, almost to a person, they'd discovered the subculture via YouTube recommendations.
Happy Birthday Sea Lions! (via) Today, June 15th, is Sea Lion birthday—half of all California Sea Lions are born today thanks to clever co-ordinated delayed implantation by Sea Lion females. Natalie has started making nature videos and I’ve been tagging along as her camera-person—this three minute video, shot at Pier 39 in San Francisco, celebrates Sea Lion birthday and explains how it works.
2019
There’s a spectrum on YouTube between the calm section — the Walter Cronkite, Carl Sagan part — and Crazytown, where the extreme stuff is. If I’m YouTube and I want you to watch more, I’m always going to steer you toward Crazytown.
— Tristan Harris, former design ethicist at Google
A Conspiracy To Kill IE6 (via) Cracking story by Chris Zacharias about how a team of engineers at YouTube back in 2009 took advantage of some exploits in YouTube’s organization structure (left over from their acquisition by Google) to ship a vague IE6 deprecation warning banner on one of the world’s highest traffic websites, inspiring many other similar banners and resulting in a 10% drop in global IE6 traffic.
Vitess (via) I remember looking at Vitess when it was first released by YouTube in 2012. The idea of a proven horizontally scalable sharding mechanism for MySQL was exciting, but I was put off by the need for a custom Go or Java client library. Apparently that changed with Vitess 2.1 in April 2017, the first version to introduce a MySQL protocol compatible proxy which can be connected to by existing code written in any language. Vitess 3.0 came out last December so now the MySQL proxy layer is much more stable. Vitess is used in production by a bunch of other companies now (including Slack and Square) so it’s definitely worth a closer look.
2018
It seems as if you are never ‘hardcore’ enough for YouTube’s recommendation algorithm. It promotes, recommends and disseminates videos in a manner that appears to constantly up the stakes. Given its billion or so users, YouTube may be one of the most powerful radicalising instruments of the 21st century.
2017
Something is wrong on the internet. James Bridle takes a fascinating and deeply troubling dive into the world of Kids’ YouTube videos, which appear to be increasingly algorithmically generated and are evolving in a very dark direction.
In the official timeline, Peppa is appropriately reassured by a kindly dentist. In the version above, she is basically tortured, before turning into a series of Iron Man robots and performing the Learn Colours dance. A search for “peppa pig dentist” returns the above video on the front page, and it only gets worse from here.
2014
A Zeppelin, A Cat, and The World’s First In-Flight Radio Message. Tom Scott asked me for “something you might not know” at our leaving party in London before we moved to California. I went with the story of Kiddo the cat and the first attempt at an aerial Atlantic crossing. Here’s the resulting YouTube video.
2012
What platform was YouTube using before they were acquired by Google?
It was written in Python—I don’t think they used any particular framework (they started the site in 2005).
[... 37 words]2009
Google container data center tour (on YouTube). 45,000 servers in 45 shipping containers, along with some serious looking plumbing.
Apparently [unladen-swallow] is already 30% faster than CPython, and this version is being used to run some of the Python code on YouTube.
2008
YouTube Enables Deep Linking Within Videos. Add #t=1m45s to the end of a YouTube URL to jump to that spot. I’d be a lot more impressed by this if visiting a YouTube link in the UK didn’t use IP geo targetting to redirect me to uk.youtube.com, losing the fragment identifier and hence the #t specifier in the process.
Popular Websites Vulnerable to Cross-Site Request Forgery Attacks. Ed Felten and Bill Zeller announce four CSRF holes, in ING Direct, YouTube, MetaFilter and the New York Times. The ING Direct hole allowed transfer of funds out of a user’s bank accounts! The first three were fixed before publication; the New York Times hole still exists (despite being reported a year ago), and allows you to silently steal e-mail addresses by CSRFing the “E-mail this” feature.
Wario Land: Shake It—Amazing footage! Some virals really do deserve linking to.
YouTube Playlist: DjangoCon 2008 Sessions. YouTube’s tag and search indexes appear to lag behind the main site by quite a while; this appears to be the definitive index page for videos of talks at DjangoCon.
YouTube: djangocon tag. Google have started posting videos of presentations at DjangoCon on YouTube.
“THIS IS NOT MLM!!!”—An Appreciation. Merlin Mann explains his fascination with the “cash gifting” pyramid scams that keep cropping up on YouTube.