Simon Willison’s Weblog

Subscribe
Atom feed for html

72 items tagged “html”

2024

Using static websites for tiny archives (via) Alex Chan:

Over the last year or so, I’ve been creating static websites to browse my local archives. I’ve done this for a variety of collections, including:

  • paperwork I’ve scanned
  • documents I’ve created
  • screenshots I’ve taken
  • web pages I’ve bookmarked
  • video and audio files I’ve saved

This is such a neat idea. These tiny little personal archive websites aren't even served through a localhost web server - they exist as folders on disk, and Alex browses them by opening up the index.html file directly in a browser.

# 17th October 2024, 11:02 pm / html, archives

HTML for People (via) Blake Watson's brand new HTML tutorial, presented as a free online book (CC BY-NC-SA 4.0, on GitHub). This seems very modern and well thought-out to me. It focuses exclusively on HTML, skipping JavaScript entirely and teaching with Simple.css to avoid needing to dig into CSS while still producing sites that are pleasing to look at. It even touches on Web Components (described as Custom HTML tags) towards the end.

# 11th October 2024, 1:51 am / css, html, web-components

Reckoning. Alex Russell is a self-confessed Cassandra - doomed to speak truth that the wider Web industry stubbornly ignores. With this latest series of posts he is spitting fire.

The series is an "investigation into JavaScript-first frontend culture and how it broke US public services", in four parts.

In Part 2 — Object Lesson Alex profiles BenefitsCal, the California state portal for accessing SNAP food benefits (aka "food stamps"). On a 9Mbps connection, as can be expected in rural parts of California with populations most likely to need these services, the site takes 29.5 seconds to become usefully interactive, fetching more than 20MB of JavaScript (which isn't even correctly compressed) for a giant SPA that incoroprates React, Vue, the AWS JavaScript SDK, six user-agent parsing libraries and a whole lot more.

It doesn't have to be like this! GetCalFresh.org, the Code for America alternative to BenefitsCal, becomes interactive after 4 seconds. Despite not being the "official" site it has driven nearly half of all signups for California benefits.

The fundamental problem here is the Web industry's obsession with SPAs and JavaScript-first development - techniques that make sense for a tiny fraction of applications (Alex calls out document editors, chat and videoconferencing and maps, geospatial, and BI visualisations as apppropriate applications) but massively increase the cost and complexity for the vast majority of sites - especially sites primarily used on mobile and that shouldn't expect lengthy session times or multiple repeat visits.

There's so much great, quotable content in here. Don't miss out on the footnotes, like this one:

The JavaScript community's omertà regarding the consistent failure of frontend frameworks to deliver reasonable results at acceptable cost is likely to be remembered as one of the most shameful aspects of frontend's lost decade.

Had the risks been prominently signposted, dozens of teams I've worked with personally could have avoided months of painful remediation, and hundreds more sites I've traced could have avoided material revenue losses.

Too many engineering leaders have found their teams beached and unproductive for no reason other than the JavaScript community's dedication to a marketing-over-results ethos of toxic positivity.

In Part 4 — The Way Out Alex recommends the gov.uk Service Manual as a guide for building civic Web services that avoid these traps, thanks to the policy described in their Building a resilient frontend using progressive enhancement document.

# 18th August 2024, 4:37 pm / accessibility, progressive-enhancement, javascript, government, html, web-performance, alex-russell, gov-uk

This month in Servo: parallel tables and more (via) New in Servo:

Parallel table layout is now enabled (@mrobinson, #32477), spreading the work for laying out rows and their columns over all available CPU cores. This change is a great example of the strengths of Rayon and the opportunistic parallelism in Servo's layout engine.

The commit landing the change is quite short, and much of the work is done by refactoring the code to use .par_iter().enumerate().map(...) - par_iter() is the Rayon method that allows parallel iteration over a collection using multiple threads, hence multiple CPU cores.

# 31st July 2024, 3:03 pm / rust, html, servo, concurrency

For some reason, many people still believe that browsers need to include non-standard hacks in HTML parsing to display the web correctly.

In reality, the HTML parsing spec is exhaustively detailed. If you implement it as described, you will have a web-compatible parser.

Andreas Kling

# 23rd June 2024, 11:59 pm / browsers, web-standards, html, andreas-kling, ladybird

Streaming HTML out of order without JavaScript (via) A really interesting new browser capability. If you serve the following HTML:

<template shadowrootmode="open">
  <slot name="item-1">Loading...</slot>
</template>

Then later in the same page stream an element specifying that slot:

<span slot="item-1">Item number 1</span>

The previous slot will be replaced while the page continues to load.

I tried the demo in the most recent Chrome, Safari and Firefox (and Mobile Safari) and it worked in all of them.

The key feature is shadowrootmode=open, which looks like it was added to Firefox 123 on February 19th 2024 - the other two browsers are listed on caniuse.com as gaining it around March last year.

# 1st March 2024, 4:59 pm / web-components, browsers, html

htmz (via) Astonishingly clever browser platform hack by Lean Rada.

Add this to a page:

<iframe hidden name=htmz onload="setTimeout(() => document.querySelector( this.contentWindow.location.hash || null)?.replaceWith( ...this.contentDocument.body.childNodes ))"></iframe>

Then elsewhere add a link like this:

<a href="/flower.html#my-element" target=htmz>Flower</a>

Clicking that link will fetch content from /flower.html and replace the element with ID of my-element with that content.

# 20th February 2024, 1:21 am / iframes, html, javascript

Portable EPUBs. Will Crichton digs into the reasons people still prefer PDF over HTML as a format for sharing digital documents, concluding that the key issues are that HTML documents are not fully self-contained and may not be rendered consistently.

He proposes “Portable EPUBs” as the solution, defining a subset of the existing EPUB standard with some additional restrictions around avoiding loading extra assets over a network, sticking to a smaller (as-yet undefined) subset of HTML and encouraging interactive components to be built using self-contained Web Components.

Will also built his own lightweight EPUB reading system, called Bene—which is used to render this Portable EPUBs article. It provides a “download” link in the top right which produces the .epub file itself.

There’s a lot to like here. I’m constantly infuriated at the number of documents out there that are PDFs but really should be web pages (academic papers are a particularly bad example here), so I’m very excited by any initiatives that might help push things in the other direction.

# 25th January 2024, 8:32 pm / web-components, html, pdf

2023

You can stop using user-scalable=no and maximum-scale=1 in viewport meta tags now. Luke Plant points out that your meta viewport tag should stick to just “width=device-width, initial-scale=1” these days—the user-scalable=no and maximum-scale=1 attributes are no longer necessary, and have a negative impact on accessibility, especially for Android users.

# 4th August 2023, 11:41 pm / lukeplant, html, accessibility, mobile, mobileweb

The Page With No Code (via) A fun demo by Dan Q, who created a web page with no HTML at all—but in Firefox it still renders content, thanks to a data URI base64 encoded stylesheet served in a link: header that uses html::before, html::after, body::before and body::after with content: properties to serve the content. It even has a background image, encoded as a base64 SVG nested inside another data URI.

# 21st January 2023, 6:59 pm / css, datauri, html

2022

Introducing sqlite-html: query, parse, and generate HTML in SQLite (via) Another brilliant SQLite extension module from Alex Garcia, this time written in Go. sqlite-html adds a whole family of functions to SQLite for parsing and constructing HTML strings, built on the Go goquery and cascadia libraries. Once again, Alex uses an Observable notebook to describe the new features, with embedded interactive examples that are backed by a Datasette instance running in Fly.

# 3rd August 2022, 5:31 pm / sqlite, datasette, go, html, alex-garcia

Fastest way to turn HTML into text in Python (via) A light benchmark of the new-to-me selectolax Python library shows it performing extremely well for tasks such as extracting just the text from an HTML string, after first manipulating the DOM. selectolax is a Python binding over the Modest and Lexbor HTML parsing engines, which are written in no-outside-dependency C.

# 27th July 2022, 5:55 pm / html, python

HTML event handler attributes: down the rabbit hole (via) onclick="myfunction(event)" is an idiom for passing the click event to a function - but how does it work? It turns out the answer is buried deep in the HTML spec - the browser wraps that string of code in a function(event) { ... that string ... } function and makes the event available to its local scope that way.

# 26th April 2022, 8:35 pm / javascript, html, domscripting

2020

pup. This is a great idea: a command-line tool for parsing HTML on stdin using CSS selectors. It’s like jq but for HTML. Supports a sensible collection of selectors and has a number of output options for the selected nodes, including plain text and JSON. It also works as a simple pretty-printer for HTML.

# 14th February 2020, 4:25 pm / parsing, html, cli

2019

Using the HTML lang attribute (via) TIL the HTML lang attribute is used by screen readers to understand how to provide the correct accent and pronunciation.

# 18th April 2019, 9:09 pm / screenreaders, html, accessibility, localisation

2018

If you wrap your main content – that is, the stuff that isn’t navigation, logo and main header etc – in a

tag, a screen reader user can jump immediately to it using a keyboard shortcut. Imagine how useful that is – they don’t have to listen to all the content before it, or tab through it to get to the main meat of your page.

Bruce Lawson

# 19th December 2018, 1:07 pm / html, accessibility, bruce-lawson

kennethreitz/requests-html: HTML Parsing for Humans™ (via) Neat and tiny wrapper around requests, lxml and html2text that provides a Kenneth Reitz grade API design for intuitively fetching and scraping web pages. The inclusion of html2text means you can use a CSS selector to select a specific HTML element and then convert that to the equivalent markdown in a one-liner.

# 25th February 2018, 4:49 pm / scraping, html, python, requests

2017

Can I use... input type=color. TIL <input type="color"> has reached 78.83% support globally already - biggest gap right now is Mobile Safari.

# 29th November 2017, 9:56 pm / html

2013

Why can’t I do style=“padding: 20px” and a border in the same div?

You can’t have two style attributes on the same element—but you can have two styles rules inside the same attribute. Try this instead:

[... 48 words]

Should I store markdown instead of HTML into database fields?

You should store the exact format that was entered by the user.

[... 95 words]

What are the different ways in which web sites can be developed?

There are a few languages that provide an alternative syntax that compiles to HTML (Haml is quite a popular one) but generally you need to have a very good understanding of HTML in order to do any web development at all, no matter what server-side technology you use. Likewise for CSS—Sass and LESS provide alternative syntax that compiles to CSS, but they are no replacement for understanding how CSS actually works.

[... 94 words]

What data structures are used to implement the DOM tree?

You may enjoy this post from Hixie back in 2002 which illustrates how different browsers deal with incorrectly nested HTML. IE6 used to create a tree that wasn’t actually a tree! http://ln.hixie.ch/?start=103791...

[... 49 words]

2012

What’s the best way to handle logins?

First, make sure you’re storing the password as a salted hash, using a deliberately slow hashing algorithm such as bcrypt, scrypt or PBKDF2—here are some recent articles to get you up to speed:

[... 176 words]

What is the difference between XHTML 1.0 strict and transitional?

Not a lot. XHTML transitional lets you use a few presentational attributes and elements that aren’t available in XHTML strict. Here’s a more detailed overview from back in 2005: http://24ways.org/2005/transitio...

[... 59 words]

2011

Could browsers be made to scroll down (e.g. by 67%) if you add #67% to a URL?

I’d say no.

[... 89 words]

2010

Is there any consensus yet on link rel=shorturl vs rev=canonical?

It’s pretty clear from the answers that rev=canonical v.s. rel=canonical is way too confusing—so it’s down to rel=shortlink v.s. rel=shorturl.

[... 38 words]

The Web for me is still URLs and HTML. I don’t want a Web which can only be understood by running a JavaScript interpreter against it.

Me, on Twitter

# 27th September 2010, 4:37 pm / html, javascript, urls, recovered

Paper 5 | Scribd (via) A more impressive example of Scribd’s new HTML/CSS document viewer: a mathematics-heavy LaTeX paper by one of Scribd’s engineers.

# 7th May 2010, 12:12 pm / css, html, html5, latex, scribd, recovered

Scribd in HTML5. Outstanding piece of engineering work from Scribd—they can now render documents using HTML, webfonts and a ton of CSS absolute positioning (using ems rather than pixels) instead of Flash. Nothing to do with HTML5 of course, which is rapidly replacing Ajax as the most mis-applied terminology on the Web. That nit-pick feels pretty insignificant compared to their overall achievement though—being able to convert any formatted document (.doc, pdf etc) in to HTML and CSS that displays correctly is a real leap forward.

# 7th May 2010, 12:09 pm / css, css3, html, html5, scribd, webfonts, recovered

Want to know if your ‘HTML application’ is part of the web? Link me into it. Not just link me to it; link me into it. Not just to the black-box frontpage. Link me to a piece of content. Show me that it can be crawled, show me that we can draw strands of silk between the resources presented in your app. That is the web: The beautiful interconnection of navigable content

Ben Ward

# 6th May 2010, 8:53 pm / ben-ward, html, links, web, webapps, recovered