Archive for Thursday, 10th October 2024

Thursday, 10th October 2024

Announcing Deno 2. The big focus of Deno 2 is compatibility with the existing Node.js and npm ecosystem:

Deno 2 takes all of the features developers love about Deno 1.x — zero-config, all-in-one toolchain for JavaScript and TypeScript development, web standard API support, secure by default — and makes it fully backwards compatible with Node and npm (in ESM).

The npm support is documented here. You can write a script like this:

import * as emoji from "npm:node-emoji";
console.log(emoji.emojify(`:sauropod: :heart:  npm`));

And when you run it Deno will automatically fetch and cache the required dependencies:

deno run main.js

Another new feature that caught my eye was this:

deno jupyter now supports outputting images, graphs, and HTML

Deno has apparently shipped with a Jupyter notebook kernel for a while, and it's had a major upgrade in this release.

Here's Ryan Dahl's demo of the new notebook support in his Deno 2 release video.

I tried this out myself, and it's really neat. First you need to install the kernel:

deno juptyer --install

I was curious to find out what this actually did, so I dug around in the code and then further in the Rust runtimed dependency. It turns out installing Jupyter kernels, at least on macOS, involves creating a directory in ~/Library/Jupyter/kernels/deno and writing a kernel.json file containing the following:

{
  "argv": [
    "/opt/homebrew/bin/deno",
    "jupyter",
    "--kernel",
    "--conn",
    "{connection_file}"
  ],
  "display_name": "Deno",
  "language": "typescript"
}

That file is picked up by any Jupyter servers running on your machine, and tells them to run deno jupyter --kernel ... to start a kernel.

I started Jupyter like this:

jupyter-notebook /tmp

Then started a new notebook, selected the Deno kernel and it worked as advertised:

Jupyter notebook running the Deno kernel. I run 4 + 5 and get 9, then Deno.version and get back 2.0.0. I import Observable Plot and the penguins data, then render a plot which shows as a scatter chart.

import * as Plot from "npm:@observablehq/plot";
import { document, penguins } from "jsr:@ry/jupyter-helper";
let p = await penguins();

Plot.plot({
  marks: [
    Plot.dot(p.toRecords(), {
      x: "culmen_depth_mm",
      y: "culmen_length_mm",
      fill: "species",
    }),
  ],
  document,
});

# 4:11 am / javascript, nodejs, npm, jupyter, typescript, deno, observable-plot

TIL Livestreaming a community election event on YouTube — I live in El Granada, California. Wikipedia calls us [a census designated place](https://en.wikipedia.org/wiki/El_Granada,_California) - we don't have a mayor or city council. But we do have a [Community Services District](https://granada.ca.gov/) - originally responsible for our sewers, and since 2014 also responsible for our parks. And we get to vote for the board members [in the upcoming November election](https://granada.ca.gov/2024-candidate-listing)!

10th Oct 2024, 5:09 am

Bridging Language Gaps in Multilingual Embeddings via Contrastive Learning (via) Most text embeddings models suffer from a "language gap", where phrases in different languages with the same semantic meaning end up with embedding vectors that aren't clustered together.

Jina claim their new jina-embeddings-v3 (CC BY-NC 4.0, which means you need to license it for commercial use if you're not using their API) is much better on this front, thanks to a training technique called "contrastive learning".

There are 30 languages represented in our contrastive learning dataset, but 97% of pairs and triplets are in just one language, with only 3% involving cross-language pairs or triplets. But this 3% is enough to produce a dramatic result: Embeddings show very little language clustering and semantically similar texts produce close embeddings regardless of their language

Scatter plot diagram, titled Desired Outcome: Clustering by Meaning. My dog is blue and Mein Hund ist blau are located near to each other, and so are Meine Katze ist rot and My cat is red

# 4 pm / machine-learning, ai, embeddings, jina

← Wednesday, 9th October 2024

Friday, 11th October 2024 →

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Simon Willison’s Weblog

Thursday, 10th October 2024