Simon Willison’s Weblog

Subscribe
Atom feed for python

1,085 items tagged “python”

The Python programming language.

2021

Async functions require an event loop to run. Flask, as a WSGI application, uses one worker to handle one request/response cycle. When a request comes in to an async view, Flask will start an event loop in a thread, run the view function there, then return the result.

Each request still ties up one worker, even for async views. The upside is that you can run async code within a view, for example to make multiple concurrent database queries, HTTP requests to an external API, etc. However, the number of requests your application can handle at one time will remain the same.

Using async and await in Flask 2.0

# 12th May 2021, 5:59 pm / flask, async, python

New Major Versions Released! Flask 2.0, Werkzeug 2.0, Jinja 3.0, Click 8.0, ItsDangerous 2.0, and MarkupSafe 2.0. Huge set of releases from the Pallets team. Python 3.6+ required and comprehensive type annotations. Flask now supports async views, Jinja async templates (used extensively by Datasette) “no longer requires patching”, Click has a bunch of new code around shell tab completion, ItsDangerous supports key rotation and so much more.

# 12th May 2021, 5:37 pm / flask, jinja, python, async

cinder: Instagram’s performance oriented fork of CPython (via) Instagram forked CPython to add some performance-oriented features they wanted, including a method-at-a-time JIT compiler and a mechanism for eagerly evaluating coroutines (avoiding the overhead of creating a coroutine if an awaited function returns a value without itself needing to await). They’re open sourcing the code to help start conversations about implementing some of these features in CPython itself. I particularly enjoyed the warning that accompanies the repo: this is not intended to be a supported release, and if you decide to run it in production you are on your own!

# 4th May 2021, 10:13 pm / open-source, python

Hello, HPy (via) HPy provides a new way to write C extensions for Python in a way that is compatible with multiple Python implementations at once, including PyPy.

# 29th March 2021, 2:40 pm / python, pypy

Homebrew Python Is Not For You. If you’ve been running into frustrations with your Homebrew Python environments breaking over the past few months (the dreaded “Reason: image not found” error) Justin Mayer has a good explanation. Python in a Homebrew is designed to work as a dependency for their other packages, and recent policy changes that they made to support smoother upgrades have had catastrophic problems effects on those of us who try to use it for development environments.

# 25th March 2021, 3:14 pm / homebrew, python

When you have to mock a collaborator, avoid using the Mock object directly. Either use mock.create_autospec() or mock.patch(autospec=True) if at all possible. Autospeccing from the real collaborator means that if the collaborator's interface changes, your tests will fail. Manually speccing or not speccing at all means that changes in the collaborator's interface will not break your tests that use the collaborator: you could have 100% test coverage and your library would fall over when used!

Thea Flowers

# 17th March 2021, 4:44 pm / mocking, testing, python

sqlite-uuid (via) Another Python package that wraps a SQLite module written in C: this one provides access to UUID functions as SQLite functions.

# 15th March 2021, 2:55 am / sqlite, python

sqlite-spellfix (via) I really like this pattern: “pip install sqlite-spellfix” gets you a Python module which includes a compiled (on your system when pip install ran) copy of the SQLite spellfix1 module, plus a utility variable containing its path so you can easily load it into a SQLite connection.

# 15th March 2021, 2:52 am / sqlite, python

unasync (via) Today I started wondering out loud if one could write code that takes an asyncio Python library and transforms it into the synchronous equivalent by using some regular expressions to strip out the “await ...” keywords and suchlike. Turns out that can indeed work, and Ratan Kulshreshtha built it! unasync uses the standard library tokenize module to run some transformations against an async library and spit out the sync version automatically. I’m now considering using this for sqlite-utils.

# 27th February 2021, 10:20 pm / async, python

Dependency Confusion: How I Hacked Into Apple, Microsoft and Dozens of Other Companies (via) Alex Birsan describes a new category of security vulnerability he discovered in the npm, pip and gem packaging ecosystems: if a company uses a private repository with internal package names, uploading a package with the same name to the public repository can often result in an attacker being able to execute their own code inside the networks of their target. Alex scored over $130,000 in bug bounties from this one, from a number of name-brand companies. Of particular note for Python developers: the --extra-index-url argument to pip will consult both public and private registries and install the package with the highest version number!

# 10th February 2021, 8:42 pm / security, pip, python, npm

Weeknotes: datasette-export-notebook, PyInstaller packaged Datasette, CBSAs

Visit Weeknotes: datasette-export-notebook, PyInstaller packaged Datasette, CBSAs

What a terrible week. I’ve found it hard to concentrate on anything substantial. In a mostly futile attempt to distract myself from doomscrolling I’ve mainly been building some experimental output plugins, fiddling with PyInstaller and messing around with shapefiles.

[... 732 words]

2020

datasette-ripgrep: deploy a regular expression search engine for your source code

Visit datasette-ripgrep: deploy a regular expression search engine for your source code

This week I built datasette-ripgrep—a web application for running regular expression searches against source code, built on top of the amazing ripgrep command-line tool.

[... 1,362 words]

Unravelling `not` in Python (via) Part of a series where Brett Cannon looks at how fundamental Python syntactic sugar works, including a clearly explained dive into the underlying op codes and C implementation.

# 27th November 2020, 5:59 pm / c, python, brett-cannon

Hunting for Malicious Packages on PyPI (via) Jordan Wright installed all 268,000 Python packages from PyPI in containers, and ran Sysdig to capture syscalls made during installation to see if any of them were making extra network calls or reading or writing from the filesystem. Absolutely brilliant piece of security engineering and research.

# 14th November 2020, 4:48 am / security, pypi, python

selenium-wire. Really useful scraping tool: enhances the Python Selenium bindings to run against a proxy which then allows Python scraping code to look at captured requests—great for if a site you are working with triggers Ajax requests and you want to extract data from the raw JSON that came back.

# 2nd November 2020, 6:58 pm / scraping, data-journalism, python, selenium

Inevitably we got round to talking about async.

As much of an unneeded complication as it is for so many day-to-day use-cases, it’s important for Python because, if and when you do need the high throughput handling of these io-bound use-cases, you don’t want to have to switch language.

The same for Django: most of what you’re doing has no need of async but you don’t want to have to change web framework just because you need a sprinkling of non-blocking IO.

Carlton Gibson

# 27th September 2020, 3:09 pm / async, django, python

Array programming with NumPy—the NumPy paper (via) The NumPy paper is out, published in Nature. I found this enlightening: for an academic paper it’s very understandable, and it filled in quite a few gaps in my mental model of what NumPy is and which problems it addresses, as well as its relationship to the many other tools in the scientific Python stack.

# 17th September 2020, 4:34 pm / python, scipy, numpy

The “await me maybe” pattern for Python asyncio

I’ve identified a pattern for handling potentially-asynchronous callback functions in Python which I’m calling the “await me maybe” pattern. It works by letting you return a value, a callable function that returns a value OR an awaitable function that returns that value.

[... 787 words]

Announcing the Consortium for Python Data API Standards (via) Interesting effort to unify the fragmented DataFrame API ecosystem, where increasing numbers of libraries offer APIs inspired by Pandas that imitate each other but aren’t 100% compatible. The announcement includes some very clever code to support the effort: custom tooling to compare the existing APIs, and an ingenious GitHub Actions setup to run traces (via sys.settrace), derive type signatures and commit those generated signatures back to a repository.

# 19th August 2020, 5:48 am / standards, data-science, github-actions, python

Pysa: An open source static analysis tool to detect and prevent security issues in Python code (via) Interesting new static analysis tool for auditing Python for security vulnerabilities—things like SQL injection and os.execute() calls. Built by Facebook and tested extensively on Instagram, a multi-million line Django application.

# 7th August 2020, 8:50 pm / security, python, facebook, staticanalysis, django

Better Python Decorators with wrapt (via) Adam Johnson explains the intricacies of decorating a Python function without breaking the ability to correctly introspect it, and discusses how Scout use the wrapt library by Graham Dumpleton to implement their instrumentation library.

# 2nd July 2020, 9:48 pm / decorators, python, adam-johnson

click-app. While working on sqlite-generate today I built a cookiecutter template for building the skeleton for Click command-line utilities. It’s based on datasette-plugin so it automatically sets up GitHub Actions for running tests and deploying packages to PyPI.

# 23rd June 2020, 2:21 am / cookiecutter, projects, python

Practical Python Programming (via) David Beazley has been developing and presenting this three day Python course (aimed at people with some prior programming experience) for over thirteen years, and he’s just released the course materials under a Creative Commons license for the first time.

# 29th May 2020, 1:15 pm / david-beazley, python

Waiting in asyncio. Handy cheatsheet explaining the differences between asyncio.gather(), asyncio.wait_for(), asyncio.as_completed() and asyncio.wait() by Hynek Schlawack.

# 26th May 2020, 3:28 pm / async, python, hynek-schlawack

pyp: Easily run Python at the shell (via) Fascinating little CLI utility which uses some deeply clever AST introspection to enable little Python one-liners that act as replacements for all manner of pipe-oriented unix utilities. Took me a while to understand how it works from the README, but then I looked at the code and the entire thing is only 380 lines long. There’s also a useful --explain option which outputs the Python source code that it would execute for a given command.

# 9th May 2020, 9:05 pm / shell, python

A hands-on introduction to static code analysis. Useful tutorial on using the Python standard library tokenize and ast modules to find specific patterns in Python source code, using the visitor pattern.

# 5th May 2020, 12:15 am / compilers, python, staticanalysis

How to get Rich with Python (a terminal rendering library). Will McGugan introduces Rich, his new Python library for rendering content on the terminal. This is a very cool piece of software—out of the box it supports coloured text, emoji, tables, rendering Markdown, syntax highlighting code, rendering Python tracebacks, progress bars and more. “pip install rich” and then “python -m rich” to render a “test card” demo demonstrating the features of the library.

# 4th May 2020, 11:27 pm / cli, python, will-mcgugan

How to install and upgrade Datasette using pipx (via) I’ve been using pipx to run Datasette for a while now—it’s a neat Python packaging tool which installs a Python CLI command with all of its dependencies in its own isolated virtual environment. Today, thanks to Twitter, I figured out how to install and upgrade plugins in the same environment—so I added a section to the Datasette installation documentation about it.

# 4th May 2020, 7:23 pm / datasette, pip, python

Weeknotes: Covid-19, First Python Notebook, more Dogsheep, Tailscale

My covid-19.datasettes.com project publishes information on COVID-19 cases around the world. The project started out using data from Johns Hopkins CSSE, but last week the New York Times started publishing high quality USA county- and state-level daily numbers to their own repository. Here’s the change that added the NY Times data.

[... 993 words]

How to cheat at unit tests with pytest and Black

I’ve been making a lot of progress on Datasette Cloud this week. As an application that provides private hosted Datasette instances (initially targeted at data journalists and newsrooms) the majority of the code I’ve written deals with permissions: allowing people to form teams, invite team members, promote and demote team administrators and suchlike.

[... 933 words]