Which is the best open source tool to populate my database with test data for my load test?
11th February 2012
My answer to Which is the best open source tool to populate my database with test data for my load test? on Quora
I’ve seen tools that do this, but to be honest it’s very simple to write your own script for this (especially if you’re using an ORM). The other benefit to writing your own script for this is that you’ll have a much better chance of accurately representing your expected data, sizes etc.
A couple of techniques that are pretty useful: Build up lists of common first names and last names, then generate user names by picking a random first name and a random last name. Build a utility function that generates 6 letter random strings, then generate email addresses as random-6-letter-string@random-domain. For relationships, one technique is to populate one table, then pull all of the primary keys out in to a list and pick them at random from that list when creating other records. You might want to bias that selection towards some records to get more of a realistic bell-curve rather than a purely random selection.
There are libraries that can help with this (e.g. built-in routines for generating fake email addresses etc). If you’re using Ruby, http://faker.rubyforge.org/ is worth a look (a port of Data::Faker from Perl). There’s a Python port here: https://github.com/threadsafelab...
More recent articles
- Qwen2.5-Coder-32B is an LLM that can code well that runs on my Mac - 12th November 2024
- Visualizing local election results with Datasette, Observable and MapLibre GL - 9th November 2024
- Project: VERDAD - tracking misinformation in radio broadcasts using Gemini 1.5 - 7th November 2024