Putting photos in folders

I’m printing index sheets for the FSPS photos, so that each streets’ group of photos (e.g. Ainslie Road) in the archive folders is divided by a set of A4 colour-printed pages with thumbnails of the photos. These don’t actually have each photo’s URLs or filenames, which I’ve been a bit disappointed about, but it does have the URL of the street’s page. That is enough to get pretty close to an individual photo, and I think it’s good enough. If I were starting this project again I might do things a bit differently, but I’m far enough in now to want to maintain consistency.

I do want to sort out a better URL rewrite for page IDs. At the moment I am including page ID URLs such as https://archives.org.au/Special:Redirect/page/1100 but this would be neater as https://archives.org.au/P1100 (which would prohibit having wiki pages at that URL, but I think that’s okay).

Checksums for Flickr photos

I’ve been working a bit on PhpFlickr CLI lately, which is my little tool for interacting with Flickr. So far, it only adds checksums to photos (as machine tags), as a precursor to making a duplicate-finder for Flickr. I also want to use the checksums for adding links to and from Wikimedia Commons (so that photos I’ve uploaded there are linked to their versions on Flickr, and on Flickr they’re linked to Commons).

Things should be ‘projects’, not ‘systems’. They should end, so they can be forgotten. They must be in a fit state to be ended and forgotten. Books work with that idea, but websites are trickier. That’s slightly annoying, but there are great tools for making it easier. Not as easy as sticking a book in a cupboard for a century though. Hmm. I think I need another beer….

Self-hosted websites are doomed to die

I keep wanting to be able to recommend the ‘best’ way for people (who don’t like command lines) to get research stuff online. Is it Flickr, Zenodo, Internet Archive, Wikimedia, and Github? Or is it a shared hosting account on Dreamhost, running MediaWiki, WordPress, and Piwigo? I’d rather the latter! Is it really that hard to set up your own website? (I don’t think so, but I probably can’t see what I can’t see.)

Anyway, even if running your own website, one should still be putting stuff on Wikimedia projects. And even if not using it for everything, Flickr is a good place for photos (in Australia) because you can add them to the Australia in Pictures group and they’ll turn up in searches on Trove. The Internet Archive, even if not a primary and cited place for research materials, is a great place to upload wikis’ public page dumps. So it really seems that the remaining trouble with self-hosting websites is that they’re fragile and subject to complete loss if you abandon them (i.e. stop paying the bills).

My current mitigation to my own sites’ reliance on me is to create annual dumps in multiple formats, including uploading public stuff to IA, and printing some things, and burning all to Blu-ray discs that get stored in polypropylene sleeves in the dark in places I can forget to throw them out. (Of course, I deal in tiny amounts of data, and no video.)

What was it Robert Graves said in I, Claudius about the best way to ensure the survival of a document being to just leave it sitting on ones desk and not try at all to do anything special — because it’s all perfectly random anyway as to what persists, and we can not influence the universe in any meaningful way?


I’ve been attempting to write to people again lately. As in, proper letters on paper and in envelopes and stuck through holes in walls and doors. It doesn’t work though. Ten years ago I wrote to people, and it was reasonably easy although one had to ignore the anachronistic self-consciousness. Now, it feels like writing a telegram, for all the relevance it has to modern life. And doing so on some sort of rare letterpress’d form at that — the mechanics have become harder, the whole thing far less familiar. Where even is there a post box around here? Do stamps still come in booklets? What’s it even cost to send a letter? Only people having weddings send things in the post these days.

I once wrote a little system for writing email-letters. It was a bit like Gmail’s system of having the reply-box at the bottom of the to-and-fro conversation, except it went to further extremes of actually deleting the quoted reply text from emails, and of actually tracking correspondents as entities in their own right and not just by email address. It also prohibited writing to more than one person at once.

It feels like there’s a place for a letter-writing system that really is just email but also isn’t one’s normal email client (be that Fastmail, Gmail, Thunderbird, or whatever). Writing to a friend should be a different act to tapping off a note to a colleague or haggling with a civil servant. The user interface should reflect that. It should be simpler, calmer, and prioritise longer paragraphs and better grammar. (I’ve read similar sentiments relating to the design of the Discourse forum software; the developers of that want the software to shunt people towards better discussions, and I’m pretty sure Google don’t have anything like that idea with the Gmail interface. No one wants to write a letter on a blotter edged with full-colour advertisements for Fletcher’s Fantastic Fumigator, and Google want you to use the exact same interface for work and for social interaction. Doesn’t seem like a good idea to me.)

I’d still be using my email archiver, but it dates from an age before two-factor authentication, and improvements in the security of email providers broke it and I’ve not yet gotten around to fixing it. Perhaps it’s time to do so.

CFB Folder 1 done

The first folder of the C.F. Barker Archives’ material is done: finished scanning and initial entry into ArchivesWiki. This is my attempt to use MediaWiki as a digital archive platform for physical records (and digitally-created ones, although they don’t feature as much in the physical folders). It’s reasonably satisfactory so far, although there’s lots that’s a bit frustrating. I’m attempting to document what I’m doing (in a Wikibook), and there’s more to figure out.

There are a few key parts to it; two stand out as a bit weird. Firstly, the structure of access control is that completely separate wikis are created for each group of access required. This can make it tricky linking things together, but makes for much clearer separation of privacy, and almost removes the possibility of things being inadvertently made public when they shouldn’t be. The second is that the File namespace is not used at all for file descriptions. Files are considered more like ‘attachments’ and their metadata is contained on main-namespace pages, where the files are displayed. This means that files are not considered to be archival items (except of course when they are; i.e. digitally-created ones!), but just representations of them, and for example multiple file types or differently cropped photos can all appear on a single item’s record. The basic idea is to have a single page that encapsulates the entire item (it doesn’t matter if the item is just a single photograph, and the system also works when the ‘item’ is an aggregate item of, for example, a whole box of photos being accessioned into ArchivesWiki).

Stop inventing new ways of doing things

Every so often I write this same thing. It’s Monday (any Monday) and so it’s time to write it again.

The solution to very few problems is to write more code. Usually, it is better to write things in English, explaining whatever the thing is.

This is mainly because it takes a good long while to write code, and continues to take time for as long as the codebase exists. This time is better spent actually doing something — code is pretty much always ‘meta’ work, work that supports other work.

And don’t go saying that all work is like that, because it’s just not. (Hurrumph.) Working on preserving, describing, and storing all the books of the realm is work that has value in itself; writing the software for doing that is meta-work. I’d rather work on the former.

Help archive Wikimedia Commons!

WikiTeam has released an update of the chronological archive of all Wikimedia Commons files, up to 2013. Now ~34 TB in total.

Just seed one or more of these torrents (typically 20-40 GB) and you’ll be like a brick in the Library of Alexandria (or something), doing your bit for permanent preservation of this massive archive.

From this post to wikimedia-l.

What goes Where on the Web

Every now and then I recap on where and what I store online. Today I do so again, while I’m rather feeling that there should be discrete and specific tools for each of the things.

Firstly there are the self-hosted items:

  1. WordPress for blogging (where photo and file attachments should be customized to the exact use in question, not linked from external sites). Is also my OpenID provider.
  2. Piwigo as the primary location for all photographs.
  3. MoonMoon for feed reading (and, hopefully one day, archiving).
  4. MediaWiki for family history sites that are closed-access.
  5. My personal DokuWiki for things that need to be collaboratively edited.

Then the third-party hosts:

  1. OpenStreetMap for map data (GPX traces) and blogging about map-making.
  2. Wikimedia Commons for media of general interest.
  3. The NLA’s Trove for correcting newspaper texts.
  4. Wikisource as a library.
  5. Twitter (although I’m not really sure why I list this here at all).

Finally, I’m still trying to figure out the best system for:

  1. Public family history research. There’s some discussion about this on Meta.

What can I put on Commons?

A strange log, completely devoured by worms. One wonders how it go where it is.

I never quite know what to upload to Wikimedia Commons. They say it accepts files that provide knowledge and are instructional or informative, but that seems so broad. Can I, for example, upload the photo above? It’s just a log that I thought was interesting because it’s so worm-eaten, and so neatly cut at each end (and the further end has two S-shaped steel hooks embedded in it), but it it within the scope of Commons? I’ve no idea. I suspect that it’s not, because it’s not a very good photo and it’s not of interest to anyone other than me. I could upload it, and it might stay there for a while, but surely someone will come along at some point — perhaps years down the track, when I’m no longer interested in it — and do away with it?

So I figure I’m better off uploading it here, where it can stay and be safely ignored by the world. I do just wonder, though, whether much the same line of reasoning can be used for very many photos that might be suitable for Commons. Actually, I don’t wonder it: I do not upload much there because I think what I’ve got to offer really does mostly fall into the same category as this log photo.

So I’ll stick to my own wiki, for now. Plenty of work for me on Wikisource, anyway…