Wrong date? Just add 3½ days

More PHP date weirdness, this time in the Cargo extension for MediaWiki:

+		// 'o' is better than 'Y' because it does not add leading
+		// zeroes to years with fewer than four digits.
+		// For some reason, though, this fails for some years -
+		// returning one year lower than it's supposed to - unless you
+		// add the equivalent of 3 days or more to the number of
+		// seconds. Is that a leap day thing? Weird PHP bug? Who knows.
+		// Anyway, it's easy to get around.
+		$yearString = date( 'o', $seconds + 300000 );

MediaWiki with two database servers

I’ve been trying to replicate locally a bug with MediaWiki’s GlobalPreferences extension. The bug is about the increased number of database reads that happen when the extension is loaded, and the increase happens not on the database table that stores the global preferences (as might be expected) but rather on the ‘local’ tables. However, locally I’ve had all of these running on the same database server, which makes it hard to watch the standard monitoring tools to see differences; so, I set things up on two database servers locally.

Firstly, this was a matter of starting a new MySQL server in a Docker container (accessible at 127.0.0.1:3305 and with its data in a local directory so I could destroy and recreate the container as required):

docker run -it -e MYSQL_ROOT_PASSWORD=pwd123 -p3305:3306 -v$PWD/mysqldata:/var/lib/mysql mysql

(Note that because we’re keeping local data, root’s password is only set on the first set-up, and so the MYSQL_ROOT_PASSWORD can be left off future invocations of this command.)

Then it’s a matter of setting up MediaWiki to use the two servers:

$wgLBFactoryConf = [
	'class' => 'LBFactory_Multi',
	'sectionsByDB' => [
		// Map of database names to section names.
		'mediawiki_wiki1' => 's1',
		'wikimeta' => 's2',
	],
	'sectionLoads' => [
		// Map of sections to server-name/load pairs.
		'DEFAULT' => [ 'localdb'  => 0 ],
		's1' => [ 'localdb'  => 0 ],
		's2' => [ 'metadb' => 0 ],
	],
	'hostsByName' => [
		// Map of server-names to IP addresses (and, in this case, ports).
		'localdb' => '127.0.0.1:3306',
		'metadb' => '127.0.0.1:3305',
	],
	'serverTemplate' => [
		'dbname'        => $wgDBname,
		'user'          => $wgDBuser,
		'password'      => $wgDBpassword,
		'type'          => 'mysql',
		'flags'         => DBO_DEFAULT,
		'max lag'       => 30,
	],
];
$wgGlobalPreferencesDB = 'wikimeta';

New MediaWiki extension: AutoCategoriseUploads

New MediaWiki extension: AutoCategoriseUploads. It “automatically adds categories to new file uploads based on keyword metadata found in the file. The following metadata types are supported: XMP (many file types, including JPG, PNG, PDF, etc.); ITCP (JPG); ID3 (MP3)”.

Unfortunately there’s no code yet in the repository, so there’s nothing to test. Sounds interesting though.

Extension:DocBookExport

There’s a new extension recently been added to mediawiki.org, called DocBookExport. It provides a system of defining a book’s structure (a set of pages and some title and other metadata) and then pipes the pages’ HTML through Pandoc and out into DocBook format, from where it can be turned into PDF or just downloaded as-is.

There are a few issues with getting the extension to run (e.g. it wants to write to its own directory, rather than a normal place for temporary files), and I haven’t actually managed to get it fully functioning. But the idea is interesting. Certainly, there are some limitations with Pandoc, but mostly it’s remarkably good at converting things.

It seems that DocBookExport, and any other MediaWiki export or format conversion system, works best when the wiki pages (and their templates etc.) are written with the output formats in mind. Then, one can avoid things such as web-only formatting conventions that make PDF (or epub, or man page) generation trickier.

CFB Folder 1 done

The first folder of the C.F. Barker Archives’ material is done: finished scanning and initial entry into ArchivesWiki. This is my attempt to use MediaWiki as a digital archive platform for physical records (and digitally-created ones, although they don’t feature as much in the physical folders). It’s reasonably satisfactory so far, although there’s lots that’s a bit frustrating. I’m attempting to document what I’m doing (in a Wikibook), and there’s more to figure out.

There are a few key parts to it; two stand out as a bit weird. Firstly, the structure of access control is that completely separate wikis are created for each group of access required. This can make it tricky linking things together, but makes for much clearer separation of privacy, and almost removes the possibility of things being inadvertently made public when they shouldn’t be. The second is that the File namespace is not used at all for file descriptions. Files are considered more like ‘attachments’ and their metadata is contained on main-namespace pages, where the files are displayed. This means that files are not considered to be archival items (except of course when they are; i.e. digitally-created ones!), but just representations of them, and for example multiple file types or differently cropped photos can all appear on a single item’s record. The basic idea is to have a single page that encapsulates the entire item (it doesn’t matter if the item is just a single photograph, and the system also works when the ‘item’ is an aggregate item of, for example, a whole box of photos being accessioned into ArchivesWiki).

Display Title extension

The MediaWiki Display Title extension is pretty cool. It uses a page’s display title in all links to that page. That might not sound like much, but it’s really useful to only have to change the title in one place, and have it show correctly all over the wiki. (This is much the same as Dokuwiki with the useheading configuration variable set to 1).

This is the sort of extension that I really like: it does a small thing, but does it well, and it makes sense as an addition to the core software. It’s not trying to do something completely different and just sit on top of or inside MediaWiki. It’s also not something that everyone would want, and so does belong as an extension and not an addition to core (even though the display title feature is part of core).

The other thing the Display Title extension provides is a parser function for retrieving the display title of any page: {{#getdisplaytitle:A page name}}, so you can use the display title without creating a link.

Jazz and the MediaWiki package

And rain, I mustn’t forget the rain. I’m worrying about the roof, although far less than I used to (it’s a different roof). The jazz is the radio; it’s on.

But the main point this morning is exploring the mediawiki-lts package maintained by Legoktm. I’ve been meaning to look at it for a while, and switch my (non-playground) wikis over to it, but there’s never enough time. Not that there’s enough time now, but I’m just trying to get it running locally for two wikis (yes, the smallest possible farm).

So, in simple steps, I first added the PPA:

sudo add-apt-repository ppa:legoktm/mediawiki-lts

This created /etc/apt/sources.list.d/legoktm-ubuntu-mediawiki-lts-xenial.list. Then I updated the package info:

sudo apt-get update

And installed the package:

sudo apt install mediawiki

At this point, the installation prompt for MediaWiki 1.27.3 was available at http://localhost/mediawiki/ (which luckily doesn’t conflict with anything I already had locally) and I stepped through the installer, creating a new database and DB user via phpMyAdmin as I went, and answering all the questions appropriately. (It’s actually been a while since I last saw the installer properly.) The only tricky thing I found was that it asks for the “Directory for deleted files” but not for the actual directory for all files — because I want the files to be stored in a particular place and not in /usr/share/mediawiki/images/, especially as I want there to be two different wikis that don’t share files.

I made a typo in my database username in the installation form, and got a “Access denied for user x to database y” error. I hit the browser’s back button, and then the installer’s back buttons, to go back to the relevant page in the installer, fixed the typo and proceeded. It remembered everything correctly, and this time installed the database tables, with only one error. This was “Notice: JobQueueGroup::__destruct: 1 buffered job(s) of type(s) RecentChangesUpdateJob never inserted. in /usr/share/mediawiki/includes/jobqueue/JobQueueGroup.php on line 447”. Didn’t seem to matter.

At the end of the installer, it prompted me to download LocalSettings.php and put it at /etc/mediawiki/LocalSettings.php which I did:

 sudo mv ~/LocalSettings.php /etc/mediawiki/.
 sudo chown root:root /etc/mediawiki/LocalSettings.php
 sudo chmod 644 /etc/mediawiki/LocalSettings.php

And then I had a working wiki at http://localhost/mediawiki/index.php!

Configuring

I wanted a different URL, so edited /etc/apache2/sites-available/000-default.conf (in order to not modify the package-provided /etc/mediawiki/mediawiki.conf) to add:

Alias /mywiki /var/lib/mediawiki

And changed the following in LocalSettings.php:

$wgScriptPath = "/mywiki";

The multiple wikis will have to wait until later, as will the backup regime.

MediaWiki Documentation Day 2017

It’s MediaWiki Documentation Day 2017!

So I’ve been documenting a couple of things, and I’ve added a bit to the Xtools manual.

The latter is actually really useful, not so much from the end-user’s point of view because I dare say they’ll never read it, but I always like writing documentation before coding. It makes the goal so much more clear in my mind, and then the coding is much easier. With agreed-upon documentation, writing tests is easier; with tests written, writing the code is easier.

Time for a beer — and I’ll drink to DFD (document first development)! Oh, and semantic linebreaks are great.

Editing MediaWiki pages in an external editor

I’ve been working on a MediaWiki gadget lately, for editing Wikisource authors’ metadata without leaving the author page. It’s fun working with and learning more about OOjs-UI, but it’s also a pain because gadget code is kept in Javascript pages in the MediaWiki namespace, and so every single time you want to change something it’s a matter of saving the whole page, then clicking ‘edit’ again, and scrolling back down to find the spot you were at. The other end of things—the re-loading of whatever test page is running the gadget—is annoying and slow enough, without having to do much the same thing at the source end too.

So I’ve added a feature to the ExternalArticles extension that allows a whole directory full of text files to be imported at once (namespaces are handled as subdirectories). More importantly, it also ‘watches’ the directories and every time a file is updated (i.e. with Ctrl-S in a text editor or IDE) it is re-imported. So this means I can have MediaWiki:Gadget-Author.js and MediaWiki:Gadget-Author.css open in PhpStorm, and just edit from there. I even have these files open inside a MediaWiki project and so autocompletion and documentation look-up works as usual for all the library code. It’s even quite a speedy set-up, luckily: I haven’t yet noticed having to wait at any time between saving some code, alt-tabbing to the browser, and hitting F5.

I dare say my bodged-together script has many flaws, but it’s working for me for now!

What goes Where on the Web

Every now and then I recap on where and what I store online. Today I do so again, while I’m rather feeling that there should be discrete and specific tools for each of the things.

Firstly there are the self-hosted items:

  1. WordPress for blogging (where photo and file attachments should be customized to the exact use in question, not linked from external sites). Is also my OpenID provider.
  2. Piwigo as the primary location for all photographs.
  3. MoonMoon for feed reading (and, hopefully one day, archiving).
  4. MediaWiki for family history sites that are closed-access.
  5. My personal DokuWiki for things that need to be collaboratively edited.

Then the third-party hosts:

  1. OpenStreetMap for map data (GPX traces) and blogging about map-making.
  2. Wikimedia Commons for media of general interest.
  3. The NLA’s Trove for correcting newspaper texts.
  4. Wikisource as a library.
  5. Twitter (although I’m not really sure why I list this here at all).

Finally, I’m still trying to figure out the best system for:

  1. Public family history research. There’s some discussion about this on Meta.