Fremantle
· Wikimedia · Western Australia · datsets · open data ·
I noticed the other day that the Western Australian Biographical Index is licensed under CC-BY, so I thought I'd try to copy relevant entries to Freopedia (and other things). I downloaded the 18 CSV files:
A-final.csv DE-final.csv H-final.csv L-final_edited.csv O-final.csv S-final.csv B-final.csv F-final.csv IJ-final.csv M-final.csv PQ-final.csv T-final.csv C-final.csv G-final.csv K-final.csv N-final.csv R-final.csv UVXYZ-final.csv
Combined them into one, without their header rows (which were confirmed to exist before doing this):
$ awk '(NR == 1) || (FNR > 1)' *.csv > wabi.csv
This was imported into OpenRefine, and resulted in 85,403 records.
Found duplicates by sorting, applying "reorder rows permanently", and then "edit cells" > "blank down". The blanks can then be faceted on, and 421 duplicates were found, e.g. PQ/P2626 (where the second here is the correct record):
POCOCK Ruth Elsie May b. 1900. m. N.S.W. 1928 Edwin Lennard MINCHIN
vs.
PLUSH Edward, son of Thomas Hall (artist). arr 18.3.1886 per Albany (steerage) from SA - listed as G. Plush. m. 1.1.1890 (Perth C/E) Amelia GOLDING, dtr. of William (gardener). PERTH painter. Joined the Police force 1886.
That meant there were 84,982 unique cards.
These were imported to a Mix'n'Match catalogue: https://mix-n-match.toolforge.org/#/catalog/6490 For this, the card text had to be truncated.
I proposed a new property on Wikidata, and it was approved and created a week or so after.
Now the task is to link items to the WABI, probably starting with any mentioning Fremantle.