New maintenance script to clean up rows with invalid DB keys
authorThis, that and the other <at.light@live.com.au>
Fri, 10 Mar 2017 13:27:27 +0000 (00:27 +1100)
committerThis, that and the other <at.light@live.com.au>
Fri, 10 Mar 2017 13:27:27 +0000 (00:27 +1100)
commit6519c42d248a78d2d42edee1beb21f926d227044
treee3a253d152f9be7a37e6097e4a25edf2dd196cae
parent97906621aa0a2abc566d831dd83a725c4c13ade1
New maintenance script to clean up rows with invalid DB keys

The TitleValue constructor, used by the link cache among other things,
throws an exception for DB keys which do not satisfy a simple sanity test
(starting or ending with _, or containing a space, tab, CR or LF
character). This has broken certain special pages on a number of WMF sites;
see T99736, T146778 and T155091.

The new cleanupInvalidDbKeys.php script allows these bogus entries to be
removed from the DB, making sure these exceptions won't be thrown in the
future. It cleans the title columns of the page, archive, redirect,
logging, category, protected_titles, recentchanges, watchlist, pagelinks,
templatelinks, and categorylinks tables.

The script doesn't support batching; most wikis should have fewer than 500
broken entries in each table. If need be, the script can be run several
times.

To make the LIKE queries work properly I had to fix the broken escaping
behaviour of Database::buildLike() -- previously it had a habit of double-
escaping things. Now an ESCAPE clause is added to change the escape
character from the problematic default backslash, and tests are added to
cover the changes.

Bug: T155091
Change-Id: I908e795e884e35be91852c0eaf056d6acfda31d8
autoload.php
includes/libs/rdbms/database/Database.php
includes/libs/rdbms/database/DatabaseMssql.php
includes/libs/rdbms/database/DatabaseMysqlBase.php
includes/libs/rdbms/database/DatabaseSqlite.php
maintenance/cleanupInvalidDbKeys.php [new file with mode: 0644]
tests/phpunit/includes/db/DatabaseSQLTest.php