Proper namespace handling for WikiImporter
authorThis, that and the other <at.light@live.com.au>
Wed, 10 Dec 2014 11:24:47 +0000 (22:24 +1100)
committerThis, that and the other <at.light@live.com.au>
Wed, 10 Dec 2014 11:24:47 +0000 (22:24 +1100)
commit37b4cd5da2c3adfd0207ad74a2212fe7292341ca
tree66e6f5a5aa07d50e721603c3e5c43bed0268f84d
parentb0b745537265d4a673a2575245e0a8604b486ea0
Proper namespace handling for WikiImporter

Up until now, the import backend has tried to resolve titles in the XML
data using the regular Title class. This is a disastrous idea, as local
namespace names often do not match foreign namespace titles.

There is enough metadata present in XML dumps generated by modern MW
versions for the target namespace ID and name to be reliably determined.
This metadata is contained in the <siteinfo> and <ns> tags, which
(unbelievably enough) was totally ignored by WikiImporter until now.
Fallbacks are provided for older XML dump versions which may be missing
some or all of this metadata.

The ForeignTitle class is introduced. This is intended specifically for
the resolution of titles on foreign wikis. In the future, an
InterwikiTitle class could be added, which would inherit ForeignTitle
and add members for the interwiki prefix and fragment.

Factory classes to generate ForeignTitle objects from string data, and
Title objects from ForeignTitle objects, are also added.

The 'AfterImportPage' hook has been modified so the second argument is a
ForeignTitle object instead of a Title (the documentation was wrong,
it was never a string). LiquidThreads, SMW and FacetedSearch all use this
hook but none of them use the $origTitle parameter.

Bug: T32723
Bug: T42192
Change-Id: Iaa58e1b9fd7287cdf999cef6a6f3bb63cd2a4778
19 files changed:
autoload.php
docs/hooks.txt
includes/Import.php
includes/specials/SpecialImport.php
includes/title/ForeignTitle.php [new file with mode: 0644]
includes/title/ForeignTitleFactory.php [new file with mode: 0644]
includes/title/ImportTitleFactory.php [new file with mode: 0644]
includes/title/NaiveForeignTitleFactory.php [new file with mode: 0644]
includes/title/NaiveImportTitleFactory.php [new file with mode: 0644]
includes/title/NamespaceAwareForeignTitleFactory.php [new file with mode: 0644]
includes/title/NamespaceImportTitleFactory.php [new file with mode: 0644]
includes/title/SubpageImportTitleFactory.php [new file with mode: 0644]
tests/phpunit/includes/ImportTest.php
tests/phpunit/includes/title/ForeignTitleTest.php [new file with mode: 0644]
tests/phpunit/includes/title/NaiveForeignTitleFactoryTest.php [new file with mode: 0644]
tests/phpunit/includes/title/NaiveImportTitleFactoryTest.php [new file with mode: 0644]
tests/phpunit/includes/title/NamespaceAwareForeignTitleFactoryTest.php [new file with mode: 0644]
tests/phpunit/includes/title/NamespaceImportTitleFactoryTest.php [new file with mode: 0644]
tests/phpunit/includes/title/SubpageImportTitleFactoryTest.php [new file with mode: 0644]