Use new externallinks.el_index_60 field
authorBrad Jorsch <bjorsch@wikimedia.org>
Sat, 19 Nov 2016 00:50:43 +0000 (19:50 -0500)
committerTim Starling <tstarling@wikimedia.org>
Mon, 12 Nov 2018 22:33:18 +0000 (22:33 +0000)
commitd65e96b7638279f7884dec827b75096bd8f2683a
tree70cdc325efb6544faa3d6c2da2d075a5d37b2dbc
parentd4e88508689aebfc9653f3a260428458e6fe15ad
Use new externallinks.el_index_60 field

This adds a method to LinkFilter to build the query conditions necessary
to properly use it, and adjusts code to use it.

This also takes the opportunity to clean up the calculation of el_index:
IPs are handled more sensibly and IDNs are canonicalized.

Also weird edge cases for invalid hosts like "http://.example.com" and
corresponding searches like "http://*..example.com" are now handled more
regularly instead of being treated as if the extra dot were omitted,
while explicit specification of the DNS root like "http://example.com./"
is canonicalized to the usual implicit specification.

Note that this patch will break link searches for links where the host
is an IP or IDN until refreshExternallinksIndex.php is run.

Bug: T59176
Bug: T130482
Change-Id: I84d224ef23de22dfe179009ec3a11fd0e4b5f56d
19 files changed:
RELEASE-NOTES-1.33
autoload.php
includes/GlobalFunctions.php
includes/LinkFilter.php
includes/api/ApiQueryBase.php
includes/api/ApiQueryExtLinksUsage.php
includes/api/ApiQueryExternalLinks.php
includes/deferred/LinksUpdate.php
includes/installer/DatabaseUpdater.php
includes/parser/Parser.php
includes/specials/SpecialLinkSearch.php
maintenance/cleanupSpam.php
maintenance/deleteSelfExternals.php
maintenance/mssql/tables.sql
maintenance/refreshExternallinksIndex.php [new file with mode: 0644]
maintenance/tables.sql
tests/phpunit/includes/GlobalFunctions/GlobalTest.php
tests/phpunit/includes/LinkFilterTest.php
tests/phpunit/includes/parser/ParserMethodsTest.php