2432bae50b36a21ea1805ca4e5636b395e1809d9
[lhc/web/wiklou.git] / docs / title.txt
1 title.txt
2
3 The MediaWiki software's "Title" class represents article
4 titles, which are used for many purposes: as the human-readable
5 text title of the article, in the URL used to access the article,
6 the wikitext link to the article, the key into the article
7 database, and so on. The class in instantiated from one of
8 these forms and can be queried for the others, and for other
9 attributes of the title. This is intended to be an
10 immutable "value" class, so there are no mutator functions.
11
12 To get a new instance, call Title::newFromText(). Once instantiated,
13 the non-static accessor methods can be used, such as getText(),
14 getDBKey(), getNamespace(), etc. Note that Title::newFromText() may
15 return false if the text is illegal according to the rules below.
16
17 The prefix rules: a title consists of an optional interwiki
18 prefix (such as "m:" for meta or "de:" for German), followed
19 by an optional namespace, followed by the remainder of the
20 title. Both interwiki prefixes and namespace prefixes have
21 the same rules: they contain only letters, digits, space, and
22 underscore, must start with a letter, are case insensitive,
23 and spaces and underscores are interchangeable. Prefixes end
24 with a ":". A prefix is only recognized if it is one of those
25 specifically allowed by the software. For example, "de:name"
26 is a link to the article "name" in the German Wikipedia, because
27 "de" is recognized as one of the allowable interwikis. The
28 title "talk:name" is a link to the article "name" in the "talk"
29 namespace of the current wiki, because "talk" is a recognized
30 namespace. Both may be present, and if so, the interwiki must
31 come first, for example, "m:talk:name". If a title begins with
32 a colon as its first character, no prefixes are scanned for,
33 and the colon is just removed. Note that because of these
34 rules, it is possible to have articles with colons in their
35 names. "E. Coli 0157:H7" is a valid title, as is "2001: A Space
36 Odyssey", because "E. Coli 0157" and "2001" are not valid
37 interwikis or namespaces.
38
39 It is not possible to have an article whose bare name includes
40 a namespace or interwiki prefix.
41
42 An initial colon in a title listed in wiki text may however
43 suppress special handling for interlanguage links, image links,
44 and category links. It is also used to indicate the main
45 namespace in template inclusions.
46
47 Once prefixes have been stripped, the rest of the title processed
48 this way:
49
50 * Spaces and underscores are treated as equivalent and each
51 is converted to the other in the appropriate context (underscore in
52 URL and database keys, spaces in plain text).
53 * Multiple consecutive spaces are converted to a single space.
54 * Leading or trailing space is removed.
55 * If $wgCapitalLinks is enabled (the default), the first letter is
56 capitalised, using the capitalisation function of the content language
57 object.
58 * The unicode characters LRM (U+200E) and RLM (U+200F) are silently
59 stripped.
60 * Invalid UTF-8 sequences or instances of the replacement character
61 (U+FFFD) are considered illegal.
62 * A percent sign followed by two hexadecimal characters is illegal
63 * Anything that looks like an XML/HTML character reference is illegal
64 * Any character not matched by the $wgLegalTitleChars regex is illegal
65 * Zero-length titles (after whitespace stripping) are illegal
66
67 All titles except special pages must be less than 255 bytes when
68 encoded with UTF-8, because that is the size of the database field.
69 Special page titles may be up to 512 bytes.
70
71 Note that Unicode Normal Form C (NFC) is enforced by MediaWiki's user
72 interface input functions, and so titles will typically be in this
73 form.
74
75 getArticleID() needs some explanation: for "internal" articles,
76 it should return the "page_id" field if the article exists, else
77 it returns 0. For all external articles it returns 0. All of
78 the IDs for all instances of Title created during a request are
79 cached, so they can be looked up quickly while rendering wiki
80 text with lots of internal links. See linkcache.txt.