Parser

Parser

Generalization of a web page (not wikitext) parser for the window and worker contexts. Parsing here means "extracting meaningful parts from the page" such as comments, sections, etc. Functions related to wikitext parsing go in module:wikitext.

Constructor

new Parser(context)

Source:

Create a page parser in the provided context.

Parameters:
Name Type Description
context Context

Collection of classes, functions, and other properties that perform the tasks we need in the current context (window or worker).

Methods

(static) init()

Source:

Initialize the class.

Source:

For internal use. Get a user name from a link, along with some other data about a page name.

Parameters:
Name Type Description
element Element | external:Element
Returns:
Type
Parser~ProcessLinkReturn

createComment(signature, targets) → {*}

Source:

Create a comment instance.

Parameters:
Name Type Description
signature Element | external:Element
targets Array.<object>
Returns:
Type
*

createSection(heading, targets, subscriptions) → {*}

Source:

Create a section instance.

Parameters:
Name Type Description
heading object
targets Array.<object>
subscriptions Subscriptions
Returns:
Type
*

findHeadings() → {Array.<object>}

Source:

For internal use. Get all headings on the page.

Returns:
Type
Array.<object>

findSignatures() → {Array.<object>}

Source:

For internal use. Find signatures under the root element.

Characters before the author link, like "—", aren't considered a part of the signature.

Returns:
Type
Array.<object>

findUnsigneds() → {Array.<object>}

Source:

Find outputs of unsigned templates.

Returns:
Type
Array.<object>

getNestingLevel(element) → {number}

Source:

Get a nesting level of an element relative to the root element.

Parameters:
Name Type Description
element Element
Returns:
Type
number

getTopElementsWithText(element, onlyChildrenWithoutCommentLevelopt) → {object}

Source:

With code like this:

* Smth. [signature]
:: Smth. [signature]

one comment (preceded by :: in this case) creates its own list tree, not a subtree, even though it's a reply to a reply. So we dive as deep to the bottom of the hierarchy of nested lists as we can to get the top nodes with comment content (and therefore draw comment layers more accurately). One of the most complex tree structures is this:

 * Smth. [signature]
 :* Smth.
 :: Smth. [signature]

(seen here: https://ru.wikipedia.org/w/index.php?title=Википедия:Форум/Общий&oldid=103760740#201912010211_Mikhail_Ryazanov) It has a branchy structure that requires a tricky algorithm to be parsed correctly.

Parameters:
Name Type Attributes Default Description
element Element | external:Element
onlyChildrenWithoutCommentLevel boolean <optional>
false
Returns:
Type
object

processAndRemoveDtMarkup(bootProcessopt)

Source:

For internal use. Remove some of the elements added by the DiscussionTools extension (even if it is disabled in user preferences) or move them away if the topic subscriptions feature of DT is enabled (to avoid errors being thrown in DT). Prior to that, extract data from them.

CD already parses comment links from notifications (which seems to be this markup's purpose for disabled DT) in BootProcess#processTargets(). Unless the elements prove useful to CD or other scripts, it's better to get rid of them rather than deal with them one by one while parsing.

Parameters:
Name Type Attributes Description
bootProcess BootProcess <optional>

splitParentAfterNode(node) → {Array.<(Element|external:Element)>}

Source:

Turn a structure like this

<dd>
  <div>Comment. [signature]</div>
  <ul>...</ul>
</dd>

into a structure like this

<dd>
  <div>Comment. [signature]</div>
</dd>
<dd>
  <ul>...</ul>
</dd>

by splitting the parent node of the given node, moving all the following nodes into the second node resulting from the split. If there is no following nodes, don't perform the split.

Parameters:
Name Type Description
node Element | external:Element

Reference node.

Returns:

The parent nodes resultant from the split (at least one).

Type
Array.<(Element|external:Element)>

Type Definitions

ProcessLinkReturn

Source:
Properties:
Name Type Attributes Description
userName string

User name.

linkType string <nullable>

Link type (user, userTalk, contribs, userSubpage, userTalkSubpage, or any of this Foreign at the end).

Type:
  • Array.<string>

Timestamp

Source:
Properties:
Name Type Attributes Description
element Element | external:Element
date Date
match object <optional>
Type:
  • object