CssToAttributeConverter
extends AbstractHtmlProcessor
in package
This HtmlProcessor can convert style HTML attributes to the corresponding other visual HTML attributes, e.g. it converts style="width: 100px" to width="100".
It will only add attributes, but leaves the style attribute untouched.
To trigger the conversion, call the convertCssToVisualAttributes method.
Table of Contents
- CONTENT_TYPE_META_TAG = '<meta http-equiv="Content-Type" content="text/html; charset=utf-8">'
- DEFAULT_DOCUMENT_TYPE = '<!DOCTYPE html>'
- HTML_COMMENT_PATTERN = '/<!--[^-]*+(?:-(?!->)[^-]*+)*+(?:-->|$)/'
- regular expression pattern to match an HTML comment, including delimiters and modifiers
- HTML_TEMPLATE_ELEMENT_PATTERN = '%<template[\s>][^<]*+(?:<(?!/template>)[^<]*+)*+(?:</template>|$)%i'
- regular expression pattern to match an HTML `<template>` element, including delimiters and modifiers
- PHP_UNRECOGNIZED_VOID_TAGNAME_MATCHER = '(?:command|embed|keygen|source|track|wbr)'
- TAGNAME_ALLOWED_BEFORE_BODY_MATCHER = '(?:html|head|base|command|link|meta|noscript|script|style|template|title)'
- Regular expression part to match tag names that may appear before the start of the `<body>` element. A start tag for any other element would implicitly start the `<body>` element due to tag omission rules.
- $domDocument : DOMDocument|null
-
$cssToHtmlMap
: array<string, array{attribute: string, nodes?: array
, values?: array }> - This multi-level array contains simple mappings of CSS properties to HTML attributes. If a mapping only applies to certain HTML nodes or only for certain values, the mapping is an object with an allowlist of nodes and values.
- $xPath : DOMXPath|null
- convertCssToVisualAttributes() : $this
- Maps the CSS from the style nodes to visual HTML attributes.
- fromDomDocument() : static
- Builds a new instance from the given DOM document.
- fromHtml() : static
- Builds a new instance from the given HTML.
- getDomDocument() : DOMDocument
- Provides access to the internal DOMDocument representation of the HTML in its current state.
- render() : string
- Renders the normalized and processed HTML.
- renderBodyContent() : string
- Renders the content of the BODY element of the normalized and processed HTML.
- getHtmlElement() : DOMElement
- Returns the HTML element.
- getXPath() : DOMXPath
- __construct() : mixed
- The constructor.
- addContentTypeMetaTag() : string
- Adds a Content-Type meta tag for the charset.
- createRawDomDocument() : void
- Creates a DOMDocument instance from the given HTML and stores it in $this->domDocument.
- createUnifiedDomDocument() : void
- Creates a DOM document from the given HTML and stores it in $this->domDocument.
- ensureDocumentType() : string
- Makes sure that the passed HTML has a document type, with lowercase "html".
- ensureExistenceOfBodyElement() : void
- Checks that $this->domDocument has a BODY element and adds it if it is missing.
- ensurePhpUnrecognizedSelfClosingTagsAreXml() : string
- Makes sure that any self-closing tags not recognized as such by PHP's DOMDocument implementation have a self-closing slash.
- getAllNodesWithStyleAttribute() : DOMNodeList
- Returns a list with all DOM nodes that have a style attribute.
- getBodyElement() : DOMElement
- Returns the BODY element.
- hasContentTypeMetaTagInHead() : bool
- Tests whether the given HTML has a valid `Content-Type` metadata element within the `<head>` element. Due to tag omission rules, HTML parsers are expected to end the `<head>` element and start the `<body>` element upon encountering a start tag for any element which is permitted only within the `<body>`.
- hasEndOfHeadElement() : bool
- Tests whether the `<head>` element ends within the given HTML. Due to tag omission rules, HTML parsers are expected to end the `<head>` element and start the `<body>` element upon encountering a start tag for any element which is permitted only within the `<body>`.
- isTableOrImageNode() : bool
- mapBackgroundProperty() : void
- mapBorderProperty() : void
- mapComplexCssProperty() : void
- Maps CSS properties that need special transformation to an HTML attribute.
- mapCssToHtmlAttribute() : void
- Tries to apply the CSS style to $node as an attribute.
- mapCssToHtmlAttributes() : void
- Applies $styles to $node.
- mapMarginProperty() : void
- mapSimpleCssProperty() : bool
- Looks up the CSS property in the mapping table and maps it if it matches the conditions.
- mapWidthOrHeightProperty() : void
- normalizeDocumentType() : string
- Makes sure the document type in the passed HTML has lowercase "html".
- parseCssShorthandValue() : array<string, string>
- Parses a shorthand CSS value and splits it into individual values. For example: `padding: 0 auto;` - `0 auto` is split into top: 0, left: auto, bottom: 0, right: auto.
- prepareHtmlForDomConversion() : string
- Returns the HTML with added document type, Content-Type meta tag, and self-closing slashes, if needed, ensuring that the HTML will be good for creating a DOM document from it.
- removeHtmlComments() : string
- Removes comments from the given HTML, including any which are unterminated, for which the remainder of the string is removed.
- removeHtmlTemplateElements() : string
- Removes `<template>` elements from the given HTML, including any without an end tag, for which the remainder of the string is removed.
- removeSelfClosingTagsClosingTags() : string
- Eliminates any invalid closing tags for void elements from the given HTML.
- setDomDocument() : void
- setHtml() : void
- Sets the HTML to process.
Constants
CONTENT_TYPE_META_TAG
protected
string
CONTENT_TYPE_META_TAG
= '<meta http-equiv="Content-Type" content="text/html; charset=utf-8">'
DEFAULT_DOCUMENT_TYPE
protected
string
DEFAULT_DOCUMENT_TYPE
= '<!DOCTYPE html>'
HTML_COMMENT_PATTERN
regular expression pattern to match an HTML comment, including delimiters and modifiers
protected
string
HTML_COMMENT_PATTERN
= '/<!--[^-]*+(?:-(?!->)[^-]*+)*+(?:-->|$)/'
HTML_TEMPLATE_ELEMENT_PATTERN
regular expression pattern to match an HTML `<template>` element, including delimiters and modifiers
protected
string
HTML_TEMPLATE_ELEMENT_PATTERN
= '%<template[\s>][^<]*+(?:<(?!/template>)[^<]*+)*+(?:</template>|$)%i'
PHP_UNRECOGNIZED_VOID_TAGNAME_MATCHER
protected
string
PHP_UNRECOGNIZED_VOID_TAGNAME_MATCHER
= '(?:command|embed|keygen|source|track|wbr)'
Tags
TAGNAME_ALLOWED_BEFORE_BODY_MATCHER
Regular expression part to match tag names that may appear before the start of the `<body>` element. A start tag for any other element would implicitly start the `<body>` element due to tag omission rules.
protected
string
TAGNAME_ALLOWED_BEFORE_BODY_MATCHER
= '(?:html|head|base|command|link|meta|noscript|script|style|template|title)'
Properties
$domDocument
protected
DOMDocument|null
$domDocument
= null
$cssToHtmlMap
This multi-level array contains simple mappings of CSS properties to HTML attributes. If a mapping only applies to certain HTML nodes or only for certain values, the mapping is an object with an allowlist of nodes and values.
private
array<string, array{attribute: string, nodes?: array, values?: array}>
$cssToHtmlMap
= ['background-color' => ['attribute' => 'bgcolor'], 'text-align' => ['attribute' => 'align', 'nodes' => ['p', 'div', 'td', 'th'], 'values' => ['left', 'right', 'center', 'justify']], 'float' => ['attribute' => 'align', 'nodes' => ['table', 'img'], 'values' => ['left', 'right']], 'border-spacing' => ['attribute' => 'cellspacing', 'nodes' => ['table']]]
$xPath
private
DOMXPath|null
$xPath
= null
Methods
convertCssToVisualAttributes()
Maps the CSS from the style nodes to visual HTML attributes.
public
convertCssToVisualAttributes() : $this
Return values
$this —fromDomDocument()
Builds a new instance from the given DOM document.
public
static fromDomDocument(DOMDocument $document) : static
Parameters
- $document : DOMDocument
-
a DOM document returned by getDomDocument() of another instance
Return values
static —fromHtml()
Builds a new instance from the given HTML.
public
static fromHtml(string $unprocessedHtml) : static
Parameters
- $unprocessedHtml : string
-
raw HTML, must be UTF-encoded, must not be empty
Tags
Return values
static —getDomDocument()
Provides access to the internal DOMDocument representation of the HTML in its current state.
public
getDomDocument() : DOMDocument
Tags
Return values
DOMDocument —render()
Renders the normalized and processed HTML.
public
render() : string
Return values
string —renderBodyContent()
Renders the content of the BODY element of the normalized and processed HTML.
public
renderBodyContent() : string
Return values
string —getHtmlElement()
Returns the HTML element.
protected
getHtmlElement() : DOMElement
This method assumes that there always is an HTML element, throwing an exception otherwise.
Tags
Return values
DOMElement —getXPath()
protected
getXPath() : DOMXPath
Tags
Return values
DOMXPath —__construct()
The constructor.
private
__construct() : mixed
Please use ::fromHtml or ::fromDomDocument instead.
Return values
mixed —addContentTypeMetaTag()
Adds a Content-Type meta tag for the charset.
private
addContentTypeMetaTag(string $html) : string
This method also ensures that there is a HEAD element.
Parameters
- $html : string
Return values
string — the HTML with the meta tag addedcreateRawDomDocument()
Creates a DOMDocument instance from the given HTML and stores it in $this->domDocument.
private
createRawDomDocument(string $html) : void
Parameters
- $html : string
Return values
void —createUnifiedDomDocument()
Creates a DOM document from the given HTML and stores it in $this->domDocument.
private
createUnifiedDomDocument(string $html) : void
The DOM document will always have a BODY element and a document type.
Parameters
- $html : string
Return values
void —ensureDocumentType()
Makes sure that the passed HTML has a document type, with lowercase "html".
private
ensureDocumentType(string $html) : string
Parameters
- $html : string
Return values
string — HTML with document typeensureExistenceOfBodyElement()
Checks that $this->domDocument has a BODY element and adds it if it is missing.
private
ensureExistenceOfBodyElement() : void
Tags
Return values
void —ensurePhpUnrecognizedSelfClosingTagsAreXml()
Makes sure that any self-closing tags not recognized as such by PHP's DOMDocument implementation have a self-closing slash.
private
ensurePhpUnrecognizedSelfClosingTagsAreXml(string $html) : string
Parameters
- $html : string
Return values
string — HTML with problematic tags converted.getAllNodesWithStyleAttribute()
Returns a list with all DOM nodes that have a style attribute.
private
getAllNodesWithStyleAttribute() : DOMNodeList
Return values
DOMNodeList —getBodyElement()
Returns the BODY element.
private
getBodyElement() : DOMElement
This method assumes that there always is a BODY element.
Tags
Return values
DOMElement —hasContentTypeMetaTagInHead()
Tests whether the given HTML has a valid `Content-Type` metadata element within the `<head>` element. Due to tag omission rules, HTML parsers are expected to end the `<head>` element and start the `<body>` element upon encountering a start tag for any element which is permitted only within the `<body>`.
private
hasContentTypeMetaTagInHead(string $html) : bool
Parameters
- $html : string
Return values
bool —hasEndOfHeadElement()
Tests whether the `<head>` element ends within the given HTML. Due to tag omission rules, HTML parsers are expected to end the `<head>` element and start the `<body>` element upon encountering a start tag for any element which is permitted only within the `<body>`.
private
hasEndOfHeadElement(string $html) : bool
Parameters
- $html : string
Tags
Return values
bool —isTableOrImageNode()
private
isTableOrImageNode(DOMElement $node) : bool
Parameters
- $node : DOMElement
Return values
bool —mapBackgroundProperty()
private
mapBackgroundProperty(DOMElement $node, string $value) : void
Parameters
- $node : DOMElement
-
node to apply styles to
- $value : string
-
the value of the style rule to map
Return values
void —mapBorderProperty()
private
mapBorderProperty(DOMElement $node, string $value) : void
Parameters
- $node : DOMElement
-
node to apply styles to
- $value : string
-
the value of the style rule to map
Return values
void —mapComplexCssProperty()
Maps CSS properties that need special transformation to an HTML attribute.
private
mapComplexCssProperty(string $property, string $value, DOMElement $node) : void
Parameters
- $property : string
-
the name of the CSS property to map
- $value : string
-
the value of the style rule to map
- $node : DOMElement
-
node to apply styles to
Return values
void —mapCssToHtmlAttribute()
Tries to apply the CSS style to $node as an attribute.
private
mapCssToHtmlAttribute(string $property, string $value, DOMElement $node) : void
This method maps a CSS rule to HTML attributes and adds those to the node.
Parameters
- $property : string
-
the name of the CSS property to map
- $value : string
-
the value of the style rule to map
- $node : DOMElement
-
node to apply styles to
Return values
void —mapCssToHtmlAttributes()
Applies $styles to $node.
private
mapCssToHtmlAttributes(array<string, string> $styles, DOMElement $node) : void
This method maps CSS styles to HTML attributes and adds those to the node.
Parameters
- $styles : array<string, string>
-
the new CSS styles taken from the global styles to be applied to this node
- $node : DOMElement
-
node to apply styles to
Return values
void —mapMarginProperty()
private
mapMarginProperty(DOMElement $node, string $value) : void
Parameters
- $node : DOMElement
-
node to apply styles to
- $value : string
-
the value of the style rule to map
Return values
void —mapSimpleCssProperty()
Looks up the CSS property in the mapping table and maps it if it matches the conditions.
private
mapSimpleCssProperty(string $property, string $value, DOMElement $node) : bool
Parameters
- $property : string
-
the name of the CSS property to map
- $value : string
-
the value of the style rule to map
- $node : DOMElement
-
node to apply styles to
Return values
bool — true if the property can be mapped using the simple mapping tablemapWidthOrHeightProperty()
private
mapWidthOrHeightProperty(DOMElement $node, string $value, string $property) : void
Parameters
- $node : DOMElement
-
node to apply styles to
- $value : string
-
the value of the style rule to map
- $property : string
-
the name of the CSS property to map
Return values
void —normalizeDocumentType()
Makes sure the document type in the passed HTML has lowercase "html".
private
normalizeDocumentType(string $html) : string
Parameters
- $html : string
Return values
string — HTML with normalized document typeparseCssShorthandValue()
Parses a shorthand CSS value and splits it into individual values. For example: `padding: 0 auto;` - `0 auto` is split into top: 0, left: auto, bottom: 0, right: auto.
private
parseCssShorthandValue(string $value) : array<string, string>
Parameters
- $value : string
-
a CSS property value with 1, 2, 3 or 4 sizes
Return values
array<string, string> — an array of values for top, right, bottom and left (using these as associative array keys)prepareHtmlForDomConversion()
Returns the HTML with added document type, Content-Type meta tag, and self-closing slashes, if needed, ensuring that the HTML will be good for creating a DOM document from it.
private
prepareHtmlForDomConversion(string $html) : string
Parameters
- $html : string
Return values
string — the unified HTMLremoveHtmlComments()
Removes comments from the given HTML, including any which are unterminated, for which the remainder of the string is removed.
private
removeHtmlComments(string $html) : string
Parameters
- $html : string
Tags
Return values
string —removeHtmlTemplateElements()
Removes `<template>` elements from the given HTML, including any without an end tag, for which the remainder of the string is removed.
private
removeHtmlTemplateElements(string $html) : string
Parameters
- $html : string
Tags
Return values
string —removeSelfClosingTagsClosingTags()
Eliminates any invalid closing tags for void elements from the given HTML.
private
removeSelfClosingTagsClosingTags(string $html) : string
Parameters
- $html : string
Return values
string —setDomDocument()
private
setDomDocument(DOMDocument $domDocument) : void
Parameters
- $domDocument : DOMDocument
Return values
void —setHtml()
Sets the HTML to process.
private
setHtml(string $html) : void
Parameters
- $html : string
-
the HTML to process, must be UTF-8-encoded
