value-class-pattern: Difference between revisions

From Microformats Wiki
Jump to navigation Jump to search
(flattened hierarchy a bit re: value-title, add stub section for date and time separation)
(Edit edit edit. Restructure, updated various sections, small expansion to date-time-separation stub (needs: Tantek input), added small FAQ to preempt controversy.)
Line 1: Line 1:
<entry-title>Value Class Pattern</entry-title>
<entry-title>Value Class Pattern</entry-title>
''The value class pattern is derived from [[hCard#Value_excerpting|value-excerpting]] in hCard. As such, it is already somewhat supported in parsers. '''However''', the precise parsing behavior is not yet finalized, and the documentation is a work in progress. The pattern should be used with some caution, and with the awareness that future parsing rules could impact your pages.''
''The value class pattern is derived from [[hCard#Value_excerpting|value-excerpting]] in hCard. As such, it is already somewhat supported in parsers. '''However''', the precise parsing behavior is not quite finalized, and the documentation is a work in progress. The pattern should be used with some caution.''


__TOC__
__TOC__
Line 75: Line 75:
In this example, <code>description</code> has a child ‘<code>value</code>’, and that child has a ''grandchild'' ‘<code>value</code>’. However, the parsing of <code>value</code> classes stops at the first level, so the data for <code>description</code> is: <samp><code>&lt;bar class="value">Puppies Rule!&lt;/bar>&lt;strong>But kittens are better!&lt;/strong></code></samp>.
In this example, <code>description</code> has a child ‘<code>value</code>’, and that child has a ''grandchild'' ‘<code>value</code>’. However, the parsing of <code>value</code> classes stops at the first level, so the data for <code>description</code> is: <samp><code>&lt;bar class="value">Puppies Rule!&lt;/bar>&lt;strong>But kittens are better!&lt;/strong></code></samp>.


== value-title ==


The <code>value-title</code> class name allows the publisher to indicate the data value for a parent property is contained in the <code>title</code> attribute of an element.
==Concatenation of Dates and Times==


This can be used to provide a synonym within content, or used to quietly publish alternate, machine forms of information for microformats parsing, without affecting the consumption of your content.
<code>value</code> elements allow the date and time pieces of a ISO timestamp to be broken up within flowing text. When parsing date-time properties, some additional parsing rules apply to produce a valid ISO date.
 
Example:
 
<source lang=html4strict>
<p>The weekly dinner will be on
    <span class="dtstart">
        <abbr class="value" title="2008-06-24">Tuesday</abbr>
    at <span class="value">18:30</span>
    </span>
</p>
</source>
 
Produces:
 
<source lang=text>
DTSTART
    2008-06-24T18:30:00+????
</source>
 
TODO: Define rules for implied timezone.
 
This section is a stub, to be filled in with the feature description, parsing instructions etc. from [[value-excerption-pattern-brainstorming#date_and_time_separation]] and [[value-excerption-dt-separation-test]]
 
==Parsing value from a <code>title</code> attribute==
 
The <code>value-title</code> class name allows the publisher to indicate the data value for a parent property is contained in the <code>title</code> attribute of an element, rather than the inner-text.
 
This can be used to provide a synonym within content, or used to quietly publish alternate forms of information for microformats parsing, without affecting the consumption of content.


For example, you can use casual localization with dates:
For example, you can use casual localization with dates:
Line 87: Line 114:
</source>
</source>


Parsing rules for <code>value-title</code> are the same as for <code>value</code> above, with the following additions:
Parsing rules for <code>value-title</code> are the same as for <code>value</code> above, with the following additional restrictions:


* Where a microformats property has a child <code>value-title</code>, the content of the <code>title</code> attribute of that element must be parsed, ''instead of'' the inner-text of the element.
* Where a microformats property has a child <code>value-title</code>, the content of the <code>title</code> attribute of that element must be parsed, ''instead of'' the inner-text of the element.
Line 122: Line 149:
The microformats parser will read the ISO format date <samp>2009-03-14T16:28-0600</samp>, but users will only see <samp>March 14th 2009, around half-past four</samp>. The ISO-form date above does not get exposed to any user at all.
The microformats parser will read the ISO format date <samp>2009-03-14T16:28-0600</samp>, but users will only see <samp>March 14th 2009, around half-past four</samp>. The ISO-form date above does not get exposed to any user at all.


===How does this work?===
===Parsing machine-data <code>value-title</code>===


Browsers collapse the <code>value-title</code> span down to a width of <code>0</code>, effectively providing no visual rendering, whilst keeping the element in the DOM. With no physical dimensions, there is no ‘hover’ state, so no tooltip is revealed. Furthermore, the empty element is not passed to assistive technology layers such as VoiceOver.
Browsers collapse the <code>value-title</code> span down to a width of <code>0</code>, effectively providing no visual rendering, whilst keeping the element in the DOM. With no physical dimensions, there is no ‘hover’ state, so no tooltip is revealed. Furthermore, the empty element is not passed to assistive technology layers such as VoiceOver.
Line 128: Line 155:
We conducted [[value-excerption-value-title-test|thorough testing]] of these parsing behaviors to ensure accessibility.
We conducted [[value-excerption-value-title-test|thorough testing]] of these parsing behaviors to ensure accessibility.


''Note: Whilst the <code>value-title</code> element is more gracefully written without whitespace inner-text (or as self-closing <code>&lt;span /></code> element in XHTML), current tools such as WYSIWYG editors and HTML-Tidy will erroneously discard them, meaning that the parsable data could be thrown away by some publishing tools, or even parsers themselves. As such, <code>&lt;span class='value-title'> &lt;/span></code>, including a single whitespace character as content, is the recommended pattern for this time.''
''Note: Whilst the <code>value-title</code> element is more gracefully written without whitespace inner-text (or as self-closing <code>&lt;foo /></code> element in XHTML), current tools such as WYSIWYG editors and HTML-Tidy will erroneously discard such elements, resulting in parsable data being thrown away by some tools. As such, <code>&lt;span class='value-title'> &lt;/span></code>, including a single whitespace character between the opening and closing tag, is the recommended pattern at this time.''


Parsing this final <code>value-title</code> extension imposes some stricter restrictions on usage. These restrictions exist to reduce the impact of <abbr title="Don't Repeat Yourself">DRY</abbr> violations, reducing the risk of sites spoofing data, and encouraging the best scenario for maintaining both forms of data accurately.
Parsing this final <code>value-title</code> extension imposes some stricter restrictions on usage. These restrictions exist to reduce the impact of <abbr title="Don't Repeat Yourself">DRY</abbr> violations, reduce the opportunity for sites to spoof data, and encourage best practice for maintaining both forms of data accurately.


So, where an element with class <code>value-title</code> is to be parsed as the data for a property, and that element also contains no non-whitespace content (ergo, it is ‘empty’), the following rules apply:
So, where an element with class <code>value-title</code> is to be parsed as data for a property, and that element also contains no non-whitespace content (hereafter referred to as ‘empty’), the following rules apply:


* The ‘empty’ value-title element must be the '''first, non-whitespace child''' of the property element. That is, it should follow immediately after the property is declared, before the human-readable form, and without any additional nesting.
* The ‘empty’ value-title element must be the '''first, non-whitespace child''' of the property element. That is, it should follow immediately after the property is declared, before the human-readable form, and without any additional nesting.
* The ‘empty’ value-title element can only be used for specific properties. Microformat specifications must explicitly state which of their properties can be used with this extension of value-exception.
* The ‘empty’ value-title element can only be used for specific properties. Microformat specifications must explicitly state which properties may be used with this extension of the value-class-pattern.
* Where an ‘empty’ value-title element is to be used as the property value, it must be the _only_ such <code>value</code> content. That is, it overrides all other <code>value</code> and <code>value-title</code> siblings and/or cousins.
* Where an ‘empty’ value-title element is to be used as the single property value, it must be the _only_ such <code>value</code> content. That is, the first instance of a conforming <code>value-title</code> element overrides all other <code>value</code> and <code>value-title</code> siblings and/or cousins.
* Tools written to perform Conformance Testing and/or Validation of microformats ''should'' attempt to compare the machine-data and human legible forms of the property data, and advise authors if the forms do not match.
* Tools written to perform Conformance Testing and/or Validation of microformats ''should'' attempt to compare the machine-data and human legible forms of the property data, and advise authors if the forms do not match.


''At time of publication, this document post-dates other microformat specifications, such that they may not yet indicate which properties are to be compatible with this pattern. In the interim, the properties documented on the [[machine-data]] page are to be considered normative.''
''This document post-dates other microformat specifications, such that they may not yet indicate which properties are to be compatible with this pattern. In the interim, the properties documented on the [[machine-data]] page are to be considered normative.''


We require a thorough test-suite for this pattern. In the interim, here is an incomplete test suite:
There are some simple reference examples and tests for this pattern on [[value-class-pattern-tests]].


===Conforming Tests===
===Pre-emptive <abbr title='Frequently Asked Questions'>FAQ</abbr>===


====One====
<div class='discussion'>
 
* '''Why use an 'empty' element? Why not embed data in the class attribute?'''
<source lang=html4strict>
** The class attribute is inappropriate for embedded data values, as per the HTML4 specification, which states <code>class</class> is for ‘general purposing processing’, which is defined as ‘e.g. for identifying fields when extracting data from HTML pages into a database, translating HTML documents into other formats, etc.’. ‘General purpose processing’ does not extend to data itself. Furthermore, this method avoids inventing a new string pattern for embedding data.
<p class='tel'>My
* '''Why use an 'empty' element? Why not make up a new attribute, like ‘data’?'''
    <span class='type'>
** Microformats exist and function in valid HTML4 and XHTML1. Those are the current standards for web development, and microformats exist for use ''now''. In the future, perhaps future revisions of HTML will offer up another solution. For now, this method has been tested against browsers, and creates a consistant document structure (where machine-form and human-form data are siblings).
        <span class='value-title' title='cell'> </span>mobile
* '''The <code>title</code> attribute should only be used for content!'''
    </span> phone number is <span class='value'>+44 1245 333 333</span>
** The <code>title</code> attribute _is_ used for content and is read by microformats parsers. This exists for cases where data cannot be parsed with sufficient precision from just the commonly published, visible information. This pattern allows both forms of content to be included, whilst keeping it invisible to human consumers.
</p>
</div>
</source>
 
Result


<source lang=text>
You can also refer to the general [[faq Microformats FAQ]] and [[microformats#the_microformats_principles principals]].
TEL
    TYPE = cell
    VALUE = +44 1245 333 333
</source>
 
====Two====
 
<source lang=html4strict>
<p class='vevent'>
    My <span class='summary'>Birthday Party</span> will be held
    <span class='dtstart'>
        <span class='value-title' title='2009-04-01'>tomorrow</span>
    </span>
    and last until
    <span class='dtend'>
        <span class='value-title' title='2009-04-05'> </span>
        the following Tuesday (April 5th)
    </span>.
</p>       
</source>
 
Result
 
<source lang=text>
VEVENT
    SUMMARY = Birthday Party
    DTSTART = 2009-04-01
    DTEND = 2009-04-05
</source>
 
===Tests of Non-Conforming Code===
 
====One====
 
In this case, the human text appears before the <code>value-title</code> element, so the machine-data value has a weaker association with the property declaration. The likelihood of the data not being maintained correctly — the data value being ignored by an editor — is increased.
 
<source lang=html4strict>
<p class='tel'>My
    <span class='type'>
        mobile
        <span class='value-title' title='cell'> </span>
    </span> phone number is <span class='value'>+44 1245 333 333</span>
</p>
</source>
 
Result
 
<source lang=text>
TEL
    TYPE = none/default/unknown ('mobile' is unknown in hCard)
    VALUE = +44 1245 333 333
</source>
 
====Two====
 
In this case, the <code>value-title</code> element is used for a property that is not valid for use with this pattern.
 
<source lang=html4strict>
<p class='vevent'>You are invited to
    <span class='summary'>
        <span class='value-title' title='FooCamp'> </span>
        BarCamp
    </span>
</source>
 
Result
 
<source lang=text>
VEVENT
    SUMMARY = BarCamp
</source>
 
== date and time separation ==
 
This section is a stub, to be filled in with the feature description, parsing instructions etc. from [[value-excerption-pattern-brainstorming#date_and_time_separation]] and [[value-excerption-dt-separation-test]]


==Related Pages==
==Related Pages==

Revision as of 06:32, 17 April 2009

<entry-title>Value Class Pattern</entry-title> The value class pattern is derived from value-excerpting in hCard. As such, it is already somewhat supported in parsers. However, the precise parsing behavior is not quite finalized, and the documentation is a work in progress. The pattern should be used with some caution.

Editor
Ben Ward

Sometimes, only a part of an element's content is to be used as the value of a microformat property. This may occur when a property has optional sibling properties, such as tel: type and tel: value in hCard. Other times, the most appropriate structure for a property may include other content.

For these purposes, the special class name value is used to mark-up the relevant data excerpt from larger element content.

Simple Examples

Here is an hCard fragment for marking up a home phone number:

vCard:

TEL;TYPE=HOME:+1.415.555.1212

hCard:

 <span class="tel">
   <span class="type">Home</span>:
   <span class="value">+1.415.555.1212</span>
 </span>

In this case, the value of tel is +1.415.555.1212, not Home: +1.415.555.1212.

Another example, this time using a localized (British) telephone number:

 <span class="tel">
   <span class="type">Home</span>:
   <span class="value">+44</span> (0) <span class="value">1223 123 123</span>
 </span>

In this case, the valid data for the telephone number is +441223123123, but the way in which phone number is presented in Britain will include the (0), for local dialling. That is, from anywhere in the world you may dial +441223123123, or from within Britain you may dial 01223123123. Common local publishing interferes with the data, since dialling +4401223123123 is an invalid number.

In the mark-up, two value classes target the part of the telephone number string that makes an international, valid number, whilst allowing conventional presentation.

Another example, using dtstart in hCalendar:

 <span class="dtstart">
    Friday 25th May, 6pm
    <span class="value">2008-05-25T18:00:00+0100</span>
 </span>

Whilst the entire string ‘Friday 25th May […]’ is date information, it's only the ISO 8601 encoded form which must be consumed by a microformats parser, so the value class isolates it.

Basic Parsing

  • Where an element with a microformat property class name has an descendant with class name value, parsers should read the inner-text of the value element only, ignoring other text node descendants.
  • Where there are multiple descendants of a property with class name of value, they should be concatenated without inserting additional characters or white-space.
  • Descendants with class of value must not be parsed deeper than one level. That is, where an element foo with class value has a descendant bar with class value, the content of foo is taken as the value. Nesting additional elements with class of value cannot be used to further isolate a property's value.

e.g.

 <p class="description">
  <foo class="value">
    <bar class="value">Puppies Rule!</bar>
    <strong>But kittens are better!</strong>
 </foo>
</p>

In this example, description has a child ‘value’, and that child has a grandchildvalue’. However, the parsing of value classes stops at the first level, so the data for description is: <bar class="value">Puppies Rule!</bar><strong>But kittens are better!</strong>.


Concatenation of Dates and Times

value elements allow the date and time pieces of a ISO timestamp to be broken up within flowing text. When parsing date-time properties, some additional parsing rules apply to produce a valid ISO date.

Example:

<p>The weekly dinner will be on 
    <span class="dtstart">
        <abbr class="value" title="2008-06-24">Tuesday</abbr> 
     at <span class="value">18:30</span>
    </span>
</p>

Produces:

DTSTART
    2008-06-24T18:30:00+????

TODO: Define rules for implied timezone.

This section is a stub, to be filled in with the feature description, parsing instructions etc. from value-excerption-pattern-brainstorming#date_and_time_separation and value-excerption-dt-separation-test

Parsing value from a title attribute

The value-title class name allows the publisher to indicate the data value for a parent property is contained in the title attribute of an element, rather than the inner-text.

This can be used to provide a synonym within content, or used to quietly publish alternate forms of information for microformats parsing, without affecting the consumption of content.

For example, you can use casual localization with dates:

<p>It was <span class='dtstart'><span class='value-title' title='2008'>last year</span></span> that I realised my addiction to cashew nuts would cost this country so dear.</p>

Parsing rules for value-title are the same as for value above, with the following additional restrictions:

  • Where a microformats property has a child value-title, the content of the title attribute of that element must be parsed, instead of the inner-text of the element.

Using value-title to publish machine-data

The initial usage of value-title is used to publish alternate, parsable forms of property values in a visible context. However, there have developed some cases in microformats where it necessary to include a data form to ensure accurate parsing, which publishers do not want visible in their page.

For example, full ISO8601 dates can be confusing to readers of the page (both as a tooltip and when read aloud to users of screen reader technology), and enumerations such as type in hCard's tel use US-English terms, which are not part of pages in any other language.

Since both of those scenarios are unacceptable, for these cases, and these alone, there exists a further extension of value-excerption, allowing the parsable form to be published ‘silently’ in parallel with the local content. This pattern is used as follows:

<p class='tel' lang='en-gb'>
    <span class='type'>
        <span class='value-title' title='cell'> </span>
        mobile
    </span>
    <span class='value'>+44 7773 000 000</span>
</p>

The cell property value is parsed, but mobile is displayed on the page.

In the case of dates:

<p class='dtstart'>
    <span class='value-title' title='2009-03-14T16:28-0600'> </span>
    March 14th 2009, around half-past four
</p>

The microformats parser will read the ISO format date 2009-03-14T16:28-0600, but users will only see March 14th 2009, around half-past four. The ISO-form date above does not get exposed to any user at all.

Parsing machine-data value-title

Browsers collapse the value-title span down to a width of 0, effectively providing no visual rendering, whilst keeping the element in the DOM. With no physical dimensions, there is no ‘hover’ state, so no tooltip is revealed. Furthermore, the empty element is not passed to assistive technology layers such as VoiceOver.

We conducted thorough testing of these parsing behaviors to ensure accessibility.

Note: Whilst the value-title element is more gracefully written without whitespace inner-text (or as self-closing <foo /> element in XHTML), current tools such as WYSIWYG editors and HTML-Tidy will erroneously discard such elements, resulting in parsable data being thrown away by some tools. As such, <span class='value-title'> </span>, including a single whitespace character between the opening and closing tag, is the recommended pattern at this time.

Parsing this final value-title extension imposes some stricter restrictions on usage. These restrictions exist to reduce the impact of DRY violations, reduce the opportunity for sites to spoof data, and encourage best practice for maintaining both forms of data accurately.

So, where an element with class value-title is to be parsed as data for a property, and that element also contains no non-whitespace content (hereafter referred to as ‘empty’), the following rules apply:

  • The ‘empty’ value-title element must be the first, non-whitespace child of the property element. That is, it should follow immediately after the property is declared, before the human-readable form, and without any additional nesting.
  • The ‘empty’ value-title element can only be used for specific properties. Microformat specifications must explicitly state which properties may be used with this extension of the value-class-pattern.
  • Where an ‘empty’ value-title element is to be used as the single property value, it must be the _only_ such value content. That is, the first instance of a conforming value-title element overrides all other value and value-title siblings and/or cousins.
  • Tools written to perform Conformance Testing and/or Validation of microformats should attempt to compare the machine-data and human legible forms of the property data, and advise authors if the forms do not match.

This document post-dates other microformat specifications, such that they may not yet indicate which properties are to be compatible with this pattern. In the interim, the properties documented on the machine-data page are to be considered normative.

There are some simple reference examples and tests for this pattern on value-class-pattern-tests.

Pre-emptive FAQ

  • Why use an 'empty' element? Why not embed data in the class attribute?
    • The class attribute is inappropriate for embedded data values, as per the HTML4 specification, which states class</class> is for ‘general purposing processing’, which is defined as ‘e.g. for identifying fields when extracting data from HTML pages into a database, translating HTML documents into other formats, etc.’. ‘General purpose processing’ does not extend to data itself. Furthermore, this method avoids inventing a new string pattern for embedding data.
  • Why use an 'empty' element? Why not make up a new attribute, like ‘data’?
    • Microformats exist and function in valid HTML4 and XHTML1. Those are the current standards for web development, and microformats exist for use now. In the future, perhaps future revisions of HTML will offer up another solution. For now, this method has been tested against browsers, and creates a consistant document structure (where machine-form and human-form data are siblings).
  • The title attribute should only be used for content!
    • The title attribute _is_ used for content and is read by microformats parsers. This exists for cases where data cannot be parsed with sufficient precision from just the commonly published, visible information. This pattern allows both forms of content to be included, whilst keeping it invisible to human consumers.

You can also refer to the general faq Microformats FAQ and microformats#the_microformats_principles principals.

Related Pages