|
Online
Solutions And Innovation, Inc.
www.osai.com/Developer/W3HTML.asp
(214) 432-1063 (Dallas) || (866)-573-2865 (Toll Free) Experts in Web, Email, and Application hosting. Dedicated to consistent, reliable, fault tolerant service! |
|
On July 27th, 2010 we made the first official release on Unicorn. We are elated with the response from the community. Within two days after the announcement we received 7 additional translations.
There are already a couple new checkers in the works, a few being discussed and a number of interesting suggestions for more such as creating accessibility and jslint checkers. Check out the code, write a checker if you are inclined, provide feedback or donate. Stay tuned and thank you.
From the very first release of the cheatsheet, I’ve received requests to include the various new elements and attributes of the HTML5 specification in the cheatsheet. As a reminder, the cheatsheet is a mobile-friendly Web application that provides a compilation of useful knowledge extracted from W3C specifications
At long last, I’ve finally managed to integrate these new elements in the latest release of the cheatsheet, where you will now find all the new, changed, obsolete and removed elements and attributes in HTML5 highlighted:


All the data are extracted from HTML: The Markup Language Reference, the specification maintained by Mike Smith that describes the markup aspects of HTML5.
As always, this comes with a number of bug fixes, UI improvements (thanks to Sorin Stefan), and this release is both available in the Web version and in the Android application.

QR Code for cheatsheet on the Android market
Obviously, I’m expecting to now get requests for another spec “du jour” (CSS3 anyone?), which clearly is in my roadmap — but as always, I’m very much interested in getting others to contribute to the work.
Feedback, comments and suggestions are very much appreciated!
July 9, 2010
I reported in The Mission of W3C that a major focus of W3C is to Strengthen our core mission. This blog entry elaborates.
Since the Web is central to everything (see also The Expanding Web Platform), it is not too surprising that we get involved in standardizing numerous aspects of Web infrastructure. Today we have over 70 groups that are providing standards in diverse areas. Evidently, some of these groups have greater impact on the utility of the Web than others. Also, some of them are more advanced/complex/difficult than others. If we apply an equal amount of effort to each working group, we will sub-optimize the Web by not applying sufficient resource on those that are most crucial. So we prioritize.
Every time we make an incremental change in our effort – such as when we start a new working group or expand the scope of requirements – we are making an incremental prioritization decision. In our “core mission task force” we are taking a step back and looking at the ensemble of our activity. Do we know which small number of the 70 groups are the most important? Are we applying sufficient resource and attention to them to guarantee the greatest possible success?
As our task force characterizes projects into core and non-core, one might ask what difference it makes.
There are many reasons that a non-core standard might have migrated to W3C. Some of them are:
The standard is a natural adjacency of a core standard
W3C houses a team who are knowledgeable technologists who can best shepherd the standard
The standard is a candidate to be core in the future
… among other reasons. These are actually sensible reasons. So we will absolutely continue to serve the industry and convene those standards groups.
But for core standards we must have a higher target of quality. It is not sufficient to work with the industry to create the standard. We must assure quality of the standard, develop it on schedule, help make sure that it has the right feature set, work on testing strategies, provide training, and help people appreciate the value. Our core mission task force might not change the list of working groups. But it will change emphasis. For items in core, we will provide supreme effort to ensure quality, the right feature set, timeliness, and market support.
Last month I spent a week in Silicon Valley. The importance of giving top attention to core was re-inforced.
Our largest working group is the HTML5 working group. Literally hundreds of people participate in the W3C HTML5 working group.
Many of them work in Silicon Valley. So I met with many of them. They come from a cross-section of companies: browser vendors, web publishers, tools providers, application vendors, security firms.
Since HTML5 is core, it is important that we as a community get it “right”. Since it is central to the next generation web, it evokes strong opinions about what it means to “get it right”. And as I listened to the opinions they are all well reasoned and based on knowledgeable and sensible views of how the future web will evolve.
While they are all reasoned, knowledgeable, and sensible – that does not mean that everyone agrees. Not at all! Nor should we expect agreement for something so central – given that different organizations have different priorities.
Therein, lies W3C's imperative to provide the right amount of attention to strengthen our core. We bring diverse stakeholders to the table and create the environment for the industry to agree on this critical standard. There is still more work to be done. There are technical issues, testing issues, and disagreements that need to be resolved. It is our commitment to work with all stakeholders to drive this forward in a professional way, and ensure that we have the right feature set, quality, timeliness, and market support.
The current maintenance update to XHTML Modularization is in response to the inevitable bug reports and clarifications that come from actual use. Since there have recently been some misconceptions expressed about the purpose of the spec, I'd thought I'd take the opportunity to try and clear them up.
XHTML Modularization is a tool for people who design markup languages. It has been used by the people designing the format for Jabber (xmpp), for the open eBook standard (epub), for the microformats specification for outlines (xoxo), and the Resource Directory Description Language (RDDL), among many others, as well as those at W3C such as XHTML 1.1, and RDFa.
Although Rick Jelliffe asserted that XHTML Modularization "...may be one of the most important new technologies of 2001," most people will not be familiar with it. That is because XHTML Modularization is not for designing Web pages, nor is it implemented in browsers: a lot of people create Web pages; not many create new markup languages.
XHTML Modularization helps people design and manage markup language schemas and DTDs; it tells you how to write schemas that will plug together. Modules can be reused and recombined across different languages, which helps keep related languages in sync.
The modularization approach in the spec applies to XML as well. We could have called it "XML Modularization" but the main reason that XHTML appears in the title is that the spec also contains modules for XHTML using the methodology. It is with these modules that XHTML 1.1, XHTML Print, and XHTML Basic (and the others mentioned above) are defined.
Modularization is in some ways an unusual specification for W3C, because you don't have to write any software for it. In a sense, the 'processor' for Modularization is a human who is writing a schema. "Write it following these rules, and it will plug in seamlessly with other modules written in the same way." You could compare it to accessibility guidelines, which just tell you how to construct web pages that are accessible; Modularization just tells you how to write schemas that will plug together. Because it is not a specification to be implemented, it doesn't require the testing that normally ensures the implementability of W3C specifications.
After 15 years working with all of you all around the world on Web technologies and standards, I'm taking a position as a Biomedical Informatics Software Engineer in the department of biostatistics at the University of Kansas Medical center.
The new job starts in just another week or two; I'll update the contact information and such on my home page before I'm done here. While my new position is likely to keep me particularly busy for a few months, I hope to surface in Mad Mode from time to time; it's a blog where I'm consolidating writing on free software, semantic web research, and other things I'm mad-passionate about.
Thanks to all of you who contribute to the work at W3C; I'm proud of a lot of things that we built together. And thanks to all my mentors and collaborators who taught me, helped me, and challenged me.
The Web is an incredibly important part of so many parts of life these days, and W3C plays an important role in ensuring that it will work for everyone over the long haul. Although it's hard to leave an organization with a mission I support, I am excited to get into bioinformatics, and I look forward to what W3C and the Web community come up with next as well.

A visit with staff at Keio University
I continue to meet key stakeholders around the world as part of my introduction to W3C. The last two weeks have been focused on Asia.
I visited India last week partly to help launch the new W3C office at a conference. My experience in the conference and in meetings illustrated opportunities that W3C has in India: better communication of our work, greater participation in this work by engineers from India, and expanding our technical scope.
The conference itself demonstrated the strong support for W3C within India. Keynote speakers included key government ministers as well as icons of India's high-tech industry. The conference attracted 600 people from industry, government, and academia. The technical program brought together leading researchers and spanned an impressive set of topics.
In India, W3C's office is hosted by TDIL – Technology Development for Indic Languages. This partnership will strengthen our internationalization work, part of ensuring that the Web is available for all people. With 22 official languages in India, in addition to a larger number of languages and dialects – making the Web available irrespective of language and literacy level is a key issue for India and consonant with our values. Many of the conference presentations related to that topic.
Outside the conference was equally rewarding. India has important vendor groups such as Nasscom and Mait who co-sponsored the conference. And TDIL is part of a government ministry. My visit gave me the opportunity to meet with several executives of these organizations and reinforce my belief that there is a strong commitment to W3C in India.
In China there was a similar enthusiasm and set of highlight events for W3C.
The visit got off to a good start even before it began – with news that China Unicom became the first large IT vendor in China to join as a W3C Member.
There is great interest in W3C in China. I gave several presentations about the Expanding Web Platform to hundreds of engineers from companies and universities at places such as Beihang University in Beijing. The deep questions illustrated the interest and technical savvy of these engineers in topics such as HTML5, accessibility, and the Semantic Web.
Also of importance were meetings with key decision makers and larger public meetings. At a dinner for W3C member laboratories in China – the local leaders of global firms participated in discussions about how to create a stronger W3C community in China. And at a conference sponsored by CESI – the China national electronics standards institute - related to ISO/IEC JTC1 SC 38, I had an opportunity to deliver remarks about how W3C standards related to the new standardization effort underway in SC 38.
Japan has always been a strong point for W3C in Asia, but additional opportunities exist.
As background, the Japan team – also noting strong ties with our Korean office – is in the process of putting together a workshop in September on Web TV.
Meeting with technical visionaries and government officials, it is clear that Asian industry is poised to play a strong role in this area. After all, many of the innovations in television manufacturing already take place in Asia. As the convergence continues between the Web and all access devices (including television), what better place than Japan to have our Web TV workshop?
Two years and a half ago, Dan Connolly wrote about When will HTML 5 support <video>? Sooner if you help. Where are we with HTML5 Video nowadays?
ISSUE-6: Pros and cons of keeping video and audio in the scope of the HTML working group.
This issue has been closed. Video and audio are part of HTML5.
ISSUE-7: codec support and the <video> element
This issue has been closed due to lack of change proposal, so we still don't have a baseline video codec for HTML5. Dan mentioned Ogg/Theora, Dirac, H.264, and VC-1 back in 2007. With the recent announcement from the MS IE team to support H.264, we're down to two. Dirac didn't come up as a strong candidate so far. Well, at least, that's the status for the moment. Several individuals are wondering if the Google I/O conference this week will reveal some intent from Google regarding the VP8 codec.
We, at W3C, believe that having a video codec which is compatible with our Royalty-Free policy would be a great step forward, but we remain skeptical about the likelihood of such a thing happening. In the meantime, the HTML5 specification provides a nice fallback mechanism.
So, if there is new information on this subject, we could reopen ISSUE 7.
ISSUE-9: how accessibility works for <video> is unclear
That issue is still ongoing and it's still unclear how accessibility works for the video element. ISSUE-9 covers a wide scope, including providing more controls over multi-track media, access to media cues, captioning/video description support, etc. The HTML accessibility task force is still struggling to come up with a definite set of requirements (mainly due to lack of resources). The editor of the specification is working on a proposal. Somehow, we'll need to match them up together. We should get some good results within the upcoming weeks and extend or change the video element as necessary.
ISSUE-10: how similar should SMIL and <video> attribute names be?
This issue has been closed, due to lack of change proposal. There are certainly disagreements on the use cases in this area.
Implementations are moving along in the meantime. After Firefox, Google chrome, and Safari, Opera announced support for video in their products recently. The IE team is working on theirs as well. If you're wondering how well your browser supports HTML5 Video, check out HTML5 Video, Media Events and Media Properties. It's far from doing a thorough testing of the implementation but it gives you an idea of the support.
The very recent announcements from Microsoft (“The Future of the Web is HTML5”), Apple (“We are betting big on HTML5”) and Google (“New open standards created in the mobile era, such as HTML5, will win on mobile devices (and PCs too)”) prove that, while HTML5 is still a work in progress, W3C is increasing implementation experience and building community support.
W3C organized an HTML5 camp in the W3C track @ WWW2010. We started with a contextual presentation about HTML5 and a whole range of other W3C technologies which contribute to the ever-expanding Web platform (including CSS3, SVG, and Canvas). Cool slides demonstrated what can already be achieved in most browsers these days. In particular, do not miss the "Rough View of the Future", the funny "Memory Game", the "XHTML5/SVG Video Player", and "Beyond HTML5".
To reinforce how simple and easy it is to use Web standards, Philippe Le Hégaret did a live coding session of a SVG/HTML5 video player. Philippe pointed out that, while a lot of work remains on the HTML5 specification inside the HTML Working Group, things are moving forward.
Doug Schepers also drew "wow!" and "ahhh!" with his presentation on The Graphical Web, where he demonstrated the features and differences of SVG and Canvas. These graphics technologies complement each other well, and both technologies are enjoying broad support across browsers.
HTML5 is getting a lot of coverage nowadays and it's certainly an exciting time to work in Web standards. We look forward to providing even cooler demos in the months and years ahead!

For those of you who will be in Paris, France on April 7, Daniel Glazman, Dominique Hazaël-Massieux and I will be giving presentations about the next Open Web platform and Web applications (incl. widgets). So, if you weren't able to attend our meetup in Boston, you'll get an other chance in Paris.
This event is public and free so everyone is welcome to come, but please register on the event registration page. We'll start at 7pm and stops when the coffee and other beverages of your choice run out.
Note: C'est complet!. the attendance has reached the capacity of the room within 24 hours.
We're looking forward to see you there and, again, have fun with Web technologies.
Today at the W3C Advisory Committee meeting, we discussed the document license for HTML 5. We discussed use cases from the HTML Working Group that call for a more open license than the current W3C Document License.
The result of discussion among the Membership is that there is strong support for:
In short, there is strong support in the Membership (but not unanimity) for all of the use cases cited by the HTML Working Group except forking the specification. Several W3C Members do feel strongly that the document license should allow forking, however.
People at the meeting agreed that, in any case, copyright is not likely to prevent fragmentation. Several points were made:
We have work to do to find the right license to meet the stated goals: to make it easy for people to reuse W3C specifications in almost all of the scenarios people have expressed are important to them.
We plan to work with the community on the details as we move forward. More information can be found in my slides from the meeting. We welcome your feedback.
For those of you who will be in Cambridge, MA on March 27, a few of us will be giving several presentations around HTML 5, CSS 3, and SVG in the morning. We'll have a hands-on session in the afternoon.
This event is free and organized as part of the Boston Web Design Meetup group. Please, don't forget to register as soon as you can.
We're looking forward to see you there and have fun with Web technologies.
At this year's 19th International World Wide Web Conference (WWW2010 - Raleigh, NC, USA), W3C will organize two "camps": the "HTML 5 camp" and the "Linked Open Data (LOD) camp" (29 and 30 April 2010). The "camp" format of the W3C Track, first adopted in Madrid in 2009, received positive feedback and so we are continuing, and improving, that format.
Each camp is one day. The morning sessions will feature talks from experts and surprise guests. In the afternoon of each session, the participants themselves will choose the topics they wish to discuss during the breakouts.
Wikis are available to submit topics of discussion in advance. There is one wiki per camp: LODCampW3CTrack and HTML5campW3CTrack. Apart from technical topics, we are also looking for lightning talks (very short presentations) proposals: anything from announcements, forward thinking ideas, controversial statements, observations, short demos, etc. Note too that the breakout session outcomes will be recorded in the same wikis.
So, if you're planning to attend the WWW conference, please register through the WWW2010 online registration system and let's W3Ccamp there!
Yesterday, as part of the W3C Technical Plenary day, I got the opportunity to introduce a new tool that I had been working on over the past few weeks, the W3C Cheatsheet for Web developers.
This cheatsheet aims at providing in a very compact and mobile-friendly format a compilation of useful knowledge extracted from W3C specifications — at this time, CSS, HTML, SVG and XPath —, completed by summaries of guidelines developed at W3C, in particular the WCAG2 accessibility guidelines, the Mobile Web Best Practices, and a number of internationalization tips.
Its main feature is a lookup search box, where one can start typing a keyword and get a list of matching properties/elements/attributes/functions in the above-mentioned specifications, and further details on those when selecting the one of interest.
The early feedback received both from TPAC participants after the demo and from the microblogging community has been really positive and makes me optimistic that this tool is filling a useful role.
This is very much a first release, and there are many aspects that will likely need improvements over time, in particular:
The code behind the cheatsheet is already publicly available, and I’m hoping others will be interested to join me in developing this tool — I’m fully aware that the first thing that will need to get others involved will be some documentation on the architecture and data formats used in the cheatsheet, and I’m thus hoping to work on that in the upcoming few weeks.
In the meantime, I very much welcome bug reports and suggestions for improvements, either by private email to me (dom@w3.org) or preferably to the publicly archived mailing list public-qa-dev@w3.org.
The idea started with the fact that we have a number of Working Groups who are trying to review the way they do testing, but also increase the number of tests they are doing as well.
The CSS Working Group was foremost in mind when it comes to testing. The Group has several documents in Candidate Recommendation stage that are waiting tests and testing. The HTML Working Group is starting to look into testing as well and a key component of ensure the proper success of HTML 5 is through testing. The specification is quite big to say the least and, when it comes to testing, it's going to require a lot of work. We also have more and more APIs within the Web Apps group, Device API, Geolocation, etc. The SVG Working Group has a test suite for 1.2, but they're looking at different ways of testing as well. The framework produced by the MWI Test Suites framework allow two methods. One requires a human to look at it and select pass/fail. The other one is more suitable for script tests, ie APIs testing.
A bunch of us, namely Mike Smith, Fantasai, Jonathan Watt, Doug Schepers, and myself, decided to get together to discuss this and figure out how to improve the situation. We focused on three axes: test submissions, test reviews and how to run a test.
First, we'd like ideally every single Web author to be able to submit tests, so when they run into a browser bug based on a specification, it should be easy for them to submit a test to W3C. It should also allow browser vendors to submit thousands of tests at once. There is the question of how much metadata do you require when submitting a test. For example, we do need to know at some point which feature/part of a spec is being tested. We should also as many format as possible for tests. Reftests, mochitests, DOM-only tests, human tests, etc. The importance aspect here is to be able to run those tests on many platforms/browsers as possible. A test format that can only be ran on one browser is of no use for us.
Once a test has been submitted, it needs to be reviewed. The basic idea behind improving test reviews is to allow more individuals to contribute. The resources inside W3C aren't enough to review ten of thousands of tests. We need to involve the community at large by doing crowd reviews. It will allow the working groups to only focus on the controversial tests.
Once the test got reviewed, we need to run them on the browsers, as many as possible. Human tests for example are easy to run on all of them, but it does require a lot of humans. Automatic layout tests are a lot trickier, especially on mobiles. We focused on one method during our gathering: screenshot based approach. The basic idea here is that a screenshot of the page is compared to a reference. Mozilla developed a technology called ref-tests that compares Web pages themselves. You write two pages differently that are supposed the exact same rendering and compare their screenshots. It avoids a lot of cross-platforms issues one can. The way Mozilla is doing that is via the mozPaint API in debug mode. That works well, but only works in Mozilla. You can guess that other browser vendors have a similar to automatically take screenshots as well. We wanted to find a way to do this with all browsers without forcing them or us to write significant amounts of code. We found a Web site called browsertests.org and we got in touch with that Sylvain Pasche and, with his help, we started to make some improvements on his application. It works well on desktops at least. Once again, we don't think W3C is big enough to replicate all types of browser environments, so we should make it easy for people to run the tests in their browser and report the results back to us. Plenty of testing frameworks have been done already and we should try to leverage them as much as possible.
We started to set up a database for receiving the tests and their results. We'd like to continue the efforts on the server/database side, as well as continuing to improve Sylvain's application, allowing more tests methods and formats. Testing the CSS or HTML5 parser should be allowed for example.
You'll find more information at our unstable server but keep in mind that:
The server also contains links to more resources on the Web related to various testing efforts, as well as a more complete of what we wish the testing framework to accomplish.
For the conclusion, I'd like to thank Mike Smith and Doug Schepers, and especially Jonathan Watt and Fantasai from the Mozilla Foundation. They all accepted to argue and code for 8 days around the simple idea of improving the state of testing at W3C. I hope we're going to be able to take this project off the ground in the near future. If you're interested in contributing, got ideas and time, don't hesitate to contact me.
The general principle of platform design is that platforms consist of a set of standard interfaces. Standard interfaces allow substitution of components across the interface boundary, while independence of interfaces allow evolution of the interfaces themselves. In a PC, for example, the disk bus interface allows many different disk vendors to offer disk products independent of the model of display or keyboard, but the orthogonality of interfaces allow evolution of the interfaces themselves. If the display interface were linked to the disk interface too tightly, it wouldn't be possible to evolve ISA to SATA without updating VGA.
In the web platform, the three important interfaces are transport, format and reference, and the current definitions of those interfaces are HTTP, HTML and URI. The interfaces are standard, allowing many different implementations: HTTP standard lets you use HTTP servers from many vendors, the HTML standard lets you use many different HTML authoring tools or template systems, and the URI specification allows identification of many different components.
While HTTP is the current "common denominator" protocol that all web agents are expected to speak, the web should continue to work if web content is delivered by other protocols -- FTP, shared file systems, email, instant messaging, and so forth. HTTP as it has evolved has severe difficulties, and designing a Web that only works with HTTP as it is currently implemented and deployed would unfortunate. We should work harder to reduce the dependencies and isolate them.
HTML is the 'lingua franca', the common language that all agents are currently expected to be able to produce, process, read and interpret (or at least a well-defined subset of it). Having a common language is important for interoperability, but the web should also work for other formats -- extensions to HTML including scripting, DOM APIs, but also other formats and application environments such as XHTML, Java, PDF, Flash, Silverlight, XForms, 3D objects, SVG, other XML languages and so forth. Certainly HTML has it has evolved is overly complex for the purposes to which it is designed.
The URI is the fundamental element of reference, but the URI itself is evolving to deal with internationalization, reference to session state, IRIs, LEIRIs, HREFs and so forth. Many applications use URIs and IRIs, not just the formats described above but other protocols and locations, including databases, directories, messaging, archiving, peer-to-peer sharing and so forth.
The is just one of many communication applications on the global Internet; for web browsing to integrate will with the rest of the distributed networking, web components should be independent of the application, and work well with messaging, instant messaging, news feeds, etc etc.
A sign of a breakdown of this architectural principle would be for a specification of a format (say HTML) to attempt to redefine, for its purposes, the protocol (say HTTP) or the method of reference (URI). The specifications should be independent, or at least, dependencies isolated, minimized, reduced. If those other elements of the web architecture are incorrect, need to evolve to meet current practice or have flaws in their definitions, they need to evolve independently, so that orthogonality of the specifications and reusability of the components are the promoted.
There may well be reasons to link some features of HTML to the fact that it is delivered over an interactive protocol, but linking HTML directly to HTTP in a way that features would work only for HTTP and not for any other protocol with similar features – that would be unfortunate. It might not matter in the short-term (that’s all we have right now) but it is harmful to the long-term evolution of the web.
(Should go without saying, but just in case: this is a personal post, not reviewed by the TAG)
Watching the Google I/O first day keynote, I'm pleased to see the level of support and interest from Google about HTML5. Sure enough, I wished SVG would have been mentioned there, as they did for the Canvas API, since I believe both technologies have relevant use cases. As an example, I made a demo of the HTML5 video element using SVG for the player interface. But overall, we do indeed need to tell the world that HTML is evolving to become the platform for a rich array of Web applications. New Web browser features aren't just limited to new user chrome or extensions.
I did notice however several mentions of the "HTML5 standard" that led me to write this post to remind the community of the current status of the specification, both in practice and on the standards track.. HTML5 isn't a W3C standard. We certainly look forward to the day when it is, but it isn't yet. In fact, the specification, co-authored by Ian Hickson from Google, is still very much a work in progress. We still don't have a required video codec to be supported by all browsers. Lively discussion is still happening in the HTML Working Group about the level of consensus around the spec. Sam Ruby of IBM and Chris Wilson of Microsoft are trying to move the Group forward. At the moment, HTML5 is only a working draft and Ian hopes to get it ready for Last Call review in October/November 2009 timeframe. Some of the work is also happening in the Geolocation, CSS and Web Applications Working Groups, so not all of it is under "HTML5".
So, while it is great to see support for and implementation of HTML 5, the community has not yet reached agreement enough to call it a standard, and it has not been implemented consistently across multiple browsers. Building a test suite will help a lot and we don't have one yet. This is an area that we intend to explore and to seek community support.
Structured data on the web got a boost this week, with Google's announcement of Rich Snippets and Rich Snippets in Custom Search. Structured data at such a large scale raises at least three issues:
Google's documentation shows support for both microformats and RDFa. It follows the hReview microformat syntax with small vocabulary changes (name vs fn). Support for RDFa syntax, in theory, means support for vocabularies that anyone makes; but in practice, Google is starting with a clean slate: data-vocabulary.org. That's a place to start, though it doesn't provide synergy with anyone who has uses FOAF or Dublin Core or the like to share their data.
The policy questions are perhaps the most difficult. Structured data is a pointy instrument; if anyone can say anything about anything, surely the system will be gamed and defrauded. Google's rollout is one step at a time, starting with some trusted sites and an application process to get your site added. The O'Reilly interview with Guha and Hansson is an interesting look at where they hope to go after this first step; if you're curious about how this fits in to HTML standards, see Sam Ruby's microdata.
While issues remain--there are syntactic i's to dot and t's to cross and even larger policy issues to work out--between Google's rollout and Yahoo's searchmonkey and the UK Central Office of Information rollout, it seems that the industry is ready to take on the challenges of using structured data in search engines.
I had a pretty small data interchange problem the other day: I just wanted to archive some play lists that I had compiled using various music player daemon (mpd) clients. The mpd server stores playlists as simple m3u files, i.e. line-oriented files with a path to the media file on each line. But that's too fragile for archive and interchange purposes. I had a similar problem a while back with iTunes playlists. In that episode, I chose hAudio, an HTML dialect in progress in the microformats community, as my target.
Unfortunately, hAudio changed out from under me between when I started and when I finished. So this time, a simple search found the music ontology and I tried it with RDFa, which lets you use any RDF vocabulary in HTML*. I'm mostly pleased with the results:
- from A Song's Best Friend_ The Very Best Of John Denver [Disc 1]
by John Denver
Poems, Prayers And Promises- from WOW Worship (orange)
by Compilations
Did you Feel the Mountains Tremble- from Family Music Party
by Trout Fishing In America
Back When I Could Fly
The album names come before the track names because I didn't read enough of the the RDFa primer when I was coding; RDFa includes @rev as well as @rel for reversing subject/object order. See an advogato episode on m3uin.py for details about the code.
The Music Ontology was developed by a handful of people who staked out a claim in URI space (http://musicontology.org/...) and happily took comments from as big a review community as they could manage, but they had no obligation to get a really global consensus. The microformats process is intended to reach a global consensus so that staking out a claim in URI space is superfluous; it works well given certain initial conditions about how common the problem is and availability of pre-web designs to draw from. Perhaps playlists (and media syndication, as hAudio seems to be expanding in scope to hMedia) will eventually reach these conditions, but the music ontology already meets my needs, since I'm the sort who doesn't mind declaring my data vocabulary with URIs.
My view of Web architecture is shaped by episodes such as this one. While giga-scale deployment is always impressive and definitely something we should design for, small scale deployment is just as important. The Web spread, initially, not because of global phenomena such as Wikipedia and Facebook but because you didn't need your manager's permission to try it out; you didn't even need a domain name; you could just run it on your LAN or even on just one machine with no server at all.
In an Oct 2008 tech plenary session on web architecture, Henri Sivonen said:
I see the Web as the public Web that people can access. The resources you can navigate publicly. I define Web as the information space accessible to the public via a browser.
If a mobile operator operates behind walls, this is not part of the Web.
I can't say that I agree with that perspective. I'm no great fan of walled gardens either, but freedom means freedom to do things we don't like as well as freedom to do things we do like. And architecture and policy should have a sort of church-and-state separation between them.
Plus, data interchange happens not just at planetary scale, but also within mobile devices, across devices, and across communities and enterprises of all shapes and sizes.
I've gone a little outside the scope of current standards; RDFa has only been specified for use in modular XHTML, with the application/xhtml+xml media type, so far.
See also:
As part of my introduction domain presentation to the Advisory Committee, I wanted to show what it means to work on several user interface technologies. So, I stuffed one slide with many technologies: HTML, CSS, SVG, MathML, Scripting, DFXP, Ruby, and RDFa. It's using well established technologies (like HTML buttons) and some very advanced ones (like CSS transforms or DFXP). I did go crazy on the CSS transforms and might win the award of the ugliest demo of the year as a result though.
Sam asked me to look at the results of running the demonstration through the HTML5 validator. It doesn't pass it and that's intentional. I'm not sure why HTML5 excludes the complex constructions of Ruby or why I can't use unit lengths in width or height. Boolean attributes in HTML5 can't use the values "true" or "false" and I can't get myself to accept that fact. There are probably ugly stories around those. I needed a way to link to external captions. And there is RDFa.
The main point of the demonstration is to see those technologies working and interacting together. It has been a hard road to get where we are today and there is still so much work to do, but let's not forget that it's fun to see those things working.
I got pretty excited about the iPhone, and even more about the openness of Android and the G1, and then I learn that the Palm Pre developer platform is basically just the open web platform: HTML, CSS, and JavaScript.
Just after the mobile buzz at Web Directions North and the TAG declared victory on how to build The Self-Describing Web with URI-based Extensibility , I get some details on how Palm is building on the open web platform:
A widget is declared within your HTML as an empty div with an x-mojo-element attribute.
<div x-mojo-element="ToggleButton" id="my-toggle"></div>
Oh great; x- tokens... aren't those passe by now?
The suggestion in the HTML 5 draft is data-* attributes. The ARIA draft suggests @role. The Palm design looks like new information for issue-41, Decentralized-extensibility, in the HTML WG.
Anybody know how frozen the Palm design is? Or if they looked at ARIA, data-* or URI-based namespaces?
I learned HTML at a time when some people were still building several versions of their site. I'm not talking about the web, mobile and iphone versions – more like the netscape and IE3 versions. That was a time when writing “standard” HTML was still a fairly novel idea, but a powerful one. It made sense: the alternative was “write standard code or risk having browsers crash miserably on your web page”.
That was more than a decade ago. Browsers, meanwhile, have made incredible progress at gracefully rendering even the most broken web page. And that is a good thing.
Does this make validation and quality checking of Web pages moot? Of course not. There are many more incentives to build great standard-compliant websites: ease of maintenance, show of professionalism, or, in the words of Zeldman, Client who saves $5,000 buying cut-rate non-semantic HTML will later spend $25,000 on SEO consultant to compensate
.
It makes me curious, however, to know what are the real-life arguments in favor of valid, standard code today. Do you have an untold story of validation getting you rid of an awful rendering glitch? Real-life accounts of a search engine bump achieved by fixing the syntax of you HTML <head>? A typo in a CSS stylesheet that hours of glancing at code didn't show, but the validator did? A forgotten alt that would have lowered your search rank for an important keyword, or cost a big fee for non-accessibility?
Use the comments below to share and discuss your experience - we'll update our outdated “Why Validate?” doc with the best examples.
Sorry. You must have JavaScript enabled to view this page. Click the BACK button below or enable JavaScript in your browser preferences and click TRY AGAIN.Let's turn that around, shall we? Sorry, if you're a network provider and you want my business, read up on unobtrusive javascript (aka the rule of least power), go BACK to work on your web site design and TRY AGAIN.
Yesterday, W3C launched a new donation and sponsorship program offering Web-people and Organizations a chance to show their support for Web Standards and Open source: the Validator Donation Program.
The Validators have been around for almost 15 years. From day one, they have been free, open source and… operating on a shoestring. This has been a beautiful adventure: these tools are used by millions every day, a lot of people feel very strongly about validation, and we are lucky to have a great community of developers, translators and “power users” surrounding and helping the project.
Because it makes a lot of sense. We are a large community using and loving these tools, and this program gives everyone a chance to give a little bit of thanks to a project we care about. Of course, all the validator projects are open source and there are many other ways one can contribute: help others, translate, find bugs, help document, and of course, code. But not everyone has the right time or skills to do all that, and if someone wants to contribute a bit of money for a project they love, why not?
Another reason is that we really can use that money. Projects like the validators cost a lot to run, and develop. The validators are all available for download for those who want to use them on their own network, but the free services at w3.org are obviously the place where most people go, resulting on millions of validations a day, leading to fairly massive operating costs such as servers and bandwidth.
The main cost, however, is elsewhere: staffing. We often subscribe to the myth that open source software are developed for free by armies of benevolent coders. That's quite false, especially for the validators:
Here is a short story. In the few years I worked on validators, there was a much dreaded regular episode. Every few months, Tim Berners-Lee (W3C's creator and visionary-in-chief) would go on, either at conferences or in staff meetings, about what, in his mind, our validators should be. They should be really smart. They should be really flexible. They should be incredibly useful to use. They should look great and make it really easy to fix the web. All these years, every few months, I would cringe and reply “that's a grand plan, Tim, but how do we do it without a real budget?”.
For these past years the budget given to validators would be, roughly, the equivalent of a full time staff, maybe one and a half. That certainly is enough to keep a service running, but it will take much more effort to take the family of tools and bring them to maturation, push them to a new step in functionalities and usability.
The validators certainly are one of the public faces of the W3C, but they are only part of all the work done to fulfill the mission of the Consortium: build specifications (standards) to provide the web with a robust, effective, flexible and powerful architecture. Creating those specifications involve an incredible amount of work put into the extremely important task of building consensus among all the actors with a stake in those new or updated Web Standards.
That consensus building and specification writing is, I think, the core work of W3C. Test suites, tools and tutorials are another important part of the W3C work, obviously. But with the W3C's limited budget, allocating more money for validators is not that easy: by funding the validator work through donations and sponsorships, and by using that money exclusively for the validators and related open source tools, we help the Web community “put the money where its mouth is”.
All the above is why this new donation program is exciting. It's not just about paying for bandwith, or keeping the services running…
The real question is: Do we want a more flexible, usable, friendly HTML Validator, or do we want to keep the one that we have as is? Do we want to support more types of document? Do we want to provide a better support for XML? Do we want to build a real validator for SVG? Do we want to support the developmet of new technologies such as html5, or merely follow once said technologies have reached standard status? Do we want to keep a CSS validator that mostly does CSS 2.1, or do we want anyone following the advances of CSS3 to check their code? Do we want the CSS validator to check only for syntax errors, or also give information as to which style constructs are widely – or not – supported in browsers?
With your donations and sponsorships, we can finally do all that. We can do great things. We can:
… and many more things we haven't thought about yet, and which we as a community will dream and decide.
We launched the donation program 24 hours ago. What a ride! The buzz has been very exciting so far, with blogs such as Daniel's, Molly's, John's and many more carrying the news to the Web community, and showing that the community cares.
I can't resist sharing with you a few comments attached to a few of the first donations we've received:
I use it every day and I love it! thx for the work
Keep up the great work!
You've supported us, now we support you. Thanks for all your great work, we need you!
Many thanks to those who have donated so far, and thanks for the kind words! For the future of the Validators and the future of the Web,
we really need to make this campaign a great success, together. So Donate, tell the world, tell your blog, ask your company to become a sponsor!