Office 2007 no longer measures up to OXML standard, says consultant
By Scott M. Fulton, III | Published April 22, 2008, 2:36 PM
With the myriad changes that had to be made to DIS 29500 before it could be approved by three-fourths of the ISO subcommittee's voters, there was a very high chance that by the time Microsoft saw its offspring once again, it wouldn't recognize it.
As a consultant for conformance testing agency Griffin Brown confirmed last Thursday, indeed, Office 2007 may require an upgrade before it can say it faithfully adheres to an international standard.
The job of the Griffin Brown consultancy is a strategic consultant and implementer of XML-based systems -- which means, among other things, it helps clients plan for information efficiency through conversion to, or implementation of, XML. But another thing it does is evaluate the adherence of an XML-based implementation to the standard that describes it, and no more important such standard currently exists than ISO/IEC 29500, which explains the Open XML formats.
To become a standard, Open XML had to embrace more existing standards than Microsoft had originally intended for it to. Among the more obvious examples are how it represents dates and times, and how it enumerates colors. To win support from skeptics, Microsoft had to promise that these features, among others, were capable of being changed. In so doing, they enabled alterations to the standard as recognized by the Ecma organization, resulting in a specification of a format to which, for now, Office 2007 doesn't adhere.
There are two models of the current ISO standard, one which is a strict interpretation, and another which more closely resembles the Ecma standard called the transitional interpretation. The latter is just what its name implies: a way for implementers to adhere to the basics of the standard while moving toward the strict interpretation.
As reported by the firm's Alex Brown -- who incidentally was a convener of the ISO's Ballot Resolution Meeting on the matter -- he received the technical specifications for both models from a colleague, stored using an OASIS standard XML schema called RELAX NG. He then used a Java-based validator called Jing to determine how many non-conformities there were between Microsoft's Open XML format, specified the same way, and the ISO models.
"The expectation is that existing Office 2007 documents might be some distance away from being valid according to the strict schemas," Brown wrote. "Sure enough, jing emitted 17 MB (around 122,000) of invalidity messages when validating in this scenario."
But most of those messages, Brown found, were mainly about the same thing -- to his surprise. And also as he expected, adherence with the transitional model was much closer: 84 rather than 122,000 +, with most of those having to do with the fact that ISO prefers the terms "TRUE" and "FALSE" to imply binary states, rather than Microsoft's occasional "ON" and "OFF."
In a blog post yesterday, Microsoft developer Doug Mahugh saw this latter set of results as good news. "To put that second number in perspective, there were 84 total errors in a document of 60,299,969 characters, which works out to about one error in every 700,000 characters or so," Mahugh wrote.
"Alex's research is an interesting first step in understanding conformance for IS29500," he continued. "Another interesting step may eventually appear in the form of a test suite, a suggestion from Italy and other countries. The existence of such a test would be useful as more implementations become available."
Brown promised to conduct a similar Jing test with ODF, the first ISO/IEC standard for interchangeable documents, and teased the reader a little bit as to whether it conforms as well to its RELAX NG specification. In a recent conversation shared on his ConsortiumInfo.org blog, Linux Foundation board member and attorney Andrew Updegrove challenged Brown to admit that ODF would undoubtedly be cleaner than OXML, at the very least.
"I'd go with that," Brown conceded. "I think ISO/IEC 26300 (ODF 1.0) can be compared to a neat house built on good foundations which is not finished; 29500 (OOXML) is a baroque cliff-side castle replete with toppling towers, secret passages and ghosts: it is all too finished."
Just do a statistical analysis of how Bn shuffles the news to keep anti MS news on the front page. News about problems with Apple, OPen Source, Linux, or Google quickly disappear into the netherworld of BN.
BN constantly brings back old news like this on their front page because they know who really pays their bills. And I dare BN and their creepy writers to release their Revenue and from what sources.
Finally, BN doesn't even always present the news about certain companies that might affect their revenue stream. This is not OS bashing but BN bashing.
Score: 0
|"This article is about how MSOOXML is different from OXML after OXML was changed for certification's sake to be more standard..."
The key is changed after certification. MS meets the criteria and then the standard changes so they'll fix this minor problem, end of story. Except for BN.
Score: 0
|With all due respect to the flamethrowers, I have to agree with PC. One flaw can create multiple errors. For example, in security software, you might have one flaw that flags over and over again increasing the supposed # error when in fact it's really one.
My bigger gripe is having to see that creepy little bug called Scott Fulton's face on every article and the BN shuffles news stories based on what seems to be anti- MS bashing even if the articles are not entirely true.
Now as to fixing the problem, MS will but not even Apple reacts that quickly. Why the standard is "True/False instead of ON/Off is a minor point and not worthy of even a sentence on BN unless BN has an agenda.
And using M$ is just plain childish.
Score: 0
|First off.. yes the title was overly inflammatory. Second, from story to story in the comments, I see people calling Scott either Ballmer's bed buddy or a Microsoft basher. He's a reporter relaying the facts as given to him; did you guys miss the quotation marks in the story above? I think the story showed that Microsoft after throwing it's weight around, compromised on the standard and now they have to fix things to fill the gap created by those compromises. It seems like this story turned out better than when Microsoft has "used standards" in the past (DNS, Kerberos, LDAP...).
Score: 0
|So, like i have been saying, no one implements M$OOXML, not even M$. There was a thread the other day where i said this and PC_Troll and the other M$ employees on here said i did not know what i was talking about. LOL. Can't wait to hear the spin on this.
"But the M$ employee who lead the bribing, i mean committee stacking, says 'most' of the errors are the same" lol. I can hear it now.
Score: 0
|Wrong, wrong, wrong...
Office 2k7 IS MSOOXML.
The standard is OXML.
As such, they're not the same thing. This article is about how MSOOXML is different from OXML after OXML was changed for certification's sake to be more standard...
But then again, you troll merrily along, don't you? :)
Score: 0
|PC_Troll and the other M$ employees on here said i did not know what i was talking about.
...and what? you just posted here to let everyone know we were right?
Did you even *read* the article? Do you even *know* the difference between OXML and MSOOXML?
Of course not, you're a dumba** troll.
Score: 0
|An implementation doesn't conform to a standard that didn't exist when it was created. How surprising! Especially when Microsoft has said they'll update Office to conform to that specification.
Score: 0
|What if someone spent millions getting a proprietary product specification approved as a standard and no one used it. Is it still a standard?
*I guess hieroglyphics was once a standard, too.* Heckuva job, Microsoft!
Score: 0
|To clarify one item.
Of the "84 errors" that were found in the document of 60,299,969 characters, all were of the same exact type. They arose because since the spec was initially released a year and a half ago, they decided to change the name of the toggle "on/off" for one field to "true/false" instead.
There is really no story here.
However, the usual cast of characters are misrepresenting this for all it is worth. At least betanews used a factually correct title for its story. The others are screaming "Office 2007 doesn't conform to OOXML spec!!!!"
Score: 0
|Scott is just repeating what the register said
http://www.theregister.c...ce_2007_oxml_fails_test/
and the comments that followed
http://forums.fark.com/c...ments.pl?IDLink=3555956
Fark if you never been on is a more relaxed version of Slashdot in the tech section
Less m$ on there but still have the same people not reading the article and knowing nothing about the format as it is now.
Score: 0
|lmao...
But most of those messages, Brown found, were mainly about the same thing -- to his surprise. And also as he expected, adherence with the transitional model was much closer: 84 rather than 122,000 +, with most of those having to do with the fact that ISO prefers the terms "TRUE" and "FALSE" to imply binary states, rather than Microsoft's occasional "ON" and "OFF."
Frankly, Scott, considering the title, I'm amazed you put those little tidbits in there, although kudos on the "no longer" bit.
OXML has evolved since the release of Office 2007. Of course the Office 2007 released prior to submission and resultant modifications will be non-conforming.
Office 2007 SP2 will once again match ISO spec upon release if it's near spot-on conformity to the transitional spec is any indication.
...not that this will at all hinder Zaine, sjc001, and El Dingo from making absurd accusations and proclaiming this invalidates the entire standard (ignoring the fact that ODF does not have a conforming implementation either).
Score: 0
|Oh come on... It's still funny after all the OXML circus :-)
Score: 0
|or from BN inflaming the situation with one of their patented ( I think they are trying for a patent) misleading headlines.
Score: 0
|The 84 true/false errors are something that should have been patched over a weekend.
The 122 thousand errors that need to be fixed for a strict implementation are what is significant here.
It's good that there is a compliance test. Historically Microsoft does not worry much about standards compliance unless someone else is keeping score.
Score: 0
|Errors!=bugs. I'm sure you knew that, right?
They will get an error each time a conflict is found, no matter how many times it runs into the same conflict.
Technically, you could get 122,000 error flags from one conflict, so that number is actually pretty meaningless. The *actual* number of conflicts would have had much more meaning...but been far less controversial.
The 84 true/false errors are something that should have been patched over a weekend.
Yeah, best get to it now instead of waiting for something a bit more final than "transitional"...
/sarcasm
Score: 0
|