Look Who’s Using Wikipedia…

March 1, 2007 – 9:17 am | by Beni | 119 views
(No Ratings Yet)
Loading ... Loading ...

If you're new here, you may want to subscribe to my RSS feed. So that you can read the latest updates about Web2.0 tools, Making Money Online, Tips in SEO, Ajax and many more. Thanks for visiting Beni's Blog!

The Wikipedia home page

The Wikipedia home page

Poor Wikipedia. Professional Golfer Fuzzy Zoeller is suing one of its contributors for a defamatory cyber-attack. And last year, television host and comedian Stephen Colbert urged his audience to vandalize a Wikipedia entry about elephants to prove the point that in a model where any user can edit encyclopedia entries, those entries are only as good as their source. Take the case of retired journalist John Seigenthaler, a former assistant to Attorney General Robert Kennedy, who was wrongfully accused of involvement in the assassination of Robert and John Kennedy by an anonymous Wikipedia contributor in 2005. Given the controversy stirring around Wikipedia, the history department at Middlebury College has banned its use as a research source. When did the online form of the dust-covered encyclopedia become such a magnet for drama?

Academics are split on the usefulness of Wikipedia, which bills itself as “the free encyclopedia that anyone can edit.” The sheer volume of content (Wikipedia claims over 5.3 million entries, 1.6 million in English) is partly responsible for the site’s dominance as an online reference. When compared to the top 3,200 educational reference sites in the U.S., Wikipedia is #1, capturing 24.3% of all visits to the category, according to Hitwise data. But as the recent drama illustrates, a body of online knowledge built by an army of 75,000 volunteer, anonymous contributors and editors is prone to anything from simple benign errors to outright information vandalism.

Search and Internet behavior data provide alarming insight into this powerful but volatile resource — alarming because one of the core groups of Wikipedia users are school children.

Determining the extent to which students leverage Wikipedia requires some data detective work. The search terms that users enter to navigate to the site are the most revealing. Along with searches for various anime cartoons, sex topics and information on the most recently shorn, exposed or departed celebrities, the majority of top terms bear a close resemblance to elementary school homework and research projects. During the month of February, which is also Black History month, three of the top 20 terms sending traffic to Wikipedia were for prominent black historical figures, while two other searches were likely motivated by President’s Day. In fact, changing time-frames to any other month during the school year reveals a similar result. (Source: Hitwise)

Along with the impressive growth in visits to the site, 680% in two years, charting those visits over time confirms student activity. Over the last three years of growth, traffic dipped during the summer months and the weeks of spring break and winter vacation.

One of the reasons for Wikipedia’s stellar growth rate in visits is all the traffic it receives from search engines, over 64% last week. In fact, due to Google’s algorithm for displaying search results and the abundance of links in any given entry, Wikipedia has become the #1 external site visited after Google’s search page.

As students begin their online research, they could view the prevalence of Wikipedia references in Google as proof of the accuracy and reliability of the source. Given the search exposure and sheer volume of data available on the site, they might fall into the trap of relying on a single source for their education. Hopefully their research projects won’t involve elephants or professional golfers.

By BILL TANCER
Bill Tancer is general manager of global research at Hitwise.


Improving the CSS 2.1 strict parser for IE 7

October 9, 2005 – 1:18 pm | by Beni | 1,196 views
(No Ratings Yet)
Loading ... Loading ...

We’ve already started talking about a few of the CSS changes that are going to be available in IE 7 when we release, but there are a few hanging points that we haven’t talked about yet or haven’t covered completely. There are 3 specific items I’d like to talk about:

  • Using the root node wild card selector for IE only rules (* HTML) [strict mode only fix]
  • Multi-class selectors as defined by CSS 2.1 (.floral.pastel) [strict mode only fix]
  • Pseudo-element parsing sometimes flags rules as invalid (P:first-letter{ color: red; }) [strict/quirks mode fix]

Root node selection:
To be very clear the root node selector was a bug. This was introduced by Chris Wilson back in IE 4 which is why we don’t let him work with the code anymore.

The root node selector has long been used to create rules that only work in IE. The general pattern would be to write rules that would match in all browsers first. Then use the child selector logic ( > ) to create more specific rules that would match only in browsers that supported these selectors. Prior to IE 7 we didn’t support these selectors so they were transparent to us. Finally, you would apply rules that would only match in IE by using the * HTML pattern to match the root node. Since no other browser supports or contains a node in the DOM located above the HTML node these patterns wouldn’t match.

So what happens when we start to match child selectors in IE 7? Well, it makes a big mess because we now match more rules not meant to be used from within IE and then those are merged with IE only rules. Because of the merge styles that were never meant to be combined end up changing the layout of the page away from what the author originally intended when they designed their cross browser matrix and testing patterns.

The best fix here was to disable the root tag matching logic because it wasn’t supposed to work according to the standards. We only do this in strict mode, since the new selectors only match in strict mode, and so we now use the same sets of rules that other CSS 2.1 compliant browsers would use and we ignore the IE only rules resulting in a page that will lay out as designed by the author. There are a few issues with this when we differ in our interpretation of the spec from other browser implementations but most of the time these are minor and have easy workarounds.

An article on peachpit by Molly Holzschlag contains further examples detailing how this works.

Multi-class selectors:
When the class selector support was originally written it was drawn up based on the CSS 1 spec which only supported a single class selector in each simple selector. We wound up keeping this behavior even after implementing portions of later versions of CSS. The end result is that we always threw out the extra classes in the selector and only kept the last one in the list and matched based on that.

Well, some sites use multi-class selectors so when we looked at doing CSS 2.1 selectors work it was pretty easy to upgrade our class selectors to allow more than one class to be applied. When in strict mode we now obey all specified class selectors per simple selector. While you won’t often use the feature it can be used for some interesting applications. One of my original test cases would use combinations of red, yellow, and blue classes to paint elements based on their combined color. A selector such as .red.yellow would paint and element orange if it contained both of these classes in a space separated list within the class attribute. Any elements not matching at least both of these wouldn’t match and so you can more accurately apply hiearchical styles.

DHTML Kitchen has a great example of multi-class selectors and the current compatibility problems.

Invalid Pseudo-Element selectors:
We applied a very, very strict interpretation of pseudo elements in our parser and this would cause certain constructs to get thrown out. Basically we asserted that any pseudo-element had to be the very last thing in the selector. The spec only really mentions that there can be only one pseudo-element per selector and it must appear in the last simple-selector within the selector. Because of our strict interpretation if we saw any non-whitespace character or token after we just processed a pseudo-element we’d throw an error flag into the rule. This gave us the following behavior:

  • Fail - P:first-letter{ color: blue; }
  • Fail - P:first-letter:hover { color: blue; }
  • Succeed - P:first-letter { color: blue; } /* note the space */
  • Succeed - P:hover:first-letter { color: blue; } /* note the ordering */

The parser is much more intelligent about when and how it applies the error flag in IE 7 and the two failures you see will now succeed. Truly invalid rules will still fail and you have to be careful not to apply multiple pseudo-elements or apply psuedo-elements that are in simple selectors that are in the beginning of a complex selector.

For a clear example of this issue you can take a look at an article on MaxGeek.com.

Moving Forward:
These small issues that made writing web pages according to spec a trial-and-error situation have been fixed for IE7. By improving the parsing logic it becomes more obvious how your selectors should be written and existing W3C documentation can be used to quickly come up to speed. It should be easy to introduce interesting layouts and formatting in your web pages without having to specify custom rules for each browser and hopefully IE 7 comes one step closer to making that a reality.