Is This a Security Issue?

Sunday, March 18th, 2007

More interesting results from yesterday’s experiments with dumping some markup in the title of a post and seeing what breaks. I noticed the markup made its way into the WordPress Admin section. Is that just because the markup I used (strong and span tags) was relatively innocuous or is there a potentially deeper problem? Let’s find out.
(more…)

A Strong Test for Markup In Titles & Summaries

Saturday, March 17th, 2007

I’ve been hacking on Benjamin Smedberg’s Atom 1.0 plug-in for WordPress. I’ve added a preference panel for choosing between full text and summary feeds. Now I’ve fixed the double escaping of content in titles and summaries. (Escaped HTML is evil and should never have been allowed into Atom.)

However I’m not sure how my hack will react when posts contain markup in titles and summaries so I’m playing with that now. Hence this post. I may delete it once I’m convinced I’ve covered the various special cases well enough.

Things may look a little funny in the feed until I’m done since I’ll be deliberately breaking things to see how WordPress behaves.
(more…)

Speeding Up This Site

Friday, March 16th, 2007

I know this site is more than a little slow on occasion. I also know that the static site www.xom.nu which is hosted on the exact same server runs like a bat out of hell, so it’s likely not the server hardware (Mac Mini) or network connection (Speakeasy DSL) that’s at fault. The remaining candidates are:

  • PHP (Very likely)
  • WordPress (Maybe, but unlikely except in so far as it’s written in PHP)
  • Traffic volume (especially comment spammers)
  • MySQL (Possible, but I tend to doubt it.)

I’ve got a lot of suggestions for improving performance, and I plan to start trying some of them. I don’t, however, have any good measurements of where this server is spending its time. I’d appreciate it if anyone could share knowledge and experience as to how to determine where the server is taking it’s time, and how to find out what’s making it so slow. Thanks.
(more…)

Best Tools for Checking Web Accessibility

Wednesday, March 14th, 2007

I’m now working on the accessibility chapter of Refactoring HTML. I’d like to mention some automated tools for checking accessibility. The W3C lists a couple of dozen. Which are the best? If you had to pick just two or three, which would you choose?
(more…)

Another Reason Java is Faster than C (maybe)

Monday, March 12th, 2007

Paul S. R. Chisholm points out a new reason virtual machine based languages such as Java may sometimes outperform statically optimized languages such as C:

Portability depends on architecture (for example, x86 vs. PowerPC), but high performance depends on microarchitecture (for example, Pentium M vs. Athlon 64 X2). Today’s Core 2 chips have many high performance features missing from the 1993 original Pentiums. A good compiler like gcc can take advantage of those additional features. This is bad news if you’re using a binary Linux distribution, compiled to a lowest common denominator. It’s good news if you’re building and installing Linux from source, with something like Linux From Scratch or Gentoo/Portage. It’s also good news for just-in-time compilers (think Java, .NET, and Mono); they’re compiling on the “target” machine, so they can generate code tailored for the machine’s exact microarchitecture.

This sounds plausible in theory. What I don’t know is whether Java takes advantage of this in practice. Has anyone looked at the JIT source code lately? Can anyone say whether it makes any microarchitecture-specific optimizations?

Defining Block Level Elements

Wednesday, March 7th, 2007

I know what an HTML block level element is, but I’m damned if I can say it in a concise, correct, obvious way (which it so happens I need to do in Chapter 4 of Refactoring HTML). In HTML, block level elements include p, blockquote, div, table, ul, ol, dl, h1h6, and a few others. Generally speaking a block element has a line break before and after it, but that’s really only true in a particualr visual representation. The notion of line breaks doesn’t make a lot of sense in a screen reader, for example.

The HTML 4.0.1 specification defines block elements thusly:

Certain HTML elements that may appear in BODY are said to be “block-level” while others are “inline” (also known as “text level”). The distinction is founded on
several notions:

Content model
Generally, block-level elements may contain inline elements and other
block-level elements. Generally, inline elements may contain only data and
other inline elements. Inherent in this structural distinction is the idea that
block elements create “larger” structures than inline elements.
Formatting
By default, block-level elements are formatted differently than inline
elements. Generally, block-level elements begin on new lines, inline elements
do not. For information about white space, line breaks, and block formatting,
please consult the section on text.
Directionality
For technical reasons involving the [UNICODE] bidirectional
text algorithm, block-level and inline elements differ in how they inherit
directionality information. For details, see the section on inheritance of text direction.

That’s not a great definition though. These seem more to be consequences rather than defining characteristics of block level elements.

Can anyone offer a more precise definition of block element that does not presume a particular rendering? Just what is a block anyway?