About Me

Saturday, October 1, 2011

Bad Android Typography: A Tale of Text Justification


"From all these expe­ri­ences the most impor­tant thing I have learned is that leg­i­bil­ity and beauty stand close together and that type design, in its restraint, should be only felt but not per­ceived by the reader." —Adrian Frutiger


When speaking of typography, one must consider a multitude of variables that go into the art of arranging type to display language. There is the selection of the typeface, the point size, the line length, the line spacing, the spacing between groups of letters and the space between pairs of letters. The end goal is of course legibility, however a good designer will seek a visually aesthetic expression of the language through typography. There is an emotive response that follows the sight of visually pleasing typography, it encourages the consumption of the content. Magazines, newspapers and books all have a rich history in beautiful typography.


Figure 1: Bembo text

The Web, being the new canvas for textual content, has been slowly following the lead of its dead tree predecessors. The implementation of the W3C's Web Open Font Format (WOFF) among all modern desktop browsers gives designers limitless possibility for beautiful text design on the web. Where the Web trails behind is in text justification. Full text justification on the Web is known to be flawed (See Batchelder: Bad Web Typography). Text that is fully justified on the Web may lead to terribly inconsistent word spacing that is very noticeable to the reader.

      I stand here today humbled by the task before us, grateful for the trust you have bestowed, mindful of the sacrifices borne by our ancestors. I thank President Bush for his service to our nation, as well as the generosity and cooperation he has shown throughout this transition.
       Forty-four Americans have now taken the presidential oath. The words have been spoken during rising tides of prosperity and the still waters of peace. Yet, every so often the oath is taken amidst gathering clouds and raging storms. At these moments, America has carried on not simply because of the skill or vision of those in high office, but because We the People have remained faithful to the ideals of our forbearers, and true to our founding documents.

The text above is fully justified with a left and right margin of 200px. Observe the spacing of words on each line and notice the inconsistency. Text justification involves aligning to the left and the right, stretching word spaces to obtain the most uniform amount of per word spacing. When the words are too "loose" it makes for a visually displeasing effect. Hyphenation is used in typography to reduce the looseness of lines. However, browsers don't hyphenate fully justified text. There have been efforts to hyphenate text using Javascript. A script called Hyphenator.js was developed to automatically hyphenate text and is available on Google Code.

On Android, using the standard TextView widget, the problem is worse. The TextView, which is a View used to display text, does not support full justification at all. This means if you wanted to make a reader application similar to the Amazon Kindle app using a TextView, you'd get text that looked like this:
Figure 2: Android TextView left aligned text

The jagged right side of the text is quite unpleasant and there is much wasted space where more text could fit in the viewport. There is a fairly simple way to get full text justification on Android; it requires using a WebView. A WebView essentially is like embedding a chrome-less browser into your Android application. This means you get all the benefits of laying out text on the Web, as well as the aforementioned deficiency.


       Figure 3: WebView with fully justified text   Figure 4: WebView with fully justified text and hyphenation 

The image on the left shows how mobile WebKit (the rendering engine which powers the WebView) lays out the fully justified text. The second image shows the layout hyphenated by the Hyphenator.js script. The use of two hyphens greatly improved the spacing for lines like "seen grasshoppers and" and "ladybugs. My uncle Bob". The hyphenated WebView text, while superior to the default TextView and the standard fully justified Web text, still has its flaws. The Amazon Kindle Android app does a better job of laying out the excerpt of Roger Ebert's "Life Itself: A Memoir". 

Figure 5: Amazon Kindle Android Application

The Kindle app does use hyphenation although it is not shown above. It clearly has a much more uniform spacing of the words though some lines are still suspect. For example: "me. Hal Holmes has a red" stands out.  The difference in the WebView layout and the Kindle layout may be attributed to the difference in font type and size, margin size, line spacing etc. The WebView examples use Helvetica and I am not sure which font is being used by the Kindle app. It is also unclear whether or not the Amazon Android app is using its own layout algorithm or is using a WebView with a hyphenation JavaScript. 

So it seems that the problem with full justification on the Web plagues Android because developers must use a WebView to achieve it unless they are brave enough to roll their own text justification algorithm. In case you are interested, try solving this problem from Kleinberg's Algorithm Design concerning "pretty printing" of text: 



But unless you are passioned by Markov Chains or Dynamic Programming, you'll probably stick to a WebView with text-align: justify. 

[Update] I've just discovered that the new CSS3 Text W3C working draft describes support for hyphenation within the browser: http://www.w3.org/TR/css3-text/#hyphenation. In fact, it's been rolled out in the latest WebKit browsers and in Firefox. I haven't tested it out yet but the usage is fairly standard:
p {
    -webkit-hyphens: auto;
    -moz-hyphens: auto;
    hyphens: auto;
}
It's great to see typography on the Web moving forward.

Thursday, September 15, 2011

Measuring FPS On The Web

Hello world. I imagine you've all forgotten about me in my absence. No, I was not away nor ill, rather I've simply been busy with life, the universe and everything. I have written a blog post that was published on CodeAurora Forum* entitled "Measuring FPS On The Web". This post was made as a Qualcomm Innovation Center employee rather than as a rogue technologist. For those of you whom have not visited my Linked In profile, I've been working at Qualcomm for a couple years as a Graphics Engineer focusing on the Android Browser. In this post, I speak of the inherent problem with measuring frame rates in JavaScript and why today's web benchmarks for the HTML5 canvas are flawed.

CodeAurora Forum: Measuring FPS On The Web


*The Code Aurora Forum is a mutually beneficial non-profit corporation promoting open source software that enables mobile and wireless ecosystems that rely on wireless internet and cellular technology.

Monday, April 4, 2011

Google I/O Sunspider demo Round 2: iPad vs Nexus One vs XOOM FIGHT!

Google showed off its speedier Froyo V8 JS engine in a demo that put the iPad up against a Froyo running N1 at Google I/O last year:



Times have changed. Apple's added the nitro JS engine to iOS 4.3 which has significantly increased performance.

We ran the swimming Android demo on an iPad running 4.3, Nexus One running Gingerbread and a Xoom running Honeycomb.

Who wins the gold? Check the video to find out!




N.B. There was a rendering artifact with the animation on the XOOM where the screen would flicker when the swimming android reached a certain area of the page.

Sunday, February 13, 2011

Making Software: Evidence to Diffuse Programming Holy Wars

I'd like to review a new software development book that I've found particularly interesting: Making Software: What Really Works, and Why We Believe It, edited by Andy Oram and Greg Wilson. After thoroughly enjoying Oram and Wilson's Beautiful Code, I was very much anticipating this next amalgamation of the thoughts of some of the industry's leading voices. Greg was a professor of Computer Science at U of T, my alma mater, and has been a great advisor to me in my career. He currently spends his time working on Software Carpentry, an effort to teach programming practices to scientists.
  • Does TDD work?
  • Is Python better than Java?
  • Are good programmers really 10 times more productive?
  • How do you measure programming performance?
  • Is open source software better than proprietary software?
  • Do design patterns work in practice?
It is enough to whisper one of these questions around a group of programmers to begin an impassioned debate. Can anyone actually be right? How can we answer these seemingly subjective questions? Making Software attempts to find credible qualitative and quantitative evidence to answer such questions. It is no longer adequate to present arguments without showing the facts. It is time we apply the scientific method to these questions, gather some solid evidence and impartially evaluate the implications.

In 2009, Thoughtworks' Martin Fowler gave a talk entitled "Three Years of Real-World Ruby" in which he presented the results of the 41 Ruby projects his company had worked on during that period. He surveyed programmers to see how they felt about working with the language. He argued for the adoption of Ruby by showing evidence of its success within his organization. This was fascinating real-world study and is exactly what the authors of Making Software would like to see more of.

Internally many software development companies are gathering evidence of their failures and successes in hopes of finding the magical formula for developing quality software fast. Few companies are willing to release such information to the public. This is part of why we don't have an abundance of empirical studies on software development. However, in recent years, more such studies have been appearing. High quality studies are out there waiting to be referenced.

My favourite chapter of the book was Steve McConnell's "What Does 10x Mean? Measuring Variations in Programmer Productivity". McConnell, of course, being famous for the highly successful Code Complete and other popular works such as Rapid Development and Software Estimation: Demystifying the Black Art. In this essay, McConnell provides substantial evidence that the "order of magnitude" difference in programming productivity is not merely anecdotal but a provable hypothesis. I've definitely seen this difference in productivity during my time at Electronic Arts; there were numerous experienced game developers who were clearly getting things done much faster than I could. However, I would imagine that familiarity with the code base played a major factor in that particular example. More interesting are the studies where programming teams are given new projects to work on and they are more or less on an even playing field. I find this is a very hot topic as such research may reveal the productivity secrets of the elite programmers. Now that's information us mere mortals are dying to hear.

What makes this book interesting is that it attempts to treat issues in software development in the same manner that we would treat anthropological issues. The authors take on controversial topics that programmers love to argue about and give us meaningful evidence to further the debates. Making Software is a great read for all programmers, regardless of whether you are 10x more productive or not.

Saturday, January 22, 2011

SMS timestamp issues on myTouch 4G

My recent switch from the Nexus One to the myTouch 4G brought me an unexpected bug. I used SMS Backup & Restore on Android to transfer my SMS history to my new phone. I've done this before; it works great. Now all my SMS conversations were on the myTouch 4G device. Along with T-mobile's Android skin comes a new messaging app that is far from an improvement on the stock one. In fact, it seems to come with a weird message timestamp bug that has annoyed me enough to make me fix it myself.

Here's the bug: the messaging app shows incoming SMS's as being received 5 hours earlier than the current time. So, in an SMS conversation, all incoming messages will be grouped together and all my outgoing messages follow. Makes the conversation view pretty useless.


When I first powered on the myTouch 4G, there was a notification asking me to change my timezone; I did this. My timezone is EST, a UTC time offset of -5 hrs. So it would seem that incoming SMS timestamps are stored as UTC and interpreted as EST and outgoing are stored as EST. Interestingly enough, when I installed an alternate messaging app, Handcent, it shows the conversations in the correct order, but the times reflect the same issue.


This probably indicates that Handcent is not sorting by the SMS timestamp but by the database ordering of the messages. Okay, so now we have a reasonable hypothesis: incoming and outgoing SMS timestamps are interpreted with different timezones. The issue has corrupted the messages in the database since my switch from the Nexus One.

There are different approaches to fixing this. One is to modify the database in the mmssms.db file on the Android filesystem. I'll pass on the SQL. Another is to use the SMS Backup & Restore app to export the SMS's as an XML file and modify the timestamps in the file and reload them into the phone --at this point you should be picturing an intense programming and blogging scene à la "The Social Network". But unlike Zuck, I'm a real programmer so I'll use Python instead of PHP. Burn. But seriously folks, I could have just as easily done this in PHP. I just prefer Python.

The XML file produced by the backup looks like this:

<smses>
<sms protocol="0" address="+XXXXXXXXXX" date="1295532538000" type="1" subject="null" body="Still need tix for winter blaze?" toa="145" sc_toa="0" service_center="+1XXXXXXXXXX" read="1" status="-1" locked="0" />
<sms protocol="0" address="+XXXXXXXXXX" date="1295535724000" type="1" subject="null" body="Ok are u both going on the bus?" toa="145" sc_toa="0" service_center="+1XXXXXXXXXX" read="1" status="-1" locked="0" />
<sms protocol="0" address="+XXXXXXXXXX" date="1295550555219" type="2" subject="null" body="Yeah man." toa="0" sc_toa="0" service_center="null" read="1" status="-1" locked="0" />
<sms protocol="0" address="+XXXXXXXXXX" date="1295558651937" type="2" subject="null" body="Yeah" toa="0" sc_toa="0" service_center="null" read="1" status="-1" locked="0" />
</smses>

According to the SMS Backup & Restore webpage, the date is stored in milliseconds since the epoch (January 1st 1970). The type attribute is either 1 or 2; received or sent. Let's look at the timestamps for the 4 messages from the screenshots: 1295532538000, 1295535724000, 1295550555219 and 1295558651937.

We'll make use of the Python time module to inspect these timestamps. The time.gmtime method converts a time expressed in seconds since the epoch to a struct_time in UTC. The time.strftime allows us to convert that into a human readible format. Our timestamps are stored in milliseconds so we'll need to divide them by 1000.

The two incoming timestamps:
>>> time.strftime("%a, %d %b %Y %H:%M:%S", time.gmtime(1295532538000 / 1000))
'Thu, 20 Jan 2011 14:08:58'
>>> time.strftime("%a, %d %b %Y %H:%M:%S", time.gmtime(1295535724000 / 1000))
'Thu, 20 Jan 2011 15:02:04'

The two outgoing timestamps:
>>> time.strftime("%a, %d %b %Y %H:%M:%S", time.gmtime(1295550555219 / 1000))
'Thu, 20 Jan 2011 19:09:15'
>>> time.strftime("%a, %d %b %Y %H:%M:%S", time.gmtime(1295558651937 / 1000))
'Thu, 20 Jan 2011 21:24:11'

Looking at older messages from my Nexus One days, I can see that the times are actually stored in the database as UTC and the -5 is applied by the messaging app. Makes sense. The outgoing messages are correctly saved with a UTC timestamp, but the incoming messages are stored with the EST time. The T-mobile and Handcent messaging apps correctly apply the -5 to the timestamps when displaying the formated time. The bug is actually that incoming messages are stored in the database with the EST timestamp instead of the UTC one. This makes it so the messaging apps show incoming messages as 5hrs earlier than the outgoing ones. To resolve this for all the messages I've received since I switched to the myTouch 4G, I simply have to add 5 hrs to the timestamps of all incoming messages starting at the switch date: January 6th 2010, 1294290000000 in milliseconds since the epoch.

Code... code... code...
#!/usr/bin/python

from xml.dom.minidom import parse

import os
import sys
import time

START_TIMESTAMP_SECS = 1294290000000
HOUR_TO_MILLISECS = 60 * 60 * 1000
RECEIVED, SENT = (1, 2)

def format_time(timestamp_in_ms):
return time.strftime("%a, %d %b %Y %H:%M:%S",
time.gmtime(timestamp_in_ms / 1000))

def convert_est_to_utc(timestamp_in_ms):
return timestamp_in_ms + (5 * HOUR_TO_MILLISECS)

def replace_timestamps(document):
smses = document.getElementsByTagName('sms')
for sms in smses:
received = int(sms.getAttribute('type')) == RECEIVED
date = int(sms.getAttribute('date'))

if received and date > START_TIMESTAMP_SECS:
sms.setAttribute('date', unicode(convert_est_to_utc(date)))

def main():
if (len(sys.argv) < 2):
sys.exit('Usage: %s sms_file' % os.path.basename(__file__))

sms_file = sys.argv[1]

document = parse(sms_file)
replace_timestamps(document)
print(document.toprettyxml(encoding='utf-8'))


if __name__ == '__main__':
main()

See http://pastebin.com/SmgkKcsR.

This code takes in the backup SMS XML file and prints out an updated XML document. Using the SMS Backup & Restore app I restore to the output XML. Et voilà!


It would be nice to know exactly where this bug originated. I have not seen this issue on the Nexus One. It must be a bug in the T-mobile software layer. I haven't solved the problem of new incoming SMS's having the incorrect timestamps; I've only fixed the timestamps of all the messages I've received so far. For this and a bunch of other reasons, I'm going to switch back to the Nexus One until I get Cyanogen Mod on this myTouch 4G.

This scene will be an awesome addition to the movie they make about my life after I become a billionaire entrepreneur. It might win a few Golden Globes.