My recent switch from the Nexus One to the myTouch 4G brought me an unexpected bug. I used SMS Backup & Restore on Android to transfer my SMS history to my new phone. I've done this before; it works great. Now all my SMS conversations were on the myTouch 4G device. Along with T-mobile's Android skin comes a new messaging app that is far from an improvement on the stock one. In fact, it seems to come with a weird message timestamp bug that has annoyed me enough to make me fix it myself.
Here's the bug: the messaging app shows incoming SMS's as being received 5 hours earlier than the current time. So, in an SMS conversation, all incoming messages will be grouped together and all my outgoing messages follow. Makes the conversation view pretty useless.
When I first powered on the myTouch 4G, there was a notification asking me to change my timezone; I did this. My timezone is EST, a UTC time offset of -5 hrs. So it would seem that incoming SMS timestamps are stored as UTC and interpreted as EST and outgoing are stored as EST. Interestingly enough, when I installed an alternate messaging app, Handcent, it shows the conversations in the correct order, but the times reflect the same issue.
This probably indicates that Handcent is not sorting by the SMS timestamp but by the database ordering of the messages. Okay, so now we have a reasonable hypothesis: incoming and outgoing SMS timestamps are interpreted with different timezones. The issue has corrupted the messages in the database since my switch from the Nexus One.
There are different approaches to fixing this. One is to modify the database in the mmssms.db file on the Android filesystem. I'll pass on the SQL. Another is to use the SMS Backup & Restore app to export the SMS's as an XML file and modify the timestamps in the file and reload them into the phone --at this point you should be picturing an intense programming and blogging scene à la "The Social Network". But unlike Zuck, I'm a real programmer so I'll use Python instead of PHP. Burn. But seriously folks, I could have just as easily done this in PHP. I just prefer Python.
The XML file produced by the backup looks like this:
<smses>
<sms protocol="0" address="+XXXXXXXXXX" date="1295532538000" type="1" subject="null" body="Still need tix for winter blaze?" toa="145" sc_toa="0" service_center="+1XXXXXXXXXX" read="1" status="-1" locked="0" />
<sms protocol="0" address="+XXXXXXXXXX" date="1295535724000" type="1" subject="null" body="Ok are u both going on the bus?" toa="145" sc_toa="0" service_center="+1XXXXXXXXXX" read="1" status="-1" locked="0" />
<sms protocol="0" address="+XXXXXXXXXX" date="1295550555219" type="2" subject="null" body="Yeah man." toa="0" sc_toa="0" service_center="null" read="1" status="-1" locked="0" />
<sms protocol="0" address="+XXXXXXXXXX" date="1295558651937" type="2" subject="null" body="Yeah" toa="0" sc_toa="0" service_center="null" read="1" status="-1" locked="0" />
</smses>
According to the SMS Backup & Restore webpage, the date is stored in milliseconds since the epoch (January 1st 1970). The type attribute is either 1 or 2; received or sent. Let's look at the timestamps for the 4 messages from the screenshots: 1295532538000, 1295535724000, 1295550555219 and 1295558651937.
We'll make use of the Python time module to inspect these timestamps. The time.gmtime method converts a time expressed in seconds since the epoch to a struct_time in UTC. The time.strftime allows us to convert that into a human readible format. Our timestamps are stored in milliseconds so we'll need to divide them by 1000.
The two incoming timestamps:
>>> time.strftime("%a, %d %b %Y %H:%M:%S", time.gmtime(1295532538000 / 1000))
'Thu, 20 Jan 2011 14:08:58'
>>> time.strftime("%a, %d %b %Y %H:%M:%S", time.gmtime(1295535724000 / 1000))
'Thu, 20 Jan 2011 15:02:04'
The two outgoing timestamps:
>>> time.strftime("%a, %d %b %Y %H:%M:%S", time.gmtime(1295550555219 / 1000))
'Thu, 20 Jan 2011 19:09:15'
>>> time.strftime("%a, %d %b %Y %H:%M:%S", time.gmtime(1295558651937 / 1000))
'Thu, 20 Jan 2011 21:24:11'
Looking at older messages from my Nexus One days, I can see that the times are actually stored in the database as UTC and the -5 is applied by the messaging app. Makes sense. The outgoing messages are correctly saved with a UTC timestamp, but the incoming messages are stored with the EST time. The T-mobile and Handcent messaging apps correctly apply the -5 to the timestamps when displaying the formated time. The bug is actually that incoming messages are stored in the database with the EST timestamp instead of the UTC one. This makes it so the messaging apps show incoming messages as 5hrs earlier than the outgoing ones. To resolve this for all the messages I've received since I switched to the myTouch 4G, I simply have to add 5 hrs to the timestamps of all incoming messages starting at the switch date: January 6th 2010, 1294290000000 in milliseconds since the epoch.
Code... code... code...
#!/usr/bin/pythonfrom xml.dom.minidom import parseimport osimport sysimport timeSTART_TIMESTAMP_SECS = 1294290000000HOUR_TO_MILLISECS = 60 * 60 * 1000RECEIVED, SENT = (1, 2)def format_time(timestamp_in_ms):return time.strftime("%a, %d %b %Y %H:%M:%S",time.gmtime(timestamp_in_ms / 1000))def convert_est_to_utc(timestamp_in_ms):return timestamp_in_ms + (5 * HOUR_TO_MILLISECS)def replace_timestamps(document):smses = document.getElementsByTagName('sms')for sms in smses:received = int(sms.getAttribute('type')) == RECEIVEDdate = int(sms.getAttribute('date'))if received and date > START_TIMESTAMP_SECS:sms.setAttribute('date', unicode(convert_est_to_utc(date)))def main():if (len(sys.argv) < 2):sys.exit('Usage: %s sms_file' % os.path.basename(__file__))sms_file = sys.argv[1]document = parse(sms_file)replace_timestamps(document)print(document.toprettyxml(encoding='utf-8'))if __name__ == '__main__':main()
See http://pastebin.com/SmgkKcsR.
This code takes in the backup SMS XML file and prints out an updated XML document. Using the SMS Backup & Restore app I restore to the output XML. Et voilà!
It would be nice to know exactly where this bug originated. I have not seen this issue on the Nexus One. It must be a bug in the T-mobile software layer. I haven't solved the problem of new incoming SMS's having the incorrect timestamps; I've only fixed the timestamps of all the messages I've received so far. For this and a bunch of other reasons, I'm going to switch back to the Nexus One until I get Cyanogen Mod on this myTouch 4G.
This scene will be an awesome addition to the movie they make about my life after I become a billionaire entrepreneur. It might win a few Golden Globes.