Palm Desktop, I stab at thee!
Firstly, I’m starting P90X tomorrow. Should be interesting. Secondly, I miss you guys :/ I’m living with somebody who asked me what the Dead Sea Scrolls are this morning, since it was on the news that they’re coming to the Science Museum.
By the way, ever planning on touching your blogs again (Sewpbox and Rattributes not included)?
So I’m migrating Heather’s Palm Desktop crap to Google Calendar (I have no idea why no tool exists to do this). Google Calendar doesn’t really like the CSV I massaged out of it (only importing about half the records), and I’m starting to see why. Half the records are fucking duplicates in every way but one. I wrote a Python script to do it for me anyway.
The long and short of it amounts to this:
If you want the easy way, export the Palm data to a .mda, import it into Yahoo Calendar, then into Google Calendar from there. Otherwise, export it to a CSV, and hit it with this script:
#!/usr/bin/ruby # require 'csv' input = "export.csv" output = "gcal.csv" csvfile = File.open(input) {|f| f.read} puts "Parsing..." csv = CSV::parse(csvfile) fields = csv.shift puts "Writing..." File.open(output, "w") do |f| f.print "Subject, Start Date, Start Time, End Date, End Time\n" csv.each do |line| startdate, starttime = Time.at(line[6].to_i).strftime("%m/%d/%Y,%I:%M:%S %p").split(',') enddate, endtime = Time.at(line[7].to_i).strftime("%m/%d/%Y,%I:%M:%S %p").split(',') f.print "\"#{line[11]}\",#{startdate},#{starttime},#{enddate},#{endtime}\n" end end puts "Done."
If you don’t feel like exporting, and are running on Windows:
#!/usr/bin/ruby # # require 'win32ole' require 'dbi' class Access attr_accessor :mdb, :conn, :data, :fields def initialize(mdb=nil) @mdb = mdb @conn = nil @data = nil @fields = nil end def open connstring = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=#{@mdb}" @conn = WIN32OLE.new('ADODB.Connection') @conn.Open(connstring) end def query(sql) set = WIN32OLE.new('ADODB.Recordset') set.Open(sql, @conn) @fields = [] set.Fields.each do |field| @fields << field.Name end @data = set.GetRows.transpose set.Close end def close @conn.Close end end output = "gcal.csv" rows = Array.new db = Access.new('c:\path\to\mdb') db.open db.query("SELECT * FROM Main;") names = db.fields rows = db.data #Alternatively DBI.connect("DBI:ODBC:driver=Microsoft Access Driver (*.mdb);"+"dbq=c:/path/to/mdb") do |dbh| dbh.select_all('select * from Main') {|row| rows << row} end puts "Writing..." File.open(output, "w") do |f| f.print "Subject, Start Date, Start Time, End Date, End Time\n" rows.each do |line| startdate, starttime = Time.at(line[6].to_i).strftime("%m/%d/%Y,%I:%M:%S %p").split(',') enddate, endtime = Time.at(line[7].to_i).strftime("%m/%d/%Y,%I:%M:%S %p").split(',') f.print "\"#{line[11]}\",#{startdate},#{starttime},#{enddate},#{endtime}\n" end end puts "Done."
If you want the details…
Essentially, Palm’s Datebook dumps everything into an Access database. No keys or relations (granted, only 3 tables, but still), and no idea what most of the columns do. Tools for working with Jet on Linux are minimal, and I didn’t feel like going through win32ole just to get to Jet, plus this sort of thing is nicer to do in downtime at work. So, I exported it via ODBC to a Postgres database on my Solaris box. Not pretty.
access=# \d main TABLE "public.main" COLUMN | Type | Modifiers ----------------+------------------------+----------- record_id | bigint | NOT NULL STATUS | integer | placement | bigint | private | smallint | category | character varying(20) | start_time | bigint | end_time | bigint | untimed | smallint | time_zone | character varying(40) | location | character varying(255) | summary | text | alarm_advance | character varying(10) | alarm_unit | character varying(10) | repeated_event | character varying(255) | alarm | smallint | note | character varying(100) | access=#
Ok, so record_id seems to be some sort of key, and Heather doesn’t bother with notes or alarms, so this doesn’t seem like it’d be so bad. To figure why Google is only taking some of the records, though:
access=$ SELECT count(*) FROM main; count ------- 5094 (1 row) access=$ SELECT count(DISTINCT record_id) FROM main; count ------- 5074 (1 row) access=$ SELECT count(DISTINCT start_time) FROM main; count ------- 2488 (1 row) access=$ SELECT count(DISTINCT end_time) FROM main; count ------- 2490 (1 row) access=$ SELECT count(DISTINCT summary) FROM main; count ------- 2264 (1 row) access=$ SELECT record_id, start_time, end_time, summary FROM main WHERE record_id IN (SELECT record_id FROM main GROUP BY record_id HAVING count(*)>1); record_id | start_time | end_time | summary -----------+------------+------------+------------------------------------------------------------------------- 0 | 1231437600 | 1231441200 | tammy 0 | 1231869600 | 1231873200 | nb chanber lunch 0 | 1229642100 | 1229645700 | tammy AND joe photos st claire broiler 0 | 1231959600 | 1231963200 | dr hunt 0 | 1230505200 | 1230508800 | tilsen photos 0 | 1230568200 | 1230571800 | meet gary at studio 0 | 1230571800 | 1230584400 | bri AND kids 0 | 1230744600 | 1230748200 | tilsen, AND sandy ORDER y membership mail 0 | 1230681600 | 1230681600 | Dan, missy AND the kids. 0 | 1231610400 | 1231614000 | james j hill houseOngoing Daily 11/15/08 - 2/22/09 m-sat 10-4 sun 1-4 0 | 1230663600 | 1230667200 | tammys house glasses shopping 0 | 1229727600 | 1229731200 | ryan help at studio 0 | 1231889400 | 1231893000 | 0 | 1231889400 | 1231903800 | EMS 0 | 1237161600 | 1237161600 | spring break 0 | 1229983200 | 1229986800 | msp WITH the girls 0 | 1241049600 | 1241049600 | DISH 0 | 1232233200 | 1232244000 | jordan senior photos excel AND studio 0 | 1230055200 | 1230058800 | paige studio 0 | 1230314400 | 1230318000 | amanda tg 0 | 1229968800 | 1229972400 | sara AND nolan (21 rows) access=$ SELECT record_id, start_time, end_time, summary FROM main ORDER BY start_time ASC LIMIT 10; record_id | start_time | end_time | summary -----------+------------+----------+--------- 7128069 | 31449600 | 31449600 | c 7128068 | 31449600 | 31449600 | a 7123605 | 31449600 | 31449600 | a 7128070 | 31449600 | 31449600 | 3 7124866 | 31449600 | 31449600 | c 7124107 | 31449600 | 31449600 | 3 7124145 | 31449600 | 31449600 | o 7124141 | 31449600 | 31449600 | ; 7128072 | 31449600 | 31449600 | ; 7128071 | 31449600 | 31449600 | o (10 rows) access=$ SELECT record_id, start_time, end_time, summary FROM main ORDER BY start_time DESC LIMIT 10; record_id | start_time | end_time | summary -----------+------------+------------+----------------------------- 7127485 | 1256774400 | 1256774400 | lawerance wedding 7125815 | 1256774400 | 1256774400 | lawerance wedding 7128114 | 1244167200 | 1244170800 | NB senior ALL night party 7125941 | 1242489600 | 1242493200 | nyquist edding 7125827 | 1242489600 | 1242493200 | nyquist edding 0 | 1241049600 | 1241049600 | DISH 7128073 | 1238079600 | 1238083200 | books IN the woods 7125623 | 1238079600 | 1238083200 | books IN the woods 7125697 | 1238025600 | 1238025600 | gunflint books IN the woods 7126175 | 1238025600 | 1238025600 | gunflint books IN the woods (10 rows) access=$
Oh, yeah! What I’ve gathered:
- There are duplicate record_ids (which I’d hoped would have been unique).
- There are events set to start and end at duplicate times
- Palm, at some point, duplicated a lot of the other records, except for the record_id.
- Times are stored in epoch seconds (oddly, Unix epoch seconds, not Windows)
- Some of the times correlate to 1970? WTF
A working solution:
access=$ SELECT DISTINCT a.start_time, a.end_time, a.summary INTO holdkey FROM main a WHERE EXISTS ( SELECT 'x' FROM main b WHERE a.start_time = b.start_time AND a.end_time = b.end_time AND a.summary = b.summary) ORDER BY a.start_time DESC; SELECT access=$ SELECT count(*) FROM holdkey; count ------- 2597 (1 row) access=$ DELETE FROM main USING holdkey WHERE main.start_time = holdkey.start_time AND main.end_time = holdkey.end_time AND main.summary = holdkey.summary; DELETE 5085 access=$ SELECT record_id, start_time, end_time, summary FROM main; record_id | start_time | end_time | summary -----------+------------+------------+--------- 5280360 | 31536000 | 31536000 | 5280298 | 31536000 | 31536000 | 5280429 | 31536000 | 31536000 | 7125497 | 1193437800 | 1193437800 | 7128378 | 31536000 | 31536000 | 7128376 | 31536000 | 31536000 | 7128374 | 31536000 | 31536000 | 7127620 | 1193437800 | 1193437800 | 0 | 1231889400 | 1231893000 | (9 rows) access=$ DROP TABLE main; DROP TABLE access=$ SELECT * INTO main FROM holdkey; SELECT
That works. Of course there’s the quick and dirty way which doesn’t involve munging about with temp tables:
access=$ DELETE FROM main t1 USING main WHERE EXISTS (SELECT * FROM main t2 WHERE t1.start_time = t2.start_time AND t1.end_time = t2.end_time AND t1.summary = t2.summary AND t1.record_id < t2.record_id); DELETE 2488 access=$ SELECT count(*) FROM test; count ------- 2606 (1 row)
It gives a slightly different result, but operates under the assumption that Palm’s record_id means something (it may not, for all I know). On the upside, it preserves all the columns in case they’re useful for something (doubtful). I could order by start_time and select into another table, add an index, and do the same thing, but it’s easier the quick and dirty way. There’s probably a trivial way to do this with joins, but I couldn’t think of one, and it leaves 9 records with a record_id of 0..
Here’s the code which it turns out I didn’t need, but it might be useful to somebody:
#Rips data from Palm Desktop. Uploads it to Google Calendar #Written with Python 2.5 (though imports should work anyway) # #Currently, the Access MDB Palm Datebook uses has been exported to a #PostgreSQL server via ODBC, so I'll be connecting to that # #There's code in here for getting through Access also, but I haven't tested it. #Use at your own risk (kinda like Access). # #This is mostly due to the Postgres ODBC driver, and the fact that I didn't #want to bother with quoting all the queries for Postgres to allow spaces try: from xml.etree import ElementTree #Python 2.5, probably 2.6/3.0 also except ImportError: from elementtree import ElementTree #Python <2.4 import gdata.calendar.service import gdata.service import atom.service import gdata.calendar import atom import getopt import sys import string import time import psycopg2 #Talk to Postgres class Struct: def __init__(self, *args, **kwargs): for k,v in kwargs.items(): setattr(self, k, v) class GCalMigrate: def __init__(self): self.conn = None self.cur = None self.calendar = None self.records = [] def connect(self): try: self.conn = psycopg2.connect("dbname='whatever' user='yournamehere' host='server'") except: print "Can't connect to the database!\n" sys.exit() self.cur = conn.cursor() query() def accessconnect(self,mdbpath): import odbc self.conn = odbc.odbc("driver=Microsoft Access Driver (*.mdb);DBQ=%s") % mdbpath self.cur = conn.cursor() queryaccess() def queryaccess(self): rows = [] self.cur.execute("SELECT Main.[Start Time], Main.[End Time], Main.[Summary] FROM Main") rows = cur.fetchall() conn.close() parserows(rows) def query(self): rows = [] try: self.cur.execute("SELECT start_time, end_time, summary FROM main") rows = cur.fetchall() except: print "Couldn't query the database.\n" conn.close() parserows(rows) def parserows(self, rows): for row in rows: starttime = time.strftime("%Y-%m-%dT%H:%M:%S.000Z", time.gmtime(row[0])) endtime = time.strftime("%Y-%m-%dT%H:%M:%S.000Z", time.gmtime(row[1])) title = row[2] record = Struct(start_time=starttime, end_time=enddtime, title=title) self.records.append(record) login() def login(self, username, password): self.calendar = gdata.calendar.service.CalendarService() self.calendar.email = username self.calendar.password = password self.calendar.source = "Palm_Desktop_Migrator" self.calendar.ProgrammaticLogin() batchsubmit() def batchsubmit(self): feed = gdata.calendar.CalendarEventFeed() for record in records: insertme = gdata.calendar.CalendarEventEntry() insertme.title = atom.Title(record.title) insertme.content = atom.Content("") insertme.when.append(gdata.calendar.When(start_time=record.start_time, end_time=record.end_time)) insertme.batch_id = gdata.BatchId(text='Palm_Migration') feed.Add_Insert(entry=insertme) response = self.calendar.ExecuteBatch(feed, gdata.calendar.service.DEFAULT_BATCH_URL) return response if __name__ == "__main__": runner = GCalMigrate() responses = runner.connect() for entry in responses.entry: print "Batch ID: %s" % entry.batch_id.text print "Status: %s" % entry.batch_status.code print "Reason: %s" % entry.batch_status.reason