- I exported from Roller into a MovableType format.
- I used the http://code.google.com/p/google-blog-converters-appengine/ project to convert the movable type file into a file suitable for Blogger import.
- However, all the dates were broken; they all showed up with today's date. I tracked this down to the fact that the converter filter assumes the times in the archive are in 12 hour format (with am/pm at the end, and for me they were not.) So I fixed this by replacing the following line in google-blog-converters-r89/src/movabletype2blogger/mt2b.py :
return time.strptime(mt_time, "%m/%d/%Y %I:%M:%S %p")
return time.strptime(mt_time, "%m/%d/%Y %H:%M:%S")
- The biggest challenge was dealing with the images; the blog converted okay but still referenced all my image resources on blogs.sun.com. To fix this, I first uploaded all my images to picasaweb and made them world writable. Then I looked at the page in picasaweb which shows thumbnails for all the images in the album, and I saved it. This file contains image links. Unfortunately, the images all end up with many different url prefixes, so it's not as simple as just replacing the old image prefix with the new one. I wrote a short Java program to extract all the image links, and create a map from file base name to the full url.
- I then wrote a simple Java program to rewrite the Blogger import file such that it replaces all urls of the form http://blogs.sun.com/... with the corresponding new Picasaweb url using the above map. I also took the image urls from the thumbnails and removed the /s128/ portion of the url, which gets you the original image rather than the thumbnail. (Also, it turns out picasaweb insists on converting all .png images and .gif images to .jpg, so the URLs have to adjusted for those cases.)
- I also did a little bit of post processing on the results; for example, I collapsed some repeated br tags that are no longer necessary, and I inserted a "This blog entry was imported and urls might be wrong"-warning at the top of each imported post.
- One final tip: I discovered that a number of my images were missing. It turns out that Picasa by defaults hides small images (of which I had many) - so these were not uploaded! There are places to both go and enable these as well as adding .gif to the list of included file extensions, so handling this is easy once you're aware of the problem.
Hopefully this helps anyone else wanting to make a similar migration. I have the Java code for the image url manipulation if anyone wants it (but it's not generic so you'll need to do a little massaging for your own needs.)