Gmail to Google Apps Email Migration
I came up with a method for migrating the emails in my personal Gmail (user@gmail.com) email account to my Google Apps (user@thamtech.com) email account. I had a few simple requirements:
- Every email in the @gmail.com account must be migrated into the @thamtech.com account with all attachments intact.
- The read/unread status of each email must be maintained.
- The labels applied to each email must be maintained, whether they were applied by a filter or manually.
- Certain Google-endorsed migration solutions are only able to maintain message labels that were applied automatically by a filter.
- The starred/non-starred status of each email must be maintained.
- The date on migrated emails must be the original date, NOT the date of migration.
- Certain migrations involving Entourage have had this unfortunate result.
- The Recipient column when viewing the list of migrated Sent Mail must show the recipients of the emails, NOT my name or “me”.
- Certain migrations involving involving Entourage or Outlook have had this unfortunate result.
Also, Gmail normally replaces my name with “me” when displaying the sender/receiver of emails. I prefer that the emails display exactly the same, “me,” after being migrated, rather than saying “user@gmail.com”. Is this too much to ask? No!
I found a solution using imapsync and Amazon EC2 (I suppose any old computer would do, but this gave me a much higher bandwidth connection to Google’s servers than I would have had otherwise). Here’s a brief overview of my procedure:
- Run an Amazon EC2 instance of “Fedora Core 4: Developer,” instance ami-26b6534f
- SSH into my new instance
- Install imapsync and required Perl packages
- Build a script called “run-imapsync”:
where “user@gmail.com” is your Gmail account and “user@domain.com” is your Google Apps account.
- Make the script executable with
- Create the password files named “passfile1” and “passfile2” that contain the password for the source and destination imap accounts, respectively.
- Execute the script
imapsync command
My imapsync command calls for a little explanation.
The --regexmess
parameters are regular expressions to apply to each message before it is uploaded to the destination server. The first two change the header email addresses from my old address to the new email address. This makes Google label them as “me” instead of “Tyler” in the web interface of my destination account.
I was getting errors from the script when it tried to upload messages that had no subject (it also had errors uploading emails with subject “Re: “, where there was no real subject other than the prefix). To fix this, I added the next two regular expressions to replace blank subjects with “(no–subject)”. It STILL had problems, so I tried “(no–subject)” and it worked. It seems strange, but it worked and I didn’t investigate further.
I don’t think I had any emails with subject “Fw: “ or “FWD: “. If you do, and you are getting errors when the script tries to upload them, try adding a couple more regular expressions to the command to fix “Fw: “ subject lines like it does for “Re: “.
You can append an additional argument to the imapsync command, “--folder X
”, where X is an imap folder to transfer. You could use this if you only want to transfer “[Gmail]\All Mail” or “[Gmail]\Sent Mail”, for example. I like to append ‘--folder "$1"
’ to my imapscript command in the run-imapscript file, and then execute run-imapscript with a single parameter, like “[Gmail]\All Mail” (including the quotes, so that it treats that string containing a space as one unit, rather than as two separate parameters).
Lockdown in Sector 4
I became impatient and decided I could transfer multiple labels simultaneously by issuing the run-imapsync command multiple times in parallel with different folder parameters. After a while of doing this, I got the dreaded “Lockdown in sector 4” message from Gmail. It did not lock me out of my web interfaces, but it did prevent me from transferring emails through IMAP for a few hours. Once I got back in, I limited myself to running one instance of imapsync at a time.
Multiple Executions
You will almost certainly have to run the imapsync command multiple times before all of your mail is transferred, unless you have just a few emails to begin with. I had to run it probably 20-50 times to get everything transferred (about 450MB). Imapsync exits every once in a while for whatever reason - maybe the IMAP servers kick it off when they get tired of it.
I did have to run it many more times than 20-50 in the course of figuring out the procedure described in this post, but 20-50 seems about what it took once I started fresh using the command described above.
Conclusion
Overall, the migration from my [user]@gmail.com account to [user]@thamtech.com was a complete success, meeting all of the requirements I mentioned at the beginning of this post. If I find some time, I’ll work up more detailed instructions or maybe set up an Amazon EC2 ami image that’s ready to go.
Comments