Thursday, July 23, 2009

From VMware to Amazon EC2

I have a VMware image running CentOS 5.3 (32 bit). Here are the steps it took to convert it to an amazon EC2 AMI. Note that since I did this from within the VM (not outside, on the vmk files), the same method would apply to non VM installations of CentOS.
  • Make sure ruby and java 1.6 are installed.
  • Download and install AMI tools from http://developer.amazonwebservices.com/connect/entry.jspa?externalID=368 (requires ruby).
    wget http://s3.amazonaws.com/ec2-downloads/ec2-ami-tools.noarch.rpm
    rpm -Uvh ec2-ami-tools.noarch.rpm

  • Donwload and install API tools from http://developer.amazonwebservices.com/connect/entry.jspa?externalID=351 (requires java).
    wget http://s3.amazonaws.com/ec2-downloads/ec2-api-tools.zip
    cd /usr/local/ec2 
    unzip /root/ec2-api-tools.zip 
    ln -s ec2-api-tools-* apitools

  • cleanup the VM disks. I had to cleanup some logs and backups to free up some space for the AMI image. Note that the AMI images are saved in "sparse" format, so they occupy less space than they appear (and show with ls -l)
  • Register for EC2 at http://aws.amazon.com/ec2/
  • Once registered it gives you the AWS Access Id and the AWS Access Secret key.
  • Create the X.509 key pair and download them into the VM image under /root/.ec2. Also save this locally on your desktop.
  • install the ElastickFox firefox extension from http://s3.amazonaws.com/ec2-downloads/elasticfox.xpi
  • Add the following into /root/.bash_profile
    export JAVA_HOME=/usr/lib/jvm/java 
    export EC2_HOME=/usr/local/ec2/apitools
    export EC2_PRIVATE_KEY=$HOME/.ec2/pk-YOUR_X509_KEY.pem
    export EC2_CERT=$HOME/.ec2/cert-YOUR_X509_CERT.pem 
    export PATH=$PATH:$EC2_HOME/bin
  • change /etc/sysconfig/network-scripts/ifcfg-eth0 to contain
    DEVICE=eth0
    BOOTPROTO=dhcp
    ONBOOT=yes
    TYPE=Ethernet
    USERCTL=yes
    PEERDNS=yes
    IPV6INIT=no
  • download and install some xen modules (Important: the module below is for 32 bit version, if you are converting a 64 bit OS, use ?)
    wget http://alestic-downloads.s3.amazonaws.com/ec2-kernel-modules-2.6.16-xenU.tgz
    cd /lib/modules
    tar -xvzf ec2-kernel-modules-2.6.16-xenU.tgz
  • EC2 boots into runlevel 4. Disable unneeded services:
    for i in cpuspeed pcscd messagebus restorecond mcstrans \
    lvm2-monitor iptables ip6tables isdn \
    netfs apmd acpid cups sendmail gpm anacron \
    atd yum-updatesd avahi-daemon smartd httpd mysqld; do
    chkconfig --level 4 $i off; 
    done

    Note that I disabled iptables as to avoid any possible problems when logging in for the first time. Eventually when the image is running properly I will enable iptables back. Similarly httpd and mysqld are disabled because I don't have the mounts for /var/www /var/lib/mysql properly setup yet. Again once this is done on the running instance, I will enable httpd and mysqld services.
  • Make sure to enable some important services:
    for i in network syslog sshd; do 
    chkconfig --level 4 $i on; 
    done

  • Make sure to disable selinux. Either use the system-config-securitylevel-tui tool or simply edit the file in /etc/selinux/config and add the line
    SELINUX=disabled

  • create the AMI
    mkdir /image
    ec2-bundle-vol -c $EC2_CERT -k $EC2_PRIVATE_KEY --user AWS_ACCOUNT_NUMBER -d /image -e /image,/home,/var/www,/var/lib/mysql -r i386 -s 4096 --no-inherit --generate-fstab

    In my case I excluded /home, /var/www, /var/lib/mysql directories because I'm planning to mount these on a EBS. Note that the tool is smart enough to exclude some system directories on its own (e.g. /dev, /proc), so you don't need to add these manually in the exclusion list.

    This command will take a pretty long time and will create an AMI of size 4GB. Your /image directory will have a file called image, many files called image.partXX and the manifest file called image.manifest.xml. You could examine what has been put into your AMI filesystem by mounting the image file: mount -o loop /image/image /mnt
  • upload your AMI to Amazon S3
    ec2-upload-bundle -b YOUR_BUCKET_NAME -m /image/image.manifest.xml -a YOUR_AWS_ACCESS_ID -s YOUR_AWS_ACCESS_SECRET_KEY

    YOUR_BUCKET_NAME can be any name (e.g. my-cool-ami). A bucket in S3 is similar to a directory on a filesystem. If it does not exist, it will be created. Otherwise, if it's yours, your image files will be uploaded into that bucket (and possibly overwrite existing files with the same names - so be careful). You can manage your S3 buckets and files using the S3Fox firefox extension from http://www.s3fox.net/release/latest/s3fox.xpi.
  • Register the AMI
    ec2-register YOUR_BUCKET_NAME/image.manifest.xml

    The output of this command will show your AMI ID. You can also see it by running
    ec2-describe-images

  • Create a key pair for ssh
    ??????? $HOME/.ec2/YOUR_KEY_PAIR.pem

  • We're ready to start up our instance:
    ec2-run-instances AMI_ID -k $HOME/.ec2/YOUR_KEY_PAIR.pem

    To see the status, the instance id and the host name that was assigned to your instance run:
    ec2-describe-instances

  • If everything went ok, you can try to ssh to the new instance:
    ssh -i $HOME/.ec2/YOUR_KEY_PAIR.pem root@YOUR_INSTANCE_HOST_NAME


EBS Volumes


I decided to use LVM in spanning mode with EBS disks, so that I have an easy way to expand my storage size when needed. Initially I will create two EBS disks, each 20GiB, so total of 40GiB will be available in my logical volume.
  • First get the zone for your running instance. It's important to create the volumes on the same availability zone as the instance you will be attaching them to (otherwise the attach process will fail).
    ec2-describe-instances

    The zone for your instance will appear as the previous to last column in the output. Also make note of the 2nd column in the output - that's the instance id, you will need it when attaching the volumes to it.
    In my case the zone is us-east-1c.
  • Now create the 2 volumes of 20GiB
    ec2-create-volume -s 20 -z us-east-1c
    ec2-create-volume -s 20 -z us-east-1c





  • Check the status and the volume ids
    ec2-describe-volumes

    If the status is "available", we can attach them:
    ec2-attach-volume VOLUME_ID1 -i INSTANCE_ID -d /dev/sdf
    ec2-attach-volume VOLUME_ID2 -i INSTANCE_ID -d /dev/sdg





At this point, you can either just create file systems on these volumes, mount them and start using them, or you can opt for LVM, which I will describe below.

LVM with EBS

  • First create the physical volumes
    pvcreate /dev/sdf
    pvcreate /dev/sdg

  • Create the volume group called vg1
    vgcreate vg1 -s 256m /dev/sdf /dev/sdg

    Note that we use a physical extent size of 256MB (instead of the default 4MB). This will allow us to have logical volumes up to ~64k*256MB = 16TB, which seams enough for now. This also puts a constraint that the logical volume can only be of size multiple of 256MB, which is ok for us.
  • Show the volumn group information
    vgdisplay vg1

    My output looks like this:
    --- Volume group ---
    VG Name               vg1
    System ID
    Format                lvm2
    Metadata Areas        2
    Metadata Sequence No  1
    VG Access             read/write
    VG Status             resizable
    MAX LV                0
    Cur LV                0
    Open LV               0
    Max PV                0
    Cur PV                2
    Act PV                2
    VG Size               39.50 GB
    PE Size               256.00 MB
    Total PE              158
    Alloc PE / Size       0 / 0
    Free  PE / Size       158 / 39.50 GB
    VG UUID               0SsrGe-X6u2-VB84-fh3z-GiuZ-ypC2-uJA8v9
    

    The line marked in red tells us that we have 158 physical extents.
  • Create the logical volume called lv1, with all the physical extents available
    lvcreate -l158 -n lv1 vg1

  • Show the logical volume information
    lvdisplay /dev/vg1/lv1

    My output looks like this:
    --- Logical volume ---
      LV Name                /dev/vg1/lv1
      VG Name                vg1
      LV UUID                BchvjQ-pRxO-byO2-3DJG-FJ01-Uowj-eM5qW5
      LV Write Access        read/write
      LV Status              available
      # open                 1
      LV Size                39.50 GB
      Current LE             158
      Segments               2
      Allocation             inherit
      Read ahead sectors     auto
      - currently set to     256
      Block device           253:0
    
  • let's format it and mount it
    mkfs.ext3 /dev/vg1/lv1
    mkdir /vol
    mount /dev/vg1/lv1 /vol
    
  • as I mentioned above, I'd like to keep a few directories on the EBS. This is because, I need the data in these directories to persist even if my EC2 instance is terminated and restarted for whatever reason. Currently I have the following directories on the EBS:
    /home (users' data)
    /etc/httpd (apache configs)
    /etc/pki (SSL secure keys, etc)
    /var/www (all the websites)
    /var/trac (trac-ed projects)
    /var/lib/mysql (mysql data)
    
    to achieve this, I did:
    mkdir -p /vol/home /vol/etc/httpd /vol/etc/pki /vol/var/www /vol/var/trac /vol/var/lib/mysql
    
    then add the following to /etc/fstab
  • /dev/vg1/lv1          /vol              ext3    defaults,noatime        0 0
    /vol/home             /home             none    bind                    0 0
    /vol/etc/httpd        /etc/httpd        none    bind                    0 0
    /vol/etc/pki          /etc/pki          none    bind                    0 0
    /vol/var/www          /var/www          none    bind                    0 0
    /vol/var/trac         /var/trac         none    bind                    0 0
    /vol/var/lib/mysql    /var/lib/mysql    none    bind                    0 0
    
    then mount them all
    mount -a
  • now you can import your data into these directories

Scripts

  • to re-bundle the currently running instance and upload it to S3
    #!/bin/bash
    
    
    function ec2_exclude_dirs {
        local x=""
        dirs=$(grep bind /etc/fstab | awk '{print $2;}')
        for i in $dirs; do
            x="$x$(find $i -maxdepth 1 -mindepth 1 | tr '\n' ',')"
        done
        echo "$x$@"
    }
    
    function die {
        echo "$@"
        exit
    }
    
    image=$1
    account=$(cat $EC2_ACCOUNT_NUMBER)
    acess_key=$(cat $EC2_ACCESS_KEY)
    secret_key=$(cat $EC2_SECRET_KEY)
    bucket=$S3_BUCKET
    
    mkdir /mnt/$image || die "could not make directory /mnt/$image"
    echo "Created directory /mnt/$image"
    
    echo "Running ec2-bundle-vol -c $EC2_CERT -k $EC2_PRIVATE_KEY -u $account -d /mnt/$image -p $image -e $(ec2_exclude_dirs /mnt) -r i386 -s 4096"
    ec2-bundle-vol -c $EC2_CERT -k $EC2_PRIVATE_KEY -u $account -d /mnt/$image -p $image -e $(ec2_exclude_dirs /mnt) -r i386 -s 4096
    
    echo "Running ec2-upload-bundle -b $bucket -m /mnt/$image/$image.manifest.xml -a $acess_key -s $secret_key"
    ec2-upload-bundle -b $bucket -m /mnt/$image/$image.manifest.xml -a $acess_key -s $secret_key
    
    echo "Running -n $image ec2-register $bucket/$image.manifest.xml"
    ec2-register -n $image $bucket/$image.manifest.xml
    
    echo "Running ec2-describe-images"
    ec2-describe-images
    
  • to detach the EBS volumes
    #!/bin/bash
    
    /etc/init.d/httpd stop
    /etc/init.d/mysqld stop
    
    grep bind /etc/fstab | awk '{print $2;}' | tac | while read m; do
        echo "umount $m"
        umount $m
    done
    umount /vol
    
    /sbin/vgchange -a n
    
    ec2-describe-volumes | grep attached | awk '{print $2;}' | while read vol; do
        echo "ec2-detach-volume $vol"
        ec2-detach-volume $vol
    done
    
    sleep 10
    ec2-describe-volumes
    
  • to attach the EBS volumes, active the LVM and mount everything
    #!/bin/bash
    
    function die {
        echo "$@"
        exit
    }
    
    INSTANCE_ID=$1
    VOLUMES_FILE=$2
    
    [ "$INSTANCE_ID" != "" ] || die "Usage: $0  [VOLUMES_FILE(=volumes.txt)]"
    if [ "$VOLUMES_FILE" == "" ]; then VOLUMES_FILE=volumes.txt; fi
    [ -f $VOLUMES_FILE ] || die "No such file $VOLUMES_FILE. Usage: $0  [VOLUMES_FILE(=volumes.txt)]"
    
    /etc/init.d/mysqld stop
    /etc/init.d/httpd stop
    
    cat $VOLUMES_FILE | while read vol dev; do
        echo "ec2-attach-volume $vol -i $INSTANCE_ID -d $dev"
        ec2-attach-volume $vol -i $INSTANCE_ID -d $dev
    done
    
    sleep 20
    
    /sbin/modprobe dm-mod
    /sbin/vgscan --mknodes
    /sbin/vgchange -a y
    
    mkdir -p /vol
    mount -a
    /etc/init.d/mysqld start
    /etc/init.d/httpd start
    
    The file volumes.txt describes which volume should be mounted as which device. Each line is in the form
    EBS_ID DEVICE
    for example
    vol-xxxxxxxx /dev/sdf
    vol-yyyyyyyy /dev/sdg