by Diego Elio Pettenò — Flameeyes
Amazon Machine Image, usually referred to as a 7-hexdigits ID prefixed with ami-; it’s the bundle of the operating system; for our purposes an AMI consists of a volume snapshot, a kernel image (which is actually a GRUB image), and a name.
A running virtual machine started from an AMI; for our purposes, the instance receives a new EBS volume as root filesystem (/dev/sda) from the snapshot coded into the AMI, and a local instance-store scratch space (/dev/sdb), by default.
An EBS volume is a virtual volume where you can build a filesystem; volumes are attached as device partitions to the instance, such as /dev/sdh1.
since operations on EBS volumes are slow and actually priced in range of millions of operations, instances are usually given one or more instance-store devices that are given preformatted, are faster and are not metered.
Official Amazon Howto:http://ec2-downloads.s3.amazonaws.com/user_specified_kernels.pdf
Since a few months ago, Amazon allows using a custom kernel configuration in place of the
previous fixed selection of (mostly old) kernels. This works fine but has a few limitations:
To prepare an EBS-based AMI, like we have to do in this case, you have to create within the EC2 system a volume snapshot that can be created to an AMI; when you then start that AMI, a temporary EBS is created that you can modify as you need but will be deleted when you terminate the instance. To solve this, if you do want to make changes to the operating system (configuration, added packages, ...) you have to take a new snapshot of the running EBS volume, and create a new AMI from that. This basically means that you shouldn’t be “touching” the root filesystem and especially you should not rely on the changes done being saved on restart and other operations. Instead, you should use other, persistent volumes to store that data. For instance, if you want to have a PostgreSQL database running, you should create a new volume for it, get it attached, and mount it as /var/lib/postgresql. You might need to play with symlinks and non-default configuration files for stuff like Apache. A good alternative “big” approach is to have the whole of /var/lib being an external, persistent volume.
As I said, the root volume should be considered like a “tweak and go”; performances on it, especially write performances, are not really important; inode availability also plays a very minimal role. What you care about is the persistent storage: /var/lib. Since you need /boot to be in ext* the easiest solution to that is using ext* for the whole of / and mount an external /var/lib separately.
Here comes the problem with this way of handling EBS: by default there is no way to tell an instance to always connect a given persistent EBS; to solve this you can use Rudy to start up the instance; in that case you have to use the Rudyfile I’m sending you, a basic patched amazon-ec2 (in my overlay), and then call it with ruby -r gentoo startup; this will then attach the volumes, mount them appropriately and finally start up the services that would need them. Do note: if you have PostgreSQL on a separate permanent EBS volume, it cannot be started as part of the operating system, but it has rather to be started after the volume is attached. You could technically run the attach-and-mount within the instance itself; but the problem with that is a) we don’t have a proper way to do so in a standardised way b) it requires you to store your EC2 credentials on the image (which is bad) and c) you still have to start up the instance someway anyway, easier to simply do the whole routine with Rudy.