What data is stored in ephemeral storage of an Amazon EC2 instance?
Anything that is not stored on an ebs volume that is mounted to the instance will be lost. For instance, if you mount your ebs volume at /mystuff, then anything not in /mystuff will be lost. If you don’t mount an ebs volume and save stuff on it, then I believe everything will be lost.
You can create an AMI from your current machine state, which will contain everything in your ephemeral storage. Then, when you launch a new instance based on that AMI it will contain everything as it is now.
Meaning of “Warning: Please note that any data on the ephemeral storage of your instance will be lost when it is stopped”
There is a difference between “stop” and “terminate”. If you “stop” an instance that is backed by EBS then the information on the root volume will still be in the same state when you “start” the machine again.
Basically, root volume (your entire virtual system disk) is ephemeral, but only if you choose to create AMI backed by Amazon EC2 instance store.
If you choose to create AMI backed by EBS then your root volume is backed by EBS and everything you have on your root volume will be saved between reboots.
If you are not sure what type of volume you have, look under EC2->Elastic Block Store->Volumes in your AWS console and if your AMI root volume is listed there then you are safe. Also, if you go to EC2->Instances and then look under column “Root Device” of your instance and if it says “ebs”, then you don’t have to worry about data on your root device.
1 GB to store in US-East-1: (Updated at 2016.dec.20)
- Glacier: $0.004/Month (Note: Major price cut in 2016)
- S3: $0.023/Month
- S3-IA (announced in 2015.09): $0.0125/Month (+$0.01/gig retrieval charge)
- EBS: $0.045-0.1/Month (depends on speed – SSD or not) + IOPS costs
- EFS: $0.3/Month
Further storage options, which may be used for temporary storing data while/before processing it:
- Kinesis stream
- DynamoDB, SimpleDB
The costs above are just samples. There can be differences by region, and it can change at any point. Also there are extra costs for data transfer (out to the internet). However they show a ratio between the prices of the services.
There are a lot more differences between these services:
- Generally Available (out of preview), but may not yet be available in your region
- Network filesystem (that means it may have bigger latency but it can be shared across several instances; even between regions)
- It is expensive compared to EBS (~10x more) but it gives extra features.
- It’s a highly available service.
- It’s a managed service
- You can attach the EFS storage to an EC2 Instance
- Can be accessed by multiple EC2 instances simultaneously
- Since 2016.dec.20 it’s possible to attach your EFS storage directly to on-premise servers via Direct Connect. ()
- A block storage (so you need to format it). This means you are able to choose which type of file system you want.
- As it’s a block storage, you can use Raid 1 (or 0 or 10) with multiple block storages
- It is really fast
- It is relatively cheap
- With the new announcements from Amazon, you can store up to 16TB data per storage on SSD-s.
- You can snapshot an EBS (while it’s still running) for backup reasons
- But it only exists in a particular region. Although you can migrate it to another region, you cannot just access it across regions (only if you share it via the EC2; but that means you have a file server)
- You need an EC2 instance to attach it to
- New feature (2017.Feb.15): You can now increase volume size, adjust performance, or change the volume type while the volume is in use. You can continue to use your application while the change takes effect.
- An object store (not a file system).
- You can store files and “folders” but can’t have locks, permissions etc like you would with a traditional file system
- This means, by default you can’t just mount S3 and use it as your webserver
- But it’s perfect for storing your images and videos for your website
- Great for short term archiving (e.g. a few weeks). It’s good for long term archiving too, but Glacier is more cost efficient.
- Great for storing logs
- You can access the data from every region (extra costs may apply)
- Highly Available, Redundant. Basically data loss is not possible (99.999999999% durability, 99.9 uptime SLA)
- Much cheaper than EBS.
- You can serve the content directly to the internet, you can even have a full (static) website working direct from S3, without an EC2 instance
- Long term archive storage
- Extremely cheap to store
- Potentially very expensive to retrieve
- Takes up to 4 hours to “read back” your data (so only store items you know you won’t need to retrieve for a long time)
There are several interesting aspects in terms of pricing. For example Glacier, S3, EFS allocates the storage for you based on your usage, while at EBS you need to predefine the allocated storage. Which means, you need to over estimate. ( However it’s easy to add more storage to your EBS volumes, it requires some engineering, which means you always “overpay” your EBS storage, which makes it even more expensive.)