Amazon Simple Storage Service or Amazon S3 is a service designed to house storage for the internet. In this article, we'll talk through all the strategies you can use to reduce Amazon S3 costs. First, let’s review the factors that affect Amazon S3 monthly costs. You will pay for: The size of data stored each month (GB). The number of access operations completed (e.g. PUT, COPY, POST, LIST, GET, SELECT, or other request types). The number of transitions between different classes. Data retrieval size and amount of requests. Data transfer fees (bandwidth out from Amazon S3) One of the most important cost factors is the storage class. Make sure you understand the different classes available and their use cases. Let’s quickly review and compare them. Amazon S3 offers 6 different storage classes. S3 Standard S3 Intelligent-Tiering S3 Standard-IA S3 One Zone-IA S3 Glacier S3 Glacier Deep Archive Keep in mind that every S3 object can be assigned a specific storage class. Thus a bucket might have objects with different classes simultaneously. class is typically used for frequently accessed data. Although the cost per Gb is high, you don’t pay for the number of requests. And therefore this storage class is best suited for objects read or written several times each month. S3 Standard The third class in the table is . It has a lower Storage Price. But the access cost is higher. According to AWS, it should be used for . S3 Standard-IA long-lived but infrequently accessed data that needs instant access As a rule of thumb, S3 Standard-IA should be used if the object is accessed on average less than once a month. Why one month? Because that’s the frequency where S3 Standard and S3 Standard-IA both have roughly have the same cost overall cost. And it’s also the minimum recommended amount of time to keep the objects in the S3 Standard-IA class. If they are kept less than 30 days, then the rest will be charged. Usually, it’s difficult to know how often the object is accessed. AWS created to address this issue. This class automatically moves data between S3 Standard and S3 Standard-IA classes. And that minimizes the S3 cost for the object. If you are keeping an object for more than 30 days, the S3 Intelligent-Tiering class will be cheaper than S3 Standard and S3 Standard-IA. This should be your first option in those cases. S3 Intelligent-Tiering class is similar to S3 Standard-IA. But, instead of storing data in 3 (or more) AZs, data is stored in only one AZ. For this reason, data could be unavailable if the AZ fails. You should use S3 One Zone-IA only if you can tolerate this risk. S3 One Zone-IA The last 2 classes are and . They have the lowest cost per GB. But the access cost is high. Therefore they are used for archiving purposes. They replace the tape libraries used on-premises. S3 Glacier S3 Glacier Deep Archive You should keep in mind that Glacier objects aren’t immediately available. If you want to access the contents of an object in any Glacier class, you will have to wait until retrieval ends. For Bulk retrieval mode, this time is between 5 and 12 hours. And other retrieval modes are faster but more expensive. For this reason, Glacier should only be used for objects that are rarely accessed. For example, Glacier is ideal for backups, archiving, and any data. long-term infrequently accessed The difference between S3 Glacier and S3 Glacier Deep Archive is the latter is for even less frequently accessed objects. For example, it’s recommended for objects accessed every 6 (or more) months. S3 Glacier Deep Archive storage costs are lower. But the object needs to be stored for at least 180 days in that class. Otherwise, that minimum period will be charged. We have just described the main characteristics of each S3 class, and the suggested use cases. So now we can start optimizing them. Below we talk through the to reduce AWS S3 costs. main strategies 1. Set the Right S3 Class For New Objects Before the Creation Your first step is to analyze the access patterns for your data. Start thinking about the intended usage for each new object to be created in S3. Each object in S3 should have a specific access pattern. And therefore there is an S3 class that works best for it. The right class should be applied to all in Amazon S3. It’s not possible to define the default class per bucket in S3. But you can assign it per object. new objects Start defining the best class for each new object in S3. And set this class in the operation that uploads this object to Amazon S3. This can be done using AWS CLI, AWS Console, or AWS SDK. As a consequence, each new object will have the right class. This could the best money-saving strategy in the long term. And probably be the most time-efficient strategy. 2. Adjust the S3 Class For Existing Objects Now that you have already set the right class for new (to be created) objects, you can focus on the already-created objects. The process is similar to the one described in the previous point. Start analyzing data access patterns for every existing object in your S3 account. Then decide the best class for each one. And finally, in the object configuration. This will allow you to optimize every S3 bucket, and thus reduce your S3 costs. assign that class How to check if this worked? You can use AWS Cost Explorer to check your daily S3 cost. You will also notice the cost reduction in next month’s bill. AWS bills show the consumption for each service, including Amazon S3. Consider that it could be time-consuming to update every object class after it’s created. So that’s why it’s very important to set classes before objects are created (as previously described). Note also that this process consists of an object-by-object (or bucket-by-bucket) revision. And, depending on the number of objects that you have, it could a considerable amount of time. It’s probably better to focus on big (or very frequently accessed) objects. And then update their storage classes first. You might also use . This is a tool to analyze S3 objects’ access patterns. It monitors the objects within a bucket. And it will show the amount of data stored in the bucket, the amount of data retrieved, and how frequently data is accessed (based on object age). Note that there is a small charge used by this tool. But it allows you to understand if the objects are accessed often. After you understand the access pattern, you can update the S3 storage class accordingly. S3 Storage Class Analysis For example, if you find out that most objects in a bucket are accessed only once per year (and you don’t need immediate access), then you should adjust their storage class to S3 Glacier Deep Archive 3. Remove Unused S3 Objects You have probably noticed that you pay for the amount of data stored on S3. So if you objects, you will also reduce S3 costs. remove unused How to check the contents of your S3 buckets? There are several ways. For example, you can list the objects on each bucket. This will show object names (or ) without downloading the object’s contents. This can be done using the AWS Console, AWS CLI, or SDK. key Another option to check S3 buckets’ contents is using CloudWatch Metrics. Use metric to get the complete size of the bucket. Or use metric to get the number of objects stored in it. These are metrics, and they will show you how big the buckets are. Then you can start removing any unused object in the biggest buckets. BucketSizeBytes NumberOfObjects per Bucket You can also activate the in a bucket. This tool prepares a CSV (or Apache ORC) file, which lists all objects in a bucket. And it’s delivered to another S3 bucket on a daily or weekly basis. This is a good approach when you have thousands of objects in a bucket, and you want to quickly find some of their properties (like size, encryption status, or last modified time). Note that S3 Inventory has a small cost when active. S3 Inventory 4. Use S3 Lifecycle Management Amazon S3 offers a tool to automatically change the storage class of any object. For example, you can transition from S3 Standard class to S3 Glacier after some days of object creation. Therefore you can transition each object to the most suitable storage class. And this will translate into a cost reduction. How does S3 Lifecycle management works? You set rules for each bucket. Each rule has a transition period. It counts the number of days since the object was created (or removed). And the rule also sets the storage class to transition into after this period. Note that you can always transition the objects to a longer-term storage class. But you can't transition to a shorter-term storage class. You can also set a lifecycle rule for a whole bucket, or based on a prefix. So you don’t need to transition your objects one-by-one. S3 Lifecycle Management is one of the most useful tools to save costs on S3. And you should always consider using it. 5. Expire S3 Objects This is another strategy to remove unused objects. Amazon S3 Lifecycle Management can also set an expiration policy. This allows you to an object some days after from creation. Every expired object will be automatically removed by AWS. expire If you keep log files (or any other temporary data) as S3 objects, you should set an expiration for them. For example, you can set log objects to expire 30 days after creation. And they will be removed automatically. 6. Expire Incomplete Multipart Uploads Amazon S3 uploads big objects using multipart upload. AWS divides a big file into smaller fragments, and each one is uploaded independently to S3. Then AWS joins the several uploaded parts into the final object. AWS recommends using multipart uploads for objects larger than 100 Mb. And it’s required to use it for objects over 5 Tb. It can take some time to upload big objects. And this upload process might be interrupted. As a consequence, the S3 upload bucket will keep some unused fragments. To remove them, you can set a new LifeCycle policy. Policies have a setting to expire these partial objects. Removing these objects will allow you to free space on S3, and then reduce costs. Clean up incomplete multipart uploads 7. Compress S3 Objects You can also any object before uploading the S3 object. You just create a compressed file (eg ZIP, GZIP, or equivalent), which will be smaller than the original. And then you upload the compressed file to S3.  The amount of data used in S3 will be lower. Then Amazon S3 costs will be reduced. compress Note that, to get the original file, you will have to download it and also decompress it. But you could save a lot of space in S3 (for example when using text files). 8. Pack S3 Objects Remember that you also pay for the number of operations done in Amazon S3. If you have to download many S3 objects simultaneously, it might be a good idea to them into one big object (e.g. TAR, ZIP, or equivalent). pack Some storage classes have minimum capacity charges for objects. For example, the minimum capacity charge per object for S3 Standard-IA and S3 One Zone-IA is 128KB. And the minimum capacity charge per object for S3 Glacier and S3 Glacier Deep Archive classes are 40 Kb. For this reason, a small 1 KB object (having S3 Standard-IA class) will be charged as 128 KB. Packing multiple files together will take advantage of . If you pack small size objects together, you will also reduce your S3 costs. minimum capacity charges 9. Limit Object Versions S3 Object versioning is a very useful tool. Every time you change the contents of an object, AWS will keep the previous version of it. But if you have a 1 MB object with 100 versions, then you will be paying for 100 MB of storage. But you can use lifecycle policies to automatically after some time. For example, you set a policy to delete objects after 30 days of becoming non-current. This will limit the number of versions stored, and lower the storage used. This is another approach to increase your savings. delete previous versions 10. Use Bulk Retrieval Mode for S3 Glacier f you can wait for some hours to retrieve the objects, you can save money. So try to Retrieval mode if possible. You can choose the retrieval mode when you request this retrieval. use Bulk 11. Use Functionality Query in Place Some applications store as Amazon S3 objects. These tables have a specific format (like JSON, CSV, or Apache Parquet formats). To query the data, you have to download the whole file. And then you have to the whole table to find the desired data inside the table. tables query But there are more efficient ways to get the contents. AWS offers tools like , , or . These tools allow you to perform queries directly on the cloud. They process the data using SQL commands. And then they send you the data you need. Amazon Athena S3 Select Amazon Redshift Spectrum These tools offer many advantages. As the queries are completed on the cloud, you will need less processing power locally. Another benefit is that you download less data from Amazon S3. This makes the process faster and cheaper. Remember that you are charged by the amount of data downloaded from S3. If less data is requested, then you will save money on bandwidth. Note that S3 queries have a small additional cost. You should evaluate whether you will get a cost optimization. 12. Change Region Some regions are much expensive than others. And this applies to Amazon S3 prices also. So it’s worth considering moving your S3 bucket to a region with lower prices. Another factor to consider is data transfer costs between AWS regions. Data sent from a bucket to a VPC in the same region is free. But sending data to a VPC in another region will have a cost per Gb. It’s a good idea to keep a bucket in the region where data is sent. Summary In this article, you learned the common strategies to reduce Amazon S3 costs. Now it’s time to take action. Pick the strategies that work best for your workload in AWS and let me know how it goes!

Amazon

Apache

12 Strategies to Reduce Amazon S3 Costs

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Create a 1-Click Cost Alert Notification in Amazon AWS: A How-To Guide

10 Lessons from 10 Years of AWS (part 1)

10 Lessons from 10 Years of AWS (part 2)

Top 10 AI Development Companies in USA

17 of the Best Amazon Web Services (AWS) for Web Developers to Learn

3 Risk-Mitigation Lessons That We Learned The Hard Way This Year

Create a 1-Click Cost Alert Notification in Amazon AWS: A How-To Guide

10 Lessons from 10 Years of AWS (part 1)

10 Lessons from 10 Years of AWS (part 2)

Top 10 AI Development Companies in USA

17 of the Best Amazon Web Services (AWS) for Web Developers to Learn

3 Risk-Mitigation Lessons That We Learned The Hard Way This Year

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps