The Side Effects of Using Disk Space

Freakymittal
3 min readFeb 10, 2021

This post is not to tell you that you shouldn’t use disk space as a storage solution in your applications, its just to make you’re aware that if you do you use it, there are several long term implications which you’ll need to take care of.

Saving Assets

Imagine you’re making (or learning to make) a new Instagram rival, at the bare minimum you’ll need

  • An API endpoint (or a similar alternative) by which someone can upload a pic, a self-portrait, or a new Monet (at least the user thinks so), which responses with a public link, so that they can share it on their Local Community’s 21st century modern art portfolio Facebook page

Congratulations, you deployed your application for the world to use and Instagram noticed the traction you’re getting.

Now either of 2 things can happen

  • Instagram tries to acquire you, cause you just made a new feature which they need badly
  • They see you as just another rival and try to get you out of business (I’ve no prior knowledge of how they do business, it's just a hypothetical, so please don’t sue me)

In either case, they’ll try to assess the maturity level of the developers who are getting their attention. In case of a merger, they’ll audit your codebase (which you pass with flying colours), they’ll load test your infra (which you fail, I’ll get into the details in just a minute). In case of rivalry, they’ll try to simply upload too many pictures (and as you’re a free portal, they’re able to do so without any hurdle), but after a while you’re API starts to throw errors (Assuming they don’t kill your server with just the amount of requests first, they’ll definitely kill your server because of simply too many files on your server), because your server disk is full.

Now you figure out your mistake, that you used disk space to store the images (or the pieces of art) and that is full, and also to reduce the cost of the infra you used HDD/SSD which is throttling your I/O. Now you start to see the cascading effect

  1. Disk space is getting full
  2. I/O is getting throttled
  3. As I/O is getting throttled, the images you’re trying to upload and the process takes time, which gets hold up in your RAM
  4. After a while, the CPU starts to spike, cause it has simply too many things to process
  5. And then BOOM, you can’t even get SSH into your server, so you can’t even see the BOOM

Coming to the crux of the point, this whole hypothetical of a mess could’ve been avoided if you simply outsourced the storage solution

Solutions

  • Use AWS S3 (love the AWS ecosystem, and why not), it's cheap, its got 9 9’s availability, its got unlimited storage, you can add triggers on various events, you can make the files publicly available and it won’t even reach your server
  • In a case where you have to use FTP/SFTP (i can’t imagine why, in this hypothetical), use AWS Transfer for SFTP which is basically an S3 backed FTP, so you get all the benefits of S3 with all the legacy things which FTP provides

I’m not affiliated with Amazon or AWS (at least not yet ;)), but I’m more than happy to plug them (Hey AWS’s marketing team, reach out to me for sponsor deals ;))

--

--