May 17, 2019

SKALE Decentralized Storage | SKALE

Dmytro Nazarenko

Developer

If you’ve been keeping up with SKALE, you’re aware of our high throughput SKALE Chains that free developers from the constraints of the Ethereum mainnet. While computation and throughput are cool, today we’re very excited to be releasing SKALE Storage to the community! In this article we’re going to be covering what SKALE FileStorage is and how it works.

The State of Things, Today

Storing data on Ethereum is EXPENSIVE - in fact, storing just 1GB of data on Ethereum would run you about 1250 ETH (assuming a gas price of 2GWei) and take ~8000 full blocks! Given this constraint, most developers and teams store as little on Ethereum as possible, turning to decentralized storage solutions (chiefly, IPFS) to meet their storage needs. The pattern is as follows:

Store some rich object in IPFS.
Get the IPFS hash of the object.
Store that IPFS hash on Ethereum.

And voila, the problem of costly storage on Ethereum has successfully been avoided! Except this isn’t actually the case - because IPFS has not yet integrated its incentivization mechanism for storing data (Filecoin), your data may be deleted at any time. So, businesses often run their own IPFS nodes and ‘Pin’ data to them to ensure that it won’t go away or trust a third party ‘Pinning Service’ - but that isn’t really decentralized, is it?

This all said, IPFS isn’t the only solution - there are plenty of others that businesses / developers are integrating to address their storage needs. But why have a separate system for storage, altogether? It was with this mindset that SKALE embarked on making a cost-effective storage layer within Ethereum capable of handling files up to 100MB.

What’s Hard About Storage?

First, let’s recap SKALE Chains. SKALE Chains are each comprised of ‘virtualized subnodes’ which run on ‘nodes’ in the network. Virtualized subnodes are partitions of these nodes (run by validators) in the network and participate in the network on an SKALE Chain basis. In this way, each validator is able to run multiple SKALE Chains with their node, simultaneously. With this partitioning of node resources into virtualized subnodes and random appointment of these virtualized subnodes to SKALE Chains, chances of collusion amongst nodes are greatly mitigated throughout the network.

With that established, what’s so hard about decentralized storage? Well, with second layer scaling solutions, the common thread amongst them is the outsourcing of computation and storage off-chain; this is true for things like Plasma, State Channels, as well as SKALE Chains. And with each solution’s unique approach, there come tradeoffs - with each being subject to the ‘Data Availability Problem’. This problem lies at the center of why decentralized storage is difficult.

For those not familiar, data availability is one of the most difficult things to deal with with respect to storage and requires some sort of redundancy between parties (backup, erasure coding, etc…). Given the modularity of these solutions, they often do not ship with their own redundancy layer but instead offer users the ability to choose their own method. Naturally, users should be wary of leveraging a centralized system for this (as it would be antithetical to idea of decentralized storage).

One of the main differentiators for SKALE Chains is the fact that they come with a built-in data availability protocol (users can still integrate their own, as well) to ensure that data is stored on at least ⅔ of virtualized subnodes in each SKALE Chain. We cover this in more detail and how we address it, here. And because of this protocol, we have been able to create and ship a FileStorage system which runs on SKALE and a modified EVM to provide this feature to existing Ethereum developers.

The following section, will give you a good idea of how everything works together - for those wanting to dive right in to the code, take a look at FileStorage.js and the FileStorage demo repo.

How it Works

SKALE FileStorage consists of a few layers:

FileStorage.js is a simple npm package allows users to integrate SKALE FileStorage into their decentralized applications with just a few lines of code. This package is an interface to Filestorage.sol which calls a precompiled smart contract (Precompiled.cpp) which has access to the node’s native filesystem. Each of these layers serves an integral part to the process of uploading, downloading, and deleting of files with the EVM.

Uploading

To upload a file, we call the uploadFile method with the following parameters:

{string} address - Your Ethereum address.
{string} fileName - The name of the file being uploaded.
{number} fileSize - The size of the file being uploaded.
{ArrayBuffer} fileBuffer - The data of your file converted into hex.
{boolean} [logs=false] - Logging flag.
{string} [privateKey] - Your private key (required for signing transactions).

After ensuring that your fileBuffer is in the proper format, a transaction is sent to FileStorage.sol to ensure that the specified account does not already have a file of that name. If this check passes, Precompiled.cpp is called to verify that the file is less than 100MB and that there is enough space on the node’s filesystem to free a contiguous block of memory in which to store the file.

In the case that there isn’t enough space available, the caller will be prompted to upgrade their SKALE-Chain or delete some files that are currently stored on their nodes. Otherwise, the space will be freed, the file will be updated to a status of UPLOADING, a FileInfo struct will be created in FileStorage.sol containing the name, size, and a boolean array indicating which chunks of a file have been uploaded.

Once all of these operations have completed, the user will be charged the cost of that storage (~.01ETH/MB) and the file will begin uploading. To begin the uploading process, the file is broken into chunks of 1MB or less which are uploaded as separate transactions to FileStorage.sol. Because anyone can call the uploadChunk function, it is necessary that we check that the status of the file is UPLOADING, that the file belongs to the transaction sender, and that each chunk is less than 1MB and has not already been uploaded before proceeding. If all of these checks pass, the chunk will be uploaded to the node’s filesystem with Precompiled.cpp. If chunk upload fails, the transaction will be retried before notifying the user that their data may be corrupt. In the event that the chunk is uploaded, the index for that chunk in the boolean array will be set to true.

Note: Modifications to the SKALE EVM (including blockSize) allow for the uploading for 1MB chunks of data.

Lastly, the finishUpload function will be called which verifies that the file is in an UPLOADING status and that all chunks have been uploaded before updating the status of the file to UPLOADED. FileStorage.js will then return the path of the uploaded file [ACCOUNT]/[FILENAME] as a string to the user.

Downloading

When downloading a file, we can select to download the file into the browser by calling downloadFileIntoBrowser or into a buffer by calling downloadFileIntoBuffer. The only difference between these two functions is that one creates a writeStream and writes the file to your local computer while the other returns a Buffer. Both take the following parameters:

{string} storagePath - The path of the file in Filestorage.
{boolean} [logs=false] - Logging flag.

When downloading a file, the first thing which occurs is a lookup of the file’s size based upon its path using the getFileSize function. Once the size of the file has been read, the file is iterated over in 1MB chunks using readChunk which verifies that the corresponding file as well as the current chunk have been successfully stored on each node before reading it from storage and loading it into a Buffer which will either be returned to the user (in the case of downloading to a buffer) or written to a stream (in the case of downloading to the browser).

Note: Unless the size of the file is a perfect multiple of 1MB, the last chunk will be less than 1MB.

Deleting

To delete a file, we call the deleteFile method with the following parameters:

{string} address - Your Ethereum address.
{string} fileName - The name of uploaded file.
{boolean} [logs=false] - Logging flag.
{string} [privateKey] - Your private key (required for signing transactions).

This will execute the deleteFile method in FileStorage.sol which will verify that the file exists and is either in the UPLOADING or UPLOADED state. The file is then deleted from a node’s storage with the C++ remove function in Precompiled.cpp.

If the file has been successfully removed from each node’s filesystem, the file’s status will be updated to EMPTY and its respective FileInfo struct will be deleted from the smart contract.

Good News!

Now that you know how it all works, you just need to add a few lines of code to your project to integrate SKALE Storage! Take a look at FileStorage.js to get started and the FileStorage demo repo as an example for how to integrate this into your project! Also, feel free to also check out SKALE’s Technical Overview and Consensus Overview for a deeper dive into how SKALE works and is able to provide 20,000 TPS.

SKALE’s mission is to make it quick and easy to set up a cost-effective, high-performance SKALE Chain that runs full-state smart contracts. We aim to deliver a performant experience to developers that offer speed and functionality without giving up security or decentralization. Follow us here on Telegram, Twitter, and Discord, and sign up for updates via this form on the SKALE website.

Build on SKALE

SKALE Grant Programs offer developer funding, engineering and QA support, marketing and PR, and access to investor networks.

Apply to SKALE Grant Programs

SKALE Decentralized Storage | SKALE

The State of Things, Today

What’s Hard About Storage?