The Question Of Control Over Data in Shared Databases

Author profile picture

@andrey-zhulinAndrey Zhulin

Founder and CEO at Insolar. Founder and Board Member at SberMarket.

Data is one of the most important tools of our time. Its usage is now pervasive throughout all aspects of society: from air transport to banking, construction to dentistry, from education to farming and beyond.
More and more we are combining data to gain a greater understanding of how the world works and improve upon it. To do so we are merging different data sources and even bringing about more value creation.
It has become common knowledge that data sharing can be used to improve the world and the lives of the people within it. As such, database sharing has become a key goal across a wide range and variety of sectors.
The new oil
The world’s biggest businesses are built on data. For example, Amazon rose to its preeminent position as a tech giant in retail precisely through collecting and merging data sets. It is now known that Amazon’s strategy was to first partner with rivals, study their sector’s value chain through the data they provided, and then expand directly into their business.
“They learned a tonne on our dime, and we didn’t learn much,” former chief strategy officer of Target, Carl Casey, later remarked about Amazon.
So the value of data is clear to see. And the example of how Amazon was able to exploit others’ data to its own advantage serves as an ample excuse for corporate executives to exercise caution when entering into data-sharing agreements. However, by sharing data, industry collaborators can reduce transactional friction in their operations.
The question is: could there be a way to satisfy the cautious concerns of enterprises willing to participate in data-sharing initiatives, yet not surrendering their precious data to be taken advantage of by others, while at the same time bringing data from various sources into one system for greater transactional efficiency?
Distributed ledgers as the answer
Distributed ledger technology (DLT) holds this promise, but incumbent tech is yet to achieve sufficient traction in business networks, thus the technology as a whole remains unproven.
The reluctance to adopt can be put down to several reasons: from scalability, ease of use, and different network types (public/private) and their interoperability, to regulatory compliance and on-chain storage of large data sets.
However, should these bottlenecks be overcome, the main advantage for business networks in adoption of DLT still persists: they can benefit from combining their data without losing control over it.
Need to equip with the right tech
The problem with the current shared databases is that trust in how they are governed and used has to be placed with a centralized authority. Implementing advanced distributed ledger technology allows businesses to operate in shared databases which allow firms to retain ownership of their intellectual property.
The network isn’t owned or controlled by any single participant, ensuring greater trust in the network as a whole. 
Concerns around control
The most well-known DLT networks are completely open and public, incorporating blockchain which chains together data to create a historical link of all changes made.
However, a completely open network with full public access is hardly one that businesses will be ready to use; their data holds value to them when kept secret.
One of the advantages of blockchain is that data placed on-chain cannot be arbitrarily deleted by any of the network participants. In an enterprise environment where parties contribute data yet later want to withdraw participation or have to comply with data erasure regulations such as those stipulated in the GDPR, this position seems infeasible.
Thus, this contributes to the hesitation and caution that company executives are exercising when deliberating the adoption of DLT.
This calls for a specific type of DLT that is able to link data, yet comply with erasure. Some blockchain platforms offer ways around this by storing such data off-network; instead just providing references to it via a fingerprint on the network.
However, this in itself becomes impractical as it limits the automation capabilities for operations using said data.
There needs to be a way to scramble data previously placed on the network, without breaking the historical chain.
What is the solution?
One way around this could be by using a system of different hashes that are stored together with the data. Blockchain is constructed using those hashes. Should the data be deleted, the hashes remain, thereby avoiding the chain’s integrity being compromised.
Just as with any other ledger entry, the data object state alteration is recorded, but in this request, the valuable data is deleted. This means that the data itself can no longer be validated against its hash.
Moreover, this DLT includes the reason(s) for why the data was removed, meaning network participants can easily see why they no longer have access to it.
Capturing all the value of data 
Enterprise networks can reduce transactional friction in their data operations between one another in order to capture all the value that data has to offer. This will not only boost the efficiency of current operations but also give rise to completely new business models and create even more value.
Instead of storing data in silos, through participation in a single database and sharing within it, collaborators in enterprise networks can transact more efficiently, while reducing issues that currently exist in the recording and transmitting of data. Furthermore, participants in such networks can easily determine provenance and spot trends/patterns.
Enterprise leaders will need to take the leap of faith with well-thought-through technologies that protect their interests in terms of their control over their data, while offering all the benefits that data collaboration has to offer.

Tags

The Noonification banner

Subscribe to get your daily round-up of top tech stories!