The goal of the project is to develop relevant theory, algorithms and a prototype implementation of an Internet-scale, peer-to-peer storage system suitable for various applications, including distributed online social networks.
Nebulostore is founded by the Foundation for Polish Science with a Homing Plus grant (May 2011–April 2013).
lack of privacy in centralized systems
The proliferation of Internet access and the increased popularity of blogs and social networking websites is changing the way people work, communicate and live. However, all popular social websites are organizationally centralized, controlled by one stakeholder. The largest websites store information about interests, contacts and history of hundreds of millions of users. This huge amount of data creates a powerful incentive for the providers of such services (and hackers alike) to profile individual users. Profiling not only enables very precise advertising, but also a range of delinquent to criminal activities: from peeping to blackmail, scam, and identity theft. Currently, users are starting to be aware of the consequences of this privacy loss. However, nowadays the only alternative to the loss of privacy is only not to use social websites at all, and thus—to lose their many useful functions.
distributed online social networks
Perhaps the safest solution to the loss of privacy is not to store all the data in a repository controlled by a single entity; but instead, to distribute it among many independent entities (users or smaller-scale storage providers) and apply careful access control rules. The principle of Distributed Online Social Networks (DOSN), such as PeerSoN, is to deliver all the functions of an Online Social Network over a peer-to-peer (p2p) infrastructure that (1) distributes OSN functions among the users (which guarantees that no single entity stores too much data nor controls too much functionality); (2) protects the data by allowing each user to specify access control rules on a fine level.
The key component of a DOSN is a distributed storage system, ensuring both privacy and high availability of the data, as the owner of the data is not always online.
a universal internet-scale storage
A universal storage system can also change the way other Internet-based applications work (such as online document editing, e.g., Google Docs). Currently, storage in these applications is tightly coupled with the core functionality, be it editing or sending messages. Such coupling makes it hard and cumbersome for users to change the application provider: even if there is a better application (or an application that provides some specific functions that are absent in solutions for the general public), it is difficult to use it because there is no effortless way to migrate the data—simply, the current provider does not have an incentive for simplifying the usage of competition's products. A standard storage mechanism liberates the data. In principle, data storage is independent of the functions or services that are offered on the data (of course, assuming that the access is efficient). Such a decoupling would make it easier for users to switch between applications, and thus create a powerful incentive for developers to create innovate Internet applications.
p2p vs the cloud
For all these applications, cloud computing does not offer sufficient level of control over the data. Commercial clouds, such as Amazon's S3, offer inexpensive, middle- to large-scale data storage that can be easily adapted to the client organization's needs. However, the scale effect makes Internet prone to monopolies (or oligopolies at the best); the market of cloud computing would probably follow the same trend as, e.g., content delivery networks (dominated by Akamai) or Internet advertising (dominated by Google ad-words). With cloud storage, although the data are "liberated" from the application providers, the data are "imprisoned" again in the data center of the data provider.