The Black Star File System#

A file system has two roles: It has to specify how to write files to a medium, and it has to define how a user can access files. Most file systems focus on the first role and adopt the standard directory tree approach for the second role. It is of course necessary to solve the challenges of medium access, but we should not neglect the user’s perspective. As a user, I mostly care about how conveniently I can organize my data, and quickly I can access relevant information. The hierarchical approach is rather restrictive in this regard: You can only organize files in a directory tree [1], and search tasks often require third-party tools like find or locate.

Tagging file systems proposed an alternative file organization model. Instead of placing files in directories, they assign one or more (user-defined) tags to each file. This increases the flexibility over a hierarchical data model, because you can group any combination of files, and each file can be a part of various groups. Semantic file systems push this idea one step further by trying to understand the data they’re dealing with. For example, files can be grouped by their data type (documents), file format (odt), author (yourself), topic (information management), etc. The benefit for the user is that they can browse their files by association rather than by location — similar to how we nagivate the Web.

Clearly, the hierarchical approach is insufficient to organize this variety of information. Instead, we need a network of files, where they can be connected to each other, their properties, or to auxiliary nodes (such as tags, collections, etc.) under a given relationship. We call this the file graph. With the Black Star File System (BSFS), you can store, manage, and query such a file graph.

The Black Star File System is designed with three query patterns in mind: navigation, search, and browsing.

The navigation pattern describes the case when the user knows exactly what they want, and they already have an address or id of the target file. BSFS identifies each file with a unique URI, or you can quickly navigate to a file via its name or other file properties.

A search occurs when the user lacks the specific address or identifier to a target file, but they have relatively clear and narrow search criteria. With BSFS, you can search by file properties (name, size), content (keywords, features), or associations to other files and auxiliary nodes (tags, collections).

Browsing takes place when the user has only vague query criteria but wants to quickly scan and compare many files. In BSFS, you can browse along file associations and rank results by a variety of similarity metrics.