Project Ideas

These are ideas that are not yet being actively developed but are, for the most part, of particular interest to the OrangeFS community. These may serve as a basis for graduate student research, summer intern projects and future efforts of the OrangeFS development team. We invite discussion on any of these projects.

Searchable Metadata

Currently, most users encode metadata about their files in a hierarchy of directories. This allows users to organize data in a meaningful, albeit limited, way. This method also causes metadata that cannot be encoded in a directory path to be lost or hidden in extended attributes that can only be found via a brute force search. The limitations of accessing extended attributes prevent more wide spread use of extended attributes as a mechanism to store and access metadata about files.

OrangeFS currently provides standard as well as extended attributes. As with other filesystems, those attributes are only available for lookup when a path is provided using standard get and set fatter calls or through the PVFS2 library.

This project aims to provide an interface that allows searching of attributes, standard and extended, that is more efficient than brute force search and provides additional query capabilities. Related work in this area has identified two main problems: 1) cost of maintaining an index of metadata 2) security challenges of providing an interface that searches filesystem metadata without an explicit path and associated permissions. We believe the distributed nature of Orangefs' metadata is an excellent architecture to provide the desired metadata search functionality while keeping resource costs to a reasonable level.

 

Small IO and Small File Performance

OrangeFS is developing optional and configurable middleware-driven caching on the client side, including configurable semantics that provide a tradeoff between performance and consistency management.  Design discussions have also circulated about how smaller files could be stored in the distributed metadata, thus allowing file data to return with metadata for small files.