The Importance of User Filesystems
This article has very little to do with the following:
- Hey, user filesystems exist!
And everything to do with the following:
- What is the potential of giving all developers an easy way to create user filesystems?
- What are some of the things we would see should this happen?
This article discusses User Filesystems as one future of making life easier for software developers and users. I’ll start out by discussing the difference between traditional filesystems and user filesystems. I will then discuss a variety of real examples that can be implemented as user filesystems, some of which are crazy, some not. Finally I want to discuss the limitations preventing developers from doing this and how we can get around those limitations.
Traditional Filesystems
A traditional filesystem is a formatting specification that basically lets you record and remember bytes into logical files and folders typically stored on hard drives, USB flash drives or addressable mediums. There are many different filesystems out there such as: NTFS, JFS, FAT32, WinFS?,… (much more). Most of these filesystems were designed to work well with a disk that typically spins around to seek and transfer data from memory to a non-volitile storage medium.
User Filesystems
User filesystems can also be described in terms of files and folders offset from some path derived from the file manager of your computer. Just historically that is how we think of file systems. User filesystems are not typically provided by the OS by default (though most OS’s have some kind of pluggable system to get a user/kernal filesystem installed). Another thing is that you can’t exactly describe them in terms of what storage medium the data is placed. In a user filesystem you can’t even tell where the files or folders even exist (without being the writer of the filesystem).
The idea here is that user filesystems are based on software the logically defines all of the same operations a real filesystem has.
This software doesn’t actually have to read or write to storage. It could make it all up or base it on anything it wants. Why would we want that?
Subst Drives (Not really a User Filesystem)
Ok fine, so the subst command isn’t a user filesystem but it does teach us one reason about why we might want user filesystems. A subst drive is basically just a new drive letter in Windows based on any folder from the existing real filesystem. When my computer boots up I have a batch file create two subst drives: U:\ and T:\. U:\ represents all of the examples I have from my C++ class and T:\ represents the examples from my Java class. Each semester the I modify the batch file to point both drives to the new location of my lectures and examples. All I have to do is change the bold part below:
subst T: "...My DocumentsTeaching2006 Springcomp118examples”
Basically by just referring to the root drive U:\ I have access to everything in the above directory. This is nothing new to many developers and I am sure there are others that use subst even today. Now as I have already said this is not a user filesystem, but it does represent one important fact: When I access my examples folder I only need to go to U:\ and not that extremely long folder. A user filesystem in most operating systems benefits from having this simplicity: A resource is provided by a short URL (IE a drive letter in Windows).
XDrive (A Real User Filesystem)
One very interesting application is called XDrive. XDrive is an implementation of a user filesystem in Windows and gets at the idea this article wants to embody. XDrive gives you a drive letter that logically represents an FTP (or SFTP) site.
To me this is very powerful: A web developer or content manager can simply open their content in any editor they want from their X:\ (XDrive defaults to X:\). XDrive maps the local folder operations such as directory listing, reading and writing files and proxies them to an FTP site. For example, with XDrive, opening a file will result in an FTP GET and saving a file will result in PUT command.
What we see and learn from this is that implementing a user filesystem allows us to map the normal operations we think of as reading and writing of files and listing directories as being completely controlled in software. The software can map all of those operations to something completely logically different.
Logical Operations
An even more powerful idea can result from thinking about how XDrive works: Map a drive letter to something that makes sense in a file/folder aspect, no matter the medium. XDrive’s goal was innovative because they are 1) popular and 2) able to write a carefully written kernal user filesystem driver to talk to user space driver that does the actual FTP implementation. This separation of tasks allowed them to create a stable piece of software that provided a very important virtual filesystem. But why stop there?
Email Browser User Filesystem
Think about your inbox as a real folder in drive M:\ (M: for mail). I can picture M:\inbox\ a read only folder that “somehow” maps to your Outlook Inbox (or whatever email client you like). Within this folder could be a variety of different views (or even configurable at run time of the user filesystem) such as:
- Sub folders representing certain weeks (like groups in Outlook)
- Individual files that represent emails as text files, html files, or simply a .msg file (outlook)
- A folder that holds even more special views of your email.
Web Site Editor User Filesystem
How about a filesystem that maps to a Wiki, a Wordpress blog, Blogger Blog, etc… which holds folders and files representing:
- XML files representing the content on the associated site.
- Image files
- Maybe even Word Document Files
- Or anything compatible with the System that represents your website.
Now don’t think of this in terms of an FTP drive like XDrive. Instead think of the act of loading and saving as going to and from a database on your website. Think of the act of saving a doc file on say S:\content\ as instantly being converted to HTML, loaded to a remote database, and automatically linked to from another page.
For website administrators perhaps users on the site could be in a folder called S:\users\ and be represented by XML files. The user filesystem knows that these files are actually database entries but presents them to you as an XML file.
Social Websites
In such sites a download could be made available, such as a Jar file, that connects to a standard user filesystem and creates a drive. The drive could have pictures of other users connected to you, their resume, or anything that is supposed to be available to you. Perhaps even a calendar file that Outlook (or insert calendar program here) always references. When the file is read by an application (such as outlook) the user filesystem maps the reading to get publicly specified events for that user, or simply free/not free information.
Perhaps a combinational user filesystem provided by “someone” could make a drive that has Flickr photos in one folder, a bookmarks.htm file from delicious in another, a folder that is SSL-connected to a user specified cookie jar for your browser, a folder or file set representing blog entries as text files, general purpose file storage, distributed file storage over many peers… the list can go on.
Can you imagine a global and secure peer to peer filesystem that gives you a key for every file that you own that is on that system, and only lets you see the files you have a key for. Every computer in such a network would only see their view even though they are probably allocating a certain amount of GB (stored redundantly) of securely stored files that only one particular user (or many) can see.
A peer to peer network that just makes files “show up” in your user filesystem via sharing your key with someone else. The contents of the file, if very large, would probably need to be copied to a local real drive/folder before it is usable.
CSV files on your user filesystem which represent current real-time results of stock data, horse/dog races, and what-have you. You could write simple programs that open these files and do some batch processing on them daily. The batch program would be so simple since it is just opening and reading a file.
A User Filesystem Connected to a Database Representing a Normal Filesystem
Imagine a Database schema that attempts to represent a filesystem:
Table: Folders {id, user_id, parent (default=root), name, timestamp fields…}
Table: Files {id, parent_folder_id, name, data (blob), timestamps…}
Obviously you could seriously create much more complicated schemas but the idea follows that maybe a filesystem is logically mapped to a database. You could get even more wild by adding a history stamping table that lets you rollback any changes made to the filesystem. No more spyware simply by rolling back all changes to the system since 3pm (assuming you built in a snap-shot like system).
Source Control Systems User Filesystem
Imagine how easy it would be if you attempted to save a file on disk G:\ (A CVS, Subversion, or whatever SCMS) that, if not checked out, would pop up a dialog asking you to check it out and then save.
Imaging all the files that aren’t yet in the CVS as being weakly stored until the user commits them manually. Weakly stored files could be mapped to a local database (per machine) rather than the actual CVS repository. This could be good for temporary build files that typically aren’t added to CVS.
Quick Aspect User Filesystems
Imagine a drive letter that maps operations like opening and reading a particular JPG, PNG, or BMP (insert fav here) that represents the current screen snapshot, a particular window, or even a remote computer’s screen. Imagine writing multiple audio files to a folder to initiate a Speech to Text program which creates a logical file in the same directory (for reading) that represents the text of the audio. Vice-Versa with TTS output. All I would have to do is drag and drop files from one folder into the special drive/folder combination to run this program.
Special Purpose User Filesystems
Suppose you’re at a company that has a very old system that uses a language only capable of reading and writing files. Suppose you want that language to go online, well you could create a user filesystem that maps the reads and writes to socket like operations. The format of the file being written to the logical user filesystem could be such that there is a place for an IP address, port, etc… This could easily launch old software systems into the new realm of internet computing.
So why hasn’t it happened yet?
I have looked endlessly through NTFS Filesystem Filter driver writing examples and get endlessly confused due to lack of prior experience in kernal driver writing. Although I know if the interface was simplified and unified for all operating systems, I would be able to write many user filesystems with a variety of applications. Although I wouldn’t even have to do that, so many open source developers would jump to write such things as well.
So what do we need for these user filesystems to be developed by everyone?
There are many answers to this question, here is mine:
- Analyse the low level functions required for each target operating system and create an interface representing common operations. Create a kernal driver for each target OS that connects to a pluggable completion driver written in user-space.
- Tier the user-space driver and release cross-platform development kits providing a slew of different interesting aspects to where you would like to start writing your filesystem.
- The granularity of the different development kits basically let the programmer decide the lowest level they wish to go, then allowing the dev kit to fill in a default operation for the unimplemented functionality.
- Port the user space driver to a variety of managed languages. Build very well designed Java and .NET classes that represent entities such as FileSystem, File, Folder, Permission Handlers, Reader and Writer Handlers. Make sub-classes to these representing example filesystems. Thread safety should be discussed in the documentation and made automatic or manual.
- Allow developers to leverage the user space driver into more interesting scenarios like: kernal driver to user driver to secondary user driver. For example: A listen socket is opened from the user space driver (the one connected directly to the kernal) which awaits a remote connection. The remote connection, if following the correct protocols (published by the writer of this driver) can act as a third tier implementation of user filesystem. When an accepted connection is made (and authenticated) a drive could appear automatically based on user preferences.
- Give the programmer lots of sample code and ideas behind the sample code.
- Errors must be handled elegantly. Ensure the kernal driver is constantly checked for problems and connected to a variety of user space connectors.
- Recommend that users use managed code environments to implement user space driver connectors.
So when will this happen?
Again, pluggable user filesystems have been an option in Linux for a long time. Yet even Linux could use a standardized cross-platform implementation. Here is the current direction:
http://sourceforge.net/search/?words=user+filesystem&type_of_search=soft
Nothing on the above link is unified or cross-platform. None make an easy to use base of code to build anything from a fully featured user filesystem to a bare-bones implementation. Hopefully some people reading this will get some ideas on how to implement this and bring on the next wave of software development.
One Last Problem: Security
This doesn’t change from the situation we are already in too much. As in most situations in which we wish to have more functionality, we typically have consequences in the security realm. The major drawback of making a standard user space connector are the people that intend to implement it in non-managed languages or those that work for spyware companies. This falls into nearly the same danger category of how spyware gets on your system in the first place. The addition of kernal access makes things worse.
Alex James said,
May 30, 2006 @ 9:13 pm
http://www.base4.net/Blog.aspx?ID=25 (Is a post about your comments).