Friday, September 28, 2012

YANFS - Yet Another Network File System

In the last few blog posts we've been exploring different kinds of how you might want to utilize libcouchbase from your app. To me it has always been really hard to come up with good example apps, so I always end up creating pretty boring applications.

Those of you who attended CouchConf in San Francisco really missed out if you didn't watch Couchbase Labs demo sessions, where they showed off CBFS. I'm not going to go into any details about what CBFS is (hopefully Dustin Sallings will create a blog post about that), but you can think cbfs as a large object store analogous to S3 (but meant to run in your own environment).

CBFS is implemented in the go language (if you haven't looked at that yet, put that on your todo list!), and it's shipped with it's own client to perform operations (upload files, list files etc). This gave me a great idea for todays blog post! FUSE is a project that allows you to create a filesystem implementation in user space, so the topic for today is an example where we utilize libcouchbase in our implementation for our FUSE driver.

As usual I won't explain the entire code, but I'll try to comment on the important bits. You'll find the source code for the entire project at http://github.com/trondn/mount_cbfs

The readme in the project should tell you how to install fuse and build the stuff, so in this blog post I'm only going to focus on how it works.

Creating a full featured filesystem driver is a lot of work, so I figured it would be better to start with a minimal implementation with a lot of limitations we could fix up later on (or I would never get around building it ;))


  • It is fully single threaded. This should be pretty easy to fix up later on by using a pool of libcouchbase instances, no shared data etc..
  • Read Only. This limits the number of entry points we need to implement.
So let's start looking at the code and how it all fits together. In main we register the struct containing all of the function pointers to the functions our filesystem support:

static struct fuse_operations cbfs_oper = {
    .getattr = cbfs_getattr,
    .open = cbfs_open,
    .read = cbfs_read,
    .readdir = cbfs_readdir,
};

This means that whenever someone tries to open a file in our filesystem, cbfs_open is called etc. So let's walk through an example here: What is the flow when you're typing "ls -l" in your shell.

The first things that happens is that cbfs_getattr is called. Its responsibility is to populate a stats struct with information about the file. Let's skip the details for now, and just assume that we return that this is a directory. Now cbfs_readdir is called with the directory to receive all of the entries in the directory, before cbfs_gettattr is called for every file to get the details of the files.

So how does our cbfs_readdir work like (after all this blog is about libcouchbase, not fuse ;)). CBFS use a http interface so we can request the list of files in a directory through the following URL:

http://cbfsserver:8484/.cbfs/list/path

It returns a JSON document looking something like:

{"files":{"file1":{}},"dirs":{},"path":"/foo"}

All we need to do is to decode the JSON and populate the information to FUSE. So how do we do this through libcouchbase? Through the http interface:

static lcb_error_t uri_execute_get(const char *uri, struct SizedBuffer *sb) {
    lcb_http_cmd_t cmd = {
        .version = 1,
        .v.v1 = {
            .path = uri,
            .npath = strlen(uri),
            .body = NULL,
            .nbody = 0,
            .method = LCB_HTTP_METHOD_GET,
            .chunked = 0,
            .content_type = "application/x-www-form-urlencoded",
            .host = cfg->cbfs_host,
            .username = cfg->cbfs_username,
            .password = cfg->cbfs_password
        }
    };

    return lcb_make_http_request(instance, sb, LCB_HTTP_TYPE_RAW, &cmd, NULL);
}

Since I'm using the synchronous interface to libcouchbase that call will block until we've received the response from the server. In my response handler I just copy whatever data the server returned to me:

static void complete_http_callback(lcb_http_request_t req, lcb_t instance,
                                   const void *cookie, lcb_error_t error,
                                   const lcb_http_resp_t *resp)
{
    struct SizedBuffer *sb = (void*)cookie;

    if (error == LCB_SUCCESS) {
        /* Allocate one byte extra for a zero term */
        sb->data = malloc(resp->v.v0.nbytes + 1);
        sb->size = resp->v.v0.nbytes;
        memcpy(sb->data, resp->v.v0.bytes, resp->v.v0.nbytes);
        sb->data[resp->v.v0.nbytes] = '\0';
    }
}

Both cbfs_read and cbfs_getattr utilize the same function from libcouchbase, but they're only hitting different URL's to get their data.

Adding supports for write operations should be no worse than finding the correct URL and do a PUT etc.

Happy Hacking :)



No comments:

Post a Comment