Files: the Ultimate API

From Jonathan Gardner's Tech Wiki
Jump to: navigation, search

Introduction

Someone once lamented that we write all these apps and none of them are truly interoperable. For instance, we can't use GiMP's brush in Inkscape or Blender.[1]

My proposal is simply this: Let's stick with the old Unix philosophy: Everything is a file!

Everything REALLY IS a File!

For starters, everything REALLY is a file! I say this because everything can be stored in memory somehow, or it doesn't exist at all. And what is a file? It's really a block of memory. QED.

What is a File?

A file is an object---an abstract concept. It provides, at best, the following APIs:

  • Serial read, start to finish, provided it is in read mode.
  • Serial write or append, start to finish, provided it is in write mode.
  • Random access, provided you can store the contents on the file someplace local. Thanks to the size of hard drives and memory, there really isn't much we have to treat serially anymore.

Files are referred to by their file handles. The OS provides a mechanism to translate words (paths and filenames) into the actual file handle.

Each file can be shared among processes, or even sent across the network. In fact, there are numerous protocols to not only share files across the network, but do so efficiently. HTTP, FTP, and even NFS do exactly this. There are also applications like Rsync and BitTorrent to assist.

OS's provide things like shared memory and sockets and such to help turn these wonderful concepts into files.

Why is this important?

When a program like Gimp interacts with its elements, it is really working with files. Apply a filter to an image, and it takes the original image "file" and modifies it. (Really, it has a bunch of memory it works with.)

If we can separate out GiMP into a series of operations, and then standardize the images it works on as files, then we are on the right path. Now, each operation is an independent thing. The layers and images in your workspace are independent as well. You don't have to open an image in GiMP to use a GiMP filter anymore, provided that the filter is given the right kind of file to work with.

You should now see GiMP as something that fits in the UNIX Paradigm. Some files are data. Some are programs. (But programs are really data.) Others represent really weird and strange things. In the end, you take one file---the program---and pipe in the contents of another file, and get two files out. This is simply STDIN, STDOUT, and STDERR.

If GiMP were to standardize its native image format, then any program could work with this format. GiMP could then break all of its commands down to their most fundamental operations. All that's left is the GiMP UI, which is really how these components are bound together.

If our OS can manage the GiMP UI bits by providing a GiMP tray and a context menu with GiMP commands, then we can add our own commands in our own tray with our own context menus.

The Future of Computing

In a way, the BASH commands pipe and redirect output aren't going away. They are going to come back with a vengeance as we simplify and expand the capability of our computers.

There will be, eventually, only a few standard data types that are globally acceptable. Images will be stored in some common format. (We won't have to compress images because bandwidth will be cheap.) Audio, video, text, etc, will all be in a common format.

There is only so much data people will ever want to collect, and only so many things they can do with it. Eventually, rather than having monolithic apps like GiMP determine which formats and which operations are acceptable, each operation will be its own independent thing, the same way "cat" and "tail" are different operations.