A dive into Golang filesystem interfaces

A dive into Golang filesystem interfaces

One of the first things that people start to love in Go is the power if the io.Reader and io.Writer interfaces, the handler abstractions in the http package and other good abstractions which are embraced by the standard library.

In this article we will explore the filesystem interfaces, which present another powerful abstraction which can make your programs less coupled and more testable. We will also learn about the interface extension pattern, which can be used in other contexts.

Motivation

The main motivation for the filesystem interfaces is the go:embed feature released with Go 1.16 (spec/proposal).

We will dig into go:embed later, but lets introduce it: it allows to embed file assets into the compiled executable. Go is already liked for its ability to generate a single executable. go:embed extends that further allowing, for example, to put a Single Page Application inside the executable. This pattern is used by many services implementing logic in Go, but offering an embedded web interface for controlling it.

Being able to walk the list of embedded files and read them in multiple contexts, for example exposing them to a web server requires some abstraction.

These abstractions would prove valuable not only for the embed use case, but to make programs less reliant on low level file access, more testable, and hopefully long term provide what third party packages like affero provide today.

Interface overview

Except for them to be read-only for now, the core of the interfaces should not surprise anyone:

type FS interface {
	Open(name string) (File, error)
}

type File interface {
	Stat() (os.FileInfo, error)
	Read([]byte) (int, error)
	Close() error
}

Those interfaces are complemented by an API exposed in the fs package:

func Glob(fsys FS, pattern string) (matches []string, err error)
func ReadFile(fsys FS, name string) ([]byte, error)
func ValidPath(name string) bool
func WalkDir(fsys FS, root string, fn WalkDirFunc) error
func ReadDir(fsys FS, name string) ([]DirEntry, error)
func Stat(fsys FS, name string) (FileInfo, error)
// ...

Accessing files via this API means you can port code reading files to this interface, and supply a os.DirFS as the default implementation.

The go:embed filesystem

As we mentioned, the go:embed functionality was a motivator for the introduction of these interfaces. This feature allows to embed resources and assets in an executable.

import "embed"

//go:embed hello.txt
var f embed.FS

The mentioned files and directories will be embedded into the resulting executable at compile-time, and made available at runtime via the embed.FS variable below the directive (not to be confused via a plain comment).

This variable satisfies the FS interface, which means you can use the full FS API to read the files at runtime.

Serving a filesystem via HTTP

The http.FileSystem interface allowed to expose an abstract filesytem to be served via HTTP by converting it to a handler through the http.FileServer function. http.FileSystem predates fs.FS, therefore the http package includes a helper http.FS to convert a fs.FS to the http.FileSystem type currently used in the http package.

Interface extensions

One detail that can indeed be surprising, is to make sense of the interface extension pattern. The package also defines interfaces with extra methods. For example the ReadDirFile interface adds the ReadDir method to the File interface (a pattern you ):

// A ReadDirFile is a File that implements the ReadDir method for directory reading.
type ReadDirFile interface {
	File
	ReadDir(n int) ([]os.FileInfo, error)
}

Another example:

type StatFS interface {
	FS
	Stat(name string) (os.FileInfo, error)
}

When reviewing these interfaces for first time, one may think, how are they used. It would be ugly to receive a generic interface and cast to something more specific to access extra functionality.

All these extra extensions are exposed in the top level API in the fs package:

func Glob(fsys FS, pattern string) (matches []string, err error)
func ReadFile(fsys FS, name string) ([]byte, error)
func ValidPath(name string) bool
func WalkDir(fsys FS, root string, fn WalkDirFunc) error
func ReadDir(fsys FS, name string) ([]DirEntry, error)
func Stat(fsys FS, name string) (FileInfo, error)
// ...

For example the Stat function does indeed work with any generic FS. The implementation will first check if it can do a more efficient stat in case the FS implements StatFS and the Stat method. If this is not the case, it will fallback to attempting to open the file and checking for an error:

func Stat(fsys FS, name string) (os.FileInfo, error) {
	if fsys, ok := fsys.(StatFS); ok {
		return fsys.Stat(name)
	}

	file, err := fsys.Open(name)
	if err != nil {
		return nil, err
	}
	defer file.Close()
	return file.Stat()
}

So it is indeed casting. But all is hidden from the API.

This pattern is already used in other places. The io.WriteString method will attempt to check if the writer implements io.StringWriter before falling back to converting the bytes to a string.

Thanks to this pattern, a FS implementation, for example a remote file system, can still work with the provided APIs, but taking advantage of optimizations by implementing those specific interface extensions.

Testing

The testing/fstest package provides:

  • A MapFS implementation, which you can use to mock a simple filesystem that would return some data given some path is accessed. You can then pass this filesystem to functions depending only on FS to test them.

For example:

	fsys := fstest.MapFS{
		"js/main.js": {
			Data: []byte("..."),
		},
		"css/style.css": {
			Data: []byte("..."),
		},
	}
  • A TestFS function that test the correct behavior of filesystem implementation. You would use this when writing custom FS implementations (eg. a remote filesystem).

FS implementations

The availability of the interfaces in the standard library means 3rd party projects can provide concrete implementations.

The gopherfs project provides several implementations:

  • additional interfaces to allow writeable filesystems and filesystem utility functions
  • cache based filesystems
  • a filesystem implementation based on Azure’s Blob storage
  • a filesystem that merges two filesystems in one
  • a filesystem on top of Redis
  • cascade filesystems: try from different cache levels, finally pull from a cloud storage

Conclusion

The interfaces allow to decouple from a particular way to access files, help with testability and provide an extension point for future and 3rd party file-system abstractions.

The go:embed filesystem leverages these interfaces to keep our services without on-disk dependencies.

Finally, we analyzed the interface extension pattern, used in this and other places in the go library, which provide a model to allow certain implementations to take advantage of optimizations without exposing them in the interface.

Written by

Duncan Mac-Vicar

February 25, 2022

Duncan is a development lead at refurbed.

We're Hiring

  • Senior Data Engineer (m/f/x)

    We are looking for a Senior Data Engineer to work in the intersection between engineering and data science. Help us improve our data processing workflows and push them to the next level.

  • Senior Vue.js Frontend Developer (m/f/x)

    We are looking for a Senior Vue.js Developer to support us in developing our external and internal interfaces. These include our checkout application, customer area and management interfaces for us and our merchants.