This post will describe common approaches to organizing Go code into packages and what I feel is the ideal directory structure for Go repositories. Before starting with this post, it may be useful to also read https://blog.golang.org/package-names first for context.
A setup with package proliferation Link to heading
Let’s take an example application called gatekeeper. It may have a public client that other programs use to connect to it and may have parts around authentication and payment, used by gatekeeper internally. One structure is like the following.
/gatekeeper/main.go
/gatekeeper/client/client.go
/gatekeeper/auth/auth.go
/gatekeeper/payment/payment.go`
The starting issue I have with this setup is how the inner objects are referenced. Here is an example of someone using your client.
import "gatekeeper/client"
// ... lots of lines later
func process() {
x := client.NewClient()
// ...
}
In this simple example, it’s easy to see while reading function process() that the client is a gatekeeper client, but this is less obvious if the code is large. There could be many clients. Which client is client.NewClient() when the reader is 500 lines down the page?
Auth and payment are internal packages to gatekeeper, so it’s possible that references to them in gatekeeper code could be less ambiguous. However, if they’re already packages it’s possible they may move out of gatekeeper, starting the same problem. Another problem is that if other packages follow the same pattern, gatekeeper may import another package called auth and confuse it with its own.
Additionally, while auth and payment are private for gatekeeper and probably contain code that users of gatekeeper shouldn’t bother sifting through, client is generally important for users of gatekeeper and something I want them to find quickly. By mixing and deprioritizing packages for consumers, I’ve made it more difficult for people to find the parts of gatekeeper I want them to use.
Setup with a single package Link to heading
An alternative setup is to place the files in a single package.
/gatekeeper/main.go
/gatekeeper/client.go
/gatekeeper/auth.go
/gatekeeper/payment.go
In this setup, it becomes very clear which type of auth or client is referenced by importers of gatekeeper.
import "gatekeeper"
func process() {
x := gatekeeper.NewClient()
// ...
}
In this setup, it is clear that variable x is a gatekeeper client. It may be desired to structure the code so that auth and payment are more private than the client.
Segregate library code Link to heading
Client is generally a library you expect many other people to use, while auth or payment may be gatekeeper specific. To achieve this, an alternative structure would be the following.
cmd/gatekeeper/main.go
cmd/gatekeeper/payment.go
cmd/gatekeeper/auth.go
gkclient/client.go
In this example, I abbreviate gatekeeper to “gk” for the public client. Another goal of naming and package placement is to avoid stuttering. Above, when I wrote client.NewClient(), the word client appears twice and is redundant to the caller. Given that we now have gkclient broken into a package, users would initiate a client like the following.
import "gkclient"
func process() {
c := gkclient.New()
// ...
}
Notice with a specific package name, the function New() is enough information to the users of your client to know intuitively what the function does.
General advice Link to heading
All Binaries in cmd directory Link to heading
From exploring a go codebase, there is no way to know which directories contain executables and which contain library code. More so, in larger organizations it may not be obvious which repositories are libraries and which are executables. Placing all executables in the (usually root) directory /cmd/ makes it intuitive to code explorers where the primary executables are and creates a clear separation from client code, libraries, and your core service.
Note this advice is stricter than “Only binaries in cmd directory”.
Only external libraries in non-root /cmd directories Link to heading
If you have a git repository of your application, it may be unintuitive which parts of your code you desire others to import and use and which is intended as packages for your service only. A solution may be like the following.
cmd/gatekeeper/auth.go
cmd/gatekeeper/payment/../
gkclient/gkclient.go
gkexplorer/..
In this setup, it is obvious that cmd/gatekeeper/payment isn’t intended for people to import or use, but gkclient and gkexplorer are open for business.
Alternatively, this can be enforced with /internal directories. I am personally a big fan of using internal directories to convey intent, but they have have not gained traction in the broader open source Go community.
The primary purpose of this rule is to quickly convey intent with as minimal information as possible. Go repositories often intermix code that is intended for public use with code that is generally only useful for the core application, making it difficult for novice code explorers to get to the meat of what they want.
An exception to this rule is a repository that is intended as entirely private with no client code at all. In this case, a cmd directory structure just adds a root directory with no purpose.
Consider package names, not package sub-directories Link to heading
The directory structure of a package is organizational and doesn’t change how people reference your functions inside code. In the first example, even though the client was in the package “gatekeeper/client”, it is used as client.NewClient(). This usage deep inside code makes it ambiguous which client is created. Think of directory structure as important only when the import statement is written and package name as important with the library is used. For example, putting auth in the cmd directory is a signal when the import statement is written and signals if the package is worthwhile to import in the first place.
Think of naming from the user’s perspective Link to heading
In my experience, it’s best to name libraries and packages from the perspective of who is using them. For example, gkclient.New() naturally means to create a new gatekeeper client when used.
The package name is part of the function name Link to heading
Don’t think of the name of a function as isolated from the name of the package. Users will reference your function prepending the package name, so when thinking of the right name of a function or which package to place it in, consider both together.
Don’t stutter Link to heading
Stuttering is generally allowed in struct names. For example, the package config can have a struct named Config. However, redundancy in function names is discouraged.
The following are examples of stuttering during naming:
- client.NewClient()
- gatekeeper.NewGatekeeperAuth()
- gatekeeperauth.NewAuth()
Large service exceptions Link to heading
Sometimes a service is so large or complex that there is a need to break off something like Auth into its own package. Usually, this is the exception rather than the rule. However, in large codebases it may make sense to break off submodules and give them names like auth. Ideally, auth as a package would be so big or complex that references to auth would be canonical in the repository.
However, this exception does not include code that is explicitly intended as public or for others to use. I still feel that in this case, packages like “client” are still better described as “gkclient” to make differentiation easier for the consumer.
Separate repository alternative for gkclient Link to heading
An alternative for client code is to place it in a separate repository so that imports of it can take the gatekeeper name. In this setup, client users will type “gatekeeper.NewClient()”. When doing this, however, it’s important that gatekeeper the service and gatekeeper the client library are in different directories (or different repositories), even if they share the same directory name.
Avoid splitting struct member functions across files Link to heading
Go allows you to split a struct’s member functions across files. An example would be like the following.
// auth.go
type Auth struct {}
func (a *Auth) Start() {
// ...
}
// creation.go
func(a *Auth) Create() {
// ...
}
This is very rarely done. This makes it confusing for readers to discover all of Auth’s API. Generally, if a struct is complex it may belong in its own file. Breaking Auth like this would only make sense if Auth was very large and in that case you have another red flag that Auth is actually two or more structs used together.
An obvious exception to this is needing to split a struct into multiple files due to build tags (a different implementation on windows vs linux).