2022-05-08

How to structure/layer your Golang Project (or whatever language you are using)

I've been maintaing a lot of other people's project, and see that most people blindly following a framework's structure without purpose. So I write this so that people can be convinced how to write a good directory structure or good layering on your application, especially for this case when creating a service.

It would be better to split your project to exactly 3 layers:
  1. presentation (handle only serialization/deserialization, and transport)
  2. business-logic (handles pure business logic, DTO goes here)
  3. persistence (handles records and it's persistence, DAO goes here)
It could be a package with bunch of sub-package, or per domain basis, or per user role basis. For example:

# monolith example

bin/ # should be added to .gitignore
presenter/ # de/serialization, transport, formatter goes here
  grpc.go
  restapi.go
  cli.go
  ...
businesslogic/ # pure business logic and INPUT/OUTPUT=DTO struct
  role1|role2|etc.go
  domain1|domain2|etc.go
  ...
models/ # all DAO/data access model goes here
  users.go
  schedules.go
  posts.go
  ...
deploy/ 
  Dockerfile
  docker-compose.yml
  deploy.sh
pkg/ # all 3rd party helpers/lib/wrapper goes here
  mysql/
  redpanda/
  aerospike/
  minio/

# Vertical-slice example

bin/
domain1/
  presenter1.go
  business1.go
  models1.go
domain2/
  presenter2.go
  business2.go
  models2.go
...
deploy/
pkg/

Also it's better to inject per function basis instead of whole struct/interface, something like this:

type LoginIn struct {
  Email string
  Password string
  // normally I embed CommonRequest object
}
type LoginOut struct {

  // normally I embed CommonResponse object with properties:
  SessionToken string
  Error string
  Success bool
}
type GuestDeps struct {
  GetUserByEmailPass func(string, string) (*User, error)
  // other injected dependencies, eg. S3 uploader, 3rd party libs
}
func (g *GuestDeps) Login(in *LoginIn) (out LoginOut) {
  // do validation
  // do retrieve from database
  // return proper object
}

So when you need to do testing, all you need is create a fake, either with counterfeiter (if you inject an interface instead of function) or manual, then check with autogold:

func TestLogin(t *testing.T) {
  t.Run(`fail_case`, func(t *testing.T){
    in := LoginIn{}
    deps := 
GuestDeps{
      GetUserByEmailPass: func(string,string) { return nil, errors.New("failed to retrieve from db") }
    }
    out := deps.Login(in)
    want := autogold.Want("fail_case",nil)
    want.Equal(t, out) // ^ update with go test -update 
  })
}

then on the main (real server implementation, you can put real dependency something like this:

rUser := userRepo.NewPgConn(conf.PgConnStr)
srv := httpRest.New( ... )
guest := GuestDeps{
  GetUserByEmailPass: rUser.GetUserByEmailPass,
  
DoFileUpload: s3client.DoFileUpload,
  ...
}
srv.Handle[LoginIn,LoginOut](LoginPath,guest.Login)
srv.Listen(conf.HostPort)

Why are we doing like this? because usually a framework that I ever used is either insanely overlayered or no clear separation between controller and business logic (it still handles transport, serialization, etc), and validation only happenned on outermost layer or sometimes half outside half inside the business logic (which can make it vulnerable), which when we create unit test, the programmer tempted to test whole http/grpc layer instead of pure business logic. This way we can also use another kind of serialization/transport layer without having to modify the business logic side. Imagine if you use 1 controller to handle the business logic, how much hassle it is if you have to switch framework because some reasons (framework no longer maintained, performance bottleneck in framework side, the framework doesn't provide proper middleware for some fragile telemetry, need to add other kind of serialization format or protocol, etc). But with this kind of layering, for example if we want to add grpc or json-rpc or command line or switching framework, or anything else, it's easy, just need to add a layer with proper serialization/transport then call the original business logic.
 
 
mermaid link (number 2-4 is our control)

Talk is cheap, show me the code example! gingorm1 or echogorm1 is the minimal example (you can always change the framework to fiber or default net/http or any other framework, an the orm to sqlc sqlx jet. But if you are all alone don't want to inject the database functions (which against clean-architecture, but this is most sensible way) and want to test directly against the database, you can check this example fiber1 or sveltefiber (without database). Note that those just example, I would not inject the database as a function (see fiber1 for example), I would directly depend and use dockertest for managed dependencies, and only use function injection for unmanaged dependency (3rd party). Some more complex example can be found here: street.

This is only from code maintainer's point of view, there are WORST practice that I found in the past when continuing other people's project:
  • table-based entity microservice, every table or set of tables has their own microservice, it's overly granular, where some should be coupled instead (eg. user, group, roles, permission -- this should be one service instead of 2-3 services)
  • MVC microservice, one microservice for each layer of the MVC and worse it's on different repository, eg. API service for X (V), API service for Y (V), webhook for X (C), adapter for third party Z (C/M), service for persistence (M), etc -- should be separated by domains/capability instead of by MVC layer, why? because if we implement one feature, that normally in monolith we only need to change 1 repository, but if the microservice separated by MVC layer, we have to modify 2-4 microservice when implementing 1 feature, have to start 2-4 service just to debug something, which doesn't make sense. It might make a bit sense if you are using monorepo, but without it, it's more pain than the benefit.
  • pub-sub channel-goroutine without persistence, this one is ok only if the request is discardable (all or nothing), but if it's very important (money for example) you should always persist every state, and would be better if there's a worker that progressing every state into next state so we don't have to fix manually.
There's also some GOOD things I found:
  • Start with auth, metrics, logs (show request id to user, and useful response), proper dependency injection, don't let this became too late that you have to fix everything later
  • Standard go test instead of custom test framework, because go test has proper tooling on the IDEs
  • Use worker for slow things, which is make sense, don't let user wait, unless the product guy want it all sync. Also send failures to slack or telegram or something
  • CRON pattern, runs a code every specific time, this is good when you need a scheduled task (eg. billing, reminder, etc)
  • Query-job publish task, this pattern separate CRON from time dependency (query from db, get list of items to be processed, publish to MQ), so the publish task can be triggered independently (eg. if there's bug in only 1 item), and regardless of time, and the workers will pick up any kind of work that are late.
  • Separating proxy (transport/serialization) and processor (business-logic/persistence), this have a really good benefit in terms of scalability of small requests (not for upload/download big files), where we put generic reverse proxy, push it to two-way pub-sub, then return back the response. For example, we create a rest proxy, grpc proxy, json-rpc proxy, all those 3 will push to NATS, then worker/processor will process the request and return proper response, this works like lambda, so programmer only need to focus building the worker/processor part instead of generic serialization/transport, all the generic auth, logging, request id, metrics, can be handled by the proxy, programmer only need to focus on business logic.
    the chart is something like this:

    generic proxy <--> NATS <--> worker/processor <--> databases/3rd party

    this way we can also scale independently either with monolith or microservice. Service mesh? no! service bus/star is the way :3