Talk: Introduction into Go Profiling: Tools

This is a written version of a talk I gave on profiling in Go — what tools exist, how to use them, and how to read the output.

What profiling is (and isn’t)

If you’re already doing observability (logs, metrics, traces), profiling fills in the gap: it tells you where your application spends CPU time, allocates memory, or blocks on locks. It’s the difference between knowing your service is slow and knowing why it’s slow.

Go supports several profile types:

Three ways to collect profiles

1. Benchmark tests

The testing package can write profiles during benchmarks:

func BenchmarkGenerateRandomString(b *testing.B) {
    for i := 0; i < b.N; i++ {
        GenerateRandomString(10)
    }
}
go test -bench=. -cpuprofile=cpu.out
go test -bench=. -memprofile=mem.out
go test -bench=. -blockprofile=block.out
go test -bench=. -mutexprofile=mutex.out

Then analyze:

go tool pprof cpu.out

2. runtime/pprof

For standalone programs where you want to control exactly when profiling starts and stops:

import (
    "os"
    "runtime/pprof"
    "log"
)

func startCPUProfile() (*os.File, error) {
    f, err := os.Create("cpu.prof")
    if err != nil {
        return nil, err
    }
    if err := pprof.StartCPUProfile(f); err != nil {
        f.Close()
        return nil, err
    }
    return f, nil
}

func main() {
    f, err := startCPUProfile()
    if err != nil {
        log.Fatal(err)
    }
    defer pprof.StopCPUProfile()
    defer f.Close()

    // your application logic

    writeHeapProfile()
}

func writeHeapProfile() {
    f, err := os.Create("heap.prof")
    if err != nil {
        log.Fatal(err)
    }
    defer f.Close()

    if err := pprof.WriteHeapProfile(f); err != nil {
        log.Fatal(err)
    }
}

3. net/http/pprof

For long-running services, expose profiles over HTTP:

import (
    "net/http"
    _ "net/http/pprof"
    "log"
)

func main() {
    go func() {
        log.Println(http.ListenAndServe("localhost:6060", nil))
    }()

    // your application logic
}

Then pull profiles on demand:

# browser
open http://localhost:6060/debug/pprof/

# CPU (30 second sample)
go tool pprof http://localhost:6060/debug/pprof/profile?seconds=30

# heap
go tool pprof http://localhost:6060/debug/pprof/heap

# goroutines
go tool pprof http://localhost:6060/debug/pprof/goroutine

# blocking
go tool pprof http://localhost:6060/debug/pprof/block

# mutex contention
go tool pprof http://localhost:6060/debug/pprof/mutex

Reading the output

Interactive mode

go tool pprof cpu.prof

The commands I use most:

Visualization

# flame graph in the browser
go tool pprof -http=:8080 cpu.prof

# PNG call graph
go tool pprof -png cpu.prof > cpu.png

# diff two profiles
go tool pprof -base=old.prof new.prof

What top output means

      flat  flat%   sum%        cum   cum%
     1.5s 50.00% 50.00%      2.0s 66.67%  main.processData
     0.5s 16.67% 66.67%      0.5s 16.67%  runtime.mallocgc
     0.3s 10.00% 76.67%      0.8s 26.67%  main.parseInput
     0.2s  6.67% 83.33%      0.3s 10.00%  encoding/json.Unmarshal

flat = time in the function itself. cum = time in the function plus everything it calls.

Heap profiles

Heap profiles show allocations, not current usage. To see current memory:

curl http://localhost:6060/debug/pprof/heap > heap.prof

# current memory
go tool pprof -inuse_space heap.prof

# total allocations over time
go tool pprof -alloc_space heap.prof

Beyond the basics

Continuous profiling

For production, there are tools that collect profiles continuously with low overhead (<1%): Google Cloud Profiler, Datadog, Pyroscope, Parca.

eBPF profiling

eBPF lets you profile without code changes, but you lose Go-specific context — it can’t attribute samples to Go source as accurately. Useful for system-level stuff. Tools: perf, bpftrace, Parca in eBPF mode.

Profile-guided optimization

Go 1.20+ can use CPU profiles to guide compiler optimizations:

go build -pgo=default.pgo

Common scenarios

Memory leaks

go tool pprof http://localhost:6060/debug/pprof/heap
(pprof) top -cum
(pprof) list <suspicious_function>

CPU hotspots

go tool pprof http://localhost:6060/debug/pprof/profile?seconds=30
(pprof) top
(pprof) web

Goroutine leaks

curl http://localhost:6060/debug/pprof/goroutine?debug=1
go tool pprof http://localhost:6060/debug/pprof/goroutine

Lock contention

runtime.SetMutexProfileFraction(5)
go tool pprof http://localhost:6060/debug/pprof/mutex

Things to watch out for

Resources

#Go #Profiling #Observability