Talk: Introduction into Go Profiling: Tools
Profiling in Go provides a way to optimize applications by understanding their resource usage patterns. This guide covers the essentials - from how profiling relates to observability to practical tools for gathering profiling data.
Profiling Basics
Profiling helps understand how an application uses resources like CPU and memory. It goes beyond basic observability by providing granular details on resource consumption, which is useful for optimization, cost reduction, and improving application performance.
Observability covers logs, metrics, and traces. Profiling complements these by offering detailed insights into resource utilization.
Go supports several types of profiling:
- CPU Profiling: Shows where the application spends execution time, highlights hotspots for optimization
- Memory (Heap) Profiling: Shows where and how memory is allocated, helps identify leaks and inefficient usage
- Goroutine Profiling: Shows all current goroutines and their stack traces
- Block Profiling: Shows where goroutines block waiting on synchronization primitives
- Mutex Profiling: Shows contention on mutexes
Profiling Tools in Go
Go’s standard library includes built-in profiling tools that integrate with the development workflow.
Using the testing Package
The testing package supports profiling through benchmark tests:
// Example benchmark test for profiling
func BenchmarkGenerateRandomString(b *testing.B) {
for i := 0; i < b.N; i++ {
GenerateRandomString(10)
}
}
Run the benchmark and collect profiles:
# CPU profile
go test -bench=. -cpuprofile=cpu.out
# Memory profile
go test -bench=. -memprofile=mem.out
# Block profile
go test -bench=. -blockprofile=block.out
# Mutex profile
go test -bench=. -mutexprofile=mutex.out
Analyze the results:
go tool pprof cpu.out
Using runtime/pprof
For more control over profiling data collection in standalone applications, runtime/pprof provides a programmatic interface:
import (
"os"
"runtime/pprof"
"log"
)
func startCPUProfile() (*os.File, error) {
f, err := os.Create("cpu.prof")
if err != nil {
return nil, err
}
if err := pprof.StartCPUProfile(f); err != nil {
f.Close()
return nil, err
}
return f, nil
}
func main() {
// Start CPU profiling
f, err := startCPUProfile()
if err != nil {
log.Fatal(err)
}
defer pprof.StopCPUProfile()
defer f.Close()
// Your application logic here
// For heap profiling at specific points
writeHeapProfile()
}
func writeHeapProfile() {
f, err := os.Create("heap.prof")
if err != nil {
log.Fatal(err)
}
defer f.Close()
if err := pprof.WriteHeapProfile(f); err != nil {
log.Fatal(err)
}
}
Using net/http/pprof
The net/http/pprof package exposes profiling data via HTTP, useful for profiling live applications. This is particularly valuable for long-running services:
import (
"net/http"
_ "net/http/pprof"
"log"
)
func main() {
// Start pprof server
go func() {
log.Println(http.ListenAndServe("localhost:6060", nil))
}()
// Your application logic here
}
Access profiles via browser or command line:
# View in browser
open http://localhost:6060/debug/pprof/
# Analyze CPU profile (30 second sample)
go tool pprof http://localhost:6060/debug/pprof/profile?seconds=30
# Analyze heap profile
go tool pprof http://localhost:6060/debug/pprof/heap
# Analyze goroutines
go tool pprof http://localhost:6060/debug/pprof/goroutine
# Analyze blocking operations
go tool pprof http://localhost:6060/debug/pprof/block
# Analyze mutex contention
go tool pprof http://localhost:6060/debug/pprof/mutex
Analyzing Profiling Data
Once you’ve collected profiling data, go tool pprof provides several ways to analyze it:
Interactive Mode
go tool pprof cpu.prof
Useful commands in interactive mode:
top: Shows functions consuming most resourcestop -cum: Shows functions by cumulative resource usagelist <function>: Shows annotated source code for a specific functionweb: Opens a graph visualization in your browser (requires graphviz)pdf: Generates a PDF call graphtraces: Shows sample traces
Visualization Options
# Generate flame graph
go tool pprof -http=:8080 cpu.prof
# Generate call graph
go tool pprof -png cpu.prof > cpu.png
# Compare two profiles
go tool pprof -base=old.prof new.prof
Understanding the Output
When you run top, you’ll see output like:
Showing nodes accounting for 2.5s, 83.33% of 3s total
flat flat% sum% cum cum%
1.5s 50.00% 50.00% 2.0s 66.67% main.processData
0.5s 16.67% 66.67% 0.5s 16.67% runtime.mallocgc
0.3s 10.00% 76.67% 0.8s 26.67% main.parseInput
0.2s 6.67% 83.33% 0.3s 10.00% encoding/json.Unmarshal
- flat: Time spent in the function itself
- cum: Cumulative time (function + its callees)
Memory Profile Specifics
Memory profiles show allocations, not current usage. For current memory usage:
# Get current heap snapshot
curl http://localhost:6060/debug/pprof/heap > heap.prof
# Analyze with inuse_space (current memory usage)
go tool pprof -inuse_space heap.prof
# Analyze with alloc_space (total allocations)
go tool pprof -alloc_space heap.prof
Advanced Profiling Techniques
Continuous Profiling
For production environments, consider continuous profiling solutions:
- Google Cloud Profiler
- Datadog Continuous Profiler
- Pyroscope
- Parca
These tools collect profiles continuously with minimal overhead (typically <1%).
eBPF-based Profiling
eBPF allows profiling without code changes, but has limitations:
- No access to Go-specific runtime information
- Can’t attribute samples to Go source code as accurately
- Useful for system-level profiling
Tools: perf, bpftrace, Parca (eBPF mode)
Profile-Guided Optimization (PGO)
Go 1.20+ supports PGO for compiler optimizations:
# Collect CPU profile from production
# Use it to build optimized binary
go build -pgo=default.pgo
Common Profiling Scenarios
Finding Memory Leaks
# Take heap snapshot at different times
go tool pprof http://localhost:6060/debug/pprof/heap
# In pprof interactive mode
(pprof) top -cum
(pprof) list <suspicious_function>
Analyzing CPU Hotspots
# Collect 30-second CPU profile
go tool pprof http://localhost:6060/debug/pprof/profile?seconds=30
# Interactive analysis
(pprof) top
(pprof) web # visualize call graph
Debugging Goroutine Leaks
# Check goroutine count
curl http://localhost:6060/debug/pprof/goroutine?debug=1
# Analyze goroutine profile
go tool pprof http://localhost:6060/debug/pprof/goroutine
Finding Lock Contention
# Enable mutex profiling (add to your code)
runtime.SetMutexProfileFraction(5)
# Collect mutex profile
go tool pprof http://localhost:6060/debug/pprof/mutex
Best Practices
Before Profiling
- Establish a performance baseline to measure improvements
- Profile in production-like environments when possible
- Use realistic workloads and data
During Profiling
- Focus on areas where the application spends the most time or consumes the most resources
- Profile for long enough to get representative samples (30-60 seconds for CPU)
- Don’t optimize based on one-off measurements
After Profiling
- Optimization is iterative - make changes based on profiling data, then re-profile to measure impact
- Document your findings and changes
- Keep profiles from before and after for comparison
Production Profiling
net/http/pprofis safe for production (profiling is on-demand)- Collecting profiles has overhead - don’t profile continuously without proper tooling
- Consider using continuous profiling platforms for always-on profiling with minimal overhead
Common Pitfalls
- Profiling in debug mode: Always profile optimized builds (
go buildwithout-gcflags) - Too short samples: CPU profiles need sufficient time to be representative (30+ seconds)
- Ignoring inlining: The compiler inlines small functions, which affects profile attribution
- Over-optimizing: Focus on significant bottlenecks, not micro-optimizations
Tips
- Use
-benchmemwith benchmarks to see memory allocations:go test -bench=. -benchmem - Compare profiles using
-base:go tool pprof -base=old.prof new.prof - Enable all profile types in development, enable selectively in production
- For goroutine debugging, use
GODEBUG=gctrace=1to see GC stats - Use
runtime.ReadMemStats()for programmatic memory monitoring
Wrap-up
Profiling in Go provides deep insights into how an application uses system resources. By using Go’s built-in tools and understanding profiling data, you can significantly improve application efficiency and reliability.
The key is to profile regularly, focus on meaningful optimizations, and validate improvements with data rather than assumptions.
Additional resources:
- Go’s official pprof blog post
- The google/pprof GitHub repository
- Profiling Go Programs (official tutorial)
- Efficient Go by Bartek Plotka
- Go Performance Workshop by Dave Cheney
Profiling takes practice, but with the right tools it becomes an essential part of performance optimization.