gRPC Architecture
Vanilla gRPC over HTTP/2 — backend-to-backend services. Pair with protobuf-architect for schema design. Go-specific implementation skeletons in RECIPES.md; pinned deps in STACK.md. Other languages follow the same protocol-level conventions with idiomatic substitutions.
1. Service definition
One service per file, named after the resource. Methods are verb-noun, request and response are always typed messages — never raw primitives or google.protobuf.Empty as input. Example in RECIPES.md.
<Verb><Noun>Request/<Verb><Noun>Responsenaming for every method's I/O message. Even when the response is a single resource, preferCreateUserResponse { User user = 1; }over returningUserdirectly — leaves room to add fields without bumping the major version.- Pagination via cursor, mirroring rest-api-architect §5.
ListUsersRequest { string cursor = 1; int32 limit = 2; }→ListUsersResponse { repeated User users = 1; string next_cursor = 2; }. google.protobuf.Emptyonly as a response type for "fire and forget" actions with no useful return. Never as input.
2. Error handling — status.Error with codes
Use gRPC standard codes, return errors via status.Error(code, msg). Map domain errors to codes in one central place (skeleton in RECIPES.md).
Standard codes — the ones architects actually use
| Code | Use for |
|---|---|
OK | success |
INVALID_ARGUMENT | request fails schema or business validation |
FAILED_PRECONDITION | request valid but system state forbids it (e.g. delete a non-empty resource) |
OUT_OF_RANGE | numeric / range-specific violation distinct from INVALID_ARGUMENT |
UNAUTHENTICATED | missing or invalid credentials |
PERMISSION_DENIED | authenticated but not authorized |
NOT_FOUND | resource doesn't exist |
ALREADY_EXISTS | unique-constraint or idempotency-key collision (with a different body) |
ABORTED | concurrency conflict (ETag-equivalent) — client should re-fetch and retry |
RESOURCE_EXHAUSTED | rate limit; per-tenant quota |
DEADLINE_EXCEEDED | the request didn't complete in time — set by the runtime |
UNAVAILABLE | transient — load-balancer drained, restart in progress; client retries |
INTERNAL | unexpected server-side failure — bug or external dependency error |
UNIMPLEMENTED | method exists in proto but server doesn't handle it (use during rollout) |
Don't reach for INTERNAL as a default. Map every known domain error to a specific code.
status.WithDetailsattaches structured detail (google.rpc.ErrorInfo,google.rpc.BadRequest) when clients need machine-readable error context — equivalent to REST's RFC 7807 (see rest-api-architect §7). Always include a correlation id.- Never leak stack traces or DB errors to clients. Log server-side; return a generic
INTERNALwith the correlation id.
3. Request / response shape
- Always typed messages. Don't define a method as
rpc Ping(StringValue) returns (StringValue)— wrap inPingRequest/PingResponse. - Validation at the boundary via
protovalidate(see protobuf-architect §5). Enforced server-side via an interceptor (§4). - No business logic in generated handler files. Generated handlers are thin shims that call into service-layer code (same discipline as REST routers per fastapi-architect / gin-architect).
4. Interceptors — mandatory chain
Interceptors are gRPC's middleware. Order matters — outermost first: recovery → request-id → log → auth → validation → metrics. Full chain in RECIPES.md.
- Recovery first — catches panics anywhere downstream and converts to
INTERNALwith correlation id (never a stack trace). - Auth before validation — no point validating an unauthenticated request's body. Per-method authorization (scopes/roles) happens inside the handler or via a small
WithAuthFuncinterceptor. - Validation is centralized via
protovalidate— don't hand-write validation in every handler. The interceptor callsvalidator.Validate(req)and returnsINVALID_ARGUMENTwithgoogle.rpc.BadRequestdetails on failure. - Same interceptor chain for streaming RPCs via
ChainStreamInterceptor. Streaming validation requires handling per-message in client/bidi streams.
5. Streaming patterns
gRPC supports four call types. Pick the simplest one that meets the requirement.
| Pattern | Use for | Pitfalls |
|---|---|---|
| Unary | Default — request/response | None — start here |
| Server-stream | Server emits N responses to one request (event feeds, paginated downloads that don't fit one response, log tail) | Connection state outlives the request; resume tokens needed for restarts |
| Client-stream | Client uploads N messages, server returns one summary (large uploads, batch ingest) | Backpressure from server requires careful flow control |
| Bidi-stream | Genuinely interactive (chat, collaborative editing, control protocols) | Connection lifecycle complexity; reconnect / resume logic; deadlines |
- Don't reach for streaming "to save round-trips" — unary with proper pagination is usually fine and orders of magnitude simpler.
- Server-streams need resume tokens. Pass
start_after_idor a cursor in the request so a disconnected client can resume from a known point. - Bidi-streams need a clear protocol — define exactly which side sends what and when. Sketch the message flow in the
.protocomments; future-you will thank you. - Set generous deadlines on streams — but always set them. An unbounded stream is a leak.
6. Deadlines & context propagation
Every gRPC call has a deadline. Clients set; servers respect; downstream calls inherit the remaining time. Client skeleton in RECIPES.md.
- Server respects — check
ctx.Err()periodically in long-running handlers; abort work the moment the deadline fires. - Propagate context to all downstream calls — DB queries, HTTP calls, other gRPC calls. Deadlines and cancellation flow automatically.
- Default deadlines per call type: unary 5–30s; server-stream often much longer (minutes / hours) but always bounded.
- Server-side deadline guard: wrap the entire handler in
context.WithTimeoutslightly less than the client deadline to leave headroom for response serialization.
7. Metadata vs message fields
| Use metadata for | Use message fields for |
|---|---|
Auth tokens (authorization: Bearer ...) | Business data |
| Request IDs / correlation IDs (read by middleware) | Anything the handler reads as part of business logic |
Tracing context (traceparent) | — |
Rate-limit hints (x-tenant-id for routing) | — |
- Metadata is HTTP/2 headers under the hood. Don't send large payloads here.
- Keys are case-insensitive ASCII; values are strings (binary metadata uses the
-binsuffix). - Standardize one correlation-id header (e.g.
x-request-id) — interceptor reads it on entry, injects into context, logs against it.
8. Reflection
Reflection enables grpcurl and IDE plugins to introspect the service without the .proto file — on in dev, off in production. It leaks the entire service surface. Skeleton in RECIPES.md. The health service (grpc.health.v1) is always on — load balancers and orchestrators need it.
9. Testing — bufconn for in-process
The google.golang.org/grpc/test/bufconn package gives an in-memory listener — full server + client without a real socket. Faster than net.Pipe, simpler than s