- Add gRPC health check support to Consul registry
- Services are gRPC-only, not HTTP
- Consul was trying HTTP health checks which failed
- Now uses gRPC health checks via grpc.health.v1.Health service
- Update HealthCheckConfig to support both HTTP and gRPC
- Add GRPC field for gRPC service name
- Add UseGRPC flag to choose health check type
- Default to gRPC for services (use_grpc: true in config)
- Fix service address registration in Docker
- Services now register with Docker service name (e.g., auth-service)
- Allows Consul to reach services via Docker network DNS
- Falls back to localhost for local development
- Update default.yaml to enable gRPC health checks
- Set use_grpc: true
- Set grpc: grpc.health.v1.Health
This fixes services being deregistered from Consul due to failed
HTTP health checks. Services will now pass gRPC health checks.
- Add SetEnvKeyReplacer to convert underscores to dots
- Explicitly bind DATABASE_DSN, REGISTRY_CONSUL_ADDRESS, REGISTRY_TYPE
- Fixes database connection issues in Docker where services couldn't
read DATABASE_DSN environment variable
- Services in Docker can now connect to postgres:5432 instead of localhost
- Change from 'go run ./cmd/{service}/main.go' to 'go run ./cmd/{service}/*.go'
- go run with single file doesn't include other files in the package
- Service implementations are in separate _fx.go files
- Using wildcard includes all .go files in the package
- Update README.md and SUMMARY.md with correct commands
- Fixes 'undefined: provideXService' errors when running services
- Create Dockerfile for API Gateway
- Multi-stage build using golang:1.25-alpine
- Minimal runtime image using alpine:latest
- Exposes port 8080
- Add API Gateway service to docker-compose.yml
- Depends on Consul and all core services
- Environment variables for gateway configuration
- Port 8080 exposed
- Update SUMMARY.md
- Add API Gateway to service list
- Add API Gateway to Docker build instructions
- Update file structure to include API Gateway Dockerfile
- Change from fx.Provide to fx.Invoke for health registry registration
- CoreModule() already provides *health.Registry
- Services should register their database checkers with the existing registry
- Use fx.Invoke to register database health checkers instead of providing new registry
- Fixes duplicate provider error for *health.Registry
- All services now build and should start successfully
- Remove duplicate CoreModule() calls from all service main.go files
- NewContainer() already includes CoreModule() automatically
- This was causing duplicate ConfigProvider provider errors
- Update all _fx.go files to use *database.Client instead of *ent.Client
- database.Client embeds *ent.Client, so it can be used directly
- This fixes type mismatches between providers and consumers
- Keep ent import for constants like ent.Desc
- All services now build and should start successfully
- Add Core Services section highlighting Epic 2 completion
- Update directory structure to include all service entry points
- Add comprehensive Quick Start guide with:
- Prerequisites including NixOS support
- Installation steps with code generation
- Two deployment options (development vs full Docker)
- Service endpoints and ports
- Testing examples with grpcurl
- Update Architecture section with Core Services details
- Add Implementation Status section showing completed epics
- Update Configuration section with service-specific settings
- Add links to Epic 2 documentation
- Create Dockerfiles for all four services (auth, identity, authz, audit)
- Multi-stage builds using golang:1.25-alpine
- Minimal runtime images using alpine:latest
- Copy config files to runtime image
- Create docker-compose.dev.yml for development
- Only PostgreSQL and Consul
- Use when running services locally with 'go run'
- Update docker-compose.yml for full deployment
- All services + infrastructure
- Services build from Dockerfiles
- Health checks and dependencies configured
- Environment variables for service configuration
- Add .dockerignore to optimize build context
- Excludes docs, tests, IDE files, build artifacts
- Update SUMMARY.md
- Document both docker-compose files
- Add Docker deployment section
- Update file structure to include Dockerfiles
- Add Consul service to docker-compose.yml
- Running in dev mode on port 8500
- Health checks configured
- Persistent volume for data
- Web UI available at http://localhost:8500/ui
- Update SUMMARY.md
- Document Consul setup in docker-compose
- Add Consul verification steps
- Update prerequisites to include Docker Compose
- Add note about Consul Web UI
- Remove obsolete version field from docker-compose.yml
- Implement Audit Service (2.5)
- gRPC server with Record and Query operations
- Database persistence with audit schema
- Service registry integration
- Entry point: cmd/audit-service
- Implement Identity Service (2.2)
- User CRUD operations
- Password hashing with argon2id
- Email verification and password reset flows
- Entry point: cmd/identity-service
- Fix package naming conflicts in user_service.go
- Implement Auth Service (2.1)
- JWT token generation and validation
- Login, RefreshToken, ValidateToken, Logout RPCs
- Integration with Identity Service
- Entry point: cmd/auth-service
- Note: RefreshToken entity needs Ent generation
- Implement Authz Service (2.3, 2.4)
- Permission checking and authorization
- User roles and permissions retrieval
- RBAC-based authorization
- Entry point: cmd/authz-service
- Implement gRPC clients for all services
- Auth, Identity, Authz, and Audit clients
- Service discovery integration
- Full gRPC communication
- Add service configurations to config/default.yaml
- Create SUMMARY.md with implementation details and testing instructions
- Fix compilation errors in Identity Service (password package conflicts)
- All services build successfully and tests pass
- Fix errcheck: explicitly ignore tx.Rollback() error in defer
- When transaction commits successfully, Rollback() returns an error (expected)
- Use defer func() with explicit error assignment to satisfy linter
- Remove unused connectToService function
- Function is not currently used (proto files not yet generated)
- Commented out with TODO for future implementation
- Prevents unused function lint error
- Change version from number to string: version: "2"
- Remove deprecated exclude-use-default option
- Change exclude-rules to exclude (new format in v2.6)
- Remove deprecated output section (print-issued-lines, print-linter-name)
- Remove linters-settings (not allowed in v2.6 schema validation)
Fixes CI validation errors with golangci-lint v2.6.1:
- version type validation error
- exclude-use-default and exclude-rules not allowed
- output options not allowed
- linters-settings not allowed at root level
- Fix race condition in gateway tests by using TestMain to set Gin mode once
- Remove duplicate gin.SetMode(gin.TestMode) calls from individual tests
- Add TestMain function to initialize test environment before all tests
- Prevents race conditions when tests run in parallel with -race flag
- Update golangci-lint-action from v6 to v7
- v6 doesn't support golangci-lint v2.x versions
- v7 supports golangci-lint v2.x and automatically selects compatible version
- Change version from v2.6.0 to latest for automatic compatibility
All tests now pass with race detector enabled.
- Replace manual golangci-lint v2.1.6 installation (built with Go 1.24)
- Use official golangci/golangci-lint-action@v6 GitHub Action
- Set version to v2.6.0 which supports Go 1.25+
- Action automatically handles Go version compatibility
Fixes CI error: 'the Go language version (go1.24) used to build
golangci-lint is lower than the targeted Go version (1.25.3)'
- Add unit tests for gateway service (services/gateway/gateway_test.go)
- Test gateway creation, route setup, service discovery, and error handling
- Achieve 67.9% code coverage for gateway service
- Test all HTTP methods are properly handled
- Test route matching and 404 handling
- Add tests for API Gateway main entry point (cmd/api-gateway/main_test.go)
- Test DI container setup and structure
- Test service instance creation logic
- Test lifecycle hooks registration
- Add testify dependency for assertions (go.mod)
All tests pass successfully. Proxy forwarding tests are noted for integration
test suite as they require real HTTP connections (per ADR-0028 testing strategy).
- Create ADR-0034 documenting Go version upgrade to 1.25.3
- Mark ADR-0002 as superseded by ADR-0034
- Update CI workflow to use Go 1.25.3
- Update Makefile to build both platform and api-gateway binaries
- Update CI workflow to build and upload both binaries
- Update documentation (ADR README and mkdocs.yml) to include ADR-0034
Refactor core kernel and infrastructure to support true microservices
architecture where services are independently deployable.
Phase 1: Core Kernel Cleanup
- Remove database provider from CoreModule (services create their own)
- Update ProvideHealthRegistry to not depend on database
- Add schema support to database client (NewClientWithSchema)
- Update main entry point to remove database dependency
- Core kernel now provides only: config, logger, error bus, health, metrics, tracer, service registry
Phase 2: Service Registry Implementation
- Create ServiceRegistry interface (pkg/registry/registry.go)
- Implement Consul registry (internal/registry/consul/consul.go)
- Add Consul dependency (github.com/hashicorp/consul/api)
- Add registry configuration to config/default.yaml
- Add ProvideServiceRegistry() to DI container
Phase 3: Service Client Interfaces
- Create service client interfaces:
- pkg/services/auth.go - AuthServiceClient
- pkg/services/identity.go - IdentityServiceClient
- pkg/services/authz.go - AuthzServiceClient
- pkg/services/audit.go - AuditServiceClient
- Create ServiceClientFactory (internal/client/factory.go)
- Create stub gRPC client implementations (internal/client/grpc/)
- Add ProvideServiceClientFactory() to DI container
Phase 4: gRPC Service Definitions
- Create proto files for all core services:
- api/proto/auth.proto
- api/proto/identity.proto
- api/proto/authz.proto
- api/proto/audit.proto
- Add generate-proto target to Makefile
Phase 5: API Gateway Implementation
- Create API Gateway service entry point (cmd/api-gateway/main.go)
- Create Gateway implementation (services/gateway/gateway.go)
- Add gateway configuration to config/default.yaml
- Gateway registers with Consul and routes requests to backend services
All code compiles successfully. Core services (Auth, Identity, Authz, Audit)
will be implemented in Epic 2 using these foundations.
Transform all documentation from modular monolith to true microservices
architecture where core services are independently deployable.
Key Changes:
- Core Kernel: Infrastructure only (no business logic)
- Core Services: Auth, Identity, Authz, Audit as separate microservices
- Each service has own entry point (cmd/{service}/)
- Each service has own gRPC server and database schema
- Services register with Consul for service discovery
- API Gateway: Moved from Epic 8 to Epic 1 as core infrastructure
- Single entry point for all external traffic
- Handles routing, JWT validation, rate limiting, CORS
- Service Discovery: Consul as primary mechanism (ADR-0033)
- Database Pattern: Per-service connections with schema isolation
Documentation Updates:
- Updated all 9 architecture documents
- Updated 4 ADRs and created 2 new ADRs (API Gateway, Service Discovery)
- Rewrote Epic 1: Core Kernel & Infrastructure (infrastructure only)
- Rewrote Epic 2: Core Services (Auth, Identity, Authz, Audit as services)
- Updated Epic 3-8 stories for service architecture
- Updated plan.md, playbook.md, requirements.md, index.md
- Updated all epic READMEs and story files
New ADRs:
- ADR-0032: API Gateway Strategy
- ADR-0033: Service Discovery Implementation (Consul)
New Stories:
- Epic 1.7: Service Client Interfaces
- Epic 1.8: API Gateway Implementation
Mermaid graphs require node labels with special characters like forward
slashes to be quoted. Changed /healthz and /metrics from square bracket
format to quoted string format to fix the lexical error.
Mermaid sequence diagrams don't support YAML-style lists with dashes
in message content. Changed the multi-line permission list to a single
comma-separated line to fix the parse error.
The Gin framework uses a global mode setting (gin.SetMode()) which is not
thread-safe when tests run in parallel. Removing t.Parallel() from metrics
tests that use gin.SetMode() prevents data races when running tests with
the race detector enabled.
All tests now pass with 'make test' which includes -race flag.
- Use typed context key instead of string in errorbus test to avoid collisions
- Remove unused imports (health.HealthChecker, trace.TracerProvider) from test files
- Simplify interface verification checks (removed unnecessary type assertions)
All linting errors resolved. make lint now passes.
The mockLogger's errors slice was being accessed concurrently from
multiple goroutines (the error bus consumer and the test goroutine),
causing race conditions when running tests with the race detector.
Added sync.Mutex to protect the errors slice and proper locking when
accessing it in test assertions.
The Gin framework uses a global mode setting (gin.SetMode()) which is not
thread-safe when tests run in parallel. Removing t.Parallel() from all
server tests that use gin.SetMode() prevents data races when running
tests with the race detector enabled.
All tests now pass with 'make test' which includes -race flag.
Story 1.2: Database Layer
- Test database client creation, connection, ping, and close
- Test connection pooling configuration
- Tests skip if database is not available (short mode)
Story 1.3: Health Monitoring and Metrics
- Test health registry registration and checking
- Test database health checker
- Test liveness and readiness checks
- Test metrics creation, middleware, and handler
- Test Prometheus metrics endpoint
Story 1.4: Error Handling and Error Bus
- Test channel-based error bus creation
- Test error publishing with context
- Test nil error handling
- Test channel full scenario
- Test graceful shutdown
- Fix Close() method to handle multiple calls safely
Story 1.5: HTTP Server and Middleware
- Test server creation with all middleware
- Test request ID middleware
- Test logging middleware
- Test panic recovery middleware
- Test CORS middleware
- Test timeout middleware
- Test health and metrics endpoints
- Test server shutdown
Story 1.6: OpenTelemetry Tracing
- Test tracer initialization (enabled/disabled)
- Test development and production modes
- Test OTLP exporter configuration
- Test graceful shutdown
- Test no-op tracer provider
All tests follow Go testing best practices:
- Table-driven tests where appropriate
- Parallel execution
- Proper mocking of interfaces
- Skip tests requiring external dependencies in short mode
- Remove emoji numbers from section headers (1-13)
- Remove rocket emoji from final congratulations message
- All sections now use plain numbers instead of emoji numbers
- Fix error return value checks (errcheck)
- Fix unused parameters by using underscore prefix
- Add missing package comments to all packages
- Fix context key type issue in middleware (use typed contextKey)
- Replace deprecated trace.NewNoopTracerProvider with noop.NewTracerProvider
- Fix embedded field selector in database client
- Remove trailing whitespace
- Remove revive linter (as requested) to avoid stuttering warnings for public API interfaces
All linting and formatting checks now pass.
- Verified all acceptance criteria for Stories 1.1-1.6
- Updated Status fields from Pending to Completed
- Marked all acceptance criteria checkboxes as completed
- All stories in Epic 1 are now fully implemented and verified
- Add fx.Invoke in main.go to force database and HTTP server creation
- This ensures all providers execute and their lifecycle hooks are registered
- Clean up debug logging statements
- Database migrations and HTTP server now start correctly on application startup
Fixes issue where database migrations and HTTP server were not starting
because FX providers were not being executed (lazy evaluation).
- Added logging when HTTP server OnStart hook is called
- Added error logging for database migration failures
- This will help identify if hooks are being called and where failures occur
- Added better error detection for HTTP server startup
- Added connectivity check to verify server is actually listening
- Increased wait time to 500ms for better error detection
- Added warning log if server connectivity check fails (may still be starting)
- Improved logging messages for server startup
This should help diagnose why the HTTP server isn't starting and provide better visibility into the startup process.
Fixes:
- Added database connection logging with masked DSN
- Added migration progress logging
- Added HTTP server startup logging with address
- Fixed database provider to accept logger parameter
- Improved error visibility throughout initialization
Documentation:
- Moved Story 1.7 (Service Client Interfaces) to Epic 2 as Story 2.7
- Updated Epic 1 and Epic 2 READMEs
- Updated COMPLETE_TASK_LIST.md
- Updated story metadata (ID, Epic, Dependencies)
These changes will help diagnose startup issues and provide better visibility into what the application is doing.
Story 1.6: OpenTelemetry Distributed Tracing
- Implemented tracer initialization with stdout (dev) and OTLP (prod) exporters
- Added HTTP request instrumentation via Gin middleware
- Integrated trace ID correlation in structured logs
- Added tracing configuration to config files
- Registered tracer provider in DI container
Documentation and Setup:
- Created Docker Compose setup for PostgreSQL database
- Added comprehensive Epic 1 summary with verification instructions
- Added Epic 0 summary with verification instructions
- Linked summaries in documentation index and epic READMEs
- Included detailed database testing instructions
- Added Docker Compose commands and troubleshooting guide
All Epic 1 stories (1.1-1.6) are now complete. Story 1.7 depends on Epic 2.