Writing applications for the cloud feels different. It’s not just about moving a traditional Java program into a virtual machine. The environment itself is dynamic and sometimes unpredictable. Servers come and go. Network connections fail and heal. Your code must be ready for this. It needs to be portable, resilient, and easy for automation tools to manage. Over time, I’ve learned that certain ways of structuring code make this much easier. Here are some practical patterns that help your Java applications thrive in places like Kubernetes.
Let’s start with the most fundamental idea. Your application must be able to run anywhere without change. This means you cannot bake configuration details into your code. Hardcoded database URLs, API keys, or feature flags create a version of your software that’s tied to one place. The solution is to externalize everything that can change.
// The simplest way, perfect for containers, is to use environment variables.
String databaseConnectionString = System.getenv("DB_CONNECTION_STRING");
int serverPort = Integer.parseInt(System.getenv("SERVER_PORT"));
// For more structured configuration, frameworks can help.
// Here's a Spring Boot example that binds properties.
@Configuration
@ConfigurationProperties(prefix = "payment")
public class PaymentServiceConfig {
private String gatewayUrl;
private double defaultCurrencyConversionRate;
private int timeoutMillis;
// Standard getters and setters are required.
public String getGatewayUrl() { return gatewayUrl; }
public void setGatewayUrl(String gatewayUrl) { this.gatewayUrl = gatewayUrl; }
}
In Kubernetes, you would store these settings in objects called ConfigMaps and Secrets. They can be attached to your running container as environment variables or as files in a volume. When you need to change a setting, you update the ConfigMap, and Kubernetes can make it available to your pods without requiring a new build or deployment. It separates the act of building the software from the act of configuring it for a specific environment.
The cloud is not a static data center. An orchestrator like Kubernetes is constantly working to keep the system healthy and balanced. It might decide to move your application to a different physical machine to free up resources. When it does this, it sends a polite request to your process to shut down. If you ignore it, your process will be killed after a grace period. You must listen for this signal.
Your application needs a chance to finish what it’s doing. Maybe it’s in the middle of processing a customer’s order or sending a critical notification. A graceful shutdown pattern gives you that chance.
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.context.ConfigurableApplicationContext;
import javax.annotation.PreDestroy;
@SpringBootApplication
public class OrderServiceApplication {
public static void main(String[] args) {
// We manage the shutdown hook ourselves for more control.
SpringApplication app = new SpringApplication(OrderServiceApplication.class);
app.setRegisterShutdownHook(false);
ConfigurableApplicationContext context = app.run(args);
// This hook catches the SIGTERM signal from Kubernetes.
Runtime.getRuntime().addShutdownHook(new Thread(() -> {
System.out.println("Shutdown signal received.");
// 1. Stop accepting new requests. Spring Boot's embedded server does this when the context closes.
// 2. Finish processing current requests.
context.close(); // This triggers @PreDestroy methods.
// 3. Close any remaining resources: database pools, HTTP clients, thread pools.
closeCustomResources();
System.out.println("Shutdown complete.");
}));
}
@PreDestroy
public void onDestroy() {
// Clean up application-specific resources here.
System.out.println("Cleaning up before bean destruction.");
}
private static void closeCustomResources() {
// Example: Close a shared HTTP client.
// httpClient.close();
}
}
You should design your shutdown logic to complete within a set time, say 30 seconds. You configure this in your Kubernetes deployment manifest as terminationGracePeriodSeconds. If your cleanup takes longer, the process will be forcibly terminated. The goal is to be a good citizen in the ecosystem.
Imagine you have five copies of your service running behind a load balancer. A user’s first request goes to Server A, and their next request goes to Server B. If you store that user’s session data in the memory of Server A, Server B will have no idea who they are. The user experience breaks. For cloud applications, servers must be stateless and interchangeable.
Any data that needs to persist between requests must live outside the application instance.
// Instead of using HttpSession, which is server-local, use a shared cache.
@Service
public class UserSessionService {
private final RedisTemplate<String, UserProfile> redisTemplate;
public UserSessionService(RedisTemplate<String, UserProfile> redisTemplate) {
this.redisTemplate = redisTemplate;
}
public void saveUserProfile(String sessionToken, UserProfile profile) {
// Store with an expiration time.
redisTemplate.opsForValue().set("session:" + sessionToken, profile, Duration.ofHours(2));
}
public UserProfile getUserProfile(String sessionToken) {
return redisTemplate.opsForValue().get("session:" + sessionToken);
}
}
// In your controller, the session token might come from a cookie or a JWT in the Authorization header.
@RestController
public class CartController {
@GetMapping("/cart")
public Cart getCart(@RequestHeader("X-Session-Token") String sessionToken) {
// Any instance can handle this request because the state is in Redis.
return cartService.getCartForSession(sessionToken);
}
}
This approach enables true horizontal scaling. You can add or remove instances seamlessly. It also makes software updates painless. Kubernetes can terminate an old pod and start a new one with updated code, and no user data is lost because it was never inside the pod.
How does the platform know if your application is healthy? You have to tell it. Kubernetes uses probes to check on your container. There are two main types: liveness and readiness. It’s important to understand the difference.
A liveness probe answers the question, “Is this process alive and not stuck?” If it fails, Kubernetes kills the container and starts a new one. A readiness probe answers, “Is this instance ready to receive traffic?” If it fails, Kubernetes stops sending new requests to that pod, but leaves it running.
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;
import org.springframework.http.ResponseEntity;
import java.sql.Connection;
import java.sql.DriverManager;
@RestController
public class HealthCheckController {
private final DataSource dataSource; // Injected database connection pool
@GetMapping("/health/live")
public ResponseEntity<String> livenessCheck() {
// A simple check. Is the application thread responsive?
// Avoid heavy checks here. This endpoint is called frequently.
return ResponseEntity.ok("Alive");
}
@GetMapping("/health/ready")
public ResponseEntity<String> readinessCheck() {
// Check all critical external dependencies.
if (!isDatabaseConnected()) {
return ResponseEntity.status(503).body("Database not available");
}
if (!isMessageBrokerReachable()) {
return ResponseEntity.status(503).body("Message broker not available");
}
return ResponseEntity.ok("Ready for traffic");
}
private boolean isDatabaseConnected() {
try (Connection conn = dataSource.getConnection()) {
return conn.isValid(2); // 2-second timeout
} catch (Exception e) {
return false;
}
}
}
In your Kubernetes deployment YAML, you configure these probes.
apiVersion: apps/v1
kind: Deployment
spec:
template:
spec:
containers:
- name: my-app
livenessProbe:
httpGet:
path: /health/live
port: 8080
initialDelaySeconds: 60 # Give the app time to start
periodSeconds: 10
readinessProbe:
httpGet:
path: /health/ready
port: 8080
initialDelaySeconds: 30
periodSeconds: 5
This setup creates a robust system. If your database goes down, the readiness probes will fail, taking all pods out of the service rotation. This prevents users from seeing errors. When the database comes back, the probes will succeed, and traffic will flow again. If your application deadlocks, the liveness probe fails, and Kubernetes automatically restarts it.
In a system of many services, your application will call others. What happens when that other service is slow or down? Without protection, your threads will wait, eventually exhausting your own resources. Your service becomes a victim of someone else’s problem. This is a cascading failure.
A circuit breaker is like an electrical fuse. When failures reach a threshold, it “opens” the circuit. Further calls fail immediately without even trying the remote operation, giving the struggling service time to recover. After a while, it lets one test request through to see if things are better.
import io.github.resilience4j.circuitbreaker.CircuitBreaker;
import io.github.resilience4j.circuitbreaker.CircuitBreakerConfig;
import io.github.resilience4j.circuitbreaker.CircuitBreakerRegistry;
import io.vavr.control.Try;
import java.time.Duration;
import java.util.function.Supplier;
@Service
public class ProductServiceClient {
private final CircuitBreaker circuitBreaker;
private final RestTemplate restTemplate;
public ProductServiceClient() {
// Configure the circuit breaker
CircuitBreakerConfig config = CircuitBreakerConfig.custom()
.slidingWindowType(CircuitBreakerConfig.SlidingWindowType.COUNT_BASED)
.slidingWindowSize(10) // Last 10 calls are analyzed
.failureRateThreshold(60.0f) // Open if 60% of calls fail
.waitDurationInOpenState(Duration.ofSeconds(45))
.permittedNumberOfCallsInHalfOpenState(2)
.build();
CircuitBreakerRegistry registry = CircuitBreakerRegistry.of(config);
this.circuitBreaker = registry.circuitBreaker("productService");
this.restTemplate = new RestTemplate();
}
public Product getProduct(String id) {
// Decorate the risky call with the circuit breaker
Supplier<Product> decoratedSupplier = CircuitBreaker
.decorateSupplier(circuitBreaker, () -> fetchFromRemoteService(id));
return Try.ofSupplier(decoratedSupplier)
.recover(throwable -> {
// Fallback: return cached data or a default/stub object
System.err.println("Call failed, using fallback. Reason: " + throwable.getMessage());
return getCachedProduct(id);
})
.get();
}
private Product fetchFromRemoteService(String id) {
// This is the call that might fail.
String url = "http://product-service:8080/products/" + id;
return restTemplate.getForObject(url, Product.class);
}
}
When the circuit is open, the fetchFromRemoteService method is not even called. The recover method provides a fallback value, which could be a default, a cached version, or an empty result. This lets your application remain partially functional.
In the old world, you might have a configuration file with the IP address of a database server. In the cloud, especially with containers, IP addresses are ephemeral. A service could have multiple instances, and they could be on any machine. Client-side service discovery solves this.
Your service doesn’t need to know the physical location of its dependencies. It just needs to know their logical names. It asks a registry, “Where can I find ‘user-service’ right now?” and gets a list of healthy endpoints.
import org.springframework.cloud.client.loadbalancer.LoadBalanced;
import org.springframework.context.annotation.Bean;
import org.springframework.web.client.RestTemplate;
@Configuration
public class AppConfig {
@Bean
@LoadBalanced // This annotation integrates with a discovery client (Eureka, Consul, Kubernetes DNS)
public RestTemplate loadBalancedRestTemplate() {
return new RestTemplate();
}
}
@Service
public class UserInfoService {
private final RestTemplate restTemplate;
public UserInfoService(RestTemplate restTemplate) {
this.restTemplate = restTemplate;
}
public User getUserDetails(String userId) {
// Notice the URL uses the service name, not an IP or hostname.
// The load balancer client resolves "user-service" to an actual address.
String url = "http://user-service/api/v1/users/" + userId;
return restTemplate.getForObject(url, User.class);
}
}
In a Kubernetes cluster, you don’t even need a separate registry. Kubernetes has its own internal DNS. A service named user-service in the same namespace is reachable at the hostname user-service. If it’s in a different namespace, it’s user-service.other-namespace.svc.cluster.local. The load balancer client handles picking one of the available pods behind that service.
When you have ten microservices handling a single user request, traditional log files are useless. You need to be able to follow that request’s journey through the entire system. This requires two things: sending all logs to a central place and marking each log entry with a unique request identifier.
The standard practice is to write logs to the standard output stream. Your container runtime captures this. Then, an agent (like Fluentd) forwards them to a central store like Elasticsearch or Grafana Loki.
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.web.bind.annotation.*;
import org.springframework.web.filter.OncePerRequestFilter;
import javax.servlet.FilterChain;
import javax.servlet.ServletException;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import java.io.IOException;
import java.util.UUID;
@Component
public class CorrelationIdFilter extends OncePerRequestFilter {
private static final String CORRELATION_ID_HEADER = "X-Correlation-ID";
private static final Logger log = LoggerFactory.getLogger(CorrelationIdFilter.class);
@Override
protected void doFilterInternal(HttpServletRequest request, HttpServletResponse response,
FilterChain filterChain) throws ServletException, IOException {
String correlationId = request.getHeader(CORRELATION_ID_HEADER);
if (correlationId == null || correlationId.isEmpty()) {
correlationId = UUID.randomUUID().toString();
}
// Store it in a thread-local context so any code in this request can access it.
MDC.put("correlationId", correlationId);
response.setHeader(CORRELATION_ID_HEADER, correlationId);
log.info("Starting request for path: {}", request.getRequestURI());
try {
filterChain.doFilter(request, response);
} finally {
log.info("Completed request with status: {}", response.getStatus());
MDC.clear(); // Clean up the thread-local context.
}
}
}
To make the logs machine-readable, configure your logging framework (like Logback) to output JSON.
<!-- logback-spring.xml -->
<configuration>
<appender name="CONSOLE_JSON" class="ch.qos.logback.core.ConsoleAppender">
<encoder class="net.logstash.logback.encoder.LogstashEncoder">
<includeContext>false</includeContext>
<includeMdcKeyName>correlationId</includeMdcKeyName>
<fieldNames>
<timestamp>timestamp</timestamp>
<message>message</message>
<level>level</level>
<logger>logger</logger>
<thread>thread</thread>
<mdc>mdc</mdc>
</fieldNames>
</encoder>
</appender>
<root level="info">
<appender-ref ref="CONSOLE_JSON" />
</root>
</configuration>
Now each log entry is a JSON object with a correlationId field. In your logging dashboard, you can search for that ID and see every log line from every service related to that one request.
Your Docker image is the final package of your application. A large image wastes storage, takes longer to transfer, and starts slower. It also has a larger attack surface. Multi-stage Docker builds are the solution. You use one image with the full JDK and Maven to compile your code, then copy only the necessary artifacts into a second, much smaller image for runtime.
# First stage: The Builder
FROM maven:3.8.6-eclipse-temurin-17 AS builder
WORKDIR /workspace
# Copy the pom.xml first. This allows Docker to cache the dependency layer.
COPY pom.xml .
# Download dependencies. This layer is cached unless pom.xml changes.
RUN mvn dependency:go-offline -B
# Copy the source code and build the application.
COPY src ./src
RUN mvn clean package -DskipTests
# Second stage: The Runtime
FROM eclipse-temurin:17-jre-alpine
# Alpine Linux is very small and security-focused.
WORKDIR /app
# Copy the JAR file from the builder stage.
COPY --from=builder /workspace/target/my-application-*.jar app.jar
# Create a non-root user to run the application (security best practice).
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser
EXPOSE 8080
ENTRYPOINT ["java", "-jar", "app.jar"]
This produces a slim image containing just the JRE, your JAR file, and the minimal Alpine Linux userland. You should also define resource requests and limits in your Kubernetes manifest so the scheduler knows how much CPU and memory your application needs.
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
This prevents a single misbehaving application from consuming all the resources on a node.
Secrets like database passwords, API tokens, and encryption keys are the keys to your kingdom. They must never appear in your source code, Docker image layers, or environment variable values that might be logged. Cloud platforms provide secure secret management.
In Kubernetes, you create a Secret object. Your application accesses it as a mounted file or an environment variable.
// Accessing a secret mounted as a file (common for certificates or large configs).
import java.nio.file.Files;
import java.nio.file.Paths;
public class ApiSecurity {
public String getEncryptionKey() {
try {
// The secret is mounted at this path by Kubernetes.
return Files.readString(Paths.get("/etc/app-secrets/encryption-key")).trim();
} catch (Exception e) {
throw new RuntimeException("Failed to read secret", e);
}
}
}
// Accessing a secret injected as an environment variable.
public class DatabaseConfig {
public String getPassword() {
// The value is set in the container's environment from the Secret.
return System.getenv("DB_ADMIN_PASSWORD");
}
}
The Kubernetes Secret is stored encrypted at rest in the cluster’s database. Access to read the secret is controlled by RBAC (Role-Based Access Control). This is far more secure than a password in a properties file checked into Git.
Networks between cloud services are reliable, but not perfectly so. A call might fail because of a momentary glitch. For idempotent operations, it makes sense to try again. However, you must be intelligent about it. Retrying immediately in a tight loop can make things worse by overloading the struggling service.
The standard pattern is to use an exponential backoff with jitter. You wait longer between each retry, and add a little randomness to prevent all clients from retrying at the same instant.
import io.github.resilience4j.retry.Retry;
import io.github.resilience4j.retry.RetryConfig;
import java.time.Duration;
import java.util.Random;
import java.util.function.Supplier;
public class ExternalApiCaller {
private final Retry retry;
public ExternalApiCaller() {
RetryConfig config = RetryConfig.custom()
.maxAttempts(4) // Initial call + 3 retries
.waitDuration(Duration.ofMillis(500))
.intervalFunction(interval -> {
// Exponential backoff: 0.5s, 1s, 2s...
long wait = (long) (500 * Math.pow(2, interval));
// Add jitter: +/- 20%
Random rand = new Random();
long jitter = (long) (wait * 0.2 * (rand.nextDouble() * 2 - 1));
return wait + jitter;
})
.retryOnException(e -> e instanceof IOException) // Only retry on network errors
.failAfterMaxAttempts(true)
.build();
this.retry = Retry.of("externalApi", config);
}
public String fetchData(String param) {
Supplier<String> decoratedSupplier = Retry.decorateSupplier(retry, () -> callUnreliableApi(param));
try {
return decoratedSupplier.get();
} catch (Exception e) {
// All retries exhausted. Handle the permanent failure.
return "Default fallback data";
}
}
private String callUnreliableApi(String param) throws IOException {
// Simulate a call that might fail with a network error.
// In reality, this would be an HTTP client call.
if (Math.random() < 0.3) { // 30% chance of failure for demo
throw new IOException("Simulated network timeout");
}
return "Data for " + param;
}
}
This pattern makes your application tolerant to temporary blips. It’s crucial for maintaining overall system stability. You should only apply it to operations that are safe to repeat, like a simple data fetch.
These patterns form a toolkit. They are not theoretical concepts, but practical responses to the realities of running software on modern platforms. When you build with these ideas from the start, your Java application stops being a fragile piece of software that needs careful nursing. It becomes a resilient component that the automation system can manage, scale, and heal. The complexity of the cloud is still there, but your code is now prepared to handle it. You spend less time fighting the environment and more time delivering features. That’s the real goal.