AWS SDK for Java 2.x - Amazon Bedrock

Overview

Invokes foundation models through AWS SDK for Java 2.x. Configures clients, builds model-specific JSON payloads, handles streaming responses with error recovery, creates embeddings for RAG, integrates generative AI into Spring Boot applications, and implements exponential backoff for resilience.

When to Use

Invoke Claude, Llama, Titan, or Stable Diffusion for text/image generation
Configure BedrockClient and BedrockRuntimeClient instances
Build and parse model-specific payloads (Claude, Titan, Llama formats)
Stream real-time AI responses with async handlers and error recovery
Create embeddings for retrieval-augmented generation
Integrate generative AI into Spring Boot microservices
Handle throttling with exponential backoff retry logic

Quick Start

Dependencies

<!-- Bedrock (model management) -->
<dependency>
    <groupId>software.amazon.awssdk</groupId>
    <artifactId>bedrock</artifactId>
</dependency>

<!-- Bedrock Runtime (model invocation) -->
<dependency>
    <groupId>software.amazon.awssdk</groupId>
    <artifactId>bedrockruntime</artifactId>
</dependency>

<!-- For JSON processing -->
<dependency>
    <groupId>org.json</groupId>
    <artifactId>json</artifactId>
    <version>20231013</version>
</dependency>

Client Setup

import software.amazon.awssdk.regions.Region;
import software.amazon.awssdk.services.bedrock.BedrockClient;
import software.amazon.awssdk.services.bedrockruntime.BedrockRuntimeClient;

// Model management client
BedrockClient bedrockClient = BedrockClient.builder()
    .region(Region.US_EAST_1)
    .build();

// Model invocation client
BedrockRuntimeClient bedrockRuntimeClient = BedrockRuntimeClient.builder()
    .region(Region.US_EAST_1)
    .build();

Instructions

Follow these steps for production-ready Bedrock integration:

Configure AWS Credentials - Set up IAM roles with Bedrock permissions (avoid access keys)
Enable Model Access - Request access to specific foundation models in AWS Console
Initialize Clients - Create reusable BedrockClient and BedrockRuntimeClient instances
Validate Model Availability - Test with a simple invocation before production use
Build Payloads - Create model-specific JSON payloads with proper format
Handle Responses - Parse response structure and extract content
Implement Streaming - Use response stream handlers for real-time generation
Add Error Handling - Implement retry logic with exponential backoff

Validation Checkpoint: Always test with a simple prompt (e.g., "Hello") before production use to verify model access and response parsing.

Examples

Text Generation with Claude

public String generateWithClaude(BedrockRuntimeClient client, String prompt) {
    JSONObject payload = new JSONObject()
        .put("anthropic_version", "bedrock-2023-05-31")
        .put("max_tokens", 1000)
        .put("messages", new JSONObject[]{
            new JSONObject().put("role", "user").put("content", prompt)
        });

    InvokeModelResponse response = client.invokeModel(InvokeModelRequest.builder()
        .modelId("anthropic.claude-sonnet-4-5-20250929-v1:0")
        .body(SdkBytes.fromUtf8String(payload.toString()))
        .build());

    JSONObject responseBody = new JSONObject(response.body().asUtf8String());
    return responseBody.getJSONArray("content")
        .getJSONObject(0)
        .getString("text");
}

Model Discovery

import software.amazon.awssdk.services.bedrock.model.*;

public List<FoundationModelSummary> listFoundationModels(BedrockClient bedrockClient) {
    return bedrockClient.listFoundationModels().modelSummaries();
}

Multi-Model Invocation

public String invokeModel(BedrockRuntimeClient client, String modelId, String prompt) {
    JSONObject payload = createPayload(modelId, prompt);

    InvokeModelResponse response = client.invokeModel(request -> request
        .modelId(modelId)
        .body(SdkBytes.fromUtf8String(payload.toString())));

    return extractTextFromResponse(modelId, response.body().asUtf8String());
}

private JSONObject createPayload(String modelId, String prompt) {
    if (modelId.startsWith("anthropic.claude")) {
        return new JSONObject()
            .put("anthropic_version", "bedrock-2023-05-31")
            .put("max_tokens", 1000)
            .put("messages", new JSONObject[]{
                new JSONObject().put("role", "user").put("content", prompt)
            });
    } else if (modelId.startsWith("amazon.titan")) {
        return new JSONObject()
            .put("inputText", prompt)
            .put("textGenerationConfig", new JSONObject()
                .put("maxTokenCount", 512)
                .put("temperature", 0.7));
    } else if (modelId.startsWith("meta.llama")) {
        return new JSONObject()
            .put("prompt", "[INST] " + prompt + " [/INST]")
            .put("max_gen_len", 512)
            .put("temperature", 0.7);
    }
    throw new IllegalArgumentException("Unsupported model: " + modelId);
}

Streaming Response with Error Handling

public String streamResponseWithRetry(BedrockRuntimeClient client, String modelId, String prompt, int maxRetries) {
    int attempt = 0;
    while (attempt < maxRetries) {
        try {
            JSONObject payload = createPayload(modelId, prompt);
            StringBuilder fullResponse = new StringBuilder();

            InvokeModelWithResponseStreamRequest request = InvokeModelWithResponseStreamRequest.builder()
                .modelId(modelId)
                .body(SdkBytes.fromUtf8String(payload.toString()))
                .build();

            client.invokeModelWithResponseStream(request,
                InvokeModelWithResponseStreamResponseHandler.builder()
                    .onEventStream(stream -> stream.forEach(event -> {
                        if (event instanceof PayloadPart) {
                            String chunk = ((PayloadPart) event).bytes().asUtf8String();
                            fullResponse.append(chunk);
                        }
                    }))
                    .onError(e -> System.err.println("Stream error: " + e.getMessage()))
                    .build());

            return fullResponse.toString();
        } catch (Exception e) {
            attempt++;
            if (attempt >= maxRetries) {
                throw new RuntimeException("Stream failed after " + maxRetries + " attempts", e);
            }
            try {
                Thread.sleep((long) Math.pow(2, attempt) * 1000); // Exponential backoff
            } catch (InterruptedException ie) {
                Thread.currentThread().interrupt();
                throw new RuntimeException("Interrupted during retry", ie);
            }
        }
    }
    throw new RuntimeException("Unexpected error in streaming");
}

Exponential Backoff for Throttling

import software.amazon.awssdk.awscore.exception.AwsServiceException;

public <T> T invokeWithRetry(Supplier<T> invocation, int maxRetries) {
    int attempt = 0;
    while (attempt < maxRetries) {
        try {
            return invocation.get();
        } catch (AwsServiceException e) {
            if (e.statusCode() == 429 || e.statusCode() >= 500) {
                attempt++;
                if (attempt >= maxRetries) throw e;
                long delayMs = Math.min(1000 * (1L << attempt) + (long) (Math.random() * 1000), 30000);
                Thread.sleep(delayMs);
            } else {
                throw e;
            }
        }
    }
    throw new IllegalStateException("Should not reach here");
}

Text Embeddings

public double[] createEmbeddings(BedrockRuntimeClient client, String text) {
    String modelId = "amazon.titan-embed-text-v1";

    JSONObject payload = new JSONObject().put("inputText", text);

    InvokeModelResponse response = client.

aws-sdk-java-v2-bedrock

How to add

Drop this on your repo README

Related skills

MoneyPrinterTurbo

weather-svg-creator

azure-keyvault-secrets-rust

azure-monitor-ingestion-py

Get new Automação skills every Monday