/PRO-CODE AGENTS

PythonagentsonSAPBTP—production-gradefromdayone.

We build pro-code AI agents using open frameworks, deployed on SAP infrastructure — with memory, observability, and human-in-the-loop built in from the start.

Agent

LangGraphLangChain-LiteLLMA2A SDK (Google)

LLM / AI

SAP AI CoreGPT-4.1LiteLLM router

SAP BTP

HANA Agent MemoryDestination ServiceXSUAA / OAuth2Cloud Foundry

Ops

OpenTelemetryOTLP / ProtobufDockerStarlette + Uvicorn

/ORCHESTRATION

LangGraphStateGraph—deterministic,testable,inspectable.

app/agent.py — _build_graph()

1def _build_graph(self):

2 tool_node = ToolNode(TOOLS)

4 async def call_model(state: MessagesState):

5 response = await self.llm.ainvoke(state["messages"])

6 return {"messages": [response]}

8 def should_continue(state) -> Literal["tools", "__end__"]:

9 last = state["messages"][-1]

10 if hasattr(last, "tool_calls") and last.tool_calls:

11 return "tools"

12 return "__end__"

14 builder = StateGraph(MessagesState)

15 builder.add_node("model", call_model)

16 builder.add_node("tools", tool_node)

17 builder.add_edge(START, "model")

18 builder.add_conditional_edges("model", should_continue)

19 builder.add_edge("tools", "model")

20 return builder.compile()

The agent runs a tight loop: the LLM decides whether to call a tool or return a final answer. Conditional routing keeps control flow explicit — no magic, no callbacks.

START

MODEL

call_model(state)

has tool_calls

TOOLS

ToolNode(TOOLS)

no tool_calls

END

6 tools registered — analysis · grounding · email · jira

Async streaming — ainvoke with MessagesState

Compiled graph — builder.compile() → reusable

/TOOLS & INTEGRATIONS

Sixtools.Everyactionconfirmedbeforeitexecutes.

Each tool is a typed Python function. The agent decides when to call it — humans decide when to act on it.

📊

run_analysis_tool

SAP AIF OData — /AIFErrorSummary

📄

run_doc_error_catalog_tool

SharePoint · SAP AI Core vector search

✉️

draft_email_tool

Drafts email from Markdown report

📤

send_email_tool

Sends via Email Service — only after confirmation

🎫

draft_jira_tool

Drafts Jira Story from report

✅

create_jira_tool

Creates issue via Jira REST API v2

Human-in-the-Loop Pattern

##AWAITING_CONFIRMATION##
→ agent pauses and surfaces the draft
→ user replies "yes" → action executes
→ send_email_tool / create_jira_tool fires

Agent draftsdraft_email_tool or draft_jira_tool

Human reviews##AWAITING_CONFIRMATION## marker

User confirmsreply "yes" to trigger action

Action firessend or create — never before

/MEMORY

Agentsthatremember.Contextthatcompounds.

app/agent.py — _search_relevant_memories()

1def _search_relevant_memories(

2 memory_client, context_id: str, query: str

3) -> str:

4 """Retrieve semantically relevant past memories."""

6 results = memory_client.search_memories(

7 agent_id=AGENT_ID,

8 invoker_id=context_id,

9 query=query,

10 threshold=0.65,

11 limit=3,

12 )

14 lines = [f"- {r.content[:800]}" for r in results]

15 return "Relevant context:\n" + "\n".join(lines)

Every completed analysis is persisted to SAP HANA Agent Memory. On the next conversation, the three most semantically relevant past reports are injected into the system prompt.

Conversation 1

"AIF errors today?"

persist_turn() → HANA

Conversation 2

"Any new issues?"

search_memories() threshold=0.65

Conversation 3

"Same as last week?"

top-3 memories injected into prompt

0.65

Similarity threshold

top-3

Memories per prompt

800

Chars per memory chunk

HANA

SAP Vector store

/OBSERVABILITY

Everytoken,everytoolcall,everymillisecondtraced.

Codemine Telemetry Dashboard — OTLP trace

SPANDURATION

invoke_agent_span: aif-analysis-agent

450ms

├──LiteLLM: sap/gpt-4.1 (call_model)

gen_ai.request.model=sap/gpt-4.1

180ms

├──run_analysis_tool

aif.period=last 24h · aif.record_count=42

95ms

├──LiteLLM: sap/gpt-4.1 (call_model)

tool_calls=run_doc_error_catalog_tool

210ms

├──run_doc_error_catalog_tool

grounding.query=FI/001 Amount mismatch

88ms

├──LiteLLM: sap/gpt-4.1 (final)

gen_ai.usage.total_tokens=2847

175ms

450ms

total

2 847

tokens

spans

One call to auto_instrument() wraps LiteLLM, LangChain, and httpx with OpenTelemetry spans. Every LLM call, tool invocation, and HTTP request is traced end-to-end.

app/bootstrap.py — configure_telemetry()

1def configure_telemetry() -> None:

2 """OpenTelemetry — LiteLLM, LangChain, httpx."""

4 # OTEL_TRACES_EXPORTER=console → stdout (local dev)

5 # OTEL_EXPORTER_OTLP_ENDPOINT → OTLP collector (prod)

6 # OTEL_EXPORTER_OTLP_PROTOCOL → http/protobuf | grpc

7 # OTEL_SERVICE_NAME → tag on all spans

9 # Codemine Telemetry Dashboard (CF):

10 # OTEL_EXPORTER_OTLP_ENDPOINT=

11 # https://codemine-telemetry…hana.ondemand.com/otel

13 auto_instrument()

📡

OTLP export

http/protobuf to Codemine Telemetry Dashboard

🔢

Token tracking

input · output · total per LLM call

🔗

Trace propagation

W3C traceparent forwarded from A2A caller

👤

User attribution

user.id propagated to all child spans via XSUAA

READY TO BUILD

Ship your first pro-code agent on SAP BTP.

We provide the architecture, the code patterns, and the SAP integration know-how. You own the outcome.

Discuss your agent project