/PRO-CODE AGENTS

PythonagentsonSAPBTPproduction-gradefromdayone.

We build pro-code AI agents using open frameworks, deployed on SAP infrastructure — with memory, observability, and human-in-the-loop built in from the start.

Agent
LangGraphLangChain-LiteLLMA2A SDK (Google)
LLM / AI
SAP AI CoreGPT-4.1LiteLLM router
SAP BTP
HANA Agent MemoryDestination ServiceXSUAA / OAuth2Cloud Foundry
Ops
OpenTelemetryOTLP / ProtobufDockerStarlette + Uvicorn
/ORCHESTRATION

LangGraphStateGraphdeterministic,testable,inspectable.

app/agent.py — _build_graph()
1def _build_graph(self):
2 tool_node = ToolNode(TOOLS)
3
4 async def call_model(state: MessagesState):
5 response = await self.llm.ainvoke(state["messages"])
6 return {"messages": [response]}
7
8 def should_continue(state) -> Literal["tools", "__end__"]:
9 last = state["messages"][-1]
10 if hasattr(last, "tool_calls") and last.tool_calls:
11 return "tools"
12 return "__end__"
13
14 builder = StateGraph(MessagesState)
15 builder.add_node("model", call_model)
16 builder.add_node("tools", tool_node)
17 builder.add_edge(START, "model")
18 builder.add_conditional_edges("model", should_continue)
19 builder.add_edge("tools", "model")
20 return builder.compile()

The agent runs a tight loop: the LLM decides whether to call a tool or return a final answer. Conditional routing keeps control flow explicit — no magic, no callbacks.

START

MODEL

call_model(state)

has tool_calls

TOOLS

ToolNode(TOOLS)

no tool_calls

END

6 tools registeredanalysis · grounding · email · jira
Async streamingainvoke with MessagesState
Compiled graphbuilder.compile() → reusable
/TOOLS & INTEGRATIONS

Sixtools.Everyactionconfirmedbeforeitexecutes.

Each tool is a typed Python function. The agent decides when to call it — humans decide when to act on it.

📊

run_analysis_tool

SAP AIF OData — /AIFErrorSummary

📄

run_doc_error_catalog_tool

SharePoint · SAP AI Core vector search

✉️

draft_email_tool

Drafts email from Markdown report

📤

send_email_tool

Sends via Email Service — only after confirmation

🎫

draft_jira_tool

Drafts Jira Story from report

create_jira_tool

Creates issue via Jira REST API v2

Human-in-the-Loop Pattern

##AWAITING_CONFIRMATION##
→ agent pauses and surfaces the draft
→ user replies "yes" → action executes
→ send_email_tool / create_jira_tool fires

1
Agent draftsdraft_email_tool or draft_jira_tool
2
Human reviews##AWAITING_CONFIRMATION## marker
3
User confirmsreply "yes" to trigger action
4
Action firessend or create — never before
/MEMORY

Agentsthatremember.Contextthatcompounds.

app/agent.py — _search_relevant_memories()
1def _search_relevant_memories(
2 memory_client, context_id: str, query: str
3) -> str:
4 """Retrieve semantically relevant past memories."""
5
6 results = memory_client.search_memories(
7 agent_id=AGENT_ID,
8 invoker_id=context_id,
9 query=query,
10 threshold=0.65,
11 limit=3,
12 )
13
14 lines = [f"- {r.content[:800]}" for r in results]
15 return "Relevant context:\n" + "\n".join(lines)

Every completed analysis is persisted to SAP HANA Agent Memory. On the next conversation, the three most semantically relevant past reports are injected into the system prompt.

Conversation 1

"AIF errors today?"

persist_turn() → HANA

Conversation 2

"Any new issues?"

search_memories() threshold=0.65

Conversation 3

"Same as last week?"

top-3 memories injected into prompt

0.65

Similarity threshold

top-3

Memories per prompt

800

Chars per memory chunk

HANA

SAP Vector store

/OBSERVABILITY

Everytoken,everytoolcall,everymillisecondtraced.

Codemine Telemetry Dashboard — OTLP trace
SPANDURATION
invoke_agent_span: aif-analysis-agent
450ms
├──LiteLLM: sap/gpt-4.1 (call_model)
gen_ai.request.model=sap/gpt-4.1
180ms
├──run_analysis_tool
aif.period=last 24h · aif.record_count=42
95ms
├──LiteLLM: sap/gpt-4.1 (call_model)
tool_calls=run_doc_error_catalog_tool
210ms
├──run_doc_error_catalog_tool
grounding.query=FI/001 Amount mismatch
88ms
├──LiteLLM: sap/gpt-4.1 (final)
gen_ai.usage.total_tokens=2847
175ms

450ms

total

2 847

tokens

6

spans

One call to auto_instrument() wraps LiteLLM, LangChain, and httpx with OpenTelemetry spans. Every LLM call, tool invocation, and HTTP request is traced end-to-end.

app/bootstrap.py — configure_telemetry()
1def configure_telemetry() -> None:
2 """OpenTelemetry — LiteLLM, LangChain, httpx."""
3
4 # OTEL_TRACES_EXPORTER=console → stdout (local dev)
5 # OTEL_EXPORTER_OTLP_ENDPOINT → OTLP collector (prod)
6 # OTEL_EXPORTER_OTLP_PROTOCOL → http/protobuf | grpc
7 # OTEL_SERVICE_NAME → tag on all spans
8
9 # Codemine Telemetry Dashboard (CF):
10 # OTEL_EXPORTER_OTLP_ENDPOINT=
11 # https://codemine-telemetry…hana.ondemand.com/otel
12
13 auto_instrument()
📡

OTLP export

http/protobuf to Codemine Telemetry Dashboard

🔢

Token tracking

input · output · total per LLM call

🔗

Trace propagation

W3C traceparent forwarded from A2A caller

👤

User attribution

user.id propagated to all child spans via XSUAA

READY TO BUILD

Ship your first pro-code agent on SAP BTP.

We provide the architecture, the code patterns, and the SAP integration know-how. You own the outcome.