Test Plan: apcore-a2a¶
| Field | Value |
|---|---|
| Title | apcore-a2a: Test Plan and Test Cases |
| Document | Test Plan (TP) |
| Document ID | TP-APCORE-A2A-001 |
| Version | 1.0 |
| Date | 2026-03-03 |
| Author | aipartnerup Engineering Team |
| Status | Draft |
| SRS Ref | docs/apcore-a2a/srs.md v1.0 (SRS-APCORE-A2A-001) |
| Tech Design | docs/apcore-a2a/tech-design.md v1.0 (TDD-APCORE-A2A-001) |
| Standard | IEEE 829-2008, ISTQB Foundation Level |
Revision History¶
| Version | Date | Author | Description |
|---|---|---|---|
| 1.0 | 2026-03-03 | aipartnerup Engineering Team | Initial draft |
Table of Contents¶
- Test Plan Overview
- Test Strategy
- Test Environment
- Test Schedule
- Unit Test Cases — Adapters
- Unit Test Cases — Server
- Unit Test Cases — Client & Auth & Storage
- Unit Test Cases — CLI, Explorer, Ops
- Integration Tests
- End-to-End Tests
- Performance Tests
- Security Tests
- Test Data and Fixtures
- Traceability Matrix
- Quality Gates
- Risk Assessment
1. Test Plan Overview¶
1.1 Purpose¶
This Test Plan defines the complete testing strategy, test cases, and acceptance criteria for apcore-a2a v1.0.0. It covers all 19 PRD features formalized into 63 functional requirements and 30 non-functional requirements in the upstream SRS. Every test case is traceable to SRS requirements and the technical design.
1.2 Scope¶
In scope: - All functional requirements (FR-SRV, FR-AGC, FR-SKL, FR-MSG, FR-TSK, FR-EXE, FR-ERR, FR-CLI, FR-CTX, FR-PSH, FR-AUT, FR-STR, FR-CMD, FR-EXP, FR-OPS) - All non-functional requirements (NFR-PERF, NFR-SEC, NFR-REL, NFR-CMP, NFR-MNT, NFR-SCA) - A2A Client module (A2AClient, AgentCardFetcher) - All API endpoints: JSON-RPC methods, REST endpoints, SSE streams, webhooks
Out of scope: - gRPC binding (not implemented in v1) - Agent Card cryptographic signing (deferred) - Persistent storage backends beyond InMemoryTaskStore - Browser-based manual Explorer UI testing (covered by automated endpoint tests)
1.3 Quality Objectives¶
- Line coverage ≥ 90% (matching apcore-mcp-python standard)
- Branch coverage ≥ 85%
- Zero P0 test cases failing at release
- All SRS functional requirements have ≥ 1 test case
- All NFR requirements have ≥ 1 measurable test
2. Test Strategy¶
2.1 Test Levels¶
| Level | Tool | Scope | Mock Boundary |
|---|---|---|---|
| Unit | pytest + pytest-asyncio | Individual components in isolation | Registry, Executor (stub) |
| Integration | pytest + pytest-asyncio + httpx | Cross-component flows | External A2A clients |
| E2E | pytest + httpx + real server | Full system via HTTP | None |
| Performance | pytest + time/asyncio | Latency and throughput | None |
| Security | pytest + httpx | Auth, sanitization | None |
2.2 Test Framework Configuration¶
# pyproject.toml (test section)
[tool.pytest.ini_options]
asyncio_mode = "auto"
testpaths = ["tests"]
filterwarnings = ["error"]
[tool.coverage.run]
source = ["src/apcore_a2a"]
branch = true
[tool.coverage.report]
fail_under = 90
2.3 Stub Pattern¶
All unit tests use lightweight stub fixtures — NOT the real apcore-python SDK:
# conftest.py pattern
from dataclasses import dataclass, field
@dataclass
class StubAnnotations:
readonly: bool = False
destructive: bool = False
idempotent: bool = False
requires_approval: bool = False
open_world: bool = True
@dataclass
class StubExample:
title: str
inputs: dict = field(default_factory=dict)
@dataclass
class StubDescriptor:
module_id: str
description: str = ""
tags: list[str] = field(default_factory=list)
examples: list[StubExample] = field(default_factory=list)
input_schema: dict = field(default_factory=dict)
output_schema: dict = field(default_factory=dict)
annotations: StubAnnotations = field(default_factory=StubAnnotations)
class StubRegistry:
def __init__(self, modules: dict[str, StubDescriptor]):
self._modules = modules
def list(self, tags=None, prefix=None) -> list[str]:
return list(self._modules.keys())
def get_definition(self, module_id: str) -> StubDescriptor | None:
return self._modules.get(module_id)
class StubExecutor:
def __init__(self, results: dict[str, any] = None, errors: dict[str, Exception] = None):
self._results = results or {}
self._errors = errors or {}
async def call_async(self, module_id: str, inputs: dict) -> any:
if module_id in self._errors:
raise self._errors[module_id]
return self._results.get(module_id, {"result": "ok"})
async def stream(self, module_id: str, inputs: dict):
for chunk in self._results.get(module_id, [{"chunk": 1}]):
yield chunk
def validate(self, module_id: str, inputs: dict) -> None:
pass
2.4 Automation¶
All 100% automated via pytest. No manual test cases. CI runs on every PR via GitHub Actions.
3. Test Environment¶
3.1 Software Requirements¶
| Component | Version |
|---|---|
| Python | 3.11+ |
| pytest | ≥ 8.0 |
| pytest-asyncio | ≥ 0.23 |
| pytest-cov | ≥ 4.1 |
| httpx | ≥ 0.27 (async HTTP client for integration/E2E) |
| PyJWT | ≥ 2.8 |
| ruff | ≥ 0.3 |
| mypy | ≥ 1.9 |
3.2 Test Directory Structure¶
tests/
├── conftest.py # Shared stubs and fixtures
├── adapters/
│ ├── test_agent_card_builder.py
│ ├── test_skill_mapper.py
│ ├── test_schema_converter.py
│ ├── test_part_converter.py
│ └── test_error_mapper.py
├── server/
│ ├── test_server_factory.py
│ ├── test_execution_router.py
│ ├── test_task_manager.py
│ └── test_streaming_handler.py
├── push/
│ └── test_push_notification_manager.py
├── client/
│ ├── test_a2a_client.py
│ └── test_agent_card_fetcher.py
├── auth/
│ ├── test_jwt_authenticator.py
│ └── test_auth_middleware.py
├── storage/
│ └── test_in_memory_task_store.py
├── cli/
│ └── test_cli.py
├── explorer/
│ └── test_explorer.py
├── ops/
│ └── test_operations.py
├── integration/
│ └── test_integration.py
├── e2e/
│ └── test_e2e.py
├── performance/
│ └── test_performance.py
└── security/
└── test_security.py
3.3 CI/CD Integration¶
# .github/workflows/test.yml
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with: { python-version: "3.11" }
- run: pip install -e ".[dev]"
- run: ruff check src/ tests/
- run: mypy src/
- run: pytest --cov=apcore_a2a --cov-report=xml --cov-fail-under=90
4. Test Schedule¶
| Phase | Test Types | Duration | Entry Criteria |
|---|---|---|---|
| Phase 1 — Adapters | Unit (TC-AGC, TC-SKL, TC-SCH, TC-PRT, TC-ERR) | Week 1-2 | Components implemented |
| Phase 2 — Server Core | Unit (TC-SRV, TC-RTR, TC-TSK, TC-STR) | Week 3-4 | Adapter tests passing |
| Phase 3 — Push/Client/Auth | Unit (TC-PSH, TC-CLI, TC-AUT, TC-STO) | Week 5-6 | Server core tests passing |
| Phase 4 — Integration | TC-INT | Week 7 | All unit tests passing |
| Phase 5 — E2E/Perf/Security | TC-E2E, TC-PERF, TC-SEC | Week 8-9 | Integration tests passing |
5. Unit Test Cases — Adapters¶
5.1 TC-AGC: Agent Card Builder¶
TC-AGC-001: Generate Agent Card with all Registry metadata fields¶
| Field | Value |
|---|---|
| ID | TC-AGC-001 |
| Priority | P0 |
| SRS Trace | FR-AGC-001 |
Preconditions: Registry with project config {name: "my-agent", description: "Test agent", version: "1.2.3"} and 2 modules: math.add (description: "Adds two numbers") and math.multiply (description: "Multiplies two numbers").
Test Steps:
1. Create StubRegistry with the above 2 modules.
2. Instantiate AgentCardBuilder(registry, url="http://localhost:8000").
3. Call builder.build().
4. Assert card.name == "my-agent".
5. Assert card.description == "Test agent".
6. Assert card.version == "1.2.3".
7. Assert card.url == "http://localhost:8000".
8. Assert len(card.skills) == 2.
9. Assert "application/json" in card.defaultInputModes.
10. Assert "application/json" in card.defaultOutputModes.
Expected Result: Agent Card built successfully with all fields matching Registry config.
TC-AGC-002: Fallback defaults when Registry has no project config¶
| Field | Value |
|---|---|
| ID | TC-AGC-002 |
| Priority | P0 |
| SRS Trace | FR-AGC-001 |
Preconditions: Registry with no project config, 3 modules.
Test Steps:
1. Create StubRegistry with no project config and 3 modules.
2. Call AgentCardBuilder(registry, url="http://localhost:9000").build().
3. Assert card.name == "apcore-agent".
4. Assert card.version == "0.0.0".
5. Assert "3" in card.description (auto-generated includes module count).
Expected Result: Fallbacks applied; description mentions "3 skills".
TC-AGC-003: Capabilities set to True when streaming module present¶
| Field | Value |
|---|---|
| ID | TC-AGC-003 |
| Priority | P0 |
| SRS Trace | FR-AGC-002 |
Preconditions: Registry with one streaming-capable module (executor has stream() method).
Test Steps:
1. Create StubExecutor that has a working stream() coroutine.
2. Build Agent Card with push_notifications=True.
3. Assert card.capabilities.streaming == True.
4. Assert card.capabilities.pushNotifications == True.
Expected Result: Both capabilities are True.
TC-AGC-004: capabilities.streaming is False when no streaming modules¶
| Field | Value |
|---|---|
| ID | TC-AGC-004 |
| Priority | P0 |
| SRS Trace | FR-AGC-002 |
Preconditions: Registry with only non-streaming modules, push_notifications=False.
Test Steps:
1. Create StubExecutor without stream() method.
2. Build Agent Card.
3. Assert card.capabilities.streaming == False.
4. Assert card.capabilities.pushNotifications == False.
Expected Result: Both capabilities are False.
TC-AGC-005: GET /.well-known/agent.json returns 200 with valid JSON¶
| Field | Value |
|---|---|
| ID | TC-AGC-005 |
| Priority | P0 |
| SRS Trace | FR-AGC-003 |
Preconditions: A2A server running with 1 module data.echo.
Test Steps:
1. Start test ASGI client using httpx.AsyncClient(app=asgi_app).
2. Send GET /.well-known/agent.json.
3. Assert status code 200.
4. Assert Content-Type: application/json.
5. Assert Cache-Control header contains max-age=300.
6. Parse body as JSON — assert card["name"] is a non-empty string.
7. Assert card["skills"] is a list with 1 item.
Expected Result: HTTP 200, valid Agent Card JSON with Cache-Control: max-age=300.
TC-AGC-006: /.well-known/agent.json requires NO authentication¶
| Field | Value |
|---|---|
| ID | TC-AGC-006 |
| Priority | P0 |
| SRS Trace | FR-AGC-003 |
Preconditions: Server configured with JWT auth (require_auth=True).
Test Steps:
1. Start server with JWTAuthenticator(secret="test-secret") and require_auth=True.
2. Send GET /.well-known/agent.json with NO Authorization header.
3. Assert status code 200.
Expected Result: 200 (public endpoint, auth-exempt).
TC-AGC-007: Extended card at /agent/authenticatedExtendedCard requires auth¶
| Field | Value |
|---|---|
| ID | TC-AGC-007 |
| Priority | P1 |
| SRS Trace | FR-AGC-004 |
Preconditions: Server with JWTAuthenticator(secret="secret-key-123").
Test Steps:
1. Send GET /agent/authenticatedExtendedCard without auth header.
2. Assert status code 401.
3. Send again with Authorization: Bearer {valid_jwt}.
4. Assert status code 200.
5. Assert response body is a valid Agent Card JSON.
Expected Result: 401 without auth, 200 with valid JWT.
TC-AGC-008: Agent Card updated within 1s after module registration¶
| Field | Value |
|---|---|
| ID | TC-AGC-008 |
| Priority | P2 |
| SRS Trace | FR-AGC-005 |
Preconditions: Running server, registry initially has 1 module.
Test Steps:
1. Fetch Agent Card — assert 1 skill.
2. Call registry.register(new_module).
3. await asyncio.sleep(1.1).
4. Fetch Agent Card again.
5. Assert 2 skills in updated card.
Expected Result: Card updated within 1 second.
5.2 TC-SKL: Skill Mapper¶
TC-SKL-001: Module ID maps exactly to Skill.id¶
| Field | Value |
|---|---|
| ID | TC-SKL-001 |
| Priority | P0 |
| SRS Trace | FR-SKL-001 |
Preconditions: Module with module_id = "image.resize".
Test Steps:
1. Call SkillMapper.to_skill(descriptor).
2. Assert skill.id == "image.resize".
Expected Result: Skill.id == "image.resize" (no transformation).
TC-SKL-002: Module description maps verbatim to Skill.description¶
| Field | Value |
|---|---|
| ID | TC-SKL-002 |
| Priority | P0 |
| SRS Trace | FR-SKL-001 |
Preconditions: Module with description = "Resizes an image to specified dimensions".
Test Steps:
1. Map descriptor.
2. Assert skill.description == "Resizes an image to specified dimensions".
Expected Result: Description preserved verbatim.
TC-SKL-003: Module with empty description excluded from skills¶
| Field | Value |
|---|---|
| ID | TC-SKL-003 |
| Priority | P0 |
| SRS Trace | FR-SKL-001 |
Preconditions: Module with description = "" (empty string).
Test Steps:
1. Call SkillMapper.to_skill(descriptor).
2. Assert result is None.
3. Assert warning was logged containing module_id.
Expected Result: Returns None, warning logged.
TC-SKL-004: Module with None description excluded from skills¶
| Field | Value |
|---|---|
| ID | TC-SKL-004 |
| Priority | P0 |
| SRS Trace | FR-SKL-001 |
Preconditions: Module with description = None.
Test Steps:
1. Call SkillMapper.to_skill(descriptor).
2. Assert result is None.
Expected Result: Returns None.
TC-SKL-005: Module tags map to Skill.tags¶
| Field | Value |
|---|---|
| ID | TC-SKL-005 |
| Priority | P0 |
| SRS Trace | FR-SKL-001 |
Preconditions: Module with tags = ["image", "resize", "utility"].
Test Steps:
1. Map descriptor.
2. Assert skill.tags == ["image", "resize", "utility"].
Expected Result: Tags list identical.
TC-SKL-006: Examples mapped correctly, max 10 included¶
| Field | Value |
|---|---|
| ID | TC-SKL-006 |
| Priority | P0 |
| SRS Trace | FR-SKL-002 |
Preconditions: Module with 12 examples, first has title = "Resize to 512x512", inputs = {"width": 512, "height": 512}.
Test Steps:
1. Map descriptor.
2. Assert len(skill.examples) == 10.
3. Assert skill.examples[0].name == "Resize to 512x512".
4. Assert first example input contains JSON serialization of {"width": 512, "height": 512}.
Expected Result: Only first 10 examples included, name and input correct.
TC-SKL-007: Module with input_schema declares application/json inputMode¶
| Field | Value |
|---|---|
| ID | TC-SKL-007 |
| Priority | P0 |
| SRS Trace | FR-SKL-003 |
Preconditions: Module with input_schema = {"type": "object", "properties": {"n": {"type": "integer"}}}.
Test Steps:
1. Map descriptor.
2. Assert "application/json" in skill.inputModes.
Expected Result: inputModes contains "application/json".
TC-SKL-008: Module with no input_schema declares text/plain inputMode¶
| Field | Value |
|---|---|
| ID | TC-SKL-008 |
| Priority | P0 |
| SRS Trace | FR-SKL-003 |
Preconditions: Module with input_schema = None or {}.
Test Steps:
1. Map descriptor.
2. Assert "text/plain" in skill.inputModes.
3. Assert "application/json" NOT in skill.inputModes.
Expected Result: Only text/plain in inputModes.
TC-SKL-009: Annotations included as Skill extensions¶
| Field | Value |
|---|---|
| ID | TC-SKL-009 |
| Priority | P0 |
| SRS Trace | FR-SKL-004 |
Preconditions: Module with annotations = StubAnnotations(readonly=True, destructive=False).
Test Steps:
1. Map descriptor.
2. Assert skill.extensions["apcore"]["annotations"]["readonly"] == True.
3. Assert skill.extensions["apcore"]["annotations"]["destructive"] == False.
Expected Result: Annotations in extensions with correct boolean values.
TC-SKL-010: Modules with None annotations omit extensions key¶
| Field | Value |
|---|---|
| ID | TC-SKL-010 |
| Priority | P0 |
| SRS Trace | FR-SKL-004 |
Preconditions: Module with annotations = None.
Test Steps:
1. Map descriptor.
2. Assert "apcore" NOT in skill.extensions (or extensions is empty/None).
Expected Result: No apcore extension key.
5.3 TC-SCH: Schema Converter¶
TC-SCH-001: Empty schema converted to empty object schema¶
| Field | Value |
|---|---|
| ID | TC-SCH-001 |
| Priority | P0 |
| SRS Trace | FR-EXE-003 |
Preconditions: Input schema {}.
Test Steps:
1. Call converter.convert_input_schema(ModuleDescriptor(input_schema={})).
2. Assert result == {"type": "object", "properties": {}}.
Expected Result: {"type": "object", "properties": {}}.
TC-SCH-002: None schema converted to empty object schema¶
| Field | Value |
|---|---|
| ID | TC-SCH-002 |
| Priority | P0 |
| SRS Trace | FR-EXE-003 |
Test Steps:
1. Call converter.convert_input_schema(ModuleDescriptor(input_schema=None)).
2. Assert result == {"type": "object", "properties": {}}.
Expected Result: {"type": "object", "properties": {}}.
TC-SCH-003: $ref resolved and $defs removed¶
| Field | Value |
|---|---|
| ID | TC-SCH-003 |
| Priority | P0 |
| SRS Trace | FR-EXE-003 |
Preconditions:
schema = {
"type": "object",
"$defs": {"Step": {"type": "object", "properties": {"name": {"type": "string"}}}},
"properties": {"steps": {"type": "array", "items": {"$ref": "#/$defs/Step"}}}
}
Test Steps:
1. Call converter.convert_input_schema(ModuleDescriptor(input_schema=schema)).
2. Assert "$defs" NOT in result.
3. Assert result["properties"]["steps"]["items"] == {"type": "object", "properties": {"name": {"type": "string"}}}.
Expected Result: $ref inlined, $defs removed.
TC-SCH-004: Circular $ref raises ValueError¶
| Field | Value |
|---|---|
| ID | TC-SCH-004 |
| Priority | P0 |
| SRS Trace | FR-EXE-003 |
Preconditions:
schema = {
"$defs": {"A": {"$ref": "#/$defs/B"}, "B": {"$ref": "#/$defs/A"}},
"properties": {"x": {"$ref": "#/$defs/A"}}
}
Test Steps:
1. Call converter.convert_input_schema(ModuleDescriptor(input_schema=schema)).
2. Assert ValueError is raised.
3. Assert error message contains "circular" or "recursion".
Expected Result: ValueError raised with descriptive message.
TC-SCH-005: Input schema deep-copied (original not mutated)¶
| Field | Value |
|---|---|
| ID | TC-SCH-005 |
| Priority | P0 |
| SRS Trace | FR-EXE-003 |
Preconditions: Schema with $defs and nested properties.
Test Steps:
1. Store original schema.
2. Call converter.convert_input_schema(ModuleDescriptor(input_schema=schema)).
3. Assert original schema still has $defs key.
Expected Result: Original schema unchanged.
5.4 TC-PRT: Part Converter¶
TC-PRT-001: Dict output converted to DataPart with application/json¶
| Field | Value |
|---|---|
| ID | TC-PRT-001 |
| Priority | P0 |
| SRS Trace | FR-MSG-004 |
Preconditions: Module output {"result": 42, "status": "ok"}.
Test Steps:
1. Call PartConverter.output_to_parts({"result": 42, "status": "ok"}).
2. Assert artifact has one DataPart.
3. Assert part.mediaType == "application/json".
4. Assert part.data == {"result": 42, "status": "ok"}.
Expected Result: DataPart with application/json media type.
TC-PRT-002: String output converted to TextPart¶
| Field | Value |
|---|---|
| ID | TC-PRT-002 |
| Priority | P0 |
| SRS Trace | FR-MSG-004 |
Preconditions: Module output "Hello, world!".
Test Steps:
1. Call PartConverter.output_to_parts("Hello, world!").
2. Assert artifact has one TextPart.
3. Assert part.text == "Hello, world!".
Expected Result: TextPart with correct text.
TC-PRT-003: DataPart with application/json parsed to dict input¶
| Field | Value |
|---|---|
| ID | TC-PRT-003 |
| Priority | P0 |
| SRS Trace | FR-MSG-003 |
Preconditions: A2A message with DataPart(mediaType="application/json", data={"width": 512, "height": 512}).
Test Steps:
1. Call PartConverter.parts_to_input(parts, input_schema).
2. Assert result == {"width": 512, "height": 512}.
Expected Result: Dict input extracted from DataPart.
TC-PRT-004: TextPart with valid JSON auto-parsed when module expects object¶
| Field | Value |
|---|---|
| ID | TC-PRT-004 |
| Priority | P0 |
| SRS Trace | FR-MSG-003 |
Preconditions: TextPart with text = '{"width": 256, "height": 256}', input_schema root type "object".
Test Steps:
1. Call PartConverter.parts_to_input(parts, input_schema={"type": "object"}).
2. Assert result == {"width": 256, "height": 256}.
Expected Result: JSON auto-parsed from TextPart.
TC-PRT-005: TextPart with invalid JSON raises JSON-RPC -32602¶
| Field | Value |
|---|---|
| ID | TC-PRT-005 |
| Priority | P0 |
| SRS Trace | FR-MSG-003 |
Preconditions: TextPart with text = "not json {", module expects object input.
Test Steps:
1. Call PartConverter.parts_to_input(parts, input_schema={"type": "object"}).
2. Assert A2AError raised with code -32602.
3. Assert message contains "Invalid JSON in TextPart".
Expected Result: Error -32602 with "Invalid JSON in TextPart".
TC-PRT-006: Empty parts list raises JSON-RPC -32602¶
| Field | Value |
|---|---|
| ID | TC-PRT-006 |
| Priority | P0 |
| SRS Trace | FR-MSG-003 |
Test Steps:
1. Call PartConverter.parts_to_input([], input_schema={}).
2. Assert A2AError raised with code -32602.
3. Assert message contains "Message must contain at least one Part".
Expected Result: Error -32602 with required message text.
5.5 TC-ERR: Error Mapper¶
TC-ERR-001: ModuleNotFoundError maps to JSON-RPC -32601¶
| Field | Value |
|---|---|
| ID | TC-ERR-001 |
| Priority | P0 |
| SRS Trace | FR-ERR-001 |
Test Steps:
1. Call ErrorMapper.to_jsonrpc_error(ModuleNotFoundError("unknown.module")).
2. Assert result.code == -32601.
Expected Result: Code -32601.
TC-ERR-002: SchemaValidationError maps to JSON-RPC -32602¶
| Field | Value |
|---|---|
| ID | TC-ERR-002 |
| Priority | P0 |
| SRS Trace | FR-ERR-001 |
Test Steps:
1. Call ErrorMapper.to_jsonrpc_error(SchemaValidationError({"width": "must be integer"})).
2. Assert result.code == -32602.
3. Assert "width" in result.message.
Expected Result: Code -32602 with field name in message.
TC-ERR-003: ACLDeniedError maps to -32001 with sanitized message¶
| Field | Value |
|---|---|
| ID | TC-ERR-003 |
| Priority | P0 |
| SRS Trace | FR-ERR-001, NFR-SEC-003 |
Test Steps:
1. Call ErrorMapper.to_jsonrpc_error(ACLDeniedError(caller="admin", module="secret.module")).
2. Assert result.code == -32001.
3. Assert "admin" NOT in result.message (caller info sanitized).
4. Assert "secret.module" NOT in result.message (module name sanitized).
Expected Result: Code -32001, no internal details in message.
TC-ERR-004: ModuleExecuteError maps to -32603 with sanitized message¶
| Field | Value |
|---|---|
| ID | TC-ERR-004 |
| Priority | P0 |
| SRS Trace | FR-ERR-001 |
Test Steps:
1. Call ErrorMapper.to_jsonrpc_error(ModuleExecuteError("Database connection failed at 192.168.1.1")).
2. Assert result.code == -32603.
3. Assert "192.168.1.1" NOT in result.message.
Expected Result: Code -32603, internal IP not leaked.
TC-ERR-005: Unknown exception type maps to -32603¶
| Field | Value |
|---|---|
| ID | TC-ERR-005 |
| Priority | P0 |
| SRS Trace | FR-ERR-001 |
Test Steps:
1. Call ErrorMapper.to_jsonrpc_error(RuntimeError("unexpected crash")).
2. Assert result.code == -32603.
3. Assert result.message is a generic error message (not the raw exception message).
Expected Result: Code -32603, generic message.
6. Unit Test Cases — Server¶
6.1 TC-SRV: Server Factory¶
TC-SRV-001: Server factory creates ASGI app with all required routes¶
| Field | Value |
|---|---|
| ID | TC-SRV-001 |
| Priority | P0 |
| SRS Trace | FR-SRV-001, FR-SRV-004 |
Test Steps:
1. Create A2AServerFactory().create(registry=stub_registry, executor=stub_executor).
2. Assert ASGI app has route GET /.well-known/agent.json.
3. Assert ASGI app has route POST / for JSON-RPC.
4. Assert ASGI app has route GET /agent/authenticatedExtendedCard (even if 404 without auth config).
Expected Result: All required routes present.
TC-SRV-002: serve() raises ValueError when registry has zero modules¶
| Field | Value |
|---|---|
| ID | TC-SRV-002 |
| Priority | P0 |
| SRS Trace | FR-SRV-001 |
Test Steps:
1. Create StubRegistry({}) (no modules).
2. Call serve(registry).
3. Assert ValueError raised with message mentioning "zero modules" or "no modules".
Expected Result: ValueError raised.
TC-SRV-003: async_serve() returns ASGI app without starting server¶
| Field | Value |
|---|---|
| ID | TC-SRV-003 |
| Priority | P0 |
| SRS Trace | FR-SRV-003 |
Test Steps:
1. Call app = await async_serve(registry).
2. Assert app is an ASGI-callable (has __call__ with scope, receive, send params).
3. Assert no server socket is bound on port 8080.
Expected Result: ASGI app returned, no port bound.
TC-SRV-004: Malformed JSON returns -32700¶
| Field | Value |
|---|---|
| ID | TC-SRV-004 |
| Priority | P0 |
| SRS Trace | FR-SRV-004 |
Test Steps:
1. POST / with body "not valid json {".
2. Assert HTTP status 200 (JSON-RPC errors are returned as HTTP 200).
3. Parse response — assert error.code == -32700.
Expected Result: JSON-RPC error -32700.
TC-SRV-005: Missing jsonrpc field returns -32600¶
| Field | Value |
|---|---|
| ID | TC-SRV-005 |
| Priority | P0 |
| SRS Trace | FR-SRV-004 |
Test Steps:
1. POST / with {"method": "message/send", "id": 1, "params": {}} (no jsonrpc field).
2. Assert error.code == -32600.
Expected Result: JSON-RPC error -32600.
TC-SRV-006: Unknown method returns -32601¶
| Field | Value |
|---|---|
| ID | TC-SRV-006 |
| Priority | P0 |
| SRS Trace | FR-SRV-004 |
Test Steps:
1. POST / with valid JSON-RPC but method = "nonexistent/method".
2. Assert error.code == -32601.
Expected Result: JSON-RPC error -32601.
TC-SRV-007: Request body exceeding 10MB returns HTTP 413¶
| Field | Value |
|---|---|
| ID | TC-SRV-007 |
| Priority | P0 |
| SRS Trace | FR-SRV-004 |
Test Steps:
1. POST / with body of 11MB (padding in params).
2. Assert HTTP status code 413.
Expected Result: HTTP 413 (Payload Too Large).
6.2 TC-RTR: Execution Router¶
TC-RTR-001: Sync message routed to executor.call_async()¶
| Field | Value |
|---|---|
| ID | TC-RTR-001 |
| Priority | P0 |
| SRS Trace | FR-EXE-001 |
Preconditions: StubExecutor with results = {"math.add": {"sum": 7}}.
Test Steps:
1. Create ExecutionRouter(executor=stub_executor).
2. Call result = await router.handle_message_send(skill_id="math.add", inputs={"a": 3, "b": 4}, context=Context()).
3. Assert result == {"sum": 7}.
4. Assert stub_executor.call_async was called with ("math.add", {"a": 3, "b": 4}).
Expected Result: Result {"sum": 7} returned.
TC-RTR-002: Streaming skill routed to executor.stream()¶
| Field | Value |
|---|---|
| ID | TC-RTR-002 |
| Priority | P0 |
| SRS Trace | FR-EXE-001 |
Preconditions: StubExecutor with streaming output [{"chunk": 1}, {"chunk": 2}, {"final": True}].
Test Steps:
1. Call router.handle_message_stream(skill_id="data.stream", inputs={}, context=Context()).
2. Collect all chunks from async generator.
3. Assert chunks == [{"chunk": 1}, {"chunk": 2}, {"final": True}].
Expected Result: All 3 chunks yielded in order.
TC-RTR-003: Executor error propagated as correct exception type¶
| Field | Value |
|---|---|
| ID | TC-RTR-003 |
| Priority | P0 |
| SRS Trace | FR-EXE-001, FR-ERR-001 |
Preconditions: StubExecutor raises ModuleExecuteError("boom") for "fail.module".
Test Steps:
1. Call await router.handle_message_send(skill_id="fail.module", inputs={}, context=Context()).
2. Assert ModuleExecuteError (or wrapped A2AError) is raised.
Expected Result: Exception propagated correctly.
TC-RTR-004: ACLDeniedError sanitized before propagation¶
| Field | Value |
|---|---|
| ID | TC-RTR-004 |
| Priority | P0 |
| SRS Trace | FR-ERR-001, NFR-SEC-003 |
Preconditions: StubExecutor raises ACLDeniedError(caller="user:bob", module="admin.delete").
Test Steps:
1. Call await router.handle_message_send(skill_id="admin.delete", inputs={}, context=Context(identity=...)).
2. Catch A2AError or equivalent.
3. Assert "bob" NOT in error message.
4. Assert "admin.delete" NOT in error message.
Expected Result: Sanitized error, no caller/module info leaked.
TC-RTR-005: Identity injected into apcore Context from AuthMiddleware¶
| Field | Value |
|---|---|
| ID | TC-RTR-005 |
| Priority | P1 |
| SRS Trace | FR-AUT-004 |
Preconditions: Auth middleware sets Identity(id="user-123", type="user", roles=["viewer"]).
Test Steps:
1. Route message with pre-set auth identity.
2. Capture context passed to stub_executor.call_async.
3. Assert context.identity.id == "user-123".
4. Assert context.identity.roles == ["viewer"].
Expected Result: Identity present in Context.
6.3 TC-TSK: Task Manager¶
TC-TSK-001: Task created with submitted state and UUID taskId¶
| Field | Value |
|---|---|
| ID | TC-TSK-001 |
| Priority | P0 |
| SRS Trace | FR-TSK-001 |
Test Steps:
1. Call task = await task_manager.create_task(context_id="ctx-abc", skill_id="math.add").
2. Assert task.status.state == "submitted".
3. Assert task.id matches UUID v4 pattern r'^[0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$'.
4. Assert task.context_id == "ctx-abc".
Expected Result: Task with submitted state and valid UUID.
TC-TSK-002: Valid state transitions succeed¶
| Field | Value |
|---|---|
| ID | TC-TSK-002 |
| Priority | P0 |
| SRS Trace | FR-TSK-002 |
Preconditions: Task in submitted state.
Test Steps:
1. Transition submitted → working: await task_manager.transition(task_id, "working"). Assert succeeds.
2. Transition working → completed: assert succeeds.
3. Assert task.status.state == "completed".
Expected Result: Both transitions succeed.
TC-TSK-003: Invalid state transition raises StateTransitionError¶
| Field | Value |
|---|---|
| ID | TC-TSK-003 |
| Priority | P0 |
| SRS Trace | FR-TSK-003 |
Preconditions: Task in completed state.
Test Steps:
1. Attempt await task_manager.transition(task_id, "working").
2. Assert StateTransitionError (or equivalent) is raised.
Expected Result: Exception raised on invalid transition.
TC-TSK-004: Completed task cannot transition to any state¶
| Field | Value |
|---|---|
| ID | TC-TSK-004 |
| Priority | P0 |
| SRS Trace | FR-TSK-003 |
Preconditions: Task in completed state.
Test Steps:
1. For each state in ["submitted", "working", "failed", "canceled", "input_required"]:
- Attempt transition.
- Assert exception raised.
Expected Result: All 5 transitions raise exceptions.
TC-TSK-005: tasks/get returns correct task with history¶
| Field | Value |
|---|---|
| ID | TC-TSK-005 |
| Priority | P0 |
| SRS Trace | FR-TSK-005 |
Preconditions: Task task-456 exists in completed state with 2 messages in history.
Test Steps:
1. POST JSON-RPC tasks/get with params = {"id": "task-456", "historyLength": 5}.
2. Assert response result.id == "task-456".
3. Assert result.status.state == "completed".
4. Assert len(result.history) == 2.
Expected Result: Task returned with correct state and history.
TC-TSK-006: tasks/get for unknown task returns -32001¶
| Field | Value |
|---|---|
| ID | TC-TSK-006 |
| Priority | P0 |
| SRS Trace | FR-TSK-005 |
Test Steps:
1. POST JSON-RPC tasks/get with params = {"id": "nonexistent-uuid"}.
2. Assert error.code == -32001.
Expected Result: Error -32001.
TC-TSK-007: tasks/cancel transitions working task to canceled¶
| Field | Value |
|---|---|
| ID | TC-TSK-007 |
| Priority | P0 |
| SRS Trace | FR-TSK-006 |
Preconditions: Task in working state.
Test Steps:
1. POST JSON-RPC tasks/cancel with params = {"id": task_id}.
2. Assert response task status.state == "canceled".
Expected Result: Task canceled.
TC-TSK-008: tasks/cancel on completed task returns -32002¶
| Field | Value |
|---|---|
| ID | TC-TSK-008 |
| Priority | P0 |
| SRS Trace | FR-TSK-006 |
Preconditions: Task in completed state.
Test Steps:
1. POST tasks/cancel.
2. Assert error.code == -32002.
Expected Result: Error -32002 (TaskNotCancelable).
TC-TSK-009: Concurrent task creation generates unique IDs¶
| Field | Value |
|---|---|
| ID | TC-TSK-009 |
| Priority | P0 |
| SRS Trace | FR-TSK-001, NFR-SCA-001 |
Test Steps:
1. Create 100 tasks concurrently using asyncio.gather().
2. Assert all 100 task.id values are unique.
Expected Result: 100 unique task IDs.
6.4 TC-STR: Streaming Handler¶
TC-STR-001: message/stream returns text/event-stream content type¶
| Field | Value |
|---|---|
| ID | TC-STR-001 |
| Priority | P0 |
| SRS Trace | FR-MSG-002 |
Test Steps:
1. POST JSON-RPC message/stream for skill "data.stream".
2. Assert response Content-Type == "text/event-stream".
Expected Result: SSE content type header.
TC-STR-002: First SSE event is TaskStatusUpdateEvent with submitted state¶
| Field | Value |
|---|---|
| ID | TC-STR-002 |
| Priority | P0 |
| SRS Trace | FR-MSG-002 |
Test Steps:
1. Start message/stream for a module.
2. Read first SSE event.
3. Parse event data as JSON-RPC response.
4. Assert result.kind == "status-update".
5. Assert result.status.state == "submitted".
Expected Result: First event has submitted state.
TC-STR-003: TaskArtifactUpdateEvent emitted for each stream chunk¶
| Field | Value |
|---|---|
| ID | TC-STR-003 |
| Priority | P0 |
| SRS Trace | FR-MSG-002 |
Preconditions: Module produces 3 stream chunks: {"part": 1}, {"part": 2}, {"part": 3}.
Test Steps:
1. Start message/stream.
2. Collect all artifact-update events.
3. Assert 3 artifact-update events received.
4. Assert second and third events have append: true.
5. Assert last event has lastChunk: true.
Expected Result: 3 artifact events, correct append/lastChunk flags.
TC-STR-004: Final SSE event is TaskStatusUpdateEvent with completed state¶
| Field | Value |
|---|---|
| ID | TC-STR-004 |
| Priority | P0 |
| SRS Trace | FR-MSG-002 |
Test Steps:
1. Collect all SSE events.
2. Assert last event result.kind == "status-update".
3. Assert result.status.state == "completed".
4. Assert result.final == True.
Expected Result: Last event is completed status update with final: true.
TC-STR-005: tasks/resubscribe reconnects to existing task stream¶
| Field | Value |
|---|---|
| ID | TC-STR-005 |
| Priority | P0 |
| SRS Trace | FR-TSK-007 |
Preconditions: Task task-789 in working state.
Test Steps:
1. POST JSON-RPC tasks/resubscribe with params = {"id": "task-789"}.
2. Assert response Content-Type == "text/event-stream".
3. Assert events received continue from current task state.
Expected Result: SSE stream re-established.
TC-STR-006: First SSE event emitted within 50ms¶
| Field | Value |
|---|---|
| ID | TC-STR-006 |
| Priority | P0 |
| SRS Trace | FR-MSG-002, NFR-PERF-003 |
Test Steps:
1. Record t0 = time.monotonic() before sending message/stream.
2. Read first SSE event.
3. Record t1 = time.monotonic().
4. Assert (t1 - t0) * 1000 < 50.
Expected Result: First event within 50ms.
7. Unit Test Cases — Client & Auth & Storage¶
7.1 TC-CLI: A2A Client¶
TC-CLI-001: Client discovers agent card from well-known URL¶
| Field | Value |
|---|---|
| ID | TC-CLI-001 |
| Priority | P0 |
| SRS Trace | FR-CLI-001 |
Preconditions: Mock HTTP server at http://test-agent.local serving valid Agent Card JSON.
Test Steps:
1. Create A2AClient(base_url="http://test-agent.local").
2. Call card = await client.discover().
3. Assert card.name is non-empty string.
4. Assert len(card.skills) >= 0.
Expected Result: Agent Card returned.
TC-CLI-002: Client caches discovered Agent Card¶
| Field | Value |
|---|---|
| ID | TC-CLI-002 |
| Priority | P0 |
| SRS Trace | FR-CLI-001 |
Test Steps:
1. Call client.discover() twice.
2. Assert mock HTTP server received only 1 GET request (cached on second call).
Expected Result: Second call served from cache.
TC-CLI-003: send_message returns Task on success¶
| Field | Value |
|---|---|
| ID | TC-CLI-003 |
| Priority | P0 |
| SRS Trace | FR-CLI-002 |
Preconditions: Mock A2A server returns Task with status.state == "completed".
Test Steps:
1. Call task = await client.send_message(skill_id="echo.text", inputs={"text": "hello"}).
2. Assert task.status.state == "completed".
3. Assert task.id is a valid UUID string.
Expected Result: Completed Task returned.
TC-CLI-004: stream_message yields SSE events as async generator¶
| Field | Value |
|---|---|
| ID | TC-CLI-004 |
| Priority | P0 |
| SRS Trace | FR-CLI-003 |
Preconditions: Mock server returns 3 SSE events: submitted, artifact-update, completed.
Test Steps:
1. Call events = client.stream_message(skill_id="data.process", inputs={}).
2. Collect all events with async for event in events.
3. Assert 3 events received.
4. Assert events[0].kind == "status-update" and events[0].status.state == "submitted".
5. Assert events[2].status.state == "completed".
Expected Result: 3 events received in correct order.
TC-CLI-005: get_task retrieves task by ID¶
| Field | Value |
|---|---|
| ID | TC-CLI-005 |
| Priority | P0 |
| SRS Trace | FR-CLI-004 |
Test Steps:
1. Call task = await client.get_task("task-uuid-001").
2. Assert task.id == "task-uuid-001".
Expected Result: Task with matching ID.
TC-CLI-006: Connection refused raises ClientConnectionError¶
| Field | Value |
|---|---|
| ID | TC-CLI-006 |
| Priority | P0 |
| SRS Trace | FR-CLI-002 |
Test Steps:
1. Create A2AClient(base_url="http://localhost:1") (no server at port 1).
2. Call await client.send_message(skill_id="any", inputs={}).
3. Assert A2AClientError or httpx.ConnectError raised.
Expected Result: Connection error raised (no hang).
7.2 TC-AUT: Authentication¶
TC-AUT-001: Valid JWT accepted, identity populated¶
| Field | Value |
|---|---|
| ID | TC-AUT-001 |
| Priority | P1 |
| SRS Trace | FR-AUT-001 |
Preconditions: JWT signed with secret = "test-secret-key-abc123", payload {"sub": "user-42", "type": "user", "roles": ["admin"]}.
Test Steps:
1. Generate JWT with above payload.
2. Call auth.authenticate({"authorization": f"Bearer {token}"}).
3. Assert identity.id == "user-42".
4. Assert identity.type == "user".
5. Assert identity.roles == ["admin"].
Expected Result: Identity with sub, type, and roles.
TC-AUT-002: Expired JWT returns None identity¶
| Field | Value |
|---|---|
| ID | TC-AUT-002 |
| Priority | P1 |
| SRS Trace | FR-AUT-001 |
Preconditions: JWT with exp = (now - 3600) (expired 1 hour ago).
Test Steps:
1. Call auth.authenticate({"authorization": f"Bearer {expired_token}"}).
2. Assert result is None.
Expected Result: None returned (no exception).
TC-AUT-003: Invalid JWT signature returns None¶
| Field | Value |
|---|---|
| ID | TC-AUT-003 |
| Priority | P1 |
| SRS Trace | FR-AUT-001 |
Preconditions: JWT signed with secret = "wrong-secret".
Test Steps:
1. Configure JWTAuthenticator(secret="correct-secret").
2. Call auth.authenticate({"authorization": f"Bearer {bad_token}"}).
3. Assert result is None.
Expected Result: None returned.
TC-AUT-004: Missing Authorization header returns None¶
| Field | Value |
|---|---|
| ID | TC-AUT-004 |
| Priority | P1 |
| SRS Trace | FR-AUT-001 |
Test Steps:
1. Call auth.authenticate({}) (empty headers).
2. Assert result is None.
Expected Result: None returned.
TC-AUT-005: AuthMiddleware with require_auth=True rejects unauthenticated requests¶
| Field | Value |
|---|---|
| ID | TC-AUT-005 |
| Priority | P1 |
| SRS Trace | FR-AUT-002 |
Test Steps:
1. Configure server with JWTAuthenticator and require_auth=True.
2. POST / JSON-RPC request without Authorization header.
3. Assert HTTP 401 response.
Expected Result: HTTP 401.
TC-AUT-006: Well-known endpoint exempt from auth even with require_auth=True¶
| Field | Value |
|---|---|
| ID | TC-AUT-006 |
| Priority | P1 |
| SRS Trace | FR-AUT-002, FR-AGC-003 |
Test Steps:
1. Configure server with require_auth=True.
2. GET /.well-known/agent.json without Authorization.
3. Assert HTTP 200.
Expected Result: HTTP 200 (exempt from auth).
7.3 TC-STO: Storage¶
TC-STO-001: InMemoryTaskStore CRUD operations¶
| Field | Value |
|---|---|
| ID | TC-STO-001 |
| Priority | P0 |
| SRS Trace | FR-STR-001 |
Test Steps:
1. store = InMemoryTaskStore().
2. Create task: await store.save(task) — assert no error.
3. Get task: result = await store.get(task.id) — assert result.id == task.id.
4. Update state: modify task.status, await store.save(task).
5. Get again — assert updated state.
6. List: tasks = await store.list(context_id=None) — assert len(tasks) == 1.
Expected Result: All CRUD operations succeed.
TC-STO-002: Get non-existent task returns None¶
| Field | Value |
|---|---|
| ID | TC-STO-002 |
| Priority | P0 |
| SRS Trace | FR-STR-001 |
Test Steps:
1. result = await store.get("does-not-exist").
2. Assert result is None.
Expected Result: None returned.
TC-STO-003: Concurrent task updates are thread-safe¶
| Field | Value |
|---|---|
| ID | TC-STO-003 |
| Priority | P0 |
| SRS Trace | FR-STR-002, NFR-REL-003 |
Test Steps:
1. Create task in store.
2. Launch 50 concurrent coroutines each updating task.status.state to different values.
3. Assert no exception raised.
4. Assert final store.get(task.id) returns a valid task (not corrupted).
Expected Result: No race condition, no corruption.
TC-STO-004: TaskStore protocol compliance check¶
| Field | Value |
|---|---|
| ID | TC-STO-004 |
| Priority | P0 |
| SRS Trace | FR-STR-001 |
Test Steps:
1. Assert InMemoryTaskStore implements all methods of TaskStore protocol: save(), get(), list(), delete().
2. Assert each method signature matches protocol definition (using inspect.signature).
Expected Result: Protocol compliance confirmed.
7.4 TC-PSH: Push Notification Manager¶
TC-PSH-001: pushNotificationConfig/set stores webhook config¶
| Field | Value |
|---|---|
| ID | TC-PSH-001 |
| Priority | P1 |
| SRS Trace | FR-PSH-001 |
Test Steps:
1. POST JSON-RPC tasks/pushNotificationConfig/set with:
{"params": {"id": "task-001", "pushNotificationConfig": {"url": "https://webhook.example.com/notify", "token": "abc-token-123"}}}
PushNotificationConfig.
3. Assert config.url == "https://webhook.example.com/notify".
4. Assert config.token == "abc-token-123".
Expected Result: Config stored and returned.
TC-PSH-002: Push notification delivered via HTTP POST to webhook URL¶
| Field | Value |
|---|---|
| ID | TC-PSH-002 |
| Priority | P1 |
| SRS Trace | FR-PSH-002 |
Preconditions: Mock webhook server at http://localhost:19999/hook. Task with push config registered.
Test Steps:
1. Register push config for task with url = "http://localhost:19999/hook".
2. Transition task to completed.
3. Assert mock webhook server received 1 POST request.
4. Assert request body contains TaskStatusUpdateEvent with state == "completed".
Expected Result: Webhook POST received with correct event.
TC-PSH-003: Push notification includes X-A2A-Notification-Token header¶
| Field | Value |
|---|---|
| ID | TC-PSH-003 |
| Priority | P1 |
| SRS Trace | FR-PSH-002 |
Preconditions: Push config with token = "verify-me-456".
Test Steps:
1. Trigger task state change.
2. Assert webhook request contains header X-A2A-Notification-Token: verify-me-456.
Expected Result: Token header present.
TC-PSH-004: Failed webhook delivery retried with exponential backoff¶
| Field | Value |
|---|---|
| ID | TC-PSH-004 |
| Priority | P1 |
| SRS Trace | FR-PSH-003 |
Preconditions: Mock webhook returns HTTP 500 for first 2 calls, HTTP 200 on 3rd.
Test Steps: 1. Trigger push notification. 2. Assert mock webhook received 3 POST requests total. 3. Assert time between attempts grows (backoff).
Expected Result: 3 attempts with increasing delay.
TC-PSH-005: push_notifications=False returns -32003 when client requests config¶
| Field | Value |
|---|---|
| ID | TC-PSH-005 |
| Priority | P1 |
| SRS Trace | FR-PSH-001 |
Preconditions: Server started with push_notifications=False.
Test Steps:
1. POST tasks/pushNotificationConfig/set.
2. Assert error.code == -32003.
Expected Result: Error -32003 (PushNotificationNotSupported).
8. Unit Test Cases — CLI, Explorer, Ops¶
8.1 TC-CMD: CLI¶
TC-CMD-001: CLI starts server with --extensions-dir argument¶
| Field | Value |
|---|---|
| ID | TC-CMD-001 |
| Priority | P1 |
| SRS Trace | FR-CMD-001 |
Test Steps:
1. Use subprocess or click.testing.CliRunner.
2. Run apcore-a2a --extensions-dir /tmp/test-extensions --port 18001 --dry-run.
3. Assert exit code 0.
4. Assert output contains server configuration summary.
Expected Result: CLI starts without error.
TC-CMD-002: CLI rejects non-existent extensions-dir¶
| Field | Value |
|---|---|
| ID | TC-CMD-002 |
| Priority | P1 |
| SRS Trace | FR-CMD-001 |
Test Steps:
1. Run apcore-a2a --extensions-dir /nonexistent/path.
2. Assert exit code 1.
3. Assert stderr contains path error message.
Expected Result: Exit code 1, error in stderr.
TC-CMD-003: CLI rejects invalid port number¶
| Field | Value |
|---|---|
| ID | TC-CMD-003 |
| Priority | P1 |
| SRS Trace | FR-CMD-001 |
Test Steps:
1. Run apcore-a2a --extensions-dir /valid/path --port 99999.
2. Assert exit code 1.
3. Assert error mentions invalid port.
Expected Result: Exit code 1, port validation error.
8.2 TC-EXP: Explorer¶
TC-EXP-001: Explorer HTML page returns 200¶
| Field | Value |
|---|---|
| ID | TC-EXP-001 |
| Priority | P2 |
| SRS Trace | FR-EXP-001 |
Preconditions: Server with explorer=True.
Test Steps:
1. GET /explorer/.
2. Assert HTTP 200.
3. Assert Content-Type: text/html.
4. Assert body contains <html.
Expected Result: HTML page served.
TC-EXP-002: Explorer skills list returns all skills¶
| Field | Value |
|---|---|
| ID | TC-EXP-002 |
| Priority | P2 |
| SRS Trace | FR-EXP-001 |
Preconditions: Server with 3 modules and explorer=True.
Test Steps:
1. GET /explorer/skills.
2. Assert HTTP 200.
3. Parse JSON — assert list of 3 skills.
4. Each skill has name, description, tags fields.
Expected Result: JSON list of 3 skills.
TC-EXP-003: Explorer is disabled when explorer=False¶
| Field | Value |
|---|---|
| ID | TC-EXP-003 |
| Priority | P2 |
| SRS Trace | FR-EXP-001 |
Preconditions: Server with explorer=False (default).
Test Steps:
1. GET /explorer/.
2. Assert HTTP 404.
Expected Result: HTTP 404.
8.3 TC-OPS: Operations¶
TC-OPS-001: Health endpoint returns status ok¶
| Field | Value |
|---|---|
| ID | TC-OPS-001 |
| Priority | P2 |
| SRS Trace | FR-OPS-001 |
Test Steps:
1. GET /health.
2. Assert HTTP 200.
3. Parse JSON — assert body["status"] == "ok".
4. Assert body["module_count"] is an integer >= 1.
Expected Result: {"status": "ok", "module_count": N}.
TC-OPS-002: Metrics endpoint returns Prometheus format¶
| Field | Value |
|---|---|
| ID | TC-OPS-002 |
| Priority | P2 |
| SRS Trace | FR-OPS-002 |
Test Steps:
1. GET /metrics.
2. Assert HTTP 200.
3. Assert Content-Type: text/plain.
4. Assert body contains apcore_a2a_tasks_total.
Expected Result: Prometheus text format with apcore_a2a_tasks_total metric.
9. Integration Tests¶
TC-INT-001: Full message/send flow — client to server to executor to response¶
| Field | Value |
|---|---|
| ID | TC-INT-001 |
| Priority | P0 |
| SRS Trace | FR-MSG-001, FR-EXE-001, FR-TSK-002 |
Preconditions: ASGI test server with 1 module math.add (executor returns {"sum": inputs["a"] + inputs["b"]}).
Test Steps:
1. POST JSON-RPC message/send:
{
"jsonrpc": "2.0", "id": "req-1", "method": "message/send",
"params": {
"message": {
"role": "user", "messageId": "msg-uuid-001",
"parts": [{"kind": "data", "data": {"a": 3, "b": 4}}],
"metadata": {"skillId": "math.add"}
}
}
}
result.kind == "task".
4. Assert result.status.state == "completed".
5. Assert result.artifacts[0].parts[0].data == {"sum": 7}.
Expected Result: Completed task with {"sum": 7} artifact.
TC-INT-002: Full streaming flow — SSE events received in correct order¶
| Field | Value |
|---|---|
| ID | TC-INT-002 |
| Priority | P0 |
| SRS Trace | FR-MSG-002, FR-TSK-002 |
Preconditions: Module data.count streams 3 chunks: {"n": 1}, {"n": 2}, {"n": 3}.
Test Steps:
1. POST message/stream for data.count.
2. Collect all SSE events.
3. Assert first event: kind == "status-update", state == "submitted".
4. Assert 3 artifact-update events follow.
5. Assert last event: kind == "status-update", state == "completed", final == True.
Expected Result: Events in order: submitted → 3×artifact → completed.
TC-INT-003: Multi-turn conversation — contextId preserved across tasks¶
| Field | Value |
|---|---|
| ID | TC-INT-003 |
| Priority | P0 |
| SRS Trace | FR-CTX-001, FR-CTX-002 |
Test Steps:
1. Send first message with no contextId — record returned task.contextId as ctx-abc.
2. Send second message with contextId = "ctx-abc".
3. Assert second task contextId == "ctx-abc".
4. Call tasks/list filtering by contextId == "ctx-abc".
5. Assert 2 tasks returned.
Expected Result: Both tasks share same contextId; list returns 2 tasks.
TC-INT-004: Authenticated message flow — JWT identity flows to ACL¶
| Field | Value |
|---|---|
| ID | TC-INT-004 |
| Priority | P1 |
| SRS Trace | FR-AUT-003, FR-AUT-004 |
Preconditions: Server with JWT auth; executor checks identity and raises ACLDeniedError for non-admin users.
Test Steps:
1. POST message/send with admin JWT (roles: ["admin"]). Assert task completed.
2. POST message/send with viewer JWT (roles: ["viewer"]). Assert error.code == -32001.
Expected Result: Admin succeeds, viewer gets ACL error.
TC-INT-005: Error propagation — executor error maps to A2A error¶
| Field | Value |
|---|---|
| ID | TC-INT-005 |
| Priority | P0 |
| SRS Trace | FR-ERR-001, FR-TSK-002 |
Preconditions: Executor raises ModuleExecuteError("internal failure") for fail.module.
Test Steps:
1. POST message/send for fail.module.
2. Assert result.status.state == "failed".
3. Assert result.status.message is non-empty but does NOT contain "internal failure" (sanitized).
Expected Result: Failed task with sanitized error message.
TC-INT-006: A2A Client calls server — discover and send¶
| Field | Value |
|---|---|
| ID | TC-INT-006 |
| Priority | P0 |
| SRS Trace | FR-CLI-001, FR-CLI-002 |
Preconditions: Real ASGI test server running at http://testserver.
Test Steps:
1. Create A2AClient(base_url="http://testserver", http_client=test_client).
2. card = await client.discover() — assert card has name and skills.
3. task = await client.send_message(skill_id="math.add", inputs={"a": 10, "b": 5}).
4. Assert task.status.state == "completed".
Expected Result: Discovery and task execution both succeed.
TC-INT-007: input_required state — requires_approval flow¶
| Field | Value |
|---|---|
| ID | TC-INT-007 |
| Priority | P1 |
| SRS Trace | FR-CTX-003 |
Preconditions: Module with requires_approval = True — raises ApprovalPendingError on first call.
Test Steps:
1. Send message/send for approval-required module.
2. Assert task state is input_required.
3. Send follow-up message/send with same contextId and approval confirmation.
4. Assert task transitions to completed.
Expected Result: Task pauses at input_required, resumes after approval.
TC-INT-008: tasks/list with context_id filter returns correct tasks¶
| Field | Value |
|---|---|
| ID | TC-INT-008 |
| Priority | P0 |
| SRS Trace | FR-TSK-005 |
Test Steps:
1. Create 3 tasks with context_id = "ctx-X".
2. Create 2 tasks with context_id = "ctx-Y".
3. POST tasks/list with params = {"contextId": "ctx-X"}.
4. Assert 3 tasks returned.
5. POST tasks/list with params = {"contextId": "ctx-Y"}.
6. Assert 2 tasks returned.
Expected Result: Filtering by contextId works correctly.
10. End-to-End Tests¶
TC-E2E-001: Full server start, discover, send, verify response¶
| Field | Value |
|---|---|
| ID | TC-E2E-001 |
| Priority | P0 |
| SRS Trace | FR-SRV-001, FR-AGC-003, FR-MSG-001 |
Test Steps:
1. Start real uvicorn server on port 19001 in subprocess.
2. GET http://localhost:19001/.well-known/agent.json — assert 200.
3. POST message/send to http://localhost:19001/ — assert completed task.
4. Stop server.
Expected Result: Full lifecycle from server start to task completion.
TC-E2E-002: Full streaming E2E — receive all SSE events¶
| Field | Value |
|---|---|
| ID | TC-E2E-002 |
| Priority | P0 |
| SRS Trace | FR-MSG-002 |
Test Steps:
1. Start real server with streaming module.
2. Open SSE connection via httpx streaming GET.
3. Collect all events until stream closes.
4. Assert events contain: submitted → working → ≥1 artifact → completed.
5. Assert stream closed after completed event.
Expected Result: SSE stream with all expected events, cleanly closed.
TC-E2E-003: Two-agent interaction — server A calls server B¶
| Field | Value |
|---|---|
| ID | TC-E2E-003 |
| Priority | P0 |
| SRS Trace | FR-CLI-001, FR-CLI-002, FR-SRV-001 |
Test Steps:
1. Start server A at port 19002 with module echo.text.
2. Start server B at port 19003 — B has a module that internally uses A2AClient to call A.
3. Send message to server B triggering A→B chain.
4. Assert final result contains echo from server A.
Expected Result: Cross-agent task delegation works end-to-end.
TC-E2E-004: Authenticated E2E — JWT required, invalid rejected¶
| Field | Value |
|---|---|
| ID | TC-E2E-004 |
| Priority | P1 |
| SRS Trace | FR-AUT-002 |
Test Steps:
1. Start server with JWTAuthenticator(secret="e2e-secret-xyz") and require_auth=True.
2. POST message/send without Authorization. Assert HTTP 401.
3. POST with Authorization: Bearer {valid_jwt}. Assert HTTP 200, task completed.
4. POST with Authorization: Bearer {expired_jwt}. Assert HTTP 401.
Expected Result: Valid JWT succeeds, missing/expired rejected.
11. Performance Tests¶
TC-PERF-001: Agent Card generation < 50ms for 100 modules¶
| Field | Value |
|---|---|
| ID | TC-PERF-001 |
| Priority | P0 |
| SRS Trace | NFR-PERF-001 |
Test Steps:
1. Create registry with 100 stub modules.
2. Time AgentCardBuilder(registry, url="http://localhost").build() 10 times.
3. Assert p99 < 50ms.
Expected Result: P99 < 50ms.
TC-PERF-002: message/send latency (excluding executor) < 10ms overhead¶
| Field | Value |
|---|---|
| ID | TC-PERF-002 |
| Priority | P0 |
| SRS Trace | NFR-PERF-002 |
Preconditions: Stub executor that returns immediately (0ms).
Test Steps:
1. Send 100 message/send requests sequentially.
2. Record latency for each.
3. Assert p95 < 10ms (overhead from routing, not executor).
Expected Result: P95 overhead < 10ms.
TC-PERF-003: 100 concurrent tasks without errors¶
| Field | Value |
|---|---|
| ID | TC-PERF-003 |
| Priority | P0 |
| SRS Trace | NFR-SCA-001 |
Test Steps:
1. Launch 100 concurrent message/send requests using asyncio.gather().
2. Assert all 100 complete without error.
3. Assert all 100 tasks reach completed state.
4. Total time < 5 seconds.
Expected Result: 100 concurrent tasks all completed.
TC-PERF-004: SSE streaming throughput — 10 chunks per second sustained¶
| Field | Value |
|---|---|
| ID | TC-PERF-004 |
| Priority | P1 |
| SRS Trace | NFR-PERF-003 |
Preconditions: Module produces 50 chunks with 100ms delay between each.
Test Steps:
1. Start streaming and measure event receipt timestamps.
2. Assert all 50 events received within 50 * 100ms + 2000ms tolerance.
3. Assert no event lost.
Expected Result: All chunks delivered, no loss.
TC-PERF-005: InMemoryTaskStore handles 1000 concurrent operations¶
| Field | Value |
|---|---|
| ID | TC-PERF-005 |
| Priority | P0 |
| SRS Trace | NFR-SCA-002 |
Test Steps:
1. Create InMemoryTaskStore().
2. Concurrently execute: 400 save, 400 get, 200 list operations.
3. Assert all operations complete without exception.
4. Total time < 2 seconds.
Expected Result: No errors, all operations complete.
12. Security Tests¶
TC-SEC-001: Unauthenticated request rejected when require_auth=True¶
| Field | Value |
|---|---|
| ID | TC-SEC-001 |
| Priority | P0 |
| SRS Trace | NFR-SEC-001 |
Test Steps:
1. Server with require_auth=True.
2. POST message/send with no Authorization header.
3. Assert HTTP 401.
4. Assert response body does NOT contain any module names or internal paths.
Expected Result: HTTP 401, no internal info leaked.
TC-SEC-002: Expired JWT token rejected with 401¶
| Field | Value |
|---|---|
| ID | TC-SEC-002 |
| Priority | P0 |
| SRS Trace | NFR-SEC-001, FR-AUT-001 |
Test Steps:
1. Generate JWT expired 1 hour ago.
2. POST with Authorization: Bearer {expired}.
3. Assert HTTP 401.
Expected Result: HTTP 401.
TC-SEC-003: ACL error message does not leak caller or module info¶
| Field | Value |
|---|---|
| ID | TC-SEC-003 |
| Priority | P0 |
| SRS Trace | NFR-SEC-003, FR-ERR-001 |
Preconditions: Executor raises ACLDeniedError(caller="service-account-db-writer", module="admin.users.delete").
Test Steps:
1. POST message/send for admin.users.delete with insufficient permissions.
2. Parse response — assert result.status.state == "failed" or error.code == -32001.
3. Assert "service-account-db-writer" NOT in response body.
4. Assert "admin.users.delete" NOT in response body.
Expected Result: Error returned with no sensitive info.
TC-SEC-004: Internal error details not exposed to client¶
| Field | Value |
|---|---|
| ID | TC-SEC-004 |
| Priority | P0 |
| SRS Trace | NFR-SEC-003 |
Preconditions: Executor raises RuntimeError("Database password: P@ssw0rd123").
Test Steps:
1. POST message/send triggering the error.
2. Assert "P@ssw0rd123" NOT in any field of the response.
3. Assert response has generic error message.
Expected Result: Password not in response.
TC-SEC-005: Push notification token validated on delivery¶
| Field | Value |
|---|---|
| ID | TC-SEC-005 |
| Priority | P1 |
| SRS Trace | NFR-SEC-004 |
Test Steps:
1. Register push config with token = "my-verify-token-abc".
2. Receive webhook POST at mock server.
3. Assert X-A2A-Notification-Token: my-verify-token-abc header present.
4. Simulate client-side: reject webhook if token does not match. Assert security hold enforced.
Expected Result: Token delivered for client-side validation.
TC-SEC-006: Request body > 10MB returns 413 (DoS protection)¶
| Field | Value |
|---|---|
| ID | TC-SEC-006 |
| Priority | P0 |
| SRS Trace | NFR-SEC-005 |
Test Steps:
1. POST / with body of 11 * 1024 * 1024 bytes.
2. Assert HTTP 413.
Expected Result: HTTP 413.
13. Test Data and Fixtures¶
13.1 Stub Module Descriptors¶
# Simple module
SIMPLE_DESCRIPTOR = StubDescriptor(
module_id="math.add",
description="Adds two integers and returns their sum",
tags=["math", "arithmetic"],
input_schema={"type": "object", "properties": {"a": {"type": "integer"}, "b": {"type": "integer"}}, "required": ["a", "b"]},
output_schema={"type": "object", "properties": {"sum": {"type": "integer"}}},
examples=[StubExample(title="Add 3 and 4", inputs={"a": 3, "b": 4})],
annotations=StubAnnotations(readonly=True, idempotent=True)
)
# Streaming module
STREAMING_DESCRIPTOR = StubDescriptor(
module_id="data.stream",
description="Streams a sequence of numbers",
tags=["data", "streaming"],
input_schema={"type": "object", "properties": {"count": {"type": "integer"}}},
output_schema={"type": "object", "properties": {"n": {"type": "integer"}}}
)
# Destructive module
DESTRUCTIVE_DESCRIPTOR = StubDescriptor(
module_id="file.delete",
description="Deletes a file at the given path",
tags=["file", "delete"],
input_schema={"type": "object", "properties": {"path": {"type": "string"}}},
annotations=StubAnnotations(destructive=True, requires_approval=True)
)
# Module with nested schema ($defs)
NESTED_SCHEMA_DESCRIPTOR = StubDescriptor(
module_id="image.resize",
description="Resizes an image to specified dimensions",
tags=["image", "resize"],
input_schema={
"type": "object",
"$defs": {"Dimensions": {"type": "object", "properties": {"width": {"type": "integer"}, "height": {"type": "integer"}}}},
"properties": {"size": {"$ref": "#/$defs/Dimensions"}}
}
)
# Module with empty description (should be excluded)
NO_DESCRIPTION_DESCRIPTOR = StubDescriptor(
module_id="internal.debug",
description="",
tags=[]
)
13.2 JWT Tokens¶
import jwt
import time
SECRET = "test-secret-key-abc123"
# Valid admin token
VALID_ADMIN_JWT = jwt.encode(
{"sub": "admin-user-001", "type": "user", "roles": ["admin"], "exp": int(time.time()) + 3600},
SECRET, algorithm="HS256"
)
# Valid viewer token
VALID_VIEWER_JWT = jwt.encode(
{"sub": "viewer-user-002", "type": "user", "roles": ["viewer"], "exp": int(time.time()) + 3600},
SECRET, algorithm="HS256"
)
# Expired token
EXPIRED_JWT = jwt.encode(
{"sub": "expired-user-003", "exp": int(time.time()) - 3600},
SECRET, algorithm="HS256"
)
# Wrong signature
WRONG_SIG_JWT = jwt.encode(
{"sub": "bad-user-004", "exp": int(time.time()) + 3600},
"wrong-secret", algorithm="HS256"
)
13.3 Sample JSON-RPC Requests¶
# message/send request
MESSAGE_SEND_REQUEST = {
"jsonrpc": "2.0",
"id": "req-001",
"method": "message/send",
"params": {
"message": {
"role": "user",
"messageId": "msg-uuid-001",
"parts": [{"kind": "data", "data": {"a": 3, "b": 4}}],
"metadata": {"skillId": "math.add"}
}
}
}
# tasks/get request
TASKS_GET_REQUEST = lambda task_id: {
"jsonrpc": "2.0",
"id": "req-002",
"method": "tasks/get",
"params": {"id": task_id, "historyLength": 10}
}
# tasks/cancel request
TASKS_CANCEL_REQUEST = lambda task_id: {
"jsonrpc": "2.0",
"id": "req-003",
"method": "tasks/cancel",
"params": {"id": task_id}
}
13.4 Sample Agent Card¶
{
"name": "Test Agent",
"description": "A test apcore agent with 2 skills",
"version": "1.0.0",
"url": "http://localhost:8000",
"defaultInputModes": ["application/json", "text/plain"],
"defaultOutputModes": ["application/json"],
"capabilities": {
"streaming": true,
"pushNotifications": false,
"stateTransitionHistory": false
},
"skills": [
{
"id": "math.add",
"description": "Adds two integers and returns their sum",
"tags": ["math", "arithmetic"],
"examples": [{"name": "Add 3 and 4", "input": "{\"a\": 3, \"b\": 4}"}],
"inputModes": ["application/json"],
"outputModes": ["application/json"],
"extensions": {"apcore": {"annotations": {"readonly": true, "destructive": false, "idempotent": true, "requires_approval": false, "open_world": true}}}
}
]
}
14. Traceability Matrix¶
14.1 Functional Requirements → Test Cases¶
| SRS Requirement | Description | Test Cases |
|---|---|---|
| FR-SRV-001 | serve() launches A2A server | TC-SRV-001, TC-SRV-002, TC-E2E-001 |
| FR-SRV-002 | serve() keyword arguments | TC-SRV-001, TC-SRV-003 |
| FR-SRV-003 | async_serve() returns ASGI app | TC-SRV-003 |
| FR-SRV-004 | JSON-RPC 2.0 dispatch | TC-SRV-004, TC-SRV-005, TC-SRV-006, TC-SRV-007 |
| FR-SRV-005 | Graceful shutdown | TC-SRV-008 (future) |
| FR-AGC-001 | Agent Card from Registry metadata | TC-AGC-001, TC-AGC-002 |
| FR-AGC-002 | Capabilities computed | TC-AGC-003, TC-AGC-004 |
| FR-AGC-003 | Serve Agent Card at /.well-known/agent.json | TC-AGC-005, TC-AGC-006 |
| FR-AGC-004 | Extended Agent Card | TC-AGC-007 |
| FR-AGC-005 | Regenerate Agent Card on changes | TC-AGC-008 |
| FR-SKL-001 | Module → Skill mapping | TC-SKL-001, TC-SKL-002, TC-SKL-003, TC-SKL-004, TC-SKL-005 |
| FR-SKL-002 | Examples mapping | TC-SKL-006 |
| FR-SKL-003 | Input/output modes | TC-SKL-007, TC-SKL-008 |
| FR-SKL-004 | Annotations as extensions | TC-SKL-009, TC-SKL-010 |
| FR-MSG-001 | message/send synchronous | TC-INT-001, TC-INT-005 |
| FR-MSG-002 | message/stream SSE | TC-STR-001–006, TC-INT-002, TC-E2E-002 |
| FR-MSG-003 | Parse Parts to module input | TC-PRT-003–006 |
| FR-MSG-004 | Convert output to Artifacts | TC-PRT-001, TC-PRT-002 |
| FR-TSK-001 | Task created with submitted state | TC-TSK-001, TC-TSK-009 |
| FR-TSK-002 | Valid state transitions | TC-TSK-002, TC-INT-001 |
| FR-TSK-003 | Invalid transitions rejected | TC-TSK-003, TC-TSK-004 |
| FR-TSK-005 | tasks/get | TC-TSK-005, TC-TSK-006 |
| FR-TSK-006 | tasks/cancel | TC-TSK-007, TC-TSK-008 |
| FR-TSK-007 | tasks/resubscribe | TC-STR-005 |
| FR-EXE-001 | Execution routing | TC-RTR-001–005 |
| FR-EXE-003 | Schema converter | TC-SCH-001–005 |
| FR-ERR-001 | Error mapping | TC-ERR-001–005 |
| FR-CLI-001 | Client agent discovery | TC-CLI-001, TC-CLI-002, TC-INT-006 |
| FR-CLI-002 | Client send_message | TC-CLI-003, TC-CLI-006, TC-INT-006 |
| FR-CLI-003 | Client stream_message | TC-CLI-004 |
| FR-CLI-004 | Client get_task | TC-CLI-005 |
| FR-CTX-001 | contextId grouping | TC-INT-003, TC-INT-008 |
| FR-CTX-003 | input_required state | TC-INT-007 |
| FR-PSH-001 | Push notification config CRUD | TC-PSH-001, TC-PSH-005 |
| FR-PSH-002 | Push notification delivery | TC-PSH-002, TC-PSH-003 |
| FR-PSH-003 | Push notification retry | TC-PSH-004 |
| FR-AUT-001 | JWT validation | TC-AUT-001–004 |
| FR-AUT-002 | AuthMiddleware | TC-AUT-005, TC-AUT-006 |
| FR-AUT-004 | Identity bridging | TC-RTR-005, TC-INT-004 |
| FR-STR-001 | TaskStore CRUD | TC-STO-001, TC-STO-002, TC-STO-004 |
| FR-STR-002 | Concurrent access | TC-STO-003 |
| FR-CMD-001 | CLI entry point | TC-CMD-001–003 |
| FR-EXP-001 | Explorer UI | TC-EXP-001–003 |
| FR-OPS-001 | Health endpoint | TC-OPS-001 |
| FR-OPS-002 | Metrics endpoint | TC-OPS-002 |
14.2 Non-Functional Requirements → Test Cases¶
| SRS Requirement | Description | Test Cases |
|---|---|---|
| NFR-PERF-001 | Agent Card generation < 50ms | TC-PERF-001 |
| NFR-PERF-002 | message/send latency overhead < 10ms | TC-PERF-002 |
| NFR-PERF-003 | SSE streaming throughput | TC-PERF-004 |
| NFR-SEC-001 | Auth enforcement | TC-SEC-001, TC-SEC-002, TC-E2E-004 |
| NFR-SEC-003 | Error sanitization | TC-ERR-003, TC-ERR-004, TC-SEC-003, TC-SEC-004 |
| NFR-SEC-004 | Push notification token | TC-PSH-003, TC-SEC-005 |
| NFR-SEC-005 | Request size limit | TC-SRV-007, TC-SEC-006 |
| NFR-REL-003 | Concurrent access safety | TC-STO-003, TC-TSK-009 |
| NFR-SCA-001 | 100 concurrent tasks | TC-PERF-003 |
| NFR-SCA-002 | Storage under load | TC-PERF-005 |
| NFR-MNT-001 | 90% code coverage | Enforced via pytest-cov |
15. Quality Gates¶
15.1 Entry Criteria (Tests can begin)¶
- Component implementation matches tech design interfaces
- All stub fixtures defined in
tests/conftest.py ruff check src/ tests/passes with zero violationsmypy src/passes with zero errors
15.2 Exit Criteria (Release can proceed)¶
| Gate | Requirement | Measurement |
|---|---|---|
| Coverage | Line coverage ≥ 90%, branch ≥ 85% | pytest --cov --cov-fail-under=90 |
| P0 tests | All P0 test cases passing | 0 failures in P0 suite |
| P1 tests | ≤ 5% P1 failures allowed | < 3 P1 failures |
| Performance | TC-PERF-001 and TC-PERF-003 pass | Latency and concurrency targets met |
| Security | All TC-SEC-* pass | 0 security test failures |
| Linting | ruff + mypy clean | 0 violations |
| E2E | All TC-E2E-* pass | 0 E2E failures |
15.3 Test Count Summary¶
| Category | Count |
|---|---|
| TC-AGC (Agent Card) | 8 |
| TC-SKL (Skill Mapper) | 10 |
| TC-SCH (Schema Converter) | 5 |
| TC-PRT (Part Converter) | 6 |
| TC-ERR (Error Mapper) | 5 |
| TC-SRV (Server Factory) | 7 |
| TC-RTR (Execution Router) | 5 |
| TC-TSK (Task Manager) | 9 |
| TC-STR (Streaming Handler) | 6 |
| TC-PSH (Push Notification) | 5 |
| TC-CLI (A2A Client) | 6 |
| TC-AUT (Authentication) | 6 |
| TC-STO (Storage) | 4 |
| TC-CMD (CLI) | 3 |
| TC-EXP (Explorer) | 3 |
| TC-OPS (Operations) | 2 |
| TC-INT (Integration) | 8 |
| TC-E2E (End-to-End) | 4 |
| TC-PERF (Performance) | 5 |
| TC-SEC (Security) | 6 |
| Total | 118 |
16. Risk Assessment¶
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| a2a-sdk API changes during development | Medium | High | Pin SDK version; add integration tests against SDK boundary |
| SSE streaming race conditions | Medium | High | Use asyncio locks; TC-STR tests cover concurrent scenarios |
| InMemoryTaskStore corruption under high concurrency | Low | High | TC-STO-003 stress test; use asyncio.Lock per task |
| Push notification delivery failures in tests | Medium | Medium | Use mock HTTP server; TC-PSH-004 covers retry logic |
| apcore-python API changes breaking adapter | Low | High | Pin version in dev; stub all apcore calls in unit tests |
| Test isolation failures (shared state between tests) | Low | Medium | Use fresh fixtures per test; pytest-asyncio in auto mode |
| CI flakiness in performance tests | Medium | Low | Run perf tests separately; use conservative thresholds |