Domain Instrumentation: Keeping Use Cases Expressive

Instrumentation is essential in any application: logs for debugging, metrics for monitoring, and traces to understand execution flow. But when this instrumentation is mixed directly into use cases, the code becomes difficult to read, maintain, and especially, to test.

In this post, I’ll explore a pattern I regularly apply: abstracting instrumentation behind domain-specific interfaces, keeping use cases expressive and focused on business logic. Additionally, it helps me postpone infrastructure decisions and dramatically improves test quality.

The Problem: Contaminated Use Cases

Consider a use case for canceling orders in an e-commerce system:

import time
import logging
from metrics import Counter, Summary

logger = logging.getLogger(__name__)

orders_cancelled_total = Counter(
    'orders_cancelled_total',
    'Total orders cancelled',
    ['status', 'result']
)

order_cancellation_duration = Summary(
    'order_cancellation_duration_seconds',
    'Time spent cancelling orders',
    ['result']
)

class CancelOrderUseCase:
    def __init__(self, order_repository: OrderRepository):
        self._order_repository = order_repository
    
    def execute(self, order_id: OrderId) -> None:
        logger.info(f"Starting order cancellation for order {order_id}")
        start_time = time.time()
        
        order = self._order_repository.get(order_id)
        if order is None:
            logger.warning(f"Order {order_id} not found")
            duration = time.time() - start_time
            orders_cancelled_total.labels(status='unknown', result='not_found').inc()
            order_cancellation_duration.labels(result='not_found').observe(duration)
            return
        
        logger.info(f"Order {order_id} found, status: {order.status}")
        
        order.cancel()
        self._order_repository.save(order)
        
        duration = time.time() - start_time
        logger.info(f"Order {order_id} cancelled successfully in {duration:.2f}s")
        orders_cancelled_total.labels(
            status=order.status.value,
            result='success'
        ).inc()
        order_cancellation_duration.labels(result='success').observe(duration)

This code works, but has several problems:

Mixed responsibilities: Business logic with logging, metrics, and timing
Hard to read: It’s difficult to distinguish what’s essential from what’s instrumentation
Timing logic repetition: Duration calculation is repeated in multiple places
Complicated testing

The corresponding tests show the complexity:

from unittest.mock import Mock, patch, call
import pytest

class TestCancelOrderUseCase:
    def test_cancel_order_successfully(self):
        order_id = OrderId("123")
        order = Mock(spec=Order)
        order.id = order_id
        order.status = OrderStatus.PENDING
        
        order_repository = Mock(spec=OrderRepository)
        order_repository.get.return_value = order
        
        use_case = CancelOrderUseCase(order_repository)
        
        orders_cancelled_total.clear()
        order_cancellation_duration.clear()
        
        with patch('time.time', side_effect=[100.0, 100.5]):
            with patch('your_module.logger') as mock_logger:
                use_case.execute(order_id)
        
        order.cancel.assert_called_once()
        order_repository.save.assert_called_once_with(order)
        
        mock_logger.info.assert_has_calls([
            call("Starting order cancellation for order 123"),
            call("Order 123 found, status: PENDING"),
            call("Order 123 cancelled successfully in 0.50s"),
        ])
        
        counter_value = orders_cancelled_total.labels(
            status='PENDING',
            result='success'
        )._value.get()
        assert counter_value == 1.0
        
        summary_value = order_cancellation_duration.labels(
            result='success'
        )._sum.get()
        assert summary_value == 0.5

    def test_cancel_order_not_found(self):
        order_id = OrderId("999")
        
        order_repository = Mock(spec=OrderRepository)
        order_repository.get.return_value = None
        
        use_case = CancelOrderUseCase(order_repository)
        
        orders_cancelled_total.clear()
        order_cancellation_duration.clear()
        
        with patch('time.time', side_effect=[100.0, 100.3]):
            with patch('your_module.logger') as mock_logger:
                use_case.execute(order_id)
        
        mock_logger.warning.assert_called_once_with("Order 999 not found")
        
        counter_value = orders_cancelled_total.labels(
            status='unknown',
            result='not_found'
        )._value.get()
        assert counter_value == 1.0
        
        summary_value = order_cancellation_duration.labels(
            result='not_found'
        )._sum.get()
        assert summary_value == 0.3

The problem is evident: tests are verbose, fragile, and obscure what’s really being tested: the business logic. We’re verifying log strings, accessing metrics internals, mocking time.time(), and clearing global metrics.

The Solution: Domain-Specific Interface

The key change is using the ubiquitous language of the domain in the instrumentation interface:

from abc import ABC, abstractmethod

class OrderInstrumentation(ABC):
    @abstractmethod
    def canceling_order(self, order_id: OrderId) -> None: ...
    
    @abstractmethod
    def order_found(self, order: Order) -> None: ...
    
    @abstractmethod
    def order_not_found(self, order_id: OrderId) -> None: ...
    
    @abstractmethod
    def order_cancelled(self, order: Order) -> None: ...

class CancelOrderUseCase:
    def __init__(
        self,
        order_repository: OrderRepository,
        instrumentation: OrderInstrumentation,
    ):
        self._order_repository = order_repository
        self._instrumentation = instrumentation
    
    def execute(self, order_id: OrderId) -> None:
        self._instrumentation.canceling_order(order_id)
        
        order = self._order_repository.get(order_id)
        if order is None:
            self._instrumentation.order_not_found(order_id)
            raise OrderNotFound(order_id)
        
        self._instrumentation.order_found(order)
        
        order.cancel()
        self._order_repository.save(order)
        
        self._instrumentation.order_cancelled(order)

The change is significant:

Timing disappeared from the use case: Instrumentation handles it internally
No magic strings: Everything is expressed in domain terms
No global metrics: Instrumentation is injected
Code is more readable: It reads as a narrative of what’s happening

And most importantly: tests become expressive:

class TestCancelOrderUseCase:
    def test_cancel_order_successfully(self):
        order_id = OrderId("123")
        order = Order(id=order_id, status=OrderStatus.PENDING)
        
        order_repository = Mock(spec=OrderRepository)
        order_repository.get.return_value = order
        
        instrumentation = Mock(spec=OrderInstrumentation)
        
        use_case = CancelOrderUseCase(order_repository, instrumentation)
        
        use_case.execute(order_id)
        
        order_repository.save.assert_called_once()
        assert order.status == OrderStatus.CANCELLED
        
        instrumentation.canceling_order.assert_called_once_with(order_id)
        instrumentation.order_found.assert_called_once_with(order)
        instrumentation.order_cancelled.assert_called_once_with(order)

    def test_cancel_order_not_found(self):
        order_id = OrderId("999")
        
        order_repository = Mock(spec=OrderRepository)
        order_repository.get.return_value = None
        
        instrumentation = Mock(spec=OrderInstrumentation)
        
        use_case = CancelOrderUseCase(order_repository, instrumentation)
        
        with pytest.raises(OrderNotFound):
            use_case.execute(order_id)
        
        instrumentation.canceling_order.assert_called_once_with(order_id)
        instrumentation.order_not_found.assert_called_once_with(order_id)
        instrumentation.order_cancelled.assert_not_called()

The quality leap is clear:

No more time.time() mocks
No more string verifications
No more accessing metrics internals
No more clearing global metrics
Tests speak the domain language
They’re shorter, clearer, and more expressive

Implementation: From Abstraction to Infrastructure

The OrderInstrumentation implementation uses the real infrastructure pieces (logging and metrics):

import logging
from abc import ABC, abstractmethod
from metrics import Counter, Summary

logger = logging.getLogger(__name__)

class Clock(ABC):
    @abstractmethod
    def now(self) -> float: ...

class SystemClock(Clock):
    def now(self) -> float:
        return time.time()

class LoggingOrderInstrumentation(OrderInstrumentation):
    def __init__(
        self,
        counter: Counter,
        summary: Summary,
        clock: Clock,
    ):
        self._counter = counter
        self._summary = summary
        self._clock = clock
        self._operation_start_times: dict[str, float] = {}
    
    def canceling_order(self, order_id: OrderId) -> None:
        logger.info(f"Starting order cancellation for order {order_id}")
        self._operation_start_times[str(order_id)] = self._clock.now()
    
    def order_found(self, order: Order) -> None:
        logger.info(f"Order {order.id} found, status: {order.status}")
    
    def order_not_found(self, order_id: OrderId) -> None:
        duration = self._calculate_duration(order_id)
        
        logger.warning(f"Order {order_id} not found")
        self._counter.labels(status='unknown', result='not_found').inc()
        self._summary.labels(result='not_found').observe(duration)
        
        self._cleanup_timing(order_id)
    
    def order_cancelled(self, order: Order) -> None:
        duration = self._calculate_duration(order.id)
        
        logger.info(f"Order {order.id} cancelled successfully in {duration:.2f}s")
        self._counter.labels(
            status=order.status.value,
            result='success'
        ).inc()
        self._summary.labels(result='success').observe(duration)
        
        self._cleanup_timing(order.id)
    
    def _calculate_duration(self, order_id: OrderId) -> float:
        start_time = self._operation_start_times.get(str(order_id))
        if start_time is None:
            return 0.0
        return self._clock.now() - start_time
    
    def _cleanup_timing(self, order_id: OrderId) -> None:
        self._operation_start_times.pop(str(order_id), None)

Separation of responsibilities in testing:

The implementation has its own unit tests that verify the instrumentation produces the correct telemetry:

class TestLoggingOrderInstrumentation:
    def test_order_instrumentation_records_success_metrics(self): ...
    def test_order_instrumentation_records_not_found_metrics(self): ...
    def test_order_instrumentation_logs_messages(self): ...

These tests do access metrics internals and mock the Clock, but they’re isolated in the infrastructure layer. Use case tests remain clean and expressive.

Benefits and Principles

This pattern brings multiple benefits:

1. Expressive and focused use cases Code speaks in domain terms, without infrastructure distractions.

2. Dramatically better tests

Shorter and more readable
Verify behavior, not implementation
No infrastructure mocks (logger, time, metrics)
Domain language in assertions

3. Separation of Concerns Instrumentation is a separate responsibility, with its own logic and tests.

4. Dependency Inversion Use cases depend on abstractions (OrderInstrumentation), not on technical details (Counter, Summary, Logger).

5. Ubiquitous Language even in instrumentation Methods like order_cancelled(), canceling_order() are part of the domain language, making code more expressive.

6. Timing without contaminating the domain Duration measurement is encapsulated in the implementation; the use case doesn’t need to know about it.

7. Injected metrics, not globals Facilitates testing and avoids shared global state.

Conclusion

The before/after comparison shows the improvement:

Before:

logger.info(f"Starting order cancellation for order {order_id}")
start_time = time.time()
# ... logic
duration = time.time() - start_time
logger.info(f"Order cancelled in {duration:.2f}s")
orders_cancelled_total.labels(status=order.status.value, result='success').inc()
order_cancellation_duration.labels(result='success').observe(duration)

After:

self._instrumentation.canceling_order(order_id)
# ... logic
self._instrumentation.order_cancelled(order)

Instrumentation is essential, but it shouldn’t contaminate use cases. By abstracting it behind domain-specific interfaces, we achieve cleaner, more expressive code that’s especially much easier to test.

Testing is a powerful driver of good design. When tests are complicated, verbose, or fragile, they’re indicating that something in the design needs improvement. In this case, they lead us to discover a pattern that separates concerns, keeps code expressive, and aligned with the domain’s ubiquitous language.

The Problem: Contaminated Use Cases#

The Solution: Domain-Specific Interface#

Implementation: From Abstraction to Infrastructure#

Benefits and Principles#

Conclusion#

The Problem: Contaminated Use Cases

The Solution: Domain-Specific Interface

Implementation: From Abstraction to Infrastructure

Benefits and Principles

Conclusion