Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Observability is one of the most important aspects of a distributed system. It's the ability to understand the system's state at any given time. You can achieve this in various ways, including logging, metrics, and distributed tracing.
Logging
Orleans uses Microsoft.Extensions.Logging for all silo and client logs. You can use any logging provider compatible with Microsoft.Extensions.Logging
. Your app code relies on dependency injection to get an instance of ILogger<TCategoryName> and uses it to log messages. For more information, see Logging in .NET.
Metrics
Metrics are numerical measurements reported over time. You most often use them to monitor an application's health and generate alerts. For more information, see Metrics in .NET. Orleans uses the System.Diagnostics.Metrics APIs to collect metrics. These metrics are exposed to the OpenTelemetry project, which exports them to various monitoring systems.
To monitor your app without making any code changes, use the dotnet counters
.NET diagnostic tool. To monitor Orleans ActivitySource counters for a specific <ProcessName>
, use the dotnet counters monitor
command as shown:
dotnet counters monitor -n <ProcessName> --counters Microsoft.Orleans
Imagine you're running the Orleans GPS Tracker sample app and monitoring it in a separate terminal with the dotnet counters monitor
command. The following output is typical:
Press p to pause, r to resume, q to quit.
Status: Running
[Microsoft.Orleans]
orleans-app-requests-latency-bucket (Count / 1 sec) 0
duration=10000ms 0
duration=1000ms 0
duration=100ms 0
duration=10ms 0
duration=15000ms 0
duration=1500ms 0
duration=1ms 2,530
duration=2000ms 0
duration=200ms 0
duration=2ms 0
duration=400ms 0
duration=4ms 0
duration=5000ms 0
duration=50ms 0
duration=6ms 0
duration=800ms 0
duration=8ms 0
duration=9223372036854775807ms 0
orleans-app-requests-latency-count (Count / 1 sec) 2,530
orleans-app-requests-latency-sum (Count / 1 sec) 0
orleans-catalog-activation-working-set 36
orleans-catalog-activations 38
orleans-consistent-ring-range-percentage-average 100
orleans-consistent-ring-range-percentage-local 100
orleans-consistent-ring-size 1
orleans-directory-cache-size 27
orleans-directory-partition-size 26
orleans-directory-ring-local-portion-average-percentage 100
orleans-directory-ring-local-portion-distance 0
orleans-directory-ring-local-portion-percentage 0
orleans-directory-ring-size 1,295
orleans-gateway-received (Count / 1 sec) 1,291
orleans-gateway-sent (Count / 1 sec) 2,582
orleans-messaging-processing-activation-data 0
orleans-messaging-processing-dispatcher-forwarded (Count / 1 0
orleans-messaging-processing-dispatcher-processed (Count / 1 2,543
Direction=Request,Status=Ok 2,582
orleans-messaging-processing-dispatcher-received (Count / 1 1,271
Context=Grain,Direction=Request 1,291
Context=None,Direction=Request 1,291
orleans-messaging-processing-ima-enqueued (Count / 1 sec) 5,113
For more information, see Investigate performance counters (dotnet-counters).
Orleans meters
Orleans uses the System.Diagnostics.Metrics APIs to collect metrics. Orleans categorizes each meter into domain-centric concerns, such as networking, messaging, gateway, etc. The following subsections describe the meters Orleans uses.
Networking
The following table shows a collection of networking meters used to monitor the Orleans networking layer.
Meter name | Type | Description |
---|---|---|
orleans-networking-sockets-closed |
Counter<T> | A count of sockets that have closed. |
orleans-networking-sockets-opened |
Counter<T> | A count of sockets that have opened. |
Messaging
The following table shows a collection of messaging meters used to monitor the Orleans messaging layer.
Meter name | Type | Description |
---|---|---|
orleans-messaging-sent-messages-size |
Histogram<T> | A histogram representing the size of messages in bytes that have been sent. |
orleans-messaging-received-messages-size |
Histogram<T> | A histogram representing the size of messages in bytes that have been received. |
orleans-messaging-sent-header-size |
ObservableCounter<T> | An observable counter representing the number of header bytes sent. |
orleans-messaging-received-header-size |
ObservableCounter<T> | An observable counter representing the number of header bytes received. |
orleans-messaging-sent-failed |
Counter<T> | A count of failed sent messages. |
orleans-messaging-sent-dropped |
Counter<T> | A count of dropped sent messages. |
orleans-messaging-processing-dispatcher-received |
ObservableCounter<T> | An observable counter representing the number dispatcher received messages. |
orleans-messaging-processing-dispatcher-processed |
ObservableCounter<T> | An observable counter representing the number dispatcher processed messages. |
orleans-messaging-processing-dispatcher-forwarded |
ObservableCounter<T> | An observable counter representing the number dispatcher forwarded messages. |
orleans-messaging-processing-ima-received |
ObservableCounter<T> | An observable counter representing the number of incoming messages received. |
orleans-messaging-processing-ima-enqueued |
ObservableCounter<T> | An observable counter representing the number of incoming messages enqueued. |
orleans-messaging-processing-activation-data |
ObservableGauge<T> | An observable gauge representing all of the processing activation data. |
orleans-messaging-pings-sent |
Counter<T> | A count of pings sent. |
orleans-messaging-pings-received |
Counter<T> | A count of pings received. |
orleans-messaging-pings-reply-received |
Counter<T> | A count of ping replies received. |
orleans-messaging-pings-reply-missed |
Counter<T> | A count of ping replies missed. |
orleans-messaging-expired" |
Counter<T> | A count of messages that have expired. |
orleans-messaging-rejected |
Counter<T> | A count of messages that have been rejected. |
orleans-messaging-rerouted |
Counter<T> | A count of messages that have been rerouted. |
orleans-messaging-sent-local |
ObservableCounter<T> | An observable counter representing the number of local messages sent. |
Gateway
The following table shows a collection of gateway meters used to monitor the Orleans gateway layer.
Meter name | Type | Description |
---|---|---|
orleans-gateway-connected-clients |
UpDownCounter<T> | An up/down counter representing the number of connected clients. |
orleans-gateway-sent |
Counter<T> | A count of gateway messages sent. |
orleans-gateway-received |
Counter<T> | A count of gateway messages received. |
orleans-gateway-load-shedding |
Counter<T> | A count of gateway (load shedding) messages that have been rejected due to the gateway being overloaded. |
Runtime
The following table shows a collection of runtime meters used to monitor the Orleans runtime layer.
Meter name | Type | Description |
---|---|---|
orleans-scheduler-long-running-turns |
Counter<T> | A count of long running turns within the scheduler. |
orleans-runtime-total-physical-memory |
ObservableCounter<T> | An observable counter representing the total number of memory (in MB) of the Orleans runtime. |
orleans-runtime-available-memory |
ObservableCounter<T> | An observable counter representing the available memory (in MB) for the Orleans runtime. |
Catalog
The following table shows a collection of catalog meters used to monitor the Orleans catalog layer.
Meter name | Type | Description |
---|---|---|
orleans-catalog-activations |
ObservableGauge<T> | An observable gauge representing the number of catalog activations. |
orleans-catalog-activation-working-set |
ObservableGauge<T> | An observable gauge representing the number of activations within the working set. |
orleans-catalog-activation-created |
Counter<T> | A count of created activations. |
orleans-catalog-activation-destroyed |
Counter<T> | A count of destroyed activations. |
orleans-catalog-activation-failed-to-activate |
Counter<T> | A count of activations that failed to activate. |
orleans-catalog-activation-collections |
Counter<T> | A count of idle activation collections. |
orleans-catalog-activation-shutdown |
Counter<T> | A count of shutdown activations. |
orleans-catalog-activation-non-existent |
Counter<T> | A count of non-existent activations. |
orleans-catalog-activation-concurrent-registration-attempts |
Counter<T> | A count of concurrent activation registration attempts. |
Directory
The following table shows a collection of directory meters used to monitor the Orleans directory layer.
Meter name | Type | Description |
---|---|---|
orleans-directory-lookups-local-issued |
Counter<T> | A count of local lookups issued. |
orleans-directory-lookups-local-successes |
Counter<T> | A count of local successful lookups. |
orleans-directory-lookups-full-issued |
Counter<T> | A count of full directory lookups issued. |
orleans-directory-lookups-remote-sent |
Counter<T> | A count of remote directory lookups sent. |
orleans-directory-lookups-remote-received |
Counter<T> | A count of remote directory lookups received. |
orleans-directory-lookups-local-directory-issued |
Counter<T> | A count of local directory lookups issued. |
orleans-directory-lookups-local-directory-successes |
Counter<T> | A count of local directory successful lookups. |
orleans-directory-lookups-cache-issued |
Counter<T> | A count cached lookups issued. |
orleans-directory-lookups-cache-successes |
Counter<T> | A count of cached successful lookups. |
orleans-directory-validations-cache-sent |
Counter<T> | A count of directory cache validations sent. |
orleans-directory-validations-cache-received |
Counter<T> | A count of directory cache validations received. |
orleans-directory-partition-size |
ObservableGauge<T> | An observable gauge representing the directory partition size. |
orleans-directory-cache-size |
ObservableGauge<T> | An observable gauge representing the directory cache size. |
orleans-directory-ring-size |
ObservableGauge<T> | An observable gauge representing the directory ring size. |
orleans-directory-ring-local-portion-distance |
ObservableGauge<T> | An observable gauge representing the ring range owned by the local directory partition. |
orleans-directory-ring-local-portion-percentage |
ObservableGauge<T> | An observable gauge representing the ring range owned by the local directory, represented as a percentage of the total range. |
orleans-directory-ring-local-portion-average-percentage |
ObservableGauge<T> | An observable gauge representing the average percentage of the directory ring range owned by each silo, giving a representation of how balanced directory ownership. |
orleans-directory-registrations-single-act-issued |
Counter<T> | A count of directory single activation registrations issued. |
orleans-directory-registrations-single-act-local |
Counter<T> | A count of directory single activation registrations handled by the local directory partition. |
orleans-directory-registrations-single-act-remote-sent |
Counter<T> | A count of directory single activation registrations sent to a remote directory partition. |
orleans-directory-registrations-single-act-remote-received |
Counter<T> | A count of directory single activation registrations received from remote hosts. |
orleans-directory-unregistrations-issued |
Counter<T> | A count of directory deregistrations issued. |
orleans-directory-unregistrations-local |
Counter<T> | A count of directory deregistrations handled by the local directory partition. |
orleans-directory-unregistrations-remote-sent |
Counter<T> | A count of directory deregistrations sent to remote directory partitions. |
orleans-directory-unregistrations-remote-received |
Counter<T> | A count of directory deregistrations received from remote hosts. |
orleans-directory-unregistrations-many-issued |
Counter<T> | A count of directory multi-activation deregistrations issued. |
orleans-directory-unregistrations-many-remote-sent |
Counter<T> | A count of directory multi-activations deregistrations sent to remote directory partitions. |
orleans-directory-unregistrations-many-remote-received |
Counter<T> | A count of directory multi-activation deregistrations received from remote hosts. |
Consistent ring
The following table shows a collection of consistent ring meters used to monitor the Orleans consistent ring layer.
Meter name | Type | Description |
---|---|---|
orleans-consistent-ring-size |
ObservableGauge<T> | An observable gauge representing the consistent ring size. |
orleans-consistent-ring-range-percentage-local |
ObservableGauge<T> | An observable gauge representing the consistent ring local percentage. |
orleans-consistent-ring-range-percentage-average |
ObservableGauge<T> | An observable gauge representing the consistent ring average percentage. |
Watchdog
The following table shows a collection of watchdog meters used to monitor the Orleans watchdog layer.
Meter name | Type | Description |
---|---|---|
orleans-watchdog-health-checks |
Counter<T> | A count of watchdog health checks. |
orleans-watchdog-health-checks-failed |
Counter<T> | A count of failed watchdog health checks. |
Client
The following table shows a collection of client meters used to monitor the Orleans client layer.
Meter name | Type | Description |
---|---|---|
orleans-client-connected-gateways |
ObservableGauge<T> | An observable gauge representing the number of connected gateway clients. |
Miscellaneous
The following table shows a collection of miscellaneous meters used to monitor various layers.
Meter name | Type | Description |
---|---|---|
orleans-grains |
Counter<T> | A count representing the number of grains. |
orleans-system-targets |
Counter<T> | A count representing the number of system targets. |
App requests
The following table shows a collection of app request meters used to monitor the Orleans app request layer.
Meter name | Type | Description |
---|---|---|
orleans-app-requests-latency |
ObservableCounter<T> | An observable counter representing app request latency. |
orleans-app-requests-timedout |
ObservableCounter<T> | An observable counter representing app requests that have timed out. |
Reminders
The following table shows a collection of reminder meters used to monitor the Orleans reminder layer.
Meter name | Type | Description |
---|---|---|
orleans-reminders-tardiness |
Histogram<T> | A histogram representing the number of seconds a reminder is tardy. |
orleans-reminders-active |
ObservableGauge<T> | An observable gauge representing the number active reminders. |
orleans-reminders-ticks-delivered |
Counter<T> | A count representing the number of reminder ticks that have been delivered. |
Storage
The following table shows a collection of storage meters used to monitor the Orleans storage layer.
Meter name | Type | Description |
---|---|---|
orleans-storage-read-errors |
Counter<T> | A count representing the number of storage read errors. |
orleans-storage-write-errors |
Counter<T> | A count representing the number of storage write errors. |
orleans-storage-clear-errors |
Counter<T> | A count representing the number of storage clear errors. |
orleans-storage-read-latency |
Histogram<T> | A histogram representing the storage read latency in milliseconds. |
orleans-storage-write-latency |
Histogram<T> | A histogram representing the storage write latency in milliseconds. |
orleans-storage-clear-latency |
Histogram<T> | A histogram representing the storage clear latency in milliseconds. |
Streams
The following table shows a collection of stream meters used to monitor the Orleans stream layer.
Meter name | Type | Description |
---|---|---|
orleans-streams-pubsub-producers-added |
Counter<T> | A count of streaming pubsub producers added. |
orleans-streams-pubsub-producers-removed |
Counter<T> | A count of streaming pubsub producers removed. |
orleans-streams-pubsub-producers |
Counter<T> | A count of streaming pubsub producers. |
orleans-streams-pubsub-consumers-added |
Counter<T> | A count of streaming pubsub consumers added. |
orleans-streams-pubsub-consumers-removed |
Counter<T> | A count of streaming pubsub consumers removed. |
orleans-streams-pubsub-consumers |
Counter<T> | A count of streaming pubsub consumers. |
orleans-streams-persistent-stream-pulling-agents |
ObservableGauge<T> | An observable gauge representing the number of persistent stream pulling agents. |
orleans-streams-persistent-stream-messages-read |
Counter<T> | A count of persistent stream messages read. |
orleans-streams-persistent-stream-messages-sent |
Counter<T> | A count of persistent stream messages sent. |
orleans-streams-persistent-stream-pubsub-cache-size |
ObservableGauge<T> | An observable gauge representing the persistent stream pubsub cache size. |
orleans-streams-queue-initialization-failures |
Counter<T> | A count of steam queue initialization failures. |
orleans-streams-queue-initialization-duration |
Counter<T> | A count of steam queue initialization occurrences. |
orleans-streams-queue-initialization-exceptions |
Counter<T> | A count of steam queue initialization exceptions. |
orleans-streams-queue-read-failures |
Counter<T> | A count of steam queue read failures. |
orleans-streams-queue-read-duration |
Counter<T> | A count of steam queue read occurrences. |
orleans-streams-queue-read-exceptions |
Counter<T> | A count of steam queue read exceptions. |
orleans-streams-queue-shutdown-failures |
Counter<T> | A count of steam queue shutdown failures. |
orleans-streams-queue-shutdown-duration |
Counter<T> | A count of steam queue shutdown occurrences. |
orleans-streams-queue-shutdown-exceptions |
Counter<T> | A count of steam queue shutdown exceptions. |
orleans-streams-queue-messages-received |
ObservableCounter<T> | An observable counter representing the number of stream queue messages received. |
orleans-streams-queue-oldest-message-enqueue-age |
ObservableGauge<T> | An observable gauge representing the age of the oldest enqueued message. |
orleans-streams-queue-newest-message-enqueue-age |
ObservableGauge<T> | An observable gauge representing the age of the newest enqueued message. |
orleans-streams-block-pool-total-memory |
ObservableCounter<T> | An observable counter representing the stream block pool total memory in bytes. |
orleans-streams-block-pool-available-memory |
ObservableCounter<T> | An observable counter representing the stream block pool available memory in bytes. |
orleans-streams-block-pool-claimed-memory |
ObservableCounter<T> | An observable counter representing the stream block pool claimed memory in bytes. |
orleans-streams-block-pool-released-memory |
ObservableCounter<T> | An observable counter representing the stream block pool released memory in bytes. |
orleans-streams-block-pool-allocated-memory |
ObservableCounter<T> | An observable counter representing the stream block pool allocated memory in bytes. |
orleans-streams-queue-cache-size |
ObservableCounter<T> | An observable counter representing the stream queue cache size in bytes. |
orleans-streams-queue-cache-length |
ObservableCounter<T> | An observable counter representing the stream queue length. |
orleans-streams-queue-cache-messages-added |
ObservableCounter<T> | An observable counter representing the stream queue messages added. |
orleans-streams-queue-cache-messages-purged |
ObservableCounter<T> | An observable counter representing the stream queue messages purged. |
orleans-streams-queue-cache-memory-allocated |
ObservableCounter<T> | An observable counter representing the stream queue memory allocated. |
orleans-streams-queue-cache-memory-released |
ObservableCounter<T> | An observable counter representing the stream queue memory released. |
orleans-streams-queue-cache-oldest-to-newest-duration |
ObservableGauge<T> | An observable gauge representing the duration from the oldest to the newest stream queue cache. |
orleans-streams-queue-cache-oldest-age |
ObservableGauge<T> | An observable gauge representing the age of the oldest cached message. |
orleans-streams-queue-cache-pressure |
ObservableGauge<T> | An observable gauge representing the pressure on the stream queue cache. |
orleans-streams-queue-cache-under-pressure |
ObservableGauge<T> | An observable gauge representing whether the stream queue cache is under pressure. |
orleans-streams-queue-cache-pressure-contribution-count |
ObservableCounter<T> | An observable counter representing the stream queue cache pressure contributions. |
Transactions
The following table shows a collection of transaction meters used to monitor the Orleans transaction layer.
Meter name | Type | Description |
---|---|---|
orleans-transactions-started |
ObservableCounter<T> | An observable counter representing the number of started transactions. |
orleans-transactions-successful |
ObservableCounter<T> | An observable counter representing the number of successful transactions. |
orleans-transactions-failed |
ObservableCounter<T> | An observable counter representing the number of failed transactions. |
orleans-transactions-throttled |
ObservableCounter<T> | An observable counter representing the number of throttled transactions. |
Prometheus
Various third-party metrics providers are available for use with Orleans. One popular example is Prometheus, which you can use to collect metrics from your app with OpenTelemetry.
To use OpenTelemetry and Prometheus with Orleans, call the following IServiceCollection
extension method:
builder.Services.AddOpenTelemetry()
.WithMetrics(metrics =>
{
metrics
.AddPrometheusExporter()
.AddMeter("Microsoft.Orleans");
});
Important
Both the OpenTelemetry.Exporter.Prometheus and OpenTelemetry.Exporter.Prometheus.AspNetCore NuGet packages are currently in preview as release candidates. They're not recommended for production use.
The AddPrometheusExporter
method ensures the PrometheusExporter
is added to the builder
. Orleans uses a Meter named "Microsoft.Orleans"
to create Counter<T> instances for many Orleans-specific metrics. Use the AddMeter
method to specify the name of the meter to subscribe to, in this case, "Microsoft.Orleans"
.
After configuring the exporter and building your app, call MapPrometheusScrapingEndpoint
on the IEndpointRouteBuilder
(the app
instance) to expose the metrics to Prometheus. For example:
WebApplication app = builder.Build();
app.MapPrometheusScrapingEndpoint();
app.Run();
Distributed tracing
Distributed tracing is a set of tools and practices for monitoring and troubleshooting distributed applications. It's a key component of observability and a critical tool for understanding your app's behavior. Orleans supports distributed tracing with OpenTelemetry.
Regardless of the distributed tracing exporter you choose, call:
- AddActivityPropagation(ISiloBuilder): which enables distributed tracing for the silo.
- AddActivityPropagation(IClientBuilder): which enables distributed tracing for the client.
Referring back to the Orleans GPS Tracker sample app, you can use the Zipkin distributed tracing system to monitor the app by updating Program.cs. To use OpenTelemetry and Zipkin with Orleans, call the following IServiceCollection
extension method:
builder.Services.AddOpenTelemetry()
.WithTracing(tracing =>
{
// Set a service name
tracing.SetResourceBuilder(
ResourceBuilder.CreateDefault()
.AddService(serviceName: "GPSTracker", serviceVersion: "1.0"));
tracing.AddSource("Microsoft.Orleans.Runtime");
tracing.AddSource("Microsoft.Orleans.Application");
tracing.AddZipkinExporter(zipkin =>
{
zipkin.Endpoint = new Uri("http://localhost:9411/api/v2/spans");
});
});
Important
The OpenTelemetry.Exporter.Zipkin NuGet package is currently in preview as a release candidate. It is not recommended for production use.
The Zipkin trace is shown in the Jaeger UI (an alternative to Zipkin that uses the same data format):
For more information, see Distributed tracing.
Orleans outputs its runtime statistics and metrics through the ITelemetryConsumer interface. Your application can register one or more telemetry consumers for its silos and clients to receive statistics and metrics the Orleans runtime periodically publishes. These can be consumers for popular telemetry analytics solutions or custom ones for any other destination and purpose. Three telemetry consumers are currently included in the Orleans codebase.
They're released as separate NuGet packages:
Microsoft.Orleans.OrleansTelemetryConsumers.AI
: For publishing to Azure Application Insights.Microsoft.Orleans.OrleansTelemetryConsumers.Counters
: For publishing to Windows performance counters. The Orleans runtime continually updates many of them. The CounterControl.exe tool, included in theMicrosoft.Orleans.CounterControl
NuGet package, helps register necessary performance counter categories. It must run with elevated privileges. Monitor the performance counters using any standard monitoring tools.Microsoft.Orleans.OrleansTelemetryConsumers.NewRelic
: For publishing to New Relic.
To configure your silo and client to use telemetry consumers, the silo configuration code looks like this:
var siloHostBuilder = new HostBuilder()
.UseOrleans(c =>
{
c.AddApplicationInsightsTelemetryConsumer("INSTRUMENTATION_KEY");
});
The client configuration code looks like this:
var clientBuilder = new ClientBuilder();
clientBuilder.AddApplicationInsightsTelemetryConsumer("INSTRUMENTATION_KEY");
To use a custom-defined TelemetryConfiguration (which might have TelemetryProcessors, TelemetrySinks, etc.), the silo configuration code looks like this:
var telemetryConfiguration = TelemetryConfiguration.CreateDefault();
var siloHostBuilder = new HostBuilder()
.UseOrleans(c =>
{
c.AddApplicationInsightsTelemetryConsumer(telemetryConfiguration);
});
Client configuration code look like this:
var clientBuilder = new ClientBuilder();
var telemetryConfiguration = TelemetryConfiguration.CreateDefault();
clientBuilder.AddApplicationInsightsTelemetryConsumer(telemetryConfiguration);