Monitoring CoreCLR


#1

Hello,

Currently most of the CLR monitoring is available through various, numerous performance counters. As Linux does not have performance counters will there be a new facility similar to JMX with which to monitor the runtime?


#2

Hi @ProTip,

It actually turns out that CoreCLR does not support performance counters as they require integration with the OS. In general, we’re actually moving away from performance counters and are using an eventing model instead. This gives us the ability to get detailed information if we need it, or just stick to lightweight information through the right set of events. On Windows we use ETW, and we’re currently looking to use another event based mechanism on Linux. No decision on what that mechanism is yet, but feel free to contribute suggestions if you have experience in this area.

Thanks.
-Brian


#3

Hello @brianrob,

Have there been any candidates? I’m not aware of anything close to ETW on Linux, and certainly nothing mainstream and as ubiquitous. Some may compare to dtrace(which is still a bit obscure on Linux still), however where as ETW is cooperative dtrace actually injects hooks into running programs.

I feel it would be strange indeed for the CLR to rely on a particular eventing system on Linux. To me, the more “Unix” way would be to expose this information through a socket with a light weight communication protocol for controlling the events exposed and their rates(?). There is even a precedent for this style of communication with kernel modules via netlink: http://man7.org/linux/man-pages/man7/netlink.7.html . If I had a wish, or suggestion, it might be to place the eventing system behind a pluggable adapter that could then expose the functionality to ETW, a Unix socket, or even an HTTP/websocket endpoint. This would allow the community to suss out the details of integrating into other systems; and of course I believe an initial sockets implementation would be tops :smile:

-Cheers


#4

Please, do not dumb down the CLR just because it’s now running on toy operating systems.

Live eventing is all about what’s happening now.

Performance counters are about what happened and how much of it. That’s why there are Avg and % performance counters.

I’m currently working on system/service/application monitoring systems and Windows and proper developed Windows applications are very easy to monitor by taking samples of performance counters and event logs.

I’m not saying how it should be implemented, but an “eventing system behind a pluggable adapter” seems to be a good option. Just don’t forget to supply the ETW, EventLog and Performance providers Windows DevOps have been relying on for a long time.


#5

Of course not! But as Brian said: we were simple not happy with PerfCounters on Windows itself, which is why we moved to Event Tracing for Windows (ETW). Even if the other operating systems wouldn’t support eventing, we’d still support it on Windows. In the end, we have to compile the runtime per operating system anyway, so using an OS specific technology isn’t a problem.

However, our general goal is to align the concepts across platformsm so that we can provide a consistent experience when it makes sense.


#6

LTTNG is a high performance linux equivalent for ETW. It would be awesome if CoreCLR could support that.


#7

Just don’t forget that instrumentation is not only about troubleshooting but also about performance management.


#8

Also, keep in mind that EventSource is already pluggable to the extent that it supports EventListeners. Anyone can create a listener that receives the traced data and can do anything with it, so even if EventSource isn’t implicitly hooked up to any backend or to the desired one, it can still be connected to it via an EventListener.


#9

No doubt about that, Stephen.

But, today, I can go to Performance Monitor and see, say, the number of output cacheable requests served from the output cache. I don’t have to do anything to get it.

I never meant that the CoreCLR code should be all sprinkled with performance counter specific code. But lots of people and tools out there are counting on those performance counter no matter how they receive their data. Or even if they are today’s performance counters.


#10

I’ve been investigating instrumentation techniques for CoreCLR (hoping that EventSource / EventListener would be the adopted technology) and came across this thread. It’s pretty old, has there been any progress with CoreCLR monitoring?

Given the prevalence of SOA / MicroServices and CoreCLR’s unique position of running across multiple platforms / containers, it seems a strong monitoring / instrumentation story for CoreCLR is a “must have” feature.


#11

I’d suggest looking into System.Diagnostics.DiagnosticSource. Lots of code inside the Framework log statistics to here. (see https://github.com/aspnet/Mvc/blob/dev/src/Microsoft.AspNetCore.Mvc.Core/Internal/MvcRouteHandler.cs for an example)


#12

Hi @brianrob,

Are there any updates or details on this? Now that we’ve moved the majority of our code based into core/docker the thing that I feel is missing is an out-of-process trace solution. So needless to say, I’m definitely interested in whatever solutions are being worked on.

Thanks,
Mike


#13

It would be good to remember there are more than two kernels out there. The starting point would be to call “linux” what it is: GNU/Linux.

+1 for dtrace.


#14

Would seem pretty stupid to use ETW , since it never will be multi platform forcing 2 solutions. …

That said would love to get a provider that feeds ETW into Logging so you can move it to other machines - another major feature ETW is missing . RDP yeah great when you have a 100 machine cluster and your trying to find what went wrong.


#15

Hi guys. Any update?


#16

Here’s article from today on the topic: http://blogs.microsoft.co.il/sasha/2017/03/30/tracing-runtime-events-in-net-core-on-linux/
Found on Twitter:
https://twitter.com/goldshtn/status/847426411820269574

It is similar to perf tooling: https://github.com/dotnet/coreclr/blob/master/Documentation/project-docs/linux-performance-tracing.md


#17

Thanks Sasha.

I am desparate on this and I hoped after more than 2 years we would have had an answer. Any plans to impl Sash’s suggestion into dotnetcore?


#18

Which Sasha’s suggestion do you expect to be implemented in dotnetcore?

We are tracking CoreCLR documentation improvements here: https://github.com/dotnet/coreclr/issues/10600
BTW: I just added there some hints about ongoing development in the space.

Please also take into account that Linux tracing mechanism is vastly different from Windows (and more challenging), that .NET Core 1.0 RTM was releases “just” in 2016/7 (<1y) and that our biggest focus right now is .NET Core 2.0, addressing the largest gap and adoption blocker of the platform - the missing APIs for easier migration from Desktop.


.NET Foundation Website | Blog | Projects | Code of Conduct