Skip to content

What is going on in Memory snapshot

Nikolay Mitikov edited this page Aug 20, 2017 · 4 revisions

Intro

Once memory snapshot is loaded into debugger we can start analysis itself. We will check following in the demo:

  • Last recorded CPU usage %
  • Threads we have, and call stack for each managed thread.
  • If we find any thread working with database, we will try to see what SQL command is being processed and for how long.

We will use the commands from sosex extension mentioned in prev. posts.

Checking CPU usage

The !theadpool command will show us last CPU usage, number of active threads, and callbacks ( if any ) :

threadpool

Finding system threads and CPU time taken by each thread

!Threads -special command will show us all threads, their statuses, and if any exceptions or locks they have.

threads_special

The -special will output the thread role:

threads_special_names

Executing the !runaway command will show how much CPU time taken by each thread ( useful for high CPU usage analysis ):

runaway_time

We can find candidates that are not system threads ( like GC ) that took tons of CPU time, and check their call stacks.

NOTE The time taken is aggregated by all the thread activity performed, thus to have an accurate measure it is better to compare the !runaway output from 2 consecutive memory snapshots collected in ~2 minutes interval, and find maximum growth. This is mentioned in my prev post.

Getting threads callstacks

!eestack -ee command will loop through all the threads printing their callstacks ( what are they doing when a snapshot was taken) :

eestack

And in tons of thread callstacks, we have found SQL-related activity:

sqlrelatedactivity

Switching to target thread and examine internals

Let's switch to 57s thread via ~57s from previous picture, and examine its stack via !mdso command:

mdso

We can see 2 main entities:

  • SqlCommand processed now, the SQL query text will be in _commandText object field ( !mdt 0000006041932830 )
  • TdsParser - (!mdt 000000633eaab3d0) has _connHandler field which holds internal connection information (!mdt 000000633eaab128). It has _createTime property that would show when connection was created, and many other useful information.

Results

By combining an output of SQL Command and TdsInternalConnection we get an understanding which command & for how long is being executed by the time snapshot was generated.

We can optimize our code (if possible) to avoid calling DB thanks to callstack seen.

Same approach is valid for different database engines, however actual objects to inspect should be picked up via analysis of database provider code internals via reverse-engineering tools.

Clone this wiki locally