Skip to content

Latest commit

 

History

History
161 lines (158 loc) · 5.89 KB

Spark_TaskMetrics.md

File metadata and controls

161 lines (158 loc) · 5.89 KB

Apache Spark Task Metrics

This pages describes Spark Executor Task Metrics

Spark executor task metrics provide instrumentation for workload measurements. They are exposed by the Spark WebUI, Spark History server, Spark EventLog file and from the ListenerBus infrastructure. The metrics are provided by each tasks and can be aggregated at higher level )stage level, job level, etc). A short description of the metrics can be found in the Spark Core Executor source code. I sum up here relevant details. See also the work on sparkMeasure for further details.
Note: following work on SPARK-25170 this is now available in the Spark doc: Spark Task Metrics

Spark Executor Task Metric name Short description
executorRunTime Time the executor spent running this task. This includes time fetching shuffle data. The value is expressed in milliseconds.
executorCpuTime CPU Time the executor spent running this task. This includes time fetching shuffle data. The value is expressed in nanoseconds.
executorDeserializeTime Time taken on the executor to deserialize this task. The value is expressed in milliseconds.
executorDeserializeCpuTime CPU Time taken on the executor to deserialize this task. The value is expressed in nanoseconds.
resultSize The number of bytes this task transmitted back to the driver as the TaskResult.
jvmGCTime Amount of time the JVM spent in garbage collection while executing this task. The value is expressed in milliseconds.
resultSerializationTime Amount of time spent serializing the task result. The value is expressed in milliseconds.
memoryBytesSpilled The number of in-memory bytes spilled by this task.
diskBytesSpilled The number of on-disk bytes spilled by this task.
peakExecutionMemory Peak memory used by internal data structures created during shuffles, aggregations and joins. The value of this accumulator should be approximately the sum of the peak sizes across all such data structures created in this task. For SQL jobs, this only tracks all unsafe operators and ExternalSort.
inputMetrics.* Metrics related to reading data from [[org.apache.spark.rdd.HadoopRDD]] or from persisted data.
    .bytesRead Total number of bytes read.
    .recordsRead Total number of records read.
outputMetrics.* Metrics related to writing data externally (e.g. to a distributed filesystem), defined only in tasks with output.
    .bytesWritten Total number of bytes written
    .recordsWritten Total number of records written
shuffleReadMetrics.* Metrics related to shuffle read operations.
    .recordsRead Number of records read in shuffle operations
    .remoteBlocksFetched Number of remote blocks fetched in shuffle operations
    .localBlocksFetched Number of local (as opposed to read from a remote executor) blocks fetched in shuffle operations
    .totalBlocksFetched Number of blocks fetched in shuffle operations (both local and remote)
    .remoteBytesRead Number of remote bytes read in shuffle operations
    .localBytesRead Number of bytes read in shuffle operations from local disk (as opposed to read from a remote executor)
    .totalBytesRead Number of bytes read in shuffle operations (both local and remote)
    .remoteBytesReadToDisk Number of remote bytes read to disk in shuffle operations. Large blocks are fetched to disk in shuffle read operations, as opposed to being read into memory, which is the default behavior.
    .fetchWaitTime Time the task spent waiting for remote shuffle blocks. This only includes the time blocking on shuffle input data. For instance if block B is being fetched while the task is still not finished processing block A, it is not considered to be blocking on block B. The value is expressed in milliseconds.
shuffleWriteMetrics.* Metrics related to operations writing shuffle data.
    .bytesWritten Number of bytes written in shuffle operations
    .recordsWritten Number of records written in shuffle operations
    .writeTime Time spent blocking on writes to disk or buffer cache. The value is expressed in nanoseconds.