Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Snap 2919 : Implementation of Structured Streaming UI Tab #184

Merged
merged 17 commits into from
Nov 27, 2019

Conversation

snappy-sachin
Copy link

@snappy-sachin snappy-sachin commented Nov 15, 2019

What changes were proposed in this pull request?

Implementation of the Structure Streaming UI Tab to let users monitor the structured streaming query/application statistics and progress .
Structured Streaming Tab is available both in SnappyData embedded cluster as well as in smart connector application (using Snappy Spark distribution)

Structured Streaming Tab has below capabilities:

  • Listing all Structured Streaming Queries/Applications submitted to SnappyData cluster using submit-job command. Similarly in smart connector this tab will list streaming queries executed in cluster.
  • Allows user selecting queries from left hand side navigation panel, to view details view on right side main query details panel.
  • Query details panel displays selected queries details and statistics, as listed below;
    -- Query Name if provided, Query Id otherwise
    -- Start Date & Time
    -- Up time
    -- Trigger Interval
    -- Batches Processed
    -- Status
    -- Total Input Records
    -- Current Input Rate
    -- Current Processing Rate
    -- Total Batch Processing Time
    -- Avg. Batch Processing Time
  • Query details panel also lists sources of streaming query along with each source details like type, description, input records, input and processing rate
  • Query details panel also displays sink details of streaming query.
  • Query details panel depicts structured streaming queries behaviourial trends using following
    -- Input Records on every batch
    -- Input Rate vs Processing Rate
    -- Processing Time
    -- Aggregation State, if available

Please check JIRA item SNAP-2919 for UI screenshots (https://jira.snappydata.io/browse/SNAP-2919)

How was this patch tested?

  • Tested manually

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

Please review http://spark.apache.org/contributing.html before opening a pull request.

 - Adding streaming page CSS and JavaScript files.
 - Auto-refresh feature.
 - Query stats populating on UI.
 - CSS changes.
 - CSS changes
 - Display Status in different colors
 - Rounding of float values on UI
 - Display numbers with thousands separator mark.
 - Display time unit as ms for milliseconds
 - Display sources table with basic stats.
 - Display sink description
 - Display Aggregation state (state operator) charts.
 - Adding utility function for conversion of duration time in human readable form.
 - JavaScript changes for displaying queries latest input rate and processing rate.
 - Display aggregation states chart only when applicable
 - Removed updated records/rows trend line from aggregation states chart
 - Fixing few other issues
 - Moving below classes/files from SnappyData Core to Spark
   SnappyStreamingQueryListener.scala,
   StreamingRepository.scala,
   SnappyStreamingApiRootResource.scala,
   StreamsInfoResource.scala,
   streamapi.scala,
   SnappyStreamingTab.scala,
   SnappyStructuredStreamingPage.scala
 - Moving Code for updating UI for Structured Streaming tab from SnappySession to SparkSession
 - In spark code, updating QueryStartedEvent signature to include triggerInterval
 - Changes for displaying Processing Threshold line in Stream Processing Time chart on UI
 - Adding Trigger Interval on UI
  - Feedback Question "Is there a case where onQueryprogress is called without onQueryStarted?"
    Ans: As per discussion, YES it can happen in rare situation but no need to handle such case.
      So removing the counter code implementation (on queryProgressEvent) of handling missed queryStartEvent.
      Instead, now just logging warning message for the same.
  - Adding two configurable parameters in sparks SQLConf.scala
      1) spark.sql.streaming.uiRunningQueriesDisplayLimit :
           To configure how many queries be displayed on structure streaming UI.
      2) spark.sql.streaming.uiTrendsMaxSampleSize :
           To configure how many historic data points be plotted on trends charts on structure streaming UI.
  - Handling of query removal in case of uiRunningQueriesDisplayLimit limit is reached.
      Inactive queries are removed if there is no space for newly added running query.
      If all existing queries are active and uiRunningQueriesDisplayLimit limit is reached then
      newly added query won't be displayed on UI.
  - Fixed issue - Query details panel keeps displaying old inactive query details if that query was
      selected before it was removed from query navigation panel.
  - Code refactoring
@snappy-sachin
Copy link
Author

All feedback taken care.. Hence merging.

@snappy-sachin snappy-sachin merged commit 1cdbfb7 into snappy/branch-2.1 Nov 27, 2019
@snappy-sachin snappy-sachin deleted the SNAP-2919 branch November 27, 2019 12:50
sumwale pushed a commit to sumwale/spark that referenced this pull request Nov 5, 2020
…are#184)

Implementation of the Structured Streaming UI Tab which lets users monitor the structured streaming queries/applications statistics and progress .
Structured Streaming Tab is available both in TIBCO ComputeDB/SnappyData embedded cluster as well as in smart connector application (using Snappy Spark distribution)

Structured Streaming Tab has below capabilities:

- Listing all Structured Streaming Queries/Applications submitted to SnappyData cluster using submit-job command. Similarly in smart connector this tab will list streaming queries executed in cluster.
- Allows user selecting queries from left hand side navigation panel, to view details view on right side main query details panel.
- Query details panel displays selected queries details and statistics, as listed below;
  -- Query Name if provided, Query Id otherwise
  -- Start Date & Time
  -- Up time
  -- Trigger Interval
  -- Batches Processed
  -- Status
  -- Total Input Records
  -- Current Input Rate
  -- Current Processing Rate
  -- Total Batch Processing Time
  -- Avg. Batch Processing Time
- Query details panel also lists sources of streaming query along with each source details like type, description, input records, input and processing rate
- Query details panel also displays sink details of streaming query.
- Query details panel depicts structured streaming queries behavioural trends using following
  -- Input Records on every batch
  -- Input Rate vs Processing Rate
  -- Processing Time
  -- Aggregation State, if available
- All statistics displayed on UI are auto updated periodically

- Adding two configurable parameters in sparks SQLConf.scala
      1) spark.sql.streaming.uiRunningQueriesDisplayLimit :
           To configure how many queries be displayed on structure streaming UI.
      2) spark.sql.streaming.uiTrendsMaxSampleSize :
           To configure how many historic data points be plotted on trends charts on structure streaming UI.
sumwale pushed a commit that referenced this pull request Jul 11, 2021
Implementation of the Structured Streaming UI Tab which lets users monitor the structured streaming queries/applications statistics and progress .
Structured Streaming Tab is available both in TIBCO ComputeDB/SnappyData embedded cluster as well as in smart connector application (using Snappy Spark distribution)

Structured Streaming Tab has below capabilities:

- Listing all Structured Streaming Queries/Applications submitted to SnappyData cluster using submit-job command. Similarly in smart connector this tab will list streaming queries executed in cluster.
- Allows user selecting queries from left hand side navigation panel, to view details view on right side main query details panel.
- Query details panel displays selected queries details and statistics, as listed below;
  -- Query Name if provided, Query Id otherwise
  -- Start Date & Time
  -- Up time
  -- Trigger Interval
  -- Batches Processed
  -- Status
  -- Total Input Records
  -- Current Input Rate
  -- Current Processing Rate
  -- Total Batch Processing Time
  -- Avg. Batch Processing Time
- Query details panel also lists sources of streaming query along with each source details like type, description, input records, input and processing rate
- Query details panel also displays sink details of streaming query.
- Query details panel depicts structured streaming queries behavioural trends using following
  -- Input Records on every batch
  -- Input Rate vs Processing Rate
  -- Processing Time
  -- Aggregation State, if available
- All statistics displayed on UI are auto updated periodically

- Adding two configurable parameters in sparks SQLConf.scala
      1) spark.sql.streaming.uiRunningQueriesDisplayLimit :
           To configure how many queries be displayed on structure streaming UI.
      2) spark.sql.streaming.uiTrendsMaxSampleSize :
           To configure how many historic data points be plotted on trends charts on structure streaming UI.
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants