Simplify multiple-of-element size access to arrays #8627

tautschnig · 2025-04-14T15:26:51Z

Array operations may fall back to byte_extract or byte_update expressions in parts of the code base. Simplify this to index or WITH expressions, respectively, when the offset is known to be a multiple of the array-element size.

Each commit message has a non-empty body, explaining why the change was made.
n/a Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
n/a The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
n/a My commit message includes data points confirming performance improvements (if claimed).
My PR is restricted to a single feature or bugfix.
n/a White-space or formatting changes outside the feature-related changed lines are in commits of their own.

Array operations may fall back to byte_extract or byte_update expressions in parts of the code base. Simplify this to index or WITH expressions, respectively, when the offset is known to be a multiple of the array-element size.

codecov · 2025-04-14T15:54:39Z

Codecov Report

Attention: Patch coverage is 99.09910% with 1 line in your changes missing coverage. Please review.

Project coverage is 80.37%. Comparing base (5dc709d) to head (959a535).

Files with missing lines	Patch %	Lines
src/util/pointer_offset_size.cpp	96.96%	1 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff            @@
##           develop    #8627   +/-   ##
========================================
  Coverage    80.37%   80.37%           
========================================
  Files         1686     1686           
  Lines       206872   206968   +96     
  Branches        73       73           
========================================
+ Hits        166265   166360   +95     
- Misses       40607    40608    +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

remi-delmas-3000 · 2025-04-14T16:28:31Z

src/util/pointer_offset_size.cpp

@@ -687,7 +687,58 @@ std::optional<exprt> get_subexpression_at_offset(
  const auto offset_bytes = numeric_cast<mp_integer>(offset);

  if(!offset_bytes.has_value())


does has_value mean "literal value" ?

remi-delmas-3000 · 2025-04-14T16:30:30Z

src/util/pointer_offset_size.cpp

+      if(
+        !target_size_bits.has_value() || !elem_size_bits.has_value() ||
+        *elem_size_bits <= 0 ||
+        *elem_size_bits % config.ansi_c.char_width != 0 ||


Does this still allow the optimization to apply on arrays of type T where alignof(T) > 8 ?

remi-delmas-3000 · 2025-04-14T16:31:22Z

src/util/pointer_offset_size.cpp

+        return {};
+      }
+
+      if(


bail if expression is not of the form K * i

remi-delmas-3000 · 2025-04-14T17:17:34Z

src/util/simplify_expr.cpp

@@ -1870,7 +1872,7 @@ simplify_exprt::simplify_byte_extract(const byte_extract_exprt &expr)
    expr2bits(expr.op(), expr.id() == ID_byte_extract_little_endian, ns);

  if(
-    bits.has_value() &&
+    offset.has_value() && bits.has_value() &&


What is this doing exactly ? seems like it restricts the optimization to a subset of cases ?

remi-delmas-3000 · 2025-04-14T17:18:19Z

src/util/simplify_expr.cpp

@@ -1986,7 +1988,9 @@ simplify_exprt::simplify_byte_extract(const byte_extract_exprt &expr)
    const array_typet &array_type = to_array_type(expr.op().type());
    const auto &element_bit_width =
      pointer_offset_bits(array_type.element_type(), ns);
-    if(element_bit_width.has_value() && *element_bit_width > 0)
+    if(


again it seems like the optimization would apply less often ?

remi-delmas-3000 · 2025-04-14T17:20:11Z

src/util/simplify_expr.cpp

+        return changed(simplify_rec(result_expr));
+      }
+      else if(
+        offset.id() == ID_mult && offset.operands().size() == 2 &&


this optimizes the encoding of an array update, and the very similar code above optimized a read ?

remi-delmas-3000 · 2025-04-14T17:20:37Z

src/util/pointer_offset_size.cpp

@@ -687,7 +687,58 @@ std::optional<exprt> get_subexpression_at_offset(
  const auto offset_bytes = numeric_cast<mp_integer>(offset);

  if(!offset_bytes.has_value())
-    return {};
+  {


Is this the array read encoding ?

remi-delmas-3000 · 2025-04-14T17:21:27Z

Is there a way to share some of the pattern matching conditions between the array-read and array-write cases ?

remi-delmas-3000 · 2025-04-15T13:16:35Z

Seems like we should also be able to pattern match an access to a member of a struct in an array of structs like a[i].m, the offset would be of the form i*sizeof(T) + offset(T,m) and turn that into a functional update as well ?

Simplify multiple-of-element size access to arrays

959a535

Array operations may fall back to byte_extract or byte_update expressions in parts of the code base. Simplify this to index or WITH expressions, respectively, when the offset is known to be a multiple of the array-element size.

tautschnig requested review from kroening and peterschrammel as code owners April 14, 2025 15:26

remi-delmas-3000 reviewed Apr 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify multiple-of-element size access to arrays #8627

Simplify multiple-of-element size access to arrays #8627

tautschnig commented Apr 14, 2025

codecov bot commented Apr 14, 2025

remi-delmas-3000 Apr 14, 2025

remi-delmas-3000 Apr 14, 2025

remi-delmas-3000 Apr 14, 2025

remi-delmas-3000 Apr 14, 2025

remi-delmas-3000 Apr 14, 2025

remi-delmas-3000 Apr 14, 2025

remi-delmas-3000 Apr 14, 2025

remi-delmas-3000 commented Apr 14, 2025

remi-delmas-3000 commented Apr 15, 2025

		@@ -687,7 +687,58 @@ std::optional<exprt> get_subexpression_at_offset(
		const auto offset_bytes = numeric_cast<mp_integer>(offset);

		if(!offset_bytes.has_value())

Simplify multiple-of-element size access to arrays #8627

Are you sure you want to change the base?

Simplify multiple-of-element size access to arrays #8627

Conversation

tautschnig commented Apr 14, 2025

codecov bot commented Apr 14, 2025

Codecov Report

remi-delmas-3000 Apr 14, 2025

Choose a reason for hiding this comment

remi-delmas-3000 Apr 14, 2025

Choose a reason for hiding this comment

remi-delmas-3000 Apr 14, 2025

Choose a reason for hiding this comment

remi-delmas-3000 Apr 14, 2025

Choose a reason for hiding this comment

remi-delmas-3000 Apr 14, 2025

Choose a reason for hiding this comment

remi-delmas-3000 Apr 14, 2025

Choose a reason for hiding this comment

remi-delmas-3000 Apr 14, 2025

Choose a reason for hiding this comment

remi-delmas-3000 commented Apr 14, 2025

remi-delmas-3000 commented Apr 15, 2025