Skip to content

Simplify multiple-of-element size access to arrays #8627

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

tautschnig
Copy link
Collaborator

Array operations may fall back to byte_extract or byte_update expressions in parts of the code base. Simplify this to index or WITH expressions, respectively, when the offset is known to be a multiple of the array-element size.

  • Each commit message has a non-empty body, explaining why the change was made.
  • n/a Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • n/a The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • n/a My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • n/a White-space or formatting changes outside the feature-related changed lines are in commits of their own.

Array operations may fall back to byte_extract or byte_update
expressions in parts of the code base. Simplify this to index or WITH
expressions, respectively, when the offset is known to be a multiple of
the array-element size.
Copy link

codecov bot commented Apr 14, 2025

Codecov Report

Attention: Patch coverage is 99.09910% with 1 line in your changes missing coverage. Please review.

Project coverage is 80.37%. Comparing base (5dc709d) to head (959a535).

Files with missing lines Patch % Lines
src/util/pointer_offset_size.cpp 96.96% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #8627   +/-   ##
========================================
  Coverage    80.37%   80.37%           
========================================
  Files         1686     1686           
  Lines       206872   206968   +96     
  Branches        73       73           
========================================
+ Hits        166265   166360   +95     
- Misses       40607    40608    +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@@ -687,7 +687,58 @@ std::optional<exprt> get_subexpression_at_offset(
const auto offset_bytes = numeric_cast<mp_integer>(offset);

if(!offset_bytes.has_value())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does has_value mean "literal value" ?

if(
!target_size_bits.has_value() || !elem_size_bits.has_value() ||
*elem_size_bits <= 0 ||
*elem_size_bits % config.ansi_c.char_width != 0 ||
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this still allow the optimization to apply on arrays of type T where alignof(T) > 8 ?

return {};
}

if(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bail if expression is not of the form K * i

@@ -1870,7 +1872,7 @@ simplify_exprt::simplify_byte_extract(const byte_extract_exprt &expr)
expr2bits(expr.op(), expr.id() == ID_byte_extract_little_endian, ns);

if(
bits.has_value() &&
offset.has_value() && bits.has_value() &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this doing exactly ? seems like it restricts the optimization to a subset of cases ?

@@ -1986,7 +1988,9 @@ simplify_exprt::simplify_byte_extract(const byte_extract_exprt &expr)
const array_typet &array_type = to_array_type(expr.op().type());
const auto &element_bit_width =
pointer_offset_bits(array_type.element_type(), ns);
if(element_bit_width.has_value() && *element_bit_width > 0)
if(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again it seems like the optimization would apply less often ?

return changed(simplify_rec(result_expr));
}
else if(
offset.id() == ID_mult && offset.operands().size() == 2 &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this optimizes the encoding of an array update, and the very similar code above optimized a read ?

@@ -687,7 +687,58 @@ std::optional<exprt> get_subexpression_at_offset(
const auto offset_bytes = numeric_cast<mp_integer>(offset);

if(!offset_bytes.has_value())
return {};
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the array read encoding ?

@remi-delmas-3000
Copy link
Collaborator

Is there a way to share some of the pattern matching conditions between the array-read and array-write cases ?

@remi-delmas-3000
Copy link
Collaborator

Seems like we should also be able to pattern match an access to a member of a struct in an array of structs like a[i].m, the offset would be of the form i*sizeof(T) + offset(T,m) and turn that into a functional update as well ?

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants