Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Move QuantumCircuit.assign_parameters to Rust #12794

Merged
merged 9 commits into from
Aug 1, 2024

Conversation

jakelishman
Copy link
Member

@jakelishman jakelishman commented Jul 19, 2024

Summary

This is (as far as I could tell), the last really major performance regression in our asv suite compared to 1.1.0, so with this commit, we should be at not worse for important utility-scale benchmarks.

This largely rewrites ParamTable (renamed back to ParameterTable because I kept getting confused with Param) to have more Rust-friendly interfaces available, so that assign_parameters can then use them.

This represents a 2-3x speedup in assign_parameters performance over 1.1.0, when binding simple Parameter instances. Approximately 75% of the time is now spent in Python-space Parameter.assign and ParameterExpression.numeric calls; almost all of this could be removed were we to move Parameter and ParameterExpression to have their data exposed directly to Rust space. The percentage of time spent in Python space only increases if the expressions to be bound are actual ParameterExpressions and not just Parameter.

Most changes in the test suite are because of the use of internal-only methods that changed with the new ParameterTable. The only discrepancy is a bug in test_pauli_feature_map, which was trying to assign using a set.

Details and comments

Built on #12730, so will need rebasing over it.

I think this first commit might accidentally have introduced a small (10%) regression to parametric-circuit construction time over its parent. That's a mistake if so - I should be able to fix that later. edit: on retiming, I couldn't reproduce a problem - if anything, this commit is a minor improvement.

Timings for parametric circuit benchmarks compared to 1.1.0 (the different SHA1 is because I hadn't written the commit message when I took the benchmark):

Benchmarks that have improved:

| Change   | Before [7d29dc1b] <1.1.0^0>   | After [f391e6d4]    | Ratio   | Benchmark (Parameter)                                                                                           |
|----------|-------------------------------|---------------------|---------|-----------------------------------------------------------------------------------------------------------------|
| -        | 1.50±0.05ms                   | 589±40μs            | 0.39    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 128, 128)                               |
| -        | 1.19±0.02ms                   | 427±10μs            | 0.36    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 128, 8)                                 |
| -        | 1.16±0.04s                    | 358±4ms             | 0.31    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 131072, 128)                            |
| -        | 1.52±0.02s                    | 634±8ms             | 0.42    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 131072, 131072)                         |
| -        | 1.07±0.02s                    | 350±5ms             | 0.33    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 131072, 2048)                           |
| -        | 1.13±0.01s                    | 410±3ms             | 0.36    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 131072, 32768)                          |
| -        | 1.12±0.01s                    | 370±2ms             | 0.33    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 131072, 8)                              |
| -        | 1.06±0.01s                    | 362±2ms             | 0.34    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 131072, 8192)                           |
| -        | 15.6±0.2ms                    | 5.37±0.2ms          | 0.34    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 2048, 128)                              |
| -        | 22.4±1ms                      | 7.90±0.2ms          | 0.35    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 2048, 2048)                             |
| -        | 15.2±0.6ms                    | 5.26±0.3ms          | 0.35    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 2048, 8)                                |
| -        | 270±6ms                       | 85.5±2ms            | 0.32    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 32768, 128)                             |
| -        | 264±10ms                      | 86.4±1ms            | 0.33    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 32768, 2048)                            |
| -        | 370±10ms                      | 147±3ms             | 0.40    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 32768, 32768)                           |
| -        | 271±10ms                      | 90.9±2ms            | 0.34    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 32768, 8)                               |
| -        | 274±3ms                       | 98.1±0.8ms          | 0.36    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 32768, 8192)                            |
| -        | 347±5μs                       | 144±8μs             | 0.42    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 8, 8)                                   |
| -        | 63.5±2ms                      | 21.6±1ms            | 0.34    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 8192, 128)                              |
| -        | 67.1±2ms                      | 23.8±0.6ms          | 0.35    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 8192, 2048)                             |
| -        | 61.6±2ms                      | 21.4±0.6ms          | 0.35    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 8192, 8)                                |
| -        | 87.3±2ms                      | 34.3±0.6ms          | 0.39    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 8192, 8192)                             |
| -        | 3.84±0.02ms                   | 2.02±0.1ms          | 0.53    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 128, 128)       |
| -        | 3.53±0.4ms                    | 1.38±0.1ms          | 0.39    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 128, 8)         |
| -        | 2.57±0.01s                    | 995±10ms            | 0.39    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 131072, 128)    |
| -        | 3.52±0.06s                    | 1.85±0.02s          | 0.53    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 131072, 131072) |
| -        | 2.64±0.06s                    | 1.03±0.02s          | 0.39    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 131072, 2048)   |
| -        | 2.84±0.02s                    | 1.23±0.01s          | 0.43    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 131072, 32768)  |
| -        | 2.62±0.05s                    | 998±10ms            | 0.38    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 131072, 8)      |
| -        | 2.63±0.03s                    | 1.09±0.02s          | 0.41    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 131072, 8192)   |
| -        | 45.1±1ms                      | 16.5±1ms            | 0.37    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 2048, 128)      |
| -        | 50.9±0.7ms                    | 29.4±0.7ms          | 0.58    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 2048, 2048)     |
| -        | 38.7±0.3ms                    | 15.4±0.6ms          | 0.40    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 2048, 8)        |
| -        | 640±20ms                      | 247±2ms             | 0.39    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 32768, 128)     |
| -        | 650±7ms                       | 273±2ms             | 0.42    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 32768, 2048)    |
| -        | 867±20ms                      | 468±9ms             | 0.54    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 32768, 32768)   |
| -        | 672±30ms                      | 248±2ms             | 0.37    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 32768, 8)       |
| -        | 703±10ms                      | 313±6ms             | 0.45    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 32768, 8192)    |
| -        | 873±10μs                      | 501±40μs            | 0.57    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 8, 8)           |
| -        | 159±5ms                       | 66.4±1ms            | 0.42    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 8192, 128)      |
| -        | 174±2ms                       | 77.1±0.5ms          | 0.44    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 8192, 2048)     |
| -        | 156±4ms                       | 64.1±1ms            | 0.41    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 8192, 8)        |
| -        | 210±2ms                       | 113±1ms             | 0.54    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 8192, 8192)     |

@jakelishman jakelishman added performance Changelog: None Do not include in changelog Rust This PR or issue is related to Rust code in the repository mod: circuit Related to the core of the `QuantumCircuit` class or the circuit library labels Jul 19, 2024
@qiskit-bot
Copy link
Collaborator

One or more of the following people are relevant to this code:

  • @Cryoris
  • @Qiskit/terra-core
  • @ajavadia
  • @kevinhartman
  • @mtreinish

@coveralls
Copy link

coveralls commented Jul 19, 2024

Pull Request Test Coverage Report for Build 10188851464

Details

  • 485 of 513 (94.54%) changed or added relevant lines in 5 files are covered.
  • 24 unchanged lines in 5 files lost coverage.
  • Overall coverage increased (+0.02%) to 89.724%

Changes Missing Coverage Covered Lines Changed/Added Lines %
qiskit/circuit/quantumcircuit.py 15 16 93.75%
qiskit/qasm3/exporter.py 2 3 66.67%
crates/circuit/src/parameter_table.rs 186 198 93.94%
crates/circuit/src/circuit_data.rs 255 269 94.8%
Files with Coverage Reduction New Missed Lines %
crates/accelerate/src/two_qubit_decompose.rs 1 90.61%
qiskit/circuit/library/standard_gates/u.py 3 93.07%
qiskit/circuit/quantumcircuit.py 4 93.54%
crates/qasm2/src/lex.rs 4 91.73%
crates/qasm2/src/parse.rs 12 96.69%
Totals Coverage Status
Change from base Build 10186441336: 0.02%
Covered Lines: 67261
Relevant Lines: 74964

💛 - Coveralls

@mtreinish mtreinish added this to the 1.2.0 milestone Jul 19, 2024
This is (as far as I could tell), the last really major performance
regression in our asv suite compared to 1.1.0, so with this commit, we
should be at _not worse_ for important utility-scale benchmarks.

This largely rewrites `ParamTable` (renamed back to `ParameterTable`
because I kept getting confused with `Param`) to have more Rust-friendly
interfaces available, so that `assign_parameters` can then use them.

This represents a 2-3x speedup in `assign_parameters` performance over
1.1.0, when binding simple `Parameter` instances.  Approximately 75% of
the time is now spent in Python-space `Parameter.assign` and
`ParameterExpression.numeric` calls; almost all of this could be removed
were we to move `Parameter` and `ParameterExpression` to have their data
exposed directly to Rust space.  The percentage of time spent in Python
space only increases if the expressions to be bound are actual
`ParameterExpression`s and not just `Parameter`.

Most changes in the test suite are because of the use of internal-only
methods that changed with the new `ParameterTable`.  The only
discrepancy is a bug in `test_pauli_feature_map`, which was trying to
assign using a set.
@mtreinish
Copy link
Member

I've tagged this as high priority because there are some bugs in parameter binding that @woodsp-ibm identified on main that this branch fixes. So we shouldn't tag 1.2.0rc1 without this included.

@jakelishman
Copy link
Member Author

jakelishman commented Jul 26, 2024

Oh, what bugs?

edit: the point being: let's add unit tests.

@woodsp-ibm
Copy link
Member

I linked the algorithms PR I have which was to address issues arising when testing against Qsikit main that showed the bug being referred to. In that PR one was due to use of some internals that are no longer there, the other, where I had added a .data copy as a workaround was the bug being referred to. I had some standalone code as a sample if its of help - I showed it to Matthew when discussing the issue so he has it too.

@mtreinish
Copy link
Member

The example Steve shared was:

from qiskit.circuit import QuantumCircuit, Parameter, ClassicalRegister, QuantumRegister

from qiskit.transpiler.passes import TranslateParameterizedGates

SUPPORTED_GATES = [
    "rx",
    "ry",
    "rz",
    "rzx",
    "rzz",
    "ryy",
    "rxx",
    "cx",
    "cy",
    "cz",
    "ccx",
    "swap",
    "iswap",
    "h",
    "t",
    "s",
    "sdg",
    "x",
    "y",
    "z",
]

a = Parameter("a")
qc = QuantumCircuit(1)
qc.h(0)
qc.p(a, 0)
qc.h(0)

print(qc)
print(qc.parameters)

translator = TranslateParameterizedGates(SUPPORTED_GATES)
qc = translator(qc)

print(qc)
print(qc.parameters)


qc1 = qc.copy()
qr_aux = QuantumRegister(1, "qr_aux")
cr_aux = ClassicalRegister(1, "cr_aux")
qc1.add_register(qr_aux)
qc1.add_register(cr_aux)
qc1.h(qr_aux)
qc1.data.insert(0, qc1.data.pop()) # This line seems to mess things up

print(qc1)
print(qc1.parameters)

cct = qc1.assign_parameters({a: 0.44})

print(cct)
print(cct.parameters)

on main it looked like the final circuit printed was not binding a on the correct index in CircuitData in the rust code. So Parameter('a') was still in the params field on the first PackedInstruction/CircuitInstruction but the parameter table didn't have it anymore and said it was bound. I assume it was something around the pop() or insert() where there was an edge case in the tracking/reindexing of the parameter table on main. But this fixes it as part of your larger refactoring. We probably should add a test that does something like this to make sure we don't regress.

@jakelishman
Copy link
Member Author

Yeah, I'm pretty sure I know exactly where that bug had come in, and I remember moving a line of code that I imagine is what fixed it. In the insert code, I think the new data is written in before we called untrack on it, so it untracked the wrong thing.

This catches a bug that was present in the parent commit, but this PR
fixes.
@jakelishman
Copy link
Member Author

I added a test that catches Steve's case in 1cd8d08. It turned out to be a bit more complex than I'd originally thought - it's a particular interaction with the parametric global-phase tracking in addition to the pop/insert retracking required, and somewhere along the line, that was causing the mess-up.

Copy link
Contributor

@Cryoris Cryoris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment about function naming and otherwise only minuscule comments and questions 🙂

crates/circuit/src/parameter_table.rs Show resolved Hide resolved
crates/circuit/src/parameter_table.rs Show resolved Hide resolved
crates/circuit/src/parameter_table.rs Outdated Show resolved Hide resolved
crates/circuit/src/circuit_data.rs Outdated Show resolved Hide resolved
}

/// Backup entry point for appending an instruction from Python space, in the unusual case that
/// one of the instruction parameters contains a cyclical reference to the circuit itself.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make sure I understand: you mean something like this with cyclical?

circuit = # some circuit with free parameters
circuit.append(circuit.to_gate(), circuit.qubits)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually most copied from the original source - I just rearranged the functions a bit. Matt originally wrote it, and I think the original problem was somehow we had a test that had a control-flow operation that put the same circuit to be one of its blocks?? I don't entirely remember.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't remember exactly where it was happening, but there was a case with a control flow op was causing an error in the runtime borrow checking because the same CircuitData object was being passed in here.

crates/circuit/src/circuit_data.rs Outdated Show resolved Hide resolved
@@ -160,172 +164,73 @@ impl CircuitData {
#[cfg(feature = "cache_pygates")]
py_op: RefCell::new(None),
});
res.track_instruction_parameters(py, res.data.len() - 1)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is technically an oversight because I guess it's possible for someone in rust space to create a parameter object and use that here. But, in practice I didn't think anyone would. Do you think it's worth the overhead of doing this for the rare case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think as long as we're accepting general Param instances, I think it's a mistake not to. At the moment, the way we use it makes it very unlikely for the Params to be anything other than floats, but I don't think it'd be too hard to have someone call this on a circuit they got from Python space after applying some filter / join of the iterables or something.

The ParameterTable isn't a public field (and imo absolutely shouldn't be), so unless the typing of this function is changed so that it only accepts f64 as the params, I think we need to do this.

The overhead shouldn't be much, since we can determine that it's a no-op without any Python-space calls.

crates/circuit/src/circuit_data.rs Outdated Show resolved Hide resolved
}

/// Backup entry point for appending an instruction from Python space, in the unusual case that
/// one of the instruction parameters contains a cyclical reference to the circuit itself.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't remember exactly where it was happening, but there was a case with a control flow op was causing an error in the runtime borrow checking because the same CircuitData object was being passed in here.

@jakelishman
Copy link
Member Author

Ok, comments should be addressed, and the AnnotatedOperation problems should be fixed by 83da2f4.

Fwiw they've been pre-existing in some form or another ever since AnnotatedOperation was first introduced.

@mtreinish mtreinish added the stable backport potential The bug might be minimal and/or import enough to be port to stable label Aug 1, 2024
Copy link
Member

@mtreinish mtreinish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This lgtm, I had a few questions/musings inline but nothing worth blocking over. Thanks for doing this, it's a really nice improvement both in the organization of the rust space data structures for dealing with parameters but also the runtime improvements.

op.getattr(params_attr)?.set_item(parameter, new_param)?;
let mut new_op = op.extract::<OperationFromPython>()?;
previous.op = new_op.operation;
previous.params_mut().swap_with_slice(&mut new_op.params);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know about this method, I like it. It's uses are a bit niche, but when you need it like here it is handy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I learned about when writing this as well. It felt like a method that would exist in Rust, so I had a quick look.

Comment on lines +87 to +93
impl<'py> FromPyObject<'py> for ParameterUuid {
fn extract_bound(ob: &Bound<'py, PyAny>) -> PyResult<Self> {
if ob.is_exact_instance(UUID.get_bound(ob.py())) {
ob.getattr(intern!(ob.py(), "int"))?.extract().map(Self)
} else {
Err(PyTypeError::new_err("not a UUID"))
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to make a note for the future that pyo3 has an open PR adding uuid conversion support for:

https://docs.rs/uuid/latest/uuid/

that might be something we want to leverage in the future here.

Comment on lines +256 to +257
self.order.reserve(self.by_uuid.len());
self.order.extend(self.by_uuid.keys());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extend() doesn't do the allocation all at once for us?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's possible that this is left over from some point when the internal data structures were different, and I was getting the keys from something that didn't necessarily have an accurate size_hint. I don't entirely remember.

self.uuid_map.insert(uuid, parameter);
self.order.reserve(self.by_uuid.len());
self.order.extend(self.by_uuid.keys());
self.order.sort_unstable_by_key(|uuid| {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there is a threshold where rayon's par_sort_unstable_by_key() would be useful here. We can look at that in the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quite possibly yes, unless it interferes with the fastest paths for data that's already sorted. The cases we need this to have the highest performance correspond to IBM backends' fast parametric updates, where all the parameters are likely to be in the circuit in order to begin with.

# During normalisation, be sure to reference 'parameters' and related things from 'self' not
# 'target' so we can take advantage of any caching we might be doing.
if isinstance(parameters, dict):
if isinstance(parameters, collections.abc.Mapping):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there cases people are passing dict likes to assign_parameters? Also I assume you're not worried about the extra runtime overhead of the registration hooks of the collections types here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, and in fact it would have been impossible before because of the type hint. I'd loosened this because a) that matches the documentation and b) I think I was passing the ParameterBindsDict custom object recursively into subcalls at some point (though I think I changed the logic subsequently).

@mtreinish mtreinish added this pull request to the merge queue Aug 1, 2024
Merged via the queue into Qiskit:main with commit a68de4f Aug 1, 2024
15 checks passed
mergify bot pushed a commit that referenced this pull request Aug 1, 2024
* Move `QuantumCircuit.assign_parameters` to Rust

This is (as far as I could tell), the last really major performance
regression in our asv suite compared to 1.1.0, so with this commit, we
should be at _not worse_ for important utility-scale benchmarks.

This largely rewrites `ParamTable` (renamed back to `ParameterTable`
because I kept getting confused with `Param`) to have more Rust-friendly
interfaces available, so that `assign_parameters` can then use them.

This represents a 2-3x speedup in `assign_parameters` performance over
1.1.0, when binding simple `Parameter` instances.  Approximately 75% of
the time is now spent in Python-space `Parameter.assign` and
`ParameterExpression.numeric` calls; almost all of this could be removed
were we to move `Parameter` and `ParameterExpression` to have their data
exposed directly to Rust space.  The percentage of time spent in Python
space only increases if the expressions to be bound are actual
`ParameterExpression`s and not just `Parameter`.

Most changes in the test suite are because of the use of internal-only
methods that changed with the new `ParameterTable`.  The only
discrepancy is a bug in `test_pauli_feature_map`, which was trying to
assign using a set.

* Add unit test of parameter insertion

This catches a bug that was present in the parent commit, but this PR
fixes.

* Update crates/circuit/src/imports.rs

* Fix assignment to `AnnotatedOperation`

* Rename `CircuitData::num_params` to match normal terminology

* Fix typos and 🇺🇸

* Fix lint

(cherry picked from commit a68de4f)
@jakelishman jakelishman deleted the rust-assign-parameters branch August 1, 2024 11:49
github-merge-queue bot pushed a commit that referenced this pull request Aug 1, 2024
* Move `QuantumCircuit.assign_parameters` to Rust

This is (as far as I could tell), the last really major performance
regression in our asv suite compared to 1.1.0, so with this commit, we
should be at _not worse_ for important utility-scale benchmarks.

This largely rewrites `ParamTable` (renamed back to `ParameterTable`
because I kept getting confused with `Param`) to have more Rust-friendly
interfaces available, so that `assign_parameters` can then use them.

This represents a 2-3x speedup in `assign_parameters` performance over
1.1.0, when binding simple `Parameter` instances.  Approximately 75% of
the time is now spent in Python-space `Parameter.assign` and
`ParameterExpression.numeric` calls; almost all of this could be removed
were we to move `Parameter` and `ParameterExpression` to have their data
exposed directly to Rust space.  The percentage of time spent in Python
space only increases if the expressions to be bound are actual
`ParameterExpression`s and not just `Parameter`.

Most changes in the test suite are because of the use of internal-only
methods that changed with the new `ParameterTable`.  The only
discrepancy is a bug in `test_pauli_feature_map`, which was trying to
assign using a set.

* Add unit test of parameter insertion

This catches a bug that was present in the parent commit, but this PR
fixes.

* Update crates/circuit/src/imports.rs

* Fix assignment to `AnnotatedOperation`

* Rename `CircuitData::num_params` to match normal terminology

* Fix typos and 🇺🇸

* Fix lint

(cherry picked from commit a68de4f)

Co-authored-by: Jake Lishman <jake.lishman@ibm.com>
Procatv pushed a commit to Procatv/qiskit-terra-catherines that referenced this pull request Aug 1, 2024
* Move `QuantumCircuit.assign_parameters` to Rust

This is (as far as I could tell), the last really major performance
regression in our asv suite compared to 1.1.0, so with this commit, we
should be at _not worse_ for important utility-scale benchmarks.

This largely rewrites `ParamTable` (renamed back to `ParameterTable`
because I kept getting confused with `Param`) to have more Rust-friendly
interfaces available, so that `assign_parameters` can then use them.

This represents a 2-3x speedup in `assign_parameters` performance over
1.1.0, when binding simple `Parameter` instances.  Approximately 75% of
the time is now spent in Python-space `Parameter.assign` and
`ParameterExpression.numeric` calls; almost all of this could be removed
were we to move `Parameter` and `ParameterExpression` to have their data
exposed directly to Rust space.  The percentage of time spent in Python
space only increases if the expressions to be bound are actual
`ParameterExpression`s and not just `Parameter`.

Most changes in the test suite are because of the use of internal-only
methods that changed with the new `ParameterTable`.  The only
discrepancy is a bug in `test_pauli_feature_map`, which was trying to
assign using a set.

* Add unit test of parameter insertion

This catches a bug that was present in the parent commit, but this PR
fixes.

* Update crates/circuit/src/imports.rs

* Fix assignment to `AnnotatedOperation`

* Rename `CircuitData::num_params` to match normal terminology

* Fix typos and 🇺🇸

* Fix lint
jakelishman added a commit to jakelishman/qiskit-terra that referenced this pull request Oct 16, 2024
When calling `assign_parameters` on a heavily parametric circuit with
the unusual access pattern of binding off a single parameter at a time,
Qiskit 1.2 had a severe performance regression compared to Qiskit 1.1
stemming from Qiskitgh-12794.  The calls to `unsorted_parameters` on each
iteration were creating a new `set`, which could be huge if the number
of parameters in the circuit was large.  In Qiskit 1.1 and before, that
object was a direct view onto the underlying `ParameterTable` (assuming
the input circuit did not have a parametric global phase), so was free
to construct.
github-merge-queue bot pushed a commit that referenced this pull request Nov 1, 2024
…s` (#13337)

* Fix performance regression in looped `QuantumCircuit.assign_parameters`

When calling `assign_parameters` on a heavily parametric circuit with
the unusual access pattern of binding off a single parameter at a time,
Qiskit 1.2 had a severe performance regression compared to Qiskit 1.1
stemming from gh-12794.  The calls to `unsorted_parameters` on each
iteration were creating a new `set`, which could be huge if the number
of parameters in the circuit was large.  In Qiskit 1.1 and before, that
object was a direct view onto the underlying `ParameterTable` (assuming
the input circuit did not have a parametric global phase), so was free
to construct.

* Improve documentation
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Changelog: None Do not include in changelog mod: circuit Related to the core of the `QuantumCircuit` class or the circuit library performance priority: high Rust This PR or issue is related to Rust code in the repository stable backport potential The bug might be minimal and/or import enough to be port to stable
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants