Skip to content
This repository has been archived by the owner on Jan 4, 2023. It is now read-only.

Refactoring of getters/setters/offset methods of types #197

Merged
merged 28 commits into from
Oct 25, 2022

Conversation

xgreenx
Copy link
Contributor

@xgreenx xgreenx commented Oct 5, 2022

Problem

In the fuel-tx and fuel-vm we have cases where we do the same logic for different variants of Transaction because the fields are the same and processing rules are the same. We have many places requiring a different set of common fields.

During work on #193, I faced a problem with getters and setters of Transaction.

Previously, almost all fields of Transaction::Script were available in the Transaction::Create. It is why Transaction had getters/setters/offsets for all fields. It was simpler to write the code like transaction.inputs() or transaction.inputs_offset() and write common logic . Only several fields like script and bytecode returned the Option type because if the type is Transaction::Script, it doesn't contain the bytecode field and vice versa, so we need to write specific logic for those fields.

With the introduction of Transaction::Mint, almost all fields except outputs are optional.

#[derive(Debug, Clone, PartialEq, Eq, Hash)]
#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
pub enum Transaction {
    Script {
        gas_price: Word,
        gas_limit: Word,
        maturity: Word,
        receipts_root: Bytes32,
        script: Vec<u8>,
        script_data: Vec<u8>,
        inputs: Vec<Input>,
        outputs: Vec<Output>,
        witnesses: Vec<Witness>,
        metadata: Option<Metadata>,
    },

    Create {
        gas_price: Word,
        gas_limit: Word,
        maturity: Word,
        bytecode_length: Word,
        bytecode_witness_index: u8,
        salt: Salt,
        storage_slots: Vec<StorageSlot>,
        inputs: Vec<Input>,
        outputs: Vec<Output>,
        witnesses: Vec<Witness>,
        metadata: Option<Metadata>,
    },
    
    Mint {
        outputs: Vec<Output>,
        metadata: Option<Metadata>,
    },
}

fuel-vm problem example

#196 is implementing the change to return Result(but it may be better to return Option) for each field in the Transaction. It already added some unwrap complexity into tests, but I tried to minimize it.

After, I started applying this change to fuel-vm, creating a lot of unwrap there.
The code doesn't look good=)

image

validate_without_signature_internal problem example

We can't write a generic code if methods are implemented only for Transaction or inner structures without traits. As an example validate_without_signature_internal method. The function checks that all common fields of Create and Script are valid(according to the rules in the spec). We must pass each field as an argument if we can't write a generic code.

The signature of the validate_without_signature_internal:
image

And is how to use this function with the current code:
image

But if methods are implemented for the inner struct, we will have the same problem. So it is better to move the implementation of getters/setters/offsets into traits.

Proposed solution

Move the anonymous enum variants into typed structures.

#[derive(Debug, Clone, PartialEq, Eq, Hash)]
#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
pub enum Transaction {
    Script(Script),
    Create(Create),
    Mint(Mint),
}

#[derive(Debug, Clone, PartialEq, Eq, Hash)]
#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
pub struct Script {
    gas_price: Word,
    gas_limit: Word,
    maturity: Word,
    receipts_root: Bytes32,
    script: Vec<u8>,
    script_data: Vec<u8>,
    inputs: Vec<Input>,
    outputs: Vec<Output>,
    witnesses: Vec<Witness>,
    metadata: Option<Metadata>,
}

// The same for `Create` and `Mint`
...

Introduce new traits for each field, something like:

pub trait InputsField {
    fn inputs(&self) -> &Vec<Input>;
    fn inputs_mut(&self) -> &mut Vec<Input>;
    fn inputs_offset(&self) -> usize;
}

pub trait OutputsField {
    fn outputs(&self) -> &Vec< Output >;
    fn outputs_mut(&self) -> &mut Vec<Output>;
    fn outputs_offset(&self) -> usize;
}

// ... Traits for other fields

Implement those traits for each new struct Create, Script, Mint.
Because we have the same problem with the Input and Output types, we can apply the same strategy as for Transaction to them.

Pros

Separate types allow as avoid getters/setters/offsets with Option/Result return type and cast to exact type.

Traits allow writing a generic code or code that returns a dyn object. And in the code, you can have generics T: ExecutableTransaction or cast Transaction into &dyn ExecutableTransaction and work with fields without Option.

Separate trait for each field allows composing traits you need based on your logic. For example, you can create your trait with super traits like:

pub trait ExecutableTransaction: InputsField + OutputField + ... + {}

The example with validate_without_signature_internal will look like:
image

image

The same can be applied for Input type, and we can have a trait CoinFields and MessageFields and have one function to_coin(&self) -> Option<&dyn CointFields>.

Also, it can simplify tests because TransactionBuilder can be TransactionBuilder<T> where T may be Script, Create, or Mint. So the builder will return a concrete type than the Transaction enum.

Cons

  • It is a significant breaking change for API, so it will cause some adaption in other repositories.
  • Feature definition and implementation will look cumbersome(it will use macro_rule but it still is a lot of code).

@xgreenx xgreenx self-assigned this Oct 5, 2022
@vlopes11
Copy link
Contributor

vlopes11 commented Oct 6, 2022

The change is definitely positive. I discussed several times about converting from enum into concrete structs, but never really had the chance. Consider also applying the same strategy for inputs/outputs (ie transform the variants into concrete structs and have a single enum Input that will hold the variants.

I would refrain however from creating a trait for InputsField and OutputsField because this logic is too simple to justify the increased type complexity/requirements. All the rest is a clear improvement!

@freesig
Copy link

freesig commented Oct 7, 2022

Would it be better to just impl the methods on the internal structs instead of using a trait?
Also is it really necessary to have everything in one enum? Instead we could de-serialize directly into the concrete inner types depending on which variant it is.
Another way to ask this question is are there any places we actually need to store all of these types within a single collection? Because if not then the enum is really not needed.

The issue with using an enum is that it's always a runtime value and though the above approach does make things cleaner it also just moves the potential panic inside the trait impl.

@xgreenx
Copy link
Contributor Author

xgreenx commented Oct 7, 2022

I would refrain however from creating a trait for InputsField and OutputsField because this logic is too simple to justify the increased type complexity/requirements. All the rest is a clear improvement!

I meant not traits only for inputs and outputs fields. I meant a separate trait for each field. So if the struct has six fields, we will create six traits. The trait has ref and mut ref getters and `offset functions.

We need traits to be able to write a generic code. I agree that it will increase the complexity of the definition. As an alternative, we can create traits only for common fields. Something like:

pub trait CreateAndScriptFields {
    fn gas_price(&self) -> &Word;
    fn gas_price_mut(&self) -> &mut Word;
    fn gas_price_offset(&self) -> usize;
    
    fn gas_limit(&self) -> &Word;
    fn gas_limit_mut(&self) -> &mut Word;
    fn gas_limit_offset(&self) -> usize;
    
    fn maturity(&self) -> &Word;
    fn maturity_mut(&self) -> &mutWord;
    fn maturity_offset(&self) -> usize;
    
    fn inputs(&self) -> &Vec<Input>;
    fn inputs_mut(&self) -> &mut Vec<Input>;
    fn inputs_offset(&self) -> usize;
    
    fn outputs(&self) -> &Vec<Output>;
    fn outputs_mut(&self) -> &mut Vec<Output>;
    fn outputs_offset(&self) -> usize;
    
    fn witnesses(&self) -> &Vec<Witness>;
    fn witnesses_mut(&self) -> &mut Vec<Witness>;
    fn witnesses_offset(&self) -> usize;
}

But it means that fuel-tx decides how to combine fields in the traits, and other crates should follow fuel-tx's rules. Also, it can cause the creation of different duplicating traits with different combinations of fields.

Moving each field to a separate trait allows us to provide only implementation, and how to use it should be defined by the crate that uses it.

@xgreenx
Copy link
Contributor Author

xgreenx commented Oct 7, 2022

Would it be better to just impl the methods on the internal structs instead of using a trait?
Also is it really necessary to have everything in one enum? Instead we could de-serialize directly into the concrete inner types depending on which variant it is.

I'm proposing to implement those methods directly only for the inner structs(the same), not for Transaction as it was before=)
Cast the type into the inner struct and pass it into a generic code that will use different combinations of fields' traits(Check the example with validate_without_signature_internal).

Another way to ask this question is are there any places we actually need to store all of these types within a single collection? Because if not then the enum is really not needed.

Input and Output are used with vector. Transaction too, the block has a vector of Transaction. + It is our representation of the transaction defined in the specs,

The issue with using an enum is that it's always a runtime value and though the above approach does make things cleaner it also just moves the potential panic inside the trait impl.

No, we don't have any panics inside because we work directly with the type that implements fields' traits.

@freesig
Copy link

freesig commented Oct 7, 2022

After a chat with @xgreenx I get the reason this is needed and think it's probably the best idea.

Feature definition and implementation will look cumbersome(it will use macro_rule but it still is a lot of code).

Macros really break the IDE's ability to do things like go to definition. I'd personally be in favor of avoiding macros and just implementing manually.

@bvrooman
Copy link
Contributor

bvrooman commented Oct 7, 2022

My 2 cents:

Breaking down the enum into constituent structs is nice! We separate the definition of Transaction from the definitions of its parts (Script, Create, etc). We can modify the pieces more easily and with higher independence.

It sounds like the idea of implementing each member function as a trait is so that we can compose the polymorphic type to functions that are designed to operate solely on interfaces, rather than types. I am a big fan of programming against interfaces, rather than types! However, my questions would be:

  • Do we need this level of modularity?
  • Are we exposing data via these interfaces solely to satisfy non-member function inputs? Maybe we need to re-examine the existing interfaces and see if they should be consolidated. For example, we expose many fields to support validate_without_signature_internal that takes in a generic tx of type T. Maybe validate_without_signature_internal should simply be a method on T?

I would also like to avoid the use of macros where possible. As already mentioned by others in this PR, macros do not play nicely with the IDEs or fmt.

@xgreenx
Copy link
Contributor Author

xgreenx commented Oct 7, 2022

It sounds like the idea of implementing each member function as a trait is so that we can compose the polymorphic type to functions that are designed to operate solely on interfaces, rather than types. I am a big fan of programming against interfaces, rather than types!

Yep!=)

Do we need this level of modularity?

It will save us and crates that depend on fuel-tx in the future in cases like with Transaction::Mint or if we decide to remove/add some fields. Also, fuel-tx doesn't need to cover all cases of all upstream crates, it provides only basic implementation, and the upstream crates can compose it as they wish. They can return Option type or define traits like ExecutableTransaction implemented only for Script and Create.

In the fuel-vm ExecutableTransaction traits feets very well. Also, traits for Coins and Messages with common fields will make the code cleaner.

It adds more flexibility overall.

Are we exposing data via these interfaces solely to satisfy non-member function inputs? Maybe we need to re-examine the existing interfaces and see if they should be consolidated. For example, we expose many fields to support validate_without_signature_internal that takes in a generic tx of type T. Maybe validate_without_signature_internal should simply be a method on T?

We can create a trait like Internal for Transaction with validate_without_signature and provide a generic implementation of this trait.

pub trait Internal {
    fn validate_without_signature(&self) -> Result<(), ValidationError>;
}

impl<T: InputsField + OutputField + ... > Internal for T {
    fn validate_without_signature(&self) -> Result<(), ValidationError> {
        ...
    }
}

In this case, Create and Script will have the implementation of the validate_without_signature. The question do we want to expose internal methods?=) Or do you suggest doing it for internal usage in the fuel-tx?

Macros really break the IDE's ability to do things like go to definition. I'd personally be in favor of avoiding macros and just implementing manually.

I would also like to avoid the use of macros where possible. As already mentioned by others in this PR, macros do not play nicely with the IDEs or fmt.

Could you describe what problems you have in detail, please?
Because it works fine for me. For example, macro here works well for me=)

Yea, it doesn't navigate to the line of method definition in the macro, only to the place of the macro use, but it is already enough to get the implementation because logic will be simple=) Three methods with simple implementation.

I can implement it manually for each field, but in this case, I need to spend 10-12 lines instead of 1.

@Voxelot
Copy link
Member

Voxelot commented Oct 7, 2022

I see the advantage of having a high-level ExecutableTransaction trait that would allow us to restrict which fuel-tx Transaction variants can be used with the VM.

However, I agree with @vlopes11 and @bvrooman that having a trait for each field seems a little overkill. Wouldn't a function like get_transaction_field need to accept any possible type of input anyways in order to return the correct opcode errors? Could you give an example of how the code for get_transaction_field would be cleaner with the approach of field-level getter traits?

@xgreenx
Copy link
Contributor Author

xgreenx commented Oct 7, 2022

It is an example of fields composition which would be nice to have:

// The trait for `CoinPredicate` and `CoinSigned`
// `fuel-vm` can have a method in `ExecutableTransaction` trait like:
// `fn coin_at(index: usize) -> Option<&dyn CoinFields>;`
// Useful here: https://github.com/FuelLabs/fuel-vm/blob/master/src/interpreter/metadata.rs#L111
pub trait CoinFields {
    fn utxo_id(self) -> &UtxoId;
    fn utxo_id_mut(self) -> &mut UtxoId;
    fn utxo_id_offset(self) -> usize;

    fn owner(self) -> &Address;
    fn owner_mut(self) -> &mut Address;
    fn owner_offset(self) -> usize;

    fn amount(self) -> &Word;
    fn amount_mut(self) -> &mut Word;
    fn amount_offset(self) -> usize;

    fn asset_id(self) -> &AssetId;
    fn asset_id_mut(self) -> &mut AssetId;
    fn asset_id_offset(self) -> usize;

    fn tx_pointer(self) -> &TxPointer;
    fn tx_pointer_mut(self) -> &mut TxPointer;
    fn tx_pointer_offset(self) -> usize;

    fn maturity(self) -> &Word;
    fn maturity_mut(self) -> &mut Word;
    fn maturity_offset(self) -> usize;
}

// The trait for `MessagePredicate` and `MessageSigned`
// `fuel-vm` can have a method in `ExecutableTransaction` trait like:
// `fn message_at(index: usize) -> Option<&dyn MessageFields>;`
// Useful here: https://github.com/FuelLabs/fuel-vm/blob/master/src/interpreter/metadata.rs#L260
pub trait MessageFields {
    fn message_id(self) -> &MessageId;
    fn message_id_mut(self) -> &mut MessageId;
    fn message_id_offset(self) -> usize;

    fn sender(self) -> &Address;
    fn sender_mut(self) -> &mut Address;
    fn sender_offset(self) -> usize;

    fn recipient(self) -> &Address;
    fn recipient_mut(self) -> &mut Address;
    fn recipient_offset(self) -> usize;

    fn amount(self) -> &Word;
    fn amount_mut(self) -> &mut Word;
    fn amount_offset(self) -> usize;

    fn nonce(self) -> &Word;
    fn nonce_mut(self) -> &mut Word;
    fn nonce_offset(self) -> usize;

    fn data(self) -> &Vec<u8>;
    fn data_mut(self) -> &mut Vec<u8>;
    fn data_offset(self) -> usize;
}

// The trait for `MessagePredicate` and `CoinPredicate`
// `fuel-vm` can have a method in `ExecutableTransaction` trait like:
// `fn predicate_at(index: usize) -> Option<&dyn PredicateFields>;`
pub trait PredicateFields {
    fn predicate(self) -> &Vec<u8>;
    fn predicate_mut(self) -> &mut Vec<u8>;
    fn predicate_offset(self) -> usize;

    fn predicate_data(self) -> &Vec<u8>;
    fn predicate_data_mut(self) -> &mut Vec<u8>;
    fn predicate_data_offset(self) -> usize;
}

// The trait for `MessagePredicate`, `CoinPredicate`, `MessageSigned` and `CoinSigned`.
// This trait can be useful to calculate the sum of spendable inputs.
pub trait AmountField {
    fn amount(self) -> &Word;
    fn amount_mut(self) -> &mut Word;
    fn amount_offset(self) -> usize;
}

The idea with separate traits is that the user of fuel-tx defines fields composition by himself. We can define those traits in the fuel-tx, but in this case, other crates depend on us and can't have their compositions. As an example, amount field is duplicated in the AmountField and MessageFields, CoinFields traits.

We can avoid duplication with super traits MessageFields: AmountField and CoinFields: AmountField and removing methods for amount. But we can have a case in the future where we can't do this trick.

We can do a one-time overkill change with a separate trait for each field and minimize future changes in this direction right now, or we can try to solve problems on demand and, for now, have traits from the example above=)

…nd `Create`.

Moved all getters from `Transaction` to `Create` and `Script`.

Splitted the previous logic of the transaction into traits and implemented traits for `Create` and `Script`.

Adapted tests to use generic code or code with exactly expected types.

Prepared the code to easily integrate a new `Mint` transaction.

TODO: Add support of metadata
src/checked_transaction.rs Show resolved Hide resolved
use core::hash::Hash;

// TODO https://github.com/FuelLabs/fuel-tx/issues/148
pub(crate) fn next_duplicate<U>(iter: impl Iterator<Item = U>) -> Option<U>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to validation.rs because it is used there.

}

#[cfg(feature = "internals")]
impl Transaction {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of those functions are used in the builder.rs, so I moved them there.

use fuel_types::bytes::{self, SizedBytes};
use fuel_types::Bytes32;

impl Transaction {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All offsets moved to Create and Script accordingly. But those offsets are relative to Create and Script instead of Transaction.
To get the relative offset to the Transaction, you need to add Transaction::offset to this offset.

For example, Create::gas_price_offset right now is 0, when previously Transaction::gas_price_offset was 8 because 8 bytes were reserved by default for TransactionRepr. So right now Transaction::gas_price_offset = Create::gas_price_offset + Transaction::offset.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to remove this change, and now all offset and serialization include 8 bytes for the Transaction enum discriminant.

So Script and Create are serialized as before the change. Transaction is only a wrapper that calls methods of inner type.

Maybe it is unclear why Script and Create have 8 bytes in the canonical serialization/deserialization, but it is simpler for the user and doesn't need to remember that offsets are relative. Also serialization_size and metered_bytes are correct=)

@xgreenx
Copy link
Contributor Author

xgreenx commented Oct 13, 2022

During refactoring, I noticed that we don't charge the user for witnesses. The maximum number of witnesses is 255. Can it cause a problem? The user can always fill in 255 witnesses, 254 are random, and 1 is valid. Do we handle this case somehow in the fuel-core to clean up unused witnesses during block commit?

@Voxelot
Copy link
Member

Voxelot commented Oct 13, 2022

During refactoring, I noticed that we don't charge the user for witnesses. The maximum number of witnesses is 255. Can it cause a problem? The user can always fill in 255 witnesses, 254 are random, and 1 is valid. Do we handle this case somehow in the fuel-core to clean up unused witnesses during block commit?

We will need to setup our own rules to protect against witness data abuse since the specs don't prescribe any solution here.

Re-worked `CheckedTransaction`. Introduced a new `Checked` wrapper that allows to wrap any data in generic way.
Added covertion to "checked" enum for `Transaction`
@freesig
Copy link

freesig commented Oct 18, 2022

Overall this change looks good to me. It's a little hard to tell what has changed vs what has just been moved. @xgreenx Is there anything specific that has changed in behavior or is it purely a refactor?

@xgreenx
Copy link
Contributor Author

xgreenx commented Oct 18, 2022

It is purely refactoring, and most of the code is moved from one place to another.
But with the reworking of some entities, code also was adapted. The main example - offsets #197 (comment)

And with this adaptation, I potentially could introduce bugs=) But I hope our tests are good enough to catch all possible bugs/not expected behaviors

Implemented `Default` for `Script`.
Made checked metadata public.
Made `Stage` also public to allow writing a generic code.
Added input argument to `ConsensusParameters::tx_offset` to highlight breaking work with offsets.
@bvrooman
Copy link
Contributor

bvrooman commented Oct 20, 2022

The refactor looks good. I like the introduction of the separate Create and Script structs, as well as the highly composable traits.

I'm wondering about the introduction of the Checked<Tx: IntoChecked, S: Stage = Fully> and related structs Fully and Partially. Effectively, you have implemented a state machine for Txs:

Tx.check() --> Checked<Tx, Partially>.check() --> Checked<Tx, Fully>

My understanding is that calling the Tx's first check() gives you a partially checked Tx; what "partially checked" means is dependent on what Tx is. This is defined by the into_checked_partially() fn on implementors of IntoChecked. Calling the Tx's second check() then gives you the fully checked Tx; this means to check the Tx's signatures.

I have a couple of questions:

  • Why do we need the intermediate step for partially checked? Can we not do a single check to go from unchecked to fully checked?
  • Can the structs that implement Stage (used for the S: Stage parameter in Checked<Tx: IntoChecked, S: Stage = Fully>) be replaced with a const generic/enum?

@xgreenx
Copy link
Contributor Author

xgreenx commented Oct 20, 2022

Why do we need the intermediate step for partially checked? Can we not do a single check to go from unchecked to fully checked?

IntoChecked has two methods into_checked that returns a fully checked transaction and into_checked_partially. So you have a choice what do you really need=)

Can the structs that implement Stage (used for the S: Stage parameter in Checked<Tx: IntoChecked, S: Stage = Fully>) be replaced with a const generic enum?

Const generic enums is not allowed=( Playground

We can use something like const CHECK: bool, but I don't want to limit us with 2 possible values=)

@xgreenx xgreenx added the breaking A breaking api change label Oct 23, 2022
@xgreenx xgreenx requested a review from Voxelot October 23, 2022 22:09
bvrooman
bvrooman previously approved these changes Oct 23, 2022
Copy link
Contributor

@bvrooman bvrooman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no more questions about the code at this time. I think that we can get the changes in and start to observe how these patterns play out in practice.

src/checked_transaction.rs Outdated Show resolved Hide resolved
src/checked_transaction.rs Outdated Show resolved Hide resolved
Copy link
Member

@Voxelot Voxelot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks for the quick feedback turnaround!

The organizational changes make sense to me and have also cleaned up some duplicate processing as well, nice work!

Copy link
Contributor

@leviathanbeak leviathanbeak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@xgreenx xgreenx merged commit 779338a into master Oct 25, 2022
@xgreenx xgreenx deleted the feature/methods-refactoring branch October 25, 2022 18:43
# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
breaking A breaking api change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants