Skip to content
This repository was archived by the owner on Aug 2, 2021. It is now read-only.

swarm: Added lightnode flag #816

Closed
wants to merge 10 commits into from
Closed

swarm: Added lightnode flag #816

wants to merge 10 commits into from

Conversation

agazso
Copy link
Member

@agazso agazso commented Jul 26, 2018

Added --lightnode command line parameter
Added LightNode to Handshake message

This is the Task0 of the Light Node implementation.
For more details see https://hackmd.io/O5PvHQVhQnaNCbhJbAJVog?view

Added --lightnode command line parameter
Added LightNode to Handshake message
}

log.Info("peer created: %s", handshake.peerAddr.String())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be a log.Debug, otherwise logs will be full of it. Also use structured logs:

log.Debug("peer created", "addr", handshake.peerAddr.String())

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed logging

)

if err != nil {
t.Fatal(err)
}

if pt.bzz.handshakes[id].LightNode != peerLightNode {
t.Fatal(fmt.Sprintf("peer LightNode flag is %v, should be %v", pt.bzz.handshakes[id].LightNode, peerLightNode))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use Fatalf, instead of Fatal + Sprintf

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to Fatalf

@@ -30,6 +30,8 @@ import (
p2ptest "github.com/ethereum/go-ethereum/p2p/testing"
)

const DefaultVersion = 5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a better name for this, and a godoc - what is this DefaultVersion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it to TestProtocolVersion

id := s.IDs[0]

err := s.testHandshake(
correctBzzHandshake(addr),
&HandshakeMsg{Version: 0, NetworkID: 3, Addr: NewAddrFromNodeID(id)},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have mixed feelings about changing the hardcoded values to extracted variables. The idea is - if someone changes the DefaultNetworkID in the code, a test should fail, because this is changing the protocol and the contract. Peers would no longer be able to communicate.

You must know very well what you are doing when changing this, and I am afraid that hacking the actual value we tests against (in this case DefaultNetworkID and Version) has the opposite effect - tests will automatically pass, and you will notice that you changed the contract only once you deploy this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a new const called TestProtocolNetworkID and used that in tests. I understand now that why using the DefaultNetworkID was a bad idea, however generally it's a good practice to not used hardcoded constants in the test code (or for that matter in any code), because it may lead to errors in unrelated tests when you change the constant in NetworkID tests.

What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy with TestProtocolNetworkID

@@ -303,6 +308,12 @@ func envVarsOverride(currentConfig *bzzapi.Config) (config *bzzapi.Config) {
}
}

if lightnodeenable := os.Getenv(SWARM_ENV_LIGHT_NODE_ENABLE); lightnodeenable != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick, but this var has a scope of 2 lines, it could very well be just lne ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the name to lne

@@ -30,6 +30,8 @@ import (
p2ptest "github.com/ethereum/go-ethereum/p2p/testing"
)

const DefaultVersion = 5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be 6, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to TestProtocolNetworkID. As far as I understand it can be anything that is not 0 based on the existing tests. Feel free to correct me if I am wrong.

}
}

func TestBzzHandshakeLightNodeOn(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of those 2 tests? I think you only need one of them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason I did two tests were to make sure it is set correctly independently of the default value. If we only check the On case (where LightNode is True), then it may happen that in the code it's set to True incorrectly buy tests wouldn't catch that.

I don't have a strong opinion about it but generally I think it doesn't hurt to have an extra test and it may cover errors as described above.

@@ -123,6 +123,11 @@ var (
Usage: "Duration for sync subscriptions update after no new peers are added (default 15s)",
EnvVar: SWARM_ENV_SYNC_UPDATE_DELAY,
}
SwarmLightNodeEnabled = cli.BoolTFlag{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that BoolFlag should be used instead BoolTFlag, if the default value is false.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it to BoolFlag

@GitCop
Copy link

GitCop commented Jul 30, 2018

There were the following issues with your Pull Request

  • Commit: 62deb0ba09e15f3d67340119a4b010cb00fa6192
  • Your subject line is longer than 72 characters

Guidelines are available at https://github.com/ethersphere/go-ethereum


This message was auto-generated by https://gitcop.com

@@ -30,6 +30,11 @@ import (
p2ptest "github.com/ethereum/go-ethereum/p2p/testing"
)

const (
TestProtocolVersion = 5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current version is 5. You are changing the handshake message, which means that you need to bump this to 6, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I bumped it to 6 to reflect the fact that the handshake is changed. However I think changing here has little significance, because it is only used in the scope of these tests and it is only used in one test for comparing with zero.

I think in order to change it in the protocol, one has to change the version in the BzzSpec that is inside swarm/network/protocol.go.

Also I remember in one of our hangout calls that it may not be necessary to bump this, @janos can you chime in?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I was wrong about the significance of changing this in the tests, because if I change it there and not in BzzSpec the tests will fail.

Still the question is valid, should we change it in BzzSpec and therefore here too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran some tests.

If we change the version, then nodes running version 6 won't connect to nodes that are running version 5 (as expected).

However if we just add the LightNode flag to the handshake without changing the version, the nodes are able to connect to each other. This is a bit surprising to me, but this means it's not strictly necessary to change the version at this point, because currently the code doesn't change any functionality.

In any case in the future it seems like a good idea to bump the version when add the actual LightNode functionality to Kademlia, because from that point nodes will definitely work differently, hence the need for the version change.

So I think the question is should we change the version two times (now and when the Kademlia changes are implemented) or just once (only after the Kademlia changes)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that adding a field to a message is not a breaking change for protocol. It just requires that default value should be handled, and since default for light node flag is false, then it is handled implicitly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I thought nodes won't be able to deserialise the new value. If they can, then perfect!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janos That's what my experiment suggests too, that just adding a LightNode field to the Handshake message works even when connecting to nodes which do not support the LightNode flag.

So basically you suggest that we keep the version right now at 5 and change it to 6 when we actually introduce breaking change in the functionality in Kademlia? @nonsense wdyt?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, kept the version at 5 right now

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently I don't understand exactly the RLP serialization - will play with this locally, and I trust you that it works. Since this is the case, keep it at 5, no need to upgrade.

@@ -225,16 +240,59 @@ func TestBzzHandshakeVersionMismatch(t *testing.T) {
}

func TestBzzHandshakeSuccess(t *testing.T) {
Copy link
Contributor

@nonsense nonsense Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following 3 tests are basically identical. I think this is introducing too much code with too little test value. Please create one test with a table-driven params, where you have both options - light node on/off. https://github.com/golang/go/wiki/TableDrivenTests

if pt.bzz.handshakes[id].LightNode != peerLightNode {
t.Fatalf("peer LightNode flag is %v, should be %v", pt.bzz.handshakes[id].LightNode, peerLightNode)
}
var lightNodeTests = []struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This var should be scoped within the test, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the var inside the function

Copy link
Contributor

@nonsense nonsense left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

provided @janos approves, you can open on ethereum/go-ethereum and have @gbalint merge it.

@agazso
Copy link
Member Author

agazso commented Jul 31, 2018

Is it possible that the Go 1.9 test is flaky and that's why the Travis build failed? Have you seen similar before?

@nonsense
Copy link
Contributor

nonsense commented Jul 31, 2018

@agazso yes, you are hitting a flaky test on the Go 1.9 build. Ignore or click rebuild. This one in particular is quite rare.

Copy link
Member

@janos janos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nonsense
Copy link
Contributor

nonsense commented Aug 7, 2018

@agazso We are getting a lot of errors in the open cluster, which includes this PR:


swarm_30399 DEBUG[08-07\|13:54:42.104\|p2p/server.go:703] Adding p2p peer name=swarm/v0.3.1-225171a... addr=137.117.226.225:57364 peers=4
--
swarm_30399 WARN [08-07\|13:54:42.104\|swarm/network/protocol.go:254] 27991d4d: handshake failed with remote peer fb24c2c5: Handshake error: Invalid message (RLP error): <= msg #0 (203 bytes): rlp: too few elements for network.HandshakeMsg
swarm_30399 DEBUG[08-07\|13:54:42.104\|p2p/server.go:721] Removing p2p peer id=daca9ff0c3176045 conn=inbound duration=444.603µs peers=3 req=false err="27991d4d: stream protocol closed: Handshake error: Invalid message (RLP error): <= msg #0 (203 bytes): rlp: too few elements for network.HandshakeMsg"


This is the backward-incompatible change I mentioned, and this is why I wanted to bump up the protocol version.

How did you test this to come to the conclusion that RLP can handle both versions of the HandshakeMsg?

@nonsense nonsense closed this Aug 14, 2018
@nonsense
Copy link
Contributor

@gbalint @agazso note that this is backward-incompatible with previous versions. We should probably bump up the protocol versions.

@agazso
Copy link
Member Author

agazso commented Aug 14, 2018

@nonsense That is correct, it is indeed backward incompatible, my tests were incomplete before.

I agree that we should bump the protocol version. The question is how should we do it, because depending on the versioning schema swarm uses, the swarm version should be changed to reflect it is a backward incompatible change. For example with semver it's a minor version change.

@nonsense
Copy link
Contributor

semver for version 0.x.y does not say we should be backward compatible - we can change APIs at any time. This doesn't mean we should, but we could.

@agazso
Copy link
Member Author

agazso commented Aug 14, 2018

@nonsense Sorry, I am a bit confused :)

To my understanding we changed the network API in a way that is backwards incompatible. People who are using Swarm may expect that change is reflected in a Swarm version number change. I don't know the policy of Swarm dealing about these cases.

It's easy just to bump the protocol version, I am asking about the specific steps to do that:

  • shall we create a new branch for it?
  • when will it be released? is it when you release a new minor version?

@nonsense
Copy link
Contributor

We changed the API in a non-backward compatible way with this PR, and it has already been merged. Nodes no longer can connect to each other, and we are currently still using the same version for the protocol, which is bad.

We should at least increase the protocol version.

This will be released with 0.3.2 , which won't be compatible with 0.3.1, because of this change.

I hope that makes sense.

@agazso
Copy link
Member Author

agazso commented Aug 14, 2018

@nonsense I created a PR for the protocol version change: #888

# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants