Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

peer_id from announce query string interpreted as latin1 then converted to UTF-8 before being returned as a transformed and wrong value in the peer list. #460

Open
pdelagrave opened this issue Mar 31, 2023 · 5 comments
Labels

Comments

@pdelagrave
Copy link

Hello,
I'm developing a BitTorrent client from scratch for a toy project and was testing it against this tracker.
Using version 10.0.2 of the tracker from the minimages/bittorrent-tracker docker image.

When announcing to the tracker with a non-UTF-8 peer_id:
curl 'http://1.2.3.4:8001/announce?info_hash=escaped_20_bytes&peer_id=%F0%5B%E2%6F%AA%33%E5%A7%9B%1E%75%52%13%6E%B5%AD%51%57%21%98&port=1234&compact=0

The request succeeds and there's no error. The peer is added to the tracker's internal list.
Then in the response we have the peer list with that new peer where the peer_id has a different value than the one provided in the request.
In this example the transformed peer_id will be: c3b05bc3a26fc2aa33c3a5c2a7c29b1e7552136ec2b5c2ad515721c298

It's a problem because peers can't communicate with a peer which the peer_id is misrepresented by the tracker.

Using the NodeJS API we can achieve that result with:

input = 'f05be26faa33e5a79b1e7552136eb5ad51572198';
latin1_str = Buffer.from(input, 'hex').toString('binary');  // As per the Node docs, 'binary' is the same as 'latin1'.
// the latin1_str == 'ð[âoª3å§\u001euR\u0013nµ­QW!'
Buffer.from(latin1_str, 'utf8').toString('hex'); // == 'c3b05bc3a26fc2aa33c3a5c2a7c29b1e7552136ec2b5c2ad515721c298'

In the "trackers" section of BEP0003 the GET request for announce, for the query parameter peer_id:

A string of length 20 which this downloader uses as its id. Each downloader generates its own id at random at the start of a new download. This value will also almost certainly have to be escaped.

BEP0020 also specifies that the usage of peer_id as a client identification method uses only ascii characters. So there shouldn't be a need to parse the bytes as latin1 or UTF-8 or from latin1 to UTF-8.

I didn't dig much into the codebase of the tracker (I'm quite unfamiliar with javascript and nodejs) but I've seen Buffer.from() and 'binary' used around and it might be the cause.

Thanks!

@ThaUnknown
Copy link
Member

ThaUnknown commented Mar 31, 2023

can you reproduce this on 9.18.6?

@pdelagrave
Copy link
Author

Same problem with 9.18.6.
Also realized that the client list in the /stats page assumes the peer ids to be text strings and not bytes, they aren't escaped and displayed as is.

@github-actions
Copy link

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

@ThaUnknown
Copy link
Member

could you walk me tru this.... slowly? I might be disabled, the issue lies in ENCODING the data, when making a request to the tracker, not decoding? to me it seems like the issue would lie on both ends?

@lirener
Copy link

lirener commented Jan 20, 2024

I may be having a similar issue to you?(我可能遇到了和你类似的问题?)
#508

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants