Talk:BitTorrentSpecification

From TheoryOrg

Jump to: navigation, search

Contents

Talk page for BitTorrentSpecification

There is an active developer's mailing list on ibiblio.org, bittorrent@lists.ibiblio.org/ or http://lists.ibiblio.org/mailman/listinfo/bittorrent

The developer IRC channel is #btdevel on irc.freenode.net.

For the specification, there is also this talk page.

Sections Under Dispute

Tracker Response

Another ambiguity in the wording of the Implementor's Note:

It currently reads:

"When a new piece has completed download, HAVE messages (see below) will need to be sent to most active peers"

For increased clarity it should read (assuming I understand the intent correctly)

"When a new piece has completed download, HAVE messages (see below) will need to be sent to the majority of peers that are active."

One could interpret the text as it currently reads in the following way which I am pretty sure is wrong:

"When a new piece has completed download, HAVE messages will need to be sent to the peers that are the most active."

Of course, wording #1 begs the question: Why only a majority of peers and not all peers?

Wording #2 begs its own question: how does one define "most active"?

Messages: bitfield

The specification says:

"A bitfield of the wrong length is considered an error. Clients should drop the connection if they receive bitfields that are not of the correct size, or if the bitfield has any of the spare bits set."

What is the right length? Is it equivalent to the number of pieces in the torrent? Or, is it right simply by agreeing with the specified message length?

Sleek says: The right length is equal to payload_piece_count + (8 - payload_piece_count % 8). Each piece is encoded as one bit, so we have 8 piece per byte. In the case of 1592 pieces, there should be 199 bytes, and no spare bits. In the case of 1595 pieces, there should be 200 bytes, and 5 spare bits (200 x 8 -1595 ). Of course this is only for X (from len=0001+X ).


Also, the specification says:

"The bitfield message is variable length, where X is the length of the bitfield"

Does this mean X is the length of the bitfield in bits, or bytes? The wording is unclear. I assume it's in bytes, but for newbies like myself it should be clarified.

Sleek says: X is the length of bytes. The final byte may contain spare bits.

Messages: request

EHeM's view

It is tricky to get one's hands on versions old enough. The change from 32KB requests to 16KB requests in the mainline happened somewhere around 3.4.2 or 3.4.1 (hmm, okay earlier than I thought). Until 4.0, the mainline would allow requests up to 128KB. Note that BitTornado forked off prior to 3.4. Further note that the official specification still lists 32KB and 128KB as the sizes!

At the same time I'm highly skeptical of the benefit of smaller requests. Given that the minimum timeslice is 10 seconds, and 5 unchokes, you'll devote 2 seconds of bandwidth in each choke-period. On a link with 1mbps bandwidth, that will be 256KB (2mbps/8=256KB) of data. For 32KB requests, that is 8 requests, with 16KB requests, that is 16 requests. Is finer grained throttling really necessary in this situation?

Another note, uau, you're stating you've changed it 3 times. Interestingly, I can only account for 1 of those times. This seems to suggest the majority view disagrees with you.

Reply from uau

You're wrong about when the mainline size change happened. It was before version 3.1. The official spec was just never updated; yes it did (and some version still does?) contain completely outdated information. The earlier claims you wrote about slice sizes in use were simply false.

There are situations where larger slices would be OK. However you clearly don't understand all the issues involved, and so you should not use the article as a place for dubious implementation ideas you made up yourself. An example of something you failed to consider: you cannot send any protocol data in the middle of uploading a slice yourself - consider this in view of the queuing issue below.

At least the last two versions I fixed had been broken by you. "Majority view" is irrelevant since what you wrote contained claims which were simply false, and rather easily verifiable to be so.

EHeM's reply

Thus demonstrating that some folks read the specification and expect that to be correct, and some folks read the code knowing it does not conform to the specification. I conceed that the change was earlier than 3.3 (confirmed by pulling the tarball off my backups). I hadn't checked the actual code since that portion was irrelevant to the experiments I've been conducting. I've edited "View #1" to be closer to what you want, any response? I like keeping the mention of the historic size since it is still "correct" (I guess I have been sucked into academia).

I'm not failing to consider that one cannot advertise possession of a piece in the middle of a PIECE/block upload message. Thing is as long as piece size divided by block size (number of gaps during transmission) is greater than the number of peers unchoking you (maximum number of simultaneously in-flight pieces), on average you won't have back to back HAVEs. Even with a fair number of back to back HAVEs, you're likely updating information faster than peers can respond and therefore damage to throuput is negligable.

Choose your words carefully. You're implying that it was changed with deliberate malice in mind. This is not so, just I was incorrectly confident due to reading the wrong source of information.

Algorithms: Queuing

EHeM's view

Well, Debian still has a BitTorrent package for mainline 3.4.2, so there is at least one valid client out there that still uses a queue with a static depth of 5. Also that is an example, that approach provides one with decent performance (links are too high-bandwidth for a queue that shallow now).

Do you really dispute whether queueing requests is necessary? Given a links with 10mbps bandwidth and 100ms (hopefully you'll conceed a reasonable low-end example as of this writing), a 16KB/128kb block will be downloaded with 0.0125s worth of bandwidth. Given 5 unchoking peers, that will work out to 0.0625s for downloading that block. If one does not queue, one will then be waiting for 0.1s of round-trip time for the next block to start arriving. Of course queueing is bloody darn important!

Finally, the algorithm suggestion was more or less just that, a suggestion. A feasible to implement algorithm that in theory should provide good performance. Static queues also work, but one must be very careful to ensure the queue depth is high enough! With modern links queueing 30 blocks or more seems like a more reasonable baseline (note that out of order packets need to be accounted for with a larger queue, 100ms is merely average).

Reply from uau

No, I don't dispute that queuing requests is necessary. I dispute 1) the factually incorrect information you wrote, and 2) your use of the article to write clueless suggestions you made up yourself that would just mislead readers. I'm sure the readers can come up with bad ideas themselves if they're in need of those.

Btw I see you added "removed personal insults by uau" to the change log list. The "personal insults" apparently referring to the text noting how you had added the earlier incorrect information. It is a fact that you did add falsehoods to the page (including ones that you cannot possibly try to justify as "matters of opinion", such as the request size values being used by clients). If you don't want such things to be mentioned with the name of the earlier editor then you should rethink your own change log entry.

EHeM's reply

Well, as noted above I pretty well conceed the 16KB versus 32KB issue. "View #1" was changed to reflect this. If you felt it was necessary to remove the example of a possible dynamic algorithm, you should of done that and only that. I wouldn't of done anything in that case. I was trying to suggest you freely modify View #2 to reflect how you think the section should appear, is the changed View #1 acceptable?

Please look up the word "falsehood" in a dictionary (WikTionary will do fine). You will note that it suggests deliberate malice. That I consider an insult. Placing my name there makes it personal. In written work word choice is crucial, I won't claim I'm a wonder of good word choice myself (far from it) but that one was rather poor if you weren't meaning to give personal insult.

Note that this is the second attempt at resolving this equitably. I'd pointed to the mailing list in the changelog earlier in an attempt to get you there, you could of pointed to this discussion page as well. I don't like cleaning up nasty business, but you cannot accuse me of not trying.

One further note, I didn't originate the section in roughly its present form. The commentary is what I'd added (and now mentioning 16KB blocks instead of 32KB blocks). With clients that have a statically sized queue, it is a highly crucial performance item for people on high bandwidth links.

exchanging multiple torrents between two peers?

Say two clients are both downloading the same two torrents (or client 2 is seeding one or both torrents), and they are connected to each other.

Now client 1 requests a piece from client 2, say piece number 5. It sends a "request" message: <len=0009+X><id=6><index=5><begin=0><length>

How does client 2 know which of the two torrents client 1 wants a piece of?

And when client 2 sends back a "piece" message: <len=0009+X><id=7><index=5><begin=0><block>, how does client 1 know which torrent this piece belongs to?

Happyjack27 17:47, 16 April 2008 (UTC)

Each torrent is on a separate TCP connection -- Coderjoe 02:20, 22 April 2008 (UTC)

Personal tools