Skip to content

lupinthird/picotorrent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

picotorrent: uTP and BitTorrent v1 tooling

This repository implements protocol pieces described in the HTML copies under docs/, with a focus on uTP (BEP 29) and BitTorrent v1 (BEP 3) plus several common extensions used in the wild.

Contents at a glance

Area Location Role
uTP (micro transport) utp/ UDP-based transport: packets, congestion, blocking UtpSocket
picotorrent picotorrent/ Strict bencode, metainfo parsing, trackers, webseeds, DHT helpers, download/seed session, CLI
Assigned numbers (BEP 4) bittorrent_constants.py Wire message IDs and reserved handshake bits
Peer ID conventions (BEP 20) bittorrent_peer_ids.py Known client prefixes and peer-id parsing helpers

Requirements: Python 3.10 or newer (tomllib is used when available; older 3.10 falls back to a small regex read of pyproject.toml).


uTP (BEP 29)

Implemented from docs/bep_0029.rst_post.html:

  • v1 header encode/decode (type/version nibble layout, extensions, selective ACK)
  • Packet types: ST_SYN, ST_STATE, ST_DATA, ST_FIN, ST_RESET
  • Congestion and timeout behavior aligned with the BEP
  • Blocking UDP-backed UtpSocket (connect, accept, sendall, recv, close)

Example:

from utp import UtpSocket

sock = UtpSocket()
sock.bind(("0.0.0.0", 0))
sock.connect(("127.0.0.1", 9000))
sock.sendall(b"hello over utp")
print(sock.recv())
sock.close()

This stack is suitable as a reference and for experiments; it is not a full production uTP implementation (pacing, PMTU, full interoperability tuning, and so on are out of scope unless extended).


picotorrent (picotorrent/)

Metainfo and bencode (BEP 3)

  • Strict bencode decode with dictionary key ordering checks (suitable for info-dictionary hashing)
  • Single-file and multi-file torrents
  • Fields surfaced in reports include: announce, announce-list, url-list (web seeds, BEP 19), nodes (DHT bootstrap hints, BEP 5), private (BEP 27), comments, creation metadata, piece table, info hash (SHA-1 of raw info bytes in the file)

Python API:

from picotorrent import parse_torrent_file, format_torrent_report

meta = parse_torrent_file("example.torrent")
print(format_torrent_report(meta))

Download and seeding

The Downloader and Seeder classes in picotorrent/session.py implement a basic end-to-end path:

  • HTTP and UDP tracker announces (BEP 3 / BEP 15), compact peer lists (BEP 23)
  • Optional DHT get_peers when nodes is present in the torrent and the torrent is not private
  • PEX parsing when peers send ut_pex
  • Web seeds (BEP 19 / GetRight-style): if tracker-based peers fail, the downloader tries url-list HTTP(S) URLs, validates piece hashes against the metainfo, and writes output (single file or multi-file tree)
  • Transport: peer connections use TCP by default. The downloader uses uTP (BEP 29) when a peer was advertised with PEX flag 0x04 (supports uTP) (BEP 11), or after the remote extension handshake lists **ut_holepunch** while the current connection is still TCP (BEP 55 then implies uTP for that endpoint), then retries that peer over uTP once. Otherwise traffic stays on TCP.
  • Piece picking (see docs/BitTorrent Request and Choking Algorithms.md for block pipelining): random-first piece until the first complete piece, then lowest missing piece index the peer offers (sequential in the concatenated payload). That avoids strict rarest-first, which often defers common low-index pieces—where early multi-file entries (e.g. images) live—until the end of the job even though each piece is written as it verifies. 16 KiB pipelined block requests per piece; wait for unchoke and learn availability from bitfield / have* before requesting; peer order favors peers that have delivered more bytes on failed rounds.
  • On-disk layout: as soon as a download starts, the client creates the folder tree and preallocates each output file to the metainfo size (existing files are kept and only resized if the length is wrong). Each peer piece is written after SHA-1 check using **os.write + fsync** (no stdio buffering), with SIGINT briefly ignored around that critical section so Ctrl+C is less likely to interrupt between write and sync. Only complete, hash-verified pieces are persisted—the in-flight piece at interrupt time is not written. If the torrent uses a very large piece length, few pieces may finish before you stop, so the file can stay mostly zero until whole pieces complete.
  • Persistent session loop with --session-timeout instead of failing on the first refused connection
  • Extension protocol bit and fast-extension reserved bits set on the wire; extension handshake advertises ut_metadata, ut_pex, ut_holepunch where applicable

This is still a reference client, not a full swarm engine (no parallel multi-peer piece picking, no full endgame cancel fan-out, no 10 s rechoke timer, and choking is not modeled on the seeder beyond basic unchoke).

Peer identity

The client peer id is 20 bytes, Azureus-style (BEP 20):

  • Prefix: -pT + four digits + -
  • The four digits come from **[project].version in pyproject.toml**: major and minor are each encoded as two decimal digits (0–99), e.g. 0.2.00200. Patch and pre-release labels are not encoded in those four digits.

If digit generation fails, the code falls back to 0001.


CLI: btinspect

The same entry point covers inspection, download, and seed. After installing the package (pip install -e .), the script name is **btinspect**. During development you can run without installing:

python -m picotorrent <subcommand> ...

All subcommands are required; there is no default subcommand.

inspect — print metainfo

btinspect inspect <torrent> [--show-pieces]
python -m picotorrent inspect <torrent> [--show-pieces]
Argument Description
torrent Path to .torrent
--show-pieces Print every piece SHA-1 (can be very long)

download — download content

btinspect download <torrent> [--out-dir DIR] [--peer HOST:PORT]
    [--session-timeout SECONDS] [--debug-handshake]
python -m picotorrent download <torrent> [options...]
Option Default Description
torrent Path to .torrent
--out-dir . Directory for output (single file or top-level multi-file folder)
--peer (none) Force a single host:port peer; otherwise use tracker + DHT + PEX discovery
--session-timeout 300 Seconds to keep retrying discovery and connections before giving up
--debug-handshake off Log each connect attempt and handshake send/receive fields (hex where relevant)

On success, the CLI prints the path to the written file or directory.

seed — serve bytes for a torrent

btinspect seed <torrent> <data> [--host HOST] [--port PORT]
python -m picotorrent seed <torrent> <data> [options...]
Argument Description
torrent Path to .torrent
data Path to one file whose bytes are the torrent payload (single-file torrents). For multi-file torrents this must be the concatenation of all files in the order listed in the metainfo (same layout the protocol uses internally).
--host Bind address (default 0.0.0.0)
--port Listen port (default 6881)

The seeder runs until interrupted (Ctrl+C). Clients must connect to this listener with the same info hash.

gui — launch desktop UI (tkinter)

btinspect gui
python -m picotorrent gui

The GUI is optional and uses Python's built-in tkinter (no extra package dependency). CLI remains the default mode.


Installing vs running from source

Editable install (recommended once):

pip install -e .

This registers the btinspect console script from pyproject.toml.

No install (development):

python -m picotorrent inspect path\to\file.torrent
python -m picotorrent download path\to\file.torrent --out-dir .\out
python -m picotorrent seed path\to\file.torrent path\to\data.bin --port 6881

Supporting modules (optional reading)

  • bittorrent_constants.py — BEP 4 message IDs and reserved-bit tables
  • bittorrent_peer_ids.py — BEP 20 client id tables and parse_peer_id()
  • picotorrent/project_version.py — semver from pyproject.toml for peer id digits
  • tests/ — unit tests for bencode, metainfo, uTP packets, project version

Run tests:

python -m pytest -q

Limitations

  • v2 / hybrid torrents are not implemented; metainfo and pieces are v1 SHA-1 only.
  • Downloader is not a full BitTorrent client: choking, pipelining, endgame, and large swarms are only partially or not modeled.
  • uTP is a standalone library here; the BitTorrent peer wire path in this repo uses TCP unless you integrate utp/ yourself.

For protocol text, see the mirrored BEP HTML files under docs/.

About

A python based command line bit torrent client

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages