Skip to content

Configuration Guide

PyBGPKITStream provides two configuration models for different use cases:

  • BGPStreamConfig: Query specification (what data to retrieve)
  • PyBGPKITStreamConfig: Implementation details (how to retrieve it)

Providing implementation details is optionnal.

Configuration Models

BGPStreamConfig: Query Specification

BGPStreamConfig defines the BGP data query: collectors, time range, data types, and filters. It's format-agnostic and could work with any BGP data provider. See API Reference for full documentation.

PyBGPKITStreamConfig: Implementation Configuration

PyBGPKITStreamConfig extends the query specification with PyBGPKITStream-specific parameters: parser selection, caching strategy, concurrent download limits, and memory options. See API Reference for full documentation.

Creating Streams

You have three approaches:

Approach 1: Simple Queries with BGPStreamConfig

For basic queries without implementation customization, use BGPStreamConfig:

from pybgpkitstream import BGPStreamConfig, BGPKITStream
import datetime

config = BGPStreamConfig(
    start_time=datetime.datetime(2010, 9, 1, 0, 0),
    end_time=datetime.datetime(2010, 9, 1, 2, 0),
    collectors=["route-views.wide"],
    data_types=["updates"],
)

stream = BGPKITStream.from_config(config)
for elem in stream:
    print(elem)

Approach 2: Full Control with PyBGPKITStreamConfig

For explicit control over implementation details, use PyBGPKITStreamConfig with nested BGPStreamConfig:

from pybgpkitstream import BGPStreamConfig, PyBGPKITStreamConfig, BGPKITStream
import datetime

query = BGPStreamConfig(
    start_time=datetime.datetime(2010, 9, 1, 0, 0),
    end_time=datetime.datetime(2010, 9, 1, 2, 0),
    collectors=["route-views.wide"],
    data_types=["updates"],
)

config = PyBGPKITStreamConfig(
    bgpstream_config=query,
    parser="bgpkit",
    max_concurrent_downloads=20,
    cache_dir="/tmp/bgp_cache",
    ram_fetch=True,
)

stream = BGPKITStream.from_config(config)
for elem in stream:
    print(elem)

BGPStreamConfig fields can be passed directly to PyBGPKITStreamConfig, eliminating the need for nesting. This is the most practical approach when you need implementation customization:

from pybgpkitstream import PyBGPKITStreamConfig, BGPKITStream
import datetime

# Pass BGPStreamConfig fields directly—no nesting required
config = PyBGPKITStreamConfig(
    start_time=datetime.datetime(2010, 9, 1, 0, 0),
    end_time=datetime.datetime(2010, 9, 1, 2, 0),
    collectors=["route-views.wide"],
    data_types=["updates"],
    # Implementation parameters
    parser="bgpkit",
    max_concurrent_downloads=10,
    cache_dir="/tmp/bgp_cache",
    ram_fetch=True,
)

stream = BGPKITStream.from_config(config)
for elem in stream:
    print(elem)

Query Parameters

These parameters define the BGP data query and are part of BGPStreamConfig:

Time-Based Selection

import datetime
from pybgpkitstream import BGPStreamConfig

# Specify exact time range
config = BGPStreamConfig(
    start_time=datetime.datetime(2010, 9, 1, 0, 0, tzinfo=datetime.timezone.utc),
    end_time=datetime.datetime(2010, 9, 1, 2, 0, tzinfo=datetime.timezone.utc),
    collectors=["route-views.wide"],
)

Important: - Datetimes are assumed to be UTC if no timezone is specified - Both start_time and end_time must be provided together, or both left as None - Leaving both times as None enables live mode

Collector Selection

# Single or multiple collectors
config = BGPStreamConfig(
    start_time=datetime.datetime(2010, 9, 1, 0, 0),
    end_time=datetime.datetime(2010, 9, 1, 2, 0),
    collectors=["route-views.wide", "route-views.sydney", "rrc04"],
)

Common collectors: - route-views.* - Route Views collectors - rrc0* - RIPE NCC RIS collectors - Full list available via BGPKIT Broker API

Data Types

# RIBs: Complete routing table snapshots
config = BGPStreamConfig(
    ...,
    data_types=["ribs"],
)

# Updates: BGP update messages
config = BGPStreamConfig(
    ...,
    data_types=["updates"],
)

# Both
config = BGPStreamConfig(
    ...,
    data_types=["ribs", "updates"],
)

Implementation Parameters

These parameters control how PyBGPKITStream retrieves and processes data. They are part of PyBGPKITStreamConfig:

Parser Selection

from pybgpkitstream import PyBGPKITStreamConfig
import datetime

# Default: pybgpkit (pure Python, slower but no dependencies)
config = PyBGPKITStreamConfig(
    start_time=datetime.datetime(2010, 9, 1, 0, 0),
    end_time=datetime.datetime(2010, 9, 1, 2, 0),
    collectors=["route-views.wide"],
    parser="pybgpkit",
)

# bgpkit-parser (requires system install, fastest)
config = PyBGPKITStreamConfig(
    ...,
    parser="bgpkit",
)

# bgpdump (requires system install)
config = PyBGPKITStreamConfig(
    ...,
    parser="bgpdump",
)

# pybgpstream (requires pip install pybgpstream)
config = PyBGPKITStreamConfig(
    ...,
    parser="pybgpstream",
)

Caching and Download Strategy

from pybgpkitstream import PyBGPKITStreamConfig

config = PyBGPKITStreamConfig(
    ...,
    cache_dir="/tmp/bgp_cache",           # Directory for downloaded files
    ram_fetch=True,                        # Use /dev/shm (Linux) or /Volumes/RAMDisk (macOS) when cache is disabled
    max_concurrent_downloads=10,           # Number of parallel downloads
    chunk_time=datetime.timedelta(hours=2),  # Process data in intervals
)

Parameter details: - cache_dir: Persistent storage for MRT files. Reused across runs. - ram_fetch: When caching is disabled, use shared memory instead of disk temp space. Improves performance at higher RAM cost. - max_concurrent_downloads: Balance between download speed and resource consumption. - chunk_time: Interval for fetch/parse cycles. Smaller intervals reduce memory usage at the cost of throughput.

Direct BGPKITStream Constructor

It's possible to bypass Pydantic config validation by instantiating BGPKITStream directly:

from pybgpkitstream import BGPKITStream

stream = BGPKITStream(
    collectors=["route-views.wide"],
    data_type=["updates"],
    ts_start=1283203200,  # Unix timestamp
    ts_end=1283289600,    # Unix timestamp
    parser_name="bgpkit",
    max_concurrent_downloads=10,
    cache_dir="/path/to/cache",
    ram_fetch=True,
)

Next: Filtering Data