FAQ¶

Frequently asked questions about PyBGPKITStream.

Why do I need a library to stream BGP data?¶

Even basic historical BGP data queries often involve several manual steps: locating the correct archive, downloading it, and parsing the contents. For more complex queries, such as those involving multiple collectors or specific time ranges, the difficulty increases as users have to manually manage and synchronize numerous archive files.

The goal of PyBGPStream and PyBGPKITStream is to abstract these complexities away. This allows users to focus on the actual data analysis rather than the infrastructure and boilerplate code required to fetch it.

Why a new library to stream BGP data?¶

While PyBGPStream has long been the primary tool for streaming historical BGP data from multiple collectors, it is currently no longer actively maintained. As of early 2026, several key features have become unreliable or non-functional, specifically support for RIS Live and data from certain RIS collectors.

PyBGPKITStream was developed to fill this gap, providing a modern, maintained alternative that restores these capabilities while offering the performance benefits of the BGPKIT ecosystem.

What's the difference between PyBGPKITStream and PyBGPStream?¶

PyBGPKITStream is a drop-in replacement for PyBGPStream that uses BGPKIT for retrieve and parsing MRT files. Key advantages:

More flexible: Multiple parser backends
Modern Python: Type hints, pydantic for configuration
No down time: Uses BGPKIT broker that can be self-hosted

Setting up the stream is different, but the output BGP elements have the same format.

Can I use this with live data?¶

Yes! Set both start_time and end_time to None for live mode:

from pybgpkitstream import BGPStreamConfig, BGPKITStream

config = BGPStreamConfig(
    collectors=["rrc00"],
    data_types=["updates"],
)

stream = BGPKITStream.from_config(config)
for elem in stream:
    print(elem)  # Live updates

How do I filter for specific data?¶

Use FilterOptions:

from pybgpkitstream import FilterOptions

filters = FilterOptions(
    peer_asn=2497,             # Specific AS
    prefix="192.0.2.0/24",     # Specific prefix
    update_type="announce",    # Only announcements
    ip_version=4,              # IPv4 only
)

config = BGPStreamConfig(
    ...,
    filters=filters,
)

Can I combine multiple filters?¶

Yes, all filters use AND logic:

filters = FilterOptions(
    peer_asn=2497,
    prefix_super="192.0.2.0/24",
    origin_asn=1234,
    ip_version=4,
)
# Returns: IPv4 prefixes originated by 1234,
#          more general than 192.0.2.0/24,
#          with peer 2497

How to cache the MRT archives?¶

Cache the Archive Files¶

Caching and more implementation details are configurable via the PyBGPKITStreamConfig object:

import datetime
from pybgpkitstream import PyBGPKITStreamConfig, BGPKITStream

config = PyBGPKITStreamConfig(
    start_time=datetime.datetime(2010, 9, 1, 0, 0),
    end_time=datetime.datetime(2010, 9, 1, 2, 0),
    collectors=["route-views.wide", "rrc04"],
    data_types=["updates"],
    cache_dir="cache",
    max_concurrent_downloads=5
)

stream = BGPKITStream.from_config(config)
for elem in stream:
    print(elem)

The stream is slow. How do I improve it?¶

Switch parser: Use pybgpstream parser instead of pybgpkit
Add filters: Reduce the dataset
Reduce time window: Request less data

See Performance Guide for more tips.

Can I process very large datasets?¶

Yes! Process in streaming fashion:

# Good: constant memory
count = 0
for elem in stream:
    count += 1

# Bad: grows linearly with elements
elements = list(stream)  # AVOID for large datasets!

More Help¶

Issues: GitHub Issues
Discussions: GitHub Discussions