Skip to content

tudelft/evkit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EVKit — Event-based Vision Toolkit

EVKit is a toolkit for storing, loading, and converting event-camera data into the tensor representations used to train deep models. It pairs a compact on-disk format (PODCAST) with CPU C++ kernels that turn raw events into voxel grids, count frames, and time surfaces, served straight from a PyTorch DataLoader.

  • Ready-to-use datasets. DSECDataset, M3EDDataset, and MVSECDataset stream PODCAST files from the Hugging Face Hub and download on first access.
  • Three representations. Voxel grid, count frame, and time surface, each backed by a single-pass C++ kernel and a configurable spec.
  • Flexible binning. Four voxel-grid temporal conventions (centered / left / right / spanned) covering the common variants in the literature.
  • Drops into PyTorch. Datasets are torch.utils.data.Datasets; collate functions assemble batches on CPU workers with no per-sample Python overhead.

Install

EVKit builds a small set of C++ extensions at install time, so a C++ compiler is required. The recommended workflow uses pixi:

pixi run install-dev      # builds the C++ kernels and installs evkit (editable)
pixi run -e dev test      # run the test suite

Or install into an existing environment with a working compiler:

pip install .

Quick start

from torch.utils.data import DataLoader
from evkit import DSECDataset, VoxelGridSpec, VoxelGridCollate

# Describe the representation you want.
spec = VoxelGridSpec(n_bins=5, separate_polarity=True, bin_weighting="centered")

# Sequences download from Hugging Face on first access and cache in cache_dir.
ds = DSECDataset(
    spec=spec,
    cache_dir="~/.cache/evkit/dsec",
    split="train",
    camera="left",
    snippet_ms=500,
)

collate = VoxelGridCollate.from_dataset(ds.datasets[0])
loader = DataLoader(ds, batch_size=8, num_workers=4, collate_fn=collate)

for voxels in loader:        # (B, n_bins, 2, H, W) with separate_polarity=True
    ...

A runnable version with throughput reporting is in examples/load_dsec.py.

Representations

Each representation is selected by passing the matching spec to the dataset and the matching collate function (or collate_for(spec, ...)) to the DataLoader.

Spec Output Notes
VoxelGridSpec (n_bins, H, W) signed, or (n_bins, 2, H, W) split bin_weighting: centered (default), left, right, spanned
CountFrameSpec (2, H, W) / (H, W) polarity: sep, sum, diff; counts saturate to dtype
TimeSurfaceSpec (H, W) or (2, H, W) exponential decay with time constant tau_us

Snippets are sampled from a recording by duration (snippet_ms) with an optional stride_ms for sliding windows.

Datasets

Class Source
DSECDataset mavlab-tudelft/dsec_podcast (Hugging Face)
M3EDDataset mavlab-tudelft/m3ed_podcast
MVSECDataset mavlab-tudelft/mvsec_podcast

All three share the same constructor shape: a representation spec, a cache_dir, an official split (or an explicit sequences list), a camera, and snippet windowing options.

Citation

If you use EVKit, please cite the accompanying paper:

@inproceedings{wu2026evkit,
  title     = {EVKit: An Open-source Flexible Toolkit for Efficient Event Camera
               Data Storage and Loading},
  author    = {Wu, Yilun and de Croon, Guido C. H. E.},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2026},
}

License

MIT — see LICENSE.

About

Efficient Storage and Fast Data Loading for Event Camera Data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors