Skip to content

Fix three EXIF reading bugs in exif_read.py#824

Merged
ptpt merged 1 commit into
mainfrom
fix/exif-read-bugs
Jun 24, 2026
Merged

Fix three EXIF reading bugs in exif_read.py#824
ptpt merged 1 commit into
mainfrom
fix/exif-read-bugs

Conversation

@ptpt

@ptpt ptpt commented Jun 24, 2026

Copy link
Copy Markdown
Member

Summary

Fixes three bugs in mapillary_tools/exif_read.py, each with a regression test in tests/unit/test_exifread.py.

1. Epoch GPS date is not rejected (extract_gps_datetime)

The "no GPS fix" placeholder date 1970-01-01 was meant to be discarded, but the guard compared a datetime against a date:

if dt is None or dt == datetime.date(1970, 1, 1):  # datetime == date is always False

datetime.datetime(1970, 1, 1) == datetime.date(1970, 1, 1) is always False, so the guard never fired and a bogus 1970-… timestamp was returned. Fixed by comparing dt.date().

2. Non-numeric SubSec crashes the read (parse_datetimestr_with_subsec_and_offset)

SubSecTime comes from untrusted image metadata. A non-numeric value raised an uncaught ValueError from int(subsec[:6]), aborting the entire capture-time read (the EXIF datetime path isn't wrapped by callers).

This now mirrors exiftool's behavior: use only the leading run of digits and ignore any non-numeric remainder; if there are no leading digits, the sub-second is simply not applied. Verified against exiftool 13.55:

SubSecTimeOriginal     : 12ab
SubSecDateTimeOriginal : 2019:01:01 12:00:00.12   # our reader returns .120000 too

3. Fractional GPS minutes truncated (parse_time_ratios_as_timedelta)

Hours and minutes were floored to int, so a fractional minute in the GPSTimeStamp rationals was silently dropped (e.g. 30.5 min lost 30 s). Now kept as float so the remainder carries into the total.

Cosmetic

ExifRead.extract_height passed "width" as the XMP reason label; corrected to "height".

Tests

Adds TestExtractGpsDatetimeFromEXIF (valid / epoch-rejected / fractional-minute, built end-to-end with PIL) and test_parse_subsec_uses_leading_digits (fully non-numeric ignored; partially-numeric "12ab".120000).

uv run pytest tests/unit/test_exifread.py -q
78 passed

ruff, usort, and mypy clean.

@meta-cla meta-cla Bot added the cla signed label Jun 24, 2026
- extract_gps_datetime: the epoch (1970-01-01) GPS date guard compared a
  datetime to a date, which is always False, so a no-GPS-fix placeholder
  date produced a bogus 1970 timestamp. Compare dt.date() instead.
- parse_datetimestr_with_subsec_and_offset: a non-numeric SubSec value
  raised an uncaught ValueError (SubSec comes from untrusted metadata).
  Now, like exiftool, only the leading run of digits is used and any
  non-numeric remainder is ignored (no leading digits -> sub-second is
  simply not applied), so a single malformed tag no longer aborts the read.
- parse_time_ratios_as_timedelta: hours/minutes were floored to int,
  silently dropping a fractional minute (e.g. 30.5m lost 30s). Keep floats.

Also fixes a cosmetic wrong reason label in ExifRead.extract_height.

Adds regression tests in tests/unit/test_exifread.py.
@ptpt ptpt force-pushed the fix/exif-read-bugs branch from 09b1033 to 68eb4da Compare June 24, 2026 23:13
@ptpt ptpt merged commit 847fd0e into main Jun 24, 2026
19 checks passed
@ptpt ptpt deleted the fix/exif-read-bugs branch June 24, 2026 23:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant