Fix three EXIF reading bugs in exif_read.py#824
Merged
Merged
Conversation
- extract_gps_datetime: the epoch (1970-01-01) GPS date guard compared a datetime to a date, which is always False, so a no-GPS-fix placeholder date produced a bogus 1970 timestamp. Compare dt.date() instead. - parse_datetimestr_with_subsec_and_offset: a non-numeric SubSec value raised an uncaught ValueError (SubSec comes from untrusted metadata). Now, like exiftool, only the leading run of digits is used and any non-numeric remainder is ignored (no leading digits -> sub-second is simply not applied), so a single malformed tag no longer aborts the read. - parse_time_ratios_as_timedelta: hours/minutes were floored to int, silently dropping a fractional minute (e.g. 30.5m lost 30s). Keep floats. Also fixes a cosmetic wrong reason label in ExifRead.extract_height. Adds regression tests in tests/unit/test_exifread.py.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes three bugs in
mapillary_tools/exif_read.py, each with a regression test intests/unit/test_exifread.py.1. Epoch GPS date is not rejected (
extract_gps_datetime)The "no GPS fix" placeholder date
1970-01-01was meant to be discarded, but the guard compared adatetimeagainst adate:datetime.datetime(1970, 1, 1) == datetime.date(1970, 1, 1)is alwaysFalse, so the guard never fired and a bogus1970-…timestamp was returned. Fixed by comparingdt.date().2. Non-numeric SubSec crashes the read (
parse_datetimestr_with_subsec_and_offset)SubSecTimecomes from untrusted image metadata. A non-numeric value raised an uncaughtValueErrorfromint(subsec[:6]), aborting the entire capture-time read (the EXIF datetime path isn't wrapped by callers).This now mirrors exiftool's behavior: use only the leading run of digits and ignore any non-numeric remainder; if there are no leading digits, the sub-second is simply not applied. Verified against exiftool 13.55:
3. Fractional GPS minutes truncated (
parse_time_ratios_as_timedelta)Hours and minutes were floored to
int, so a fractional minute in theGPSTimeStamprationals was silently dropped (e.g.30.5min lost30s). Now kept asfloatso the remainder carries into the total.Cosmetic
ExifRead.extract_heightpassed"width"as the XMP reason label; corrected to"height".Tests
Adds
TestExtractGpsDatetimeFromEXIF(valid / epoch-rejected / fractional-minute, built end-to-end with PIL) andtest_parse_subsec_uses_leading_digits(fully non-numeric ignored; partially-numeric"12ab"→.120000).ruff,usort, andmypyclean.