Advent of Code 2020 - Day 4
Day 4 of AoC 2020 (Passport Processing) is a fairly straightforward parsing problem. As with the previous posts in this series, we’ll use Python for the task. Spoilers lurk ahead.
Because the input is made up of multiple records separated by blank lines, we can use the following snippet to build the passport database.
from collections import defaultdict
with open('input') as datafile:
# Split on '\n\n' to separate the individual records
data = datafile.read().split('\n\n')
passports = []
for pp in data:
# We are using defaultdict to simplify handling in later stages
newpp = defaultdict()
for elem in pp.split():
# elem is the k:v pair which we need to split further
k, v = elem.split(':')
newpp[k] = v
passports.append(newpp)
Part 1 requires us to just verify that the passports all have certain fields present. We can use the following snippet to verify part 1
def valid_part1(pp):
return all(k in pp for k in ['byr', 'iyr', 'eyr', 'hgt', 'hcl', 'ecl', 'pid'])
count_valid_part1 = [valid_part1(pp) for pp in passports].count(True)
Part 2 adds some additional data validation, specifically checking that some fields are within a specified range, other fields match a pattern.
For the hair color field (hcl
), we need it to match the pattern
#[0-9a-f]{6}
. Similarly, for the passport ID field (pid
), it needs to have
exactly 9 digits. We can use the re
module to verify the patterns.
import re
hcl_regex = re.compile(r'#[0-9]{6}')
def valid_hcl(pp):
hcl = pp['hcl']
return hcl is not None and hcl_regex.fullmatch(hcl) is not None
pid_regex = re.compile(r'[0-9]{9}')
def valid_pid(pp):
pid = pp['pid']
return pid is not None and pid_regex.fullmatch(pid) is not None
For the eye color, we need it to be one of a set of values.
ecl_set = set(['amb', 'blu', 'brn', 'gry', 'grn', 'hzl', 'oth'])
def valid_ecl(pp):
return pp['ecl'] in ecl_set
For the fields that need to be within a range, we need to convert them to integers to compare them. We will create a helper function to deal with this.
def valid_range(field, lo, hi):
try:
field = int(field)
except (ValueError, TypeError):
# If field is None, or has characters other than 0-9, then it will
# fail integer conversion, and therefore is not in the range
return False
return lo <= field <= hi
We can now validate the byr
, iyr
and eyr
fields by calling valid_range
directly. For the hgt
field, we need some additional steps to check that the
last two characters are a valid measurement unit, and based on that, validate
the remaining numbers within the expected range.
def valid_hgt(pp):
hgt = pp['hgt']
if hgt is None:
return False
unit = hgt[-2:] # Get the last 2 characters
if unit == 'cm':
return valid_range(hgt[:-2], 150, 193)
elif unit == 'in':
return valid_range(hgt[:-2], 59, 76)
return False
We now have enough to validate as per the Part 2 rules.
def valid_part2(pp):
return all(
valid_range(pp['byr'], 1920, 2002),
valid_range(pp['iyr'], 2010, 2020),
valid_range(pp['eyr'], 2020, 2030),
valid_hgt(pp),
valid_hcl(pp),
valid_ecl(pp),
valid_pid(pp)
)
count_valid_part2 = [valid_part2(pp) for pp in passports].count(True)