Advent of Code was created by software engineer Eric Wastl. It’s an advent calendar in which you are invited to solve one coding challenge per day in the lead-up to December 25th. The challenges unsurprisingly get more and more complicated as the days go.

Each challenge includes a little storyline: in 2020, we are trying to collect stars, the currency in the tropical island we’re trying to escape to.

I’m getting started late and I’m in the middle of exams, so I definitely will not respect the timeline, but I’ll try to get it done over the next few weeks.

If you’re unfamiliar with R, here is my list of resources to get started with R programming.

Edit: I am adding code to this post as I make progress in AoC 2020.

Day 1: Report Repair

On day 1, we are tasked with finding numbers that add up to 2020 in a big dataset, and multiplying them together. Part 1 requires us to find two numbers adding up to 2020, part 2 requires us to find three.

I managed to get the results with a few lines, but I should try to get the same result using a function instead of a loop. As soon as we get to a large three-dimensional array (in part 2), the loop becomes computationally intensive, so I assume there are cleaner solutions for a fairly simple problem like this one.

Part 1

d1_input <- read.table("d1_input.txt")

adds_up <- matrix(NA, ncol = nrow(d1_input), nrow = nrow(d1_input))

for (i in 1:nrow(d1_input)) {
  for (j in 1:nrow(d1_input)) {
    adds_up[i,j] <-  d1_input$V1[i] + d1_input$V1[j]
  }
}

row_id <- unique(which(adds_up == 2020, arr.ind = TRUE)[1,])

d1_input$V1[row_id[1]]*d1_input$V1[row_id[2]]
## [1] 988771

Part 2

adds_up_3 <- array(NA, dim = rep(nrow(d1_input), 3))

for (i in 1:nrow(d1_input)) {
  for (j in 1:nrow(d1_input)) {
    for(k in 1:nrow(d1_input)) {
    adds_up_3[i,j,k] <-  d1_input$V1[i] + d1_input$V1[j] + d1_input$V1[k]
    }
  }
}

row_id_3 <- unique(which(adds_up_3 == 2020, arr.ind = TRUE)[1,])

d1_input$V1[row_id_3[1]]*d1_input$V1[row_id_3[2]]*d1_input$V1[row_id_3[3]]
## [1] 171933104

Day 2: Password Philosophy

On day 2, we are asked to identify, among 200 combinations of letters, passwords that match certain rules. Both can be done using stringr functions (I need to figure out code highlighting on blogdown).

Part 1

d2_input <- read.table("d2_input.txt")

check_validity <- function(rule, letter, password) {
  min <- as.numeric(str_extract(rule, "[0-9]+"))
  max <- as.numeric(str_remove(rule, "[0-9]+-"))
  letter <- str_extract(letter, "[a-z]")
  
  letter_count <- str_count(password, letter)
  
  return(letter_count >= min & letter_count <= max)
}

d2_input %>% 
  mutate(valid = check_validity(V1, V2, V3)) %>% 
  pull(valid) %>%  
  sum()
## [1] 393

Part 2

check_validity_2 <- function(rule, letter, password) {
  position_1 <- as.numeric(str_extract(rule, "[0-9]+"))
  position_2 <- as.numeric(str_remove(rule, "[0-9]+-"))
  letter <- str_extract(letter, "[a-z]")

  letter_1 <- str_sub(password, position_1, position_1)
  letter_2 <- str_sub(password, position_2, position_2)
  
  return((letter_1 == letter & letter_2 != letter) | (letter_1 != letter & letter_2 == letter))
}

d2_input %>% 
  mutate(valid = check_validity_2(V1, V2, V3)) %>% 
  pull(valid) %>% 
  sum()
## [1] 690

Day 3: Toboggan Trajectory

On day 3, we are trying to compute how many trees are going to stand on the toboggan slope between our North Pole home and the Coastal Airport.

This one is interesting: it requires thinking about the input we have not as a data set but as a space in which each cell has an up_down and left-right coordinate corresponding to its location on the mountain slope.

The input indicates whether a tree stands in each cell. Here is what it looks like.

d3_input <- read_csv("d3_input.txt", col_names = FALSE)
length_pattern <- str_count(d3_input$X1[1])
length_input <- nrow(d3_input)
d3_input %>% head()

Part 1

In the first part, we are asked to compute how many trees we will encounter if we follow a slope of right 3 and down 1.

I first transform the input into a matrix of sufficient size. I extract the information on whether there is a tree or not in each cell by extracting the n-th character from the pattern on each line. I then make each cell without a tree a 0 and each cell with a tree a 1.

right_n <- 3
down_n <- 1

map <- matrix(NA, nrow = down_n*length_input, ncol = right_n*length_input)

treemap <- function(i, j) {
  return(if_else(j%%length_pattern == 0,
                 as.numeric(str_sub(d3_input$X1[i],
                                    length_pattern,
                                    length_pattern) == "#"),
                 as.numeric(str_sub(d3_input$X1[i],
                                    j%%length_pattern,
                                    j%%length_pattern) == "#")))
}
for (i in 1:nrow(map)) {
  for (j in 1:ncol(map)) {
    map[i,j] <- treemap(i, j)
  }
}

Then, I follow the slope with a while loop until I get below the bottom of the map, and count the number of times I encounter a tree.

i <- 1
j <- 1
n_trees <- 0

while(i <= nrow(map)) {
  n_trees <- n_trees + map[i,j]
  j <- j + right_n
  i <- i + down_n
}

n_trees
## [1] 159

Part 2

In part 2, we reproduce this process a few times for different slopes, and compute the total product of the number of trees encountered when going down each slope. There is nothing interesting in this chunk compared to the one above.

trees_numbers <- c(0, 0, 0, 0, 0)
k <- 1

for (slope in list(c(1,1), c(3,1), c(5,1), c(7,1), c(1,2))) {

  right_n <- slope[[1]]
  down_n <- slope[[2]]
  
  map <- matrix(NA, nrow = length_input, ncol = right_n*length_input)
  
  treemap <- function(i, j) {
    return(if_else(j%%length_pattern == 0,
                   as.numeric(str_sub(d3_input$X1[i],
                                      length_pattern,
                                      length_pattern) == "#"),
                   as.numeric(str_sub(d3_input$X1[i],
                                      j%%length_pattern,
                                      j%%length_pattern) == "#")))
  }
  for (i in 1:nrow(map)) {
    for (j in 1:ncol(map)) {
      map[i,j] <- treemap(i, j)
    }
  }
  i <- 1
  j <- 1
  n_trees <- 0
  while(i <= nrow(map)) {
    n_trees <- n_trees + map[i,j]
    j <- j + right_n
    i <- i + down_n
  }
  trees_numbers[k] <- n_trees
  k <- k+1
}

prod(trees_numbers, na.rm = TRUE)
## [1] 6419669520

Day 4: Passport Processing

On day 4, we are given a big, messy set of passport data, which we are asked to clean and filter in order to compute the number of “valid” passports.

Day 4’s challenge is easy with stringr functions. The main difficulty is to change this long messy input into a tidy data set. I do so by reading the data as one long character that I split it into discrete observations, from which I extract the value of each available variable using regular expressions.

Part 1

d4_input <- readChar("d4_input.txt", file.info("d4_input.txt")$size) %>%
  str_split("\n\n") %>%
  as.data.frame() %>% 
  as_tibble() %>% 
  rename(entries = c..eyr.2029.iyr.2013.nhcl..ceb3a1.byr.1939.ecl.blu.nhgt.163cm.npid.660456119...)

clean_d4_input <- d4_input %>% 
  mutate(
    entries = str_replace_all(entries, ": ", ":"),
    entries = str_replace_all(entries, "\n", " "),
    entries = str_squish(entries)
  ) %>% 
  mutate(
    eyr = str_remove(str_extract(entries, "eyr:[#A-Za-z0-9]+"), "[a-z]+:"),
    iyr = str_remove(str_extract(entries, "iyr:[#A-Za-z0-9]+"), "[a-z]+:"),
    hcl = str_remove(str_extract(entries, "hcl:[#A-Za-z0-9]+"), "[a-z]+:"),
    byr = str_remove(str_extract(entries, "byr:[#A-Za-z0-9]+"), "[a-z]+:"),
    ecl = str_remove(str_extract(entries, "ecl:[#A-Za-z0-9]+"), "[a-z]+:"),
    hgt = str_remove(str_extract(entries, "hgt:[#A-Za-z0-9]+"), "[a-z]+:"),
    pid = str_remove(str_extract(entries, "pid:[#A-Za-z0-9]+"), "[a-z]+:"),
    cid = str_remove(str_extract(entries, "cid:[#A-Za-z0-9]+"), "[a-z]+:")
  )

clean_d4_input %>%
  filter(
    !is.na(eyr), !is.na(iyr), !is.na(hcl),
    !is.na(byr), !is.na(ecl), !is.na(hgt), !is.na(pid)
    ) %>% nrow()
## [1] 204

Part 2

clean_d4_input %>% 
  mutate(
    byr = as.integer(byr),
    iyr = as.integer(iyr),
    eyr = as.integer(eyr),
    hgt_unit = str_extract(hgt, "[a-z]+"),
    hgt_value = as.integer(str_extract(hgt, "[0-9]+"))
  ) %>% 
  filter(
    !is.na(eyr),
    !is.na(iyr),
    !is.na(hcl),
    !is.na(byr),
    !is.na(ecl),
    !is.na(hgt),
    !is.na(pid),
    byr >= 1920 & byr <= 2002,
    iyr >= 2010 & iyr <= 2020,
    eyr >= 2020 & eyr <= 2030,
    (hgt_unit == "cm" & hgt_value >= 150 & hgt_value <= 193) | (hgt_unit == "in"  & hgt_value >= 59 & hgt_value <= 76),
    str_detect(hcl, "#[0-9a-f]{6}"),
    str_count(hcl) == 7,
    ecl %in% c("amb", "blu", "brn", "gry", "grn", "hzl", "oth"),
    str_detect(pid, "[0-9]{9}"),
    str_count(pid) == 9
  ) %>% nrow()
## [1] 179

Day 5: Binary Boarding

On day 5 we are boarding the plane but our boarding pass is missing. The input is a long list of boarding passes, from which we are asked to deduce our seat ID.

The boarding passes are codified in an interesting way and the main difficulty is to implement the rule that identifies each seat ID from the weird set of letters provided.

Part 1

This is what the input looks like:

d5_input <- read.table("d5_input.txt")
d5_input %>% head()
##           V1
## 1 FBBBBBBRRL
## 2 BBFFFBBLRL
## 3 FBFFBFFLRL
## 4 FFBBBFBLRR
## 5 FFBFBFFLLL
## 6 BFBBBBFLLL
d5_input <- d5_input %>% 
  mutate(
    V1 = str_replace_all(V1, "B", "1"),
    V1 = str_replace_all(V1, "R", "1"),
    V1 = str_replace_all(V1, "F", "0"),
    V1 = str_replace_all(V1, "L", "0")
  ) %>% 
  mutate(
    r_1 = as.numeric(str_sub(V1, 1, 1)),
    r_2 = as.numeric(str_sub(V1, 2, 2)),
    r_3 = as.numeric(str_sub(V1, 3, 3)),
    r_4 = as.numeric(str_sub(V1, 4, 4)),
    r_5 = as.numeric(str_sub(V1, 5, 5)),
    r_6 = as.numeric(str_sub(V1, 6, 6)),
    r_7 = as.numeric(str_sub(V1, 7, 7)),
    c_1 = as.numeric(str_sub(V1, 8, 8)),
    c_2 = as.numeric(str_sub(V1, 9, 9)),
    c_3 = as.numeric(str_sub(V1, 10, 10))
  ) %>% 
  select(-V1)

rows <- 0:127
cols <- 0:7

row_numbers <- rep(NA, nrow(d5_input))
for (i in 1:nrow(d5_input)) {
  row_numbers[i] <- ((((((rows[1:64]+ 64*d5_input$r_1[i])[1:32]
                         + 32*d5_input$r_2[i])[1:16]+16*d5_input$r_3[i])[1:8]
                       + 8*d5_input$r_4[i])[1:4] + 4*d5_input$r_5[i])[1:2]
                     + 2*d5_input$r_6[i])[1] + 1*d5_input$r_7[i]
}


col_numbers <- rep(NA, nrow(d5_input))
for (i in 1:nrow(d5_input)) {
  col_numbers[i] <- ((cols[1:4] + 4*d5_input$c_1[i])[1:2]
                     + 2*d5_input$c_2[i])[1] + 1*d5_input$c_3[i]
}

boarding_passes <- tibble(
  row = row_numbers,
  col = col_numbers
)

boarding_passes %>% 
  mutate(seat_ID = 8*row + col) %>% 
  pull(seat_ID) %>% 
  max()
## [1] 987

Part 2

flight <- matrix(NA, nrow = 128, ncol = 8)

for (i in 1:nrow(d5_input)) {
  flight[row_numbers[i]+1, col_numbers[i]+1] <- "occupied"
}

seat <- which(is.na(flight), arr.ind = TRUE)[which(is.na(flight), arr.ind = TRUE)[, 1] > 12
                                             & which(is.na(flight), arr.ind = TRUE)[, 1] < 124]

(seat[[1]]-1)*8+(seat[[2]]-1)
## [1] 603