Advent of Code, Day 11

I've been working through the puzzles for the Advent of Code, and they're quite a bit of fun. If you enjoy programming and/or computer science, I would highly recommend trying them out if you haven't already.

The problems increase in difficulty as Christmas gets nearer. Up to day 11, the have mostly been quick problems that yield to a relatively straightforward ad-hoc algorithm. Depending on your level of experience, of course Day 11, though, was quite a bit trickier.

This post will walk through the solution for the puzzle. Before reading further, take a look at the problem statement if you are not already familiar with it.

Getting Started

After I first looked at the problem, I realized that I was going to need quite a bit of time to think about it before starting the actual algorithm. Since I was still hoping at the time to make the leaderboard, I decided to postpone the actual algorithm and just work on reading the file to give myself a bit of time to think. I didn't. :(

While parsing the input file doesn't seem terribly interesting, a question that comes up while doing it turns out to be extremely important.

The sample input for this problem is:

The first floor contains a hydrogen-compatible microchip and a lithium-compatible microchip.
The second floor contains a hydrogen generator.
The third floor contains a lithium generator.
The fourth floor contains nothing relevant.

which represents an object distribution like this (courtesy the problem statement):

F4 .  .  .  .  .
F3 .  .  .  LG .
F2 .  HG .  .  .
F1 E  .  HM .  LM

where H, L, M, and G represent hydrogen, lithium, microchip, and generator, respectively.

Since each item is preceding by a, I just split on that, extracted the chips and generators, and processed them separately. If you are interested, my full implementation is here.

The question that I had while writing this part was whether I needed to preserve the names in the representation of the item distribution. Based on the problem statement, it is clear that we absolutely must give both hydrogen items the same tag and both lithium items the same tag, since a chip of one element is protected by a generator of the same element, but it isn't clear those tags need to be hydrogen and lithium. It turns out that we could just as well call hydrogen 1 and lithium 2, as long as we do it consistently. The interactions between two objects depend on whether they are the same element, but not on their names.

I ended up keeping the tags as the element names out of convenience, but knowing that the tags don't matter turned out to be important. Keep it in mind.

Finding the Shortest Path

Now we must turn to the actual problem, which is to determine the minimum number of moves needed to get all the items to the 4th floor, provided that we must take only legal moves, following the constraints provided.

If you think about the current item distribution as a start 'position' and the the goal of all items on the top floor as a final 'position', this problem can be seen as finding the shortest path from the start to the finish. The 'distance' is the number of moves that are needed to get from distribution A to distribution B. This characterization suggests that we think about the states of (elevator, distribution) as nodes in a graph, with the edges leaving each node given by the legal moves available at that state.

At these point, there isn't really a way to get around needing to know some algorithms to find shortest paths. The classic one is Dijkstra's algorithm, which is general enough for any graph and performs a 'best-first search', where the next best node is determined by the distance alone. However, since every move (and therefore every edge) represents the same distance, we don't need the full generality of Dijkstra's algorithm. In this case of equal edge distances, Dijkstra's reduces to simply Breadth first search, which works well for this problem. The important idea is that breadth first search examines all possible states after N moves before examining any after N+1 moves, so it will always find the minimum number of moves first.

I won't go into detail on breadth first search, but here is the pseudocode for the algorithm to solve the puzzle.

moves = []
// (floor, distance, item distribution)
current_state = (0, 0, item distribution)

while moves isn't empty:
    move = moves.pop_front
    current_state = apply move to current_state

    if current_state has all items on top floor:
        return distance // second element of current_state
    else:
        moves.push_back(possible moves from current_state)

I know that is pretty abstract, but the implementation is not bad. The possible moves from a given state can be found by listing all the subsets of the set of items on the current floor and choosing to move either up or down, and then eliminating those that result in an illegal state (e.g. one that involves frying some of your microchips or going to floor -1).

So, we are done, yeah? That wasn't so bad! Pick a language, write some code, and try it.

Even if you get the implementation working properly, we have a pretty big problem. Sure, this algorithm works, but it will take all day (or maybe even all week) to run. Huh. We need to change something if we want the solution before December 25th.

Cutting some Branches, Part I

Before we can speed up the algorithm, we need to understand what exactly is happening. How many moves are available at each state? We can either move up a floor or down a floor, so there are two moves. In addition, we must bring with us either 1 or 2 items (due to the elevator's rules). Let's say there are 3 items on the current floor. Then we can pick any one of them (3 possibilities) or any combination of two of them (another 3), for a total of 6 possible choices of items. Combined with the up/down choice, we have 12 possible moves. This number will be decreased by the restrictions on radiation, but not enough to solve our problem. At every state, we will have ~10 moves, resulting in ~10 states after one move. Each of those states will also have ~10 moves, for ~100 states after two moves. Thus exponential growth in possibilities, which doesn't play nicely with finitely fast computers.

At this point, I modified my program to print out the move queue for each number of moves and took a look at the first few outputs. For the second move, the algorithm considers the possibility of undoing the previous move and returning to the initial state. This is pretty obviously a dumb idea, since going back to where we just where after more moves can't possibly be optimal.

This brings us to the idea of pruning the search tree. If we know that a certain move is not going to lead us to the optimal solution, we can just discard it completely. This doesn't just save us one move, but all the moves that could be taken from the resulting state, and so on. Therefore this idea is actually a really good one, and it cuts out a significant fraction of the moves that need to be considered.

In the specific case of this problem, we can disregard all moves that lead back to a state that has already been examined. The implementation of this isn't that bad either. I just kept a hash table with a state as the key and inserted an element every time a state was examined. Then, when adding new moves, I first consulted the visited hash to see if the resulting state already had an entry, and ignored the move if it did.

Now try writing this up. This modified algorithm probably executes reasonably quickly for the sample input, but is likely still super slow for the full puzzle, depending on your implementation language. Some more work is needed.

Cutting some branches, Part II

The idea of pruning moves that lead to previously-visited states is the only one that came to me fairly quickly.

After thinking for a long time, I returned to the idea discussed above regarding the importance of the tags of the items. As noted, the names don't matter, only whether any two elements represent a pair. Let's consider the initial states of the sample input again. Here it is, for easy reference:

F4 .  .  .  .  .
F3 .  .  .  LG .
F2 .  HG .  .  .
F1 E  .  HM .  LM

Since the names are arbitrary and don't matter, the entire state can be represented just by the floor the elevator is on and a list of paired locations. So we can represent this state as:

{
    // elevator on first floor
    elevator: 0,

    // first pair on floors 0 and 1,
    // second pair on floors 0 and 2
    pairs: [(0, 1), (0,2)]
}

Wait a second. We have lost some information! How are we supposed to know whether the first pair in that list is lithium or hydrogen? From that representation of the state, we could just as well have this state:

F4 .  .  .  .  .
F3 .  .  .  HG .
F2 .  LG .  .  .
F1 E  .  LM .  HM

That's clearly different, isn't it?

But we said before that the names don't matter. What stops us from calling 'lithium' 'hydrogen', and 'hydrogen' 'lithium'?

Nothing. This state is exactly the same as the first, just with some names mixed up. Follow the same sequence given in the problem statement with lithium and hydrogen switched, and you will again solve the puzzle in 11 moves.

The simple idea that the names don't matter turns out to be the key to solving the entire puzzle. We have already determined that we shouldn't bother revisiting states that we have previously considered, but at the time I was considering the two states above to be distinct. For all our intents and purposes, having visited the first state is the same as having visited the second, since both require the same number of moves to complete the puzzle.

With this in mind, I modified my hash table of previously visited states to use the (floor number, list of pairs) representation of the state as the key. Therefore, both of the states above will map to the same entry of the hash table, and we can prune not only moves leading to a state we have seen, but also all states that are the same but with names switched up.

Using this algorithm, my naive and allocation-happy Ruby code gives the solutions for both parts in around 35 seconds, which is fast enough for me. Tada! Puzzle solved.

If you are interested, my complete implementation is available here.

Some Notes

While I consider the puzzle to be solved at this point, if you want to further improve performance there are additional ideas for pruning in this reddit thread. Some of them seem reasonable, but I haven't been able to convince myself that some of the suggestions would work for every input.

If you have any questions, comments, or suggestions regarding this post, you can send me an email (see the about page) or ask a question on /r/adventofcode.