Tensor Puzzles Walkthrough: Optimizations, Comparing Solutions

In my previous blog post, I introduced the Tensor Puzzles created by Sasha Rush, and walked through my solutions (Tensor-Puzzles-Solutions). In this post, I will be optimizing my solutions to fit the author's first rule:

Each puzzle needs to be solved in 1 line (<80 columns) of code.

I also compare my thought process and solutions to the author's walkthrough. As a fun exercise, I also track who has a more concise solution for each puzzle - all in good fun! Note that my "optimized" solutions are not necessarily the most efficient, but rather the most concise. So you'll find that several of them are not the most readable solutions, and I have included my original solutions alongside for better readability.

Puzzle 1: ones

Optimization: Realized arange(i) already gives a vector of the right shape. By multiplying with 0 and adding 1, I get a ones vector directly, avoiding any matrix operations from my naive solution.

Comparing Notes: While I used arithmetic (multiply by 0, add 1), Sasha used a logical approach with where. His solution leverages the fact that arange(i) > -1 is always True, mapping to 1 via where. Both achieve the same result with different conceptual approaches.

Shashank: 20 cols, Sasha: 34 cols

Scores: Shashank 1, Sasha 0

Puzzle 2: sum

Optimization: I just removed the intermediate variable storage of the dot product, directly adding the extra dimension using [None] to match the return type.

Comparing Notes: Both solutions use dot product with a ones vector for summation. While I expanded the result dimension ([None]), Sasha expanded the input dimension ([:, None]). Both approaches ensure correct broadcasting for the final 1D tensor.

Shashank: 29 cols, Sasha: 36 cols

Scores: Shashank 2, Sasha 0

💡 I noticed that Sasha used the shape method in his solution, which I had incorrectly considered as a function that was not allowed. It wasn't a blocker in my own solutions as I just len with indexing to get the corresponding dimension shapes, but it's just a reminder that you can never be too careful about reading the instructions!

Puzzle 3: outer

Optimization: Already had the optimal solution using broadcasting with None. Only removed extra spaces for brevity.

Comparing Notes: Both solutions use identical broadcasting logic: expanding a to 2D with [:, None] and b with [None, :]. This creates aligned dimensions for element-wise multiplication, perfectly matching the outer product definition.

Shashank: 27 cols, Sasha: 27 cols

Scores: Shashank 2.5, Sasha 0.5

Puzzle 4: diag

Optimization: Combined creation of boolean mask and filtering into a single line, skipping intermediate variables.

Comparing Notes: While I used broadcasting to create a 2D boolean mask, Sasha's solution directly indexes both dimensions with the same range tensor. His approach is more concise and elegant, essentially "zipping" the indices to get diagonal positions.

Shashank: 60 cols, Sasha: 48 cols

Scores: Shashank 2.5, Sasha 1.5

Puzzle 5: eye

Optimization: Combined boolean mask creation and conversion to integers into one line by adding 0. Removed extra spaces to further reduce length.

Comparing Notes: Both solutions start with the same boolean mask creation. While I add 0 to convert boolean to integers, Sasha originally used where but then found an even simpler approach by multiplying the boolean mask by 1. His final solution is more elegant.

Shashank: 47 cols, Sasha: 42 cols

Scores: Shashank 2.5, Sasha 2.5

💡 The key optimization here is that I don't need to add an empty dimension on both sides of the comparsion, since adding the empty dimension on the LHS already causes the right side to be implicitly broadcasted. Quite clever and ingeniuous on Sasha's part!

Puzzle 6: triu

Optimization: Used the same approach as eye(), but with <= comparison to get upper triangular pattern. Combined boolean mask and integer conversion into one line.

Comparing Notes: Both solutions follow identical strategy from Puzzle 5, just replacing the equality operator with <=. Like in eye(), the key difference is my redundant [None,:] on the RHS which Sasha avoids through implicit broadcasting.

Shashank: 47 cols, Sasha: 42 cols

Scores: Shashank 2.5, Sasha 3.5

❗ And just like that, Sasha takes the lead! Could this be a sign of things to come? 🤔

Puzzle 7: cumsum

Optimization: No real optimizations per se: I just removed the spaces from my original solution.

Comparing Notes: Both solutions leverage the same insight - cumulative sum at position i is the dot product of input with first i ones. While I reused triu(), Sasha directly embedded the triangular matrix logic. Same concept, different implementation choices.

Shashank: 21 cols, Sasha: 73 cols

Scores: Shashank 3.5, Sasha 3.5

Puzzle 8: diff

Optimization: Combined assignment and return into one line with semicolon. In-place assignment avoids creating intermediate arrays while maintaining the required functionality.

Comparing Notes: While I used direct indexing and in-place modification, Sasha used where to handle the edge case at index 0. Both solutions compute differences between consecutive elements, just with different approaches to boundary handling.

Shashank: 28 cols, Sasha: 64 cols

This one's a bit hard to judge, while I used much fewer characters, my solution involves two statements crammed into the same line. Sasha's solution, while longer, is just one Python statement. I am gonna split the points with Sasha on this one.

Scores: Shashank 4, Sasha 4

Puzzle 9: vstack

Optimization: Initially tried combining masks and outer products, but exceeded 80 chars. Switched to where which simplified logic and reduced length significantly. Broadcasting eliminated need for explicit outer products.

Comparing Notes: Both solutions evolved similarly - from complex broadcasting to simpler where usage. Sasha's insight about direct broadcasting of a and b without manipulation made his solution more elegant. A great example of "less is more" in tensor operations.

Shashank: 71 cols, Sasha: 43 cols

Scores: Shashank 4, Sasha 5

Puzzle 10: roll

Optimization: Simplified by removing unnecessary broadcasting with ones. Combined three lines into one using semicolons. Direct array modification and indexing keeps the solution concise.

Comparing Notes: While I manually handled the wrap-around by setting last index to 0, Sasha used % modulo operation for cyclic indexing. His solution is more mathematical and eliminates need for in-place modification.

Shashank: 38 cols, Sasha: 29 cols

Scores: Shashank 4, Sasha 6

Puzzle 11: flip

Optimization: Had the optimal solution from start - using reversed indexing sequence. Only removed spaces for brevity.

Comparing Notes: Both arrived at identical solutions, creating a reversed index sequence to flip the array. A case where tensor indexing has a clear, optimal approach.

Shashank: 33 cols, Sasha: 33 cols

Scores: Shashank 4.5, Sasha 6.5

Puzzle 12: compress

Optimization: Combined multiple steps into one line using semicolons: summing boolean mask, creating padded array, and filling with masked values.

Comparing Notes: While I used sum to count True values and manual array filling, Sasha used a more sophisticated approach with cumsum of the boolean mask to determine target positions. His solution creates a mapping matrix in one go, though more complex to parse.

Shashank: 48 cols, Sasha: 63 cols

Let's split the points on this one, my solution is more concise but uses more than one statement in a line, while Sasha's is a single statement but longer.

Scores: Shashank 5, Sasha 7

Puzzle 13: pad_to

Optimization: Combined multiple steps into one line using semicolons: creating zero array, finding minimum length, and copying values. Keeps the sequential logic but more concise.

Comparing Notes: While I used direct array manipulation with a temporary zero array, Sasha used matrix multiplication with a carefully constructed diagonal matrix. His approach is more elegant mathematically but longer in implementation.

Shashank: 45 cols, Sasha: 48 cols

I will give this one to Sasha, his solution is more elegant and if you remove the spaces, it will also take fewer characters.

Scores: Shashank 5, Sasha 8

Puzzle 14: sequence_mask

Optimization: Combined creation of comparison matrices and masking into one line. Used direct comparison of outer products without converting to integers, as multiplication with values handles the conversion. Despite optimization attempts, couldn't bring solution under 80 chars.

Comparing Notes: While I used outer products to create comparison matrices, Sasha used where with broadcasting. His solution directly compares reshaped length with arange, eliminating need for explicit outer products and making the intent clearer. This was one of the more enlightening solutions from the walkthrough, as I hadn't considered using where with direct broadcasting at all.

Shashank: 98 cols, Sasha: 65 cols

Scores: Shashank 5, Sasha 9

💡 This was one of the few problems where I couldn't optimize my solution to be under 80 characters. The author's approach using where with clever broadcasting not only resulted in a more concise solution but also demonstrated to me a completely different way of thinking about the problem.

Puzzle 15: bincount

Optimization: Combined the broadcasting, comparison, and summation into one line. The core idea of comparing each index with input values and summing matches remains the same.

Comparing Notes: Our approaches aligned perfectly here - both using comparisons with broadcasted arrays to count occurrences. The final solutions are nearly identical, just differing in format rather than logic.

Shashank: 60 cols, Sasha: 47 cols

Scores: Shashank 5, Sasha 10

💡 The key optimization is Sasha's use of dimension addition with [:, None] instead of explicit outer product, a pattern we've seen work well in previous puzzles too.

Puzzle 16: scatter_add

Optimization: Combined broadcasting, comparison and summation into one line, similar to bincount. Though technically functional, the solution exceeds the 80-character limit significantly.

Comparing Notes: While I used explicit outer products and element-wise operations, Sasha's elegant solution leverages eye matrix indexing to create a mapping matrix. Indexing it with j gives the mapping matrix from the columns in values to the correct columns in the output.

Shashank: 106 cols, Sasha: 28 cols

Scores: Shashank 5, Sasha 11

💡 I really thought this was the the most elegant solution in the entire puzzle set. It gave me a new mental model to think about: instead of looping over indexes by using broadcasted matrices to get the mapping, I can get the mapping matrix directly by creating an identity matrix and then permuting it by indexing with the link vector.

❗ At this point I gave up on the character optimizations of my solutions, because the Speed Run Mode only covers Puzzles 1-16. For the remaining 5 puzzles, I just compare my original solution approach to the author's.

Puzzle 17: flatten

Comparing Notes: Both mine and Sasha's solutions use the same core idea - using arange to generate linear indices and mapping them to 2D coordinates using division and modulo. The mathematical approach aligned perfectly between my solution and Sasha's. The only real difference is I used regular division followed by typecasting to integer, while Sasha used // integer division directly.

Puzzle 18: linspace

Comparing Notes: Both approaches use arange as the foundation, scaling and shifting it to get desired range. Main difference is in handling edge cases - I used explicit conditionals while Sasha handled it with max function.

💡 While I had already used this idea in my solutions, it was spelled out as a good mental model here: vectorization of loops over indices can be achieved directly with arange and broadcasting.

Puzzle 19: heaviside

Comparing Notes: While I used two separate where calls, Sasha's solution neatly combines the logic into a single where statement with a boolean multiplication for the positive case.

Puzzle 20: repeat

Comparing Notes: Both solutions arrived at the same elegant approach using outer product with ones vector. A case where the simplest solution was also the best.

Puzzle 21: bucketize

Comparing Notes: While I used a complex approach with multiple comparison matrices, Sasha's solution is much more elegant - directly comparing values with boundaries and summing the resulting boolean matrix to get bucket indices.

It was really insightful to go over the author's solutions and compare them to my own. Particularly for the puzzles 11-16, I was really glad to see broadcasting in use in ways that I had never explicitly considered - relying on indexing instead.

This was also one of those few side projects where I went through the entire process of implementing, optimizing, and comparing solutions, sharing my learning process through blog posts, and putting up my code on GitHub Tensor-Puzzles-Solutions. To celebrate this little win, here is my favorite dog gif from the ones that Sasha uses on successful completion of a puzzle: