- Published on
- |Views: 28|17 min read
Tensor Puzzles Walkthrough: Optimizations, Comparing Solutions
- Authors
- Name
- Shashank Shekhar
- @sshkhr16
In my previous blog post, I introduced the Tensor Puzzles created by Sasha Rush, and walked through my solutions (Tensor-Puzzles-Solutions). In this post, I will be optimizing my solutions to fit the author's first rule:
Each puzzle needs to be solved in 1 line (<80 columns) of code.
I also compare my thought process and solutions to the author's walkthrough. As a fun exercise, I also track who has a more concise solution for each puzzle - all in good fun! Note that my "optimized" solutions are not necessarily the most efficient, but rather the most concise. So you'll find that several of them are not the most readable solutions, and I have included my original solutions alongside for better readability.
Puzzle 1: ones
Optimization: Realized arange(i)
already gives a vector of the right shape. By multiplying with 0 and adding 1, I get a ones vector directly, avoiding any matrix operations from my naive solution.
Comparing Notes: While I used arithmetic (multiply by 0, add 1), Sasha used a logical approach with where
. His solution leverages the fact that arange(i) > -1
is always True, mapping to 1 via where
. Both achieve the same result with different conceptual approaches.
Shashank: 20 cols, Sasha: 34 cols
Scores: Shashank 1, Sasha 0
Puzzle 2: sum
Optimization: I just removed the intermediate variable storage of the dot product, directly adding the extra dimension using [None]
to match the return type.
Comparing Notes: Both solutions use dot product with a ones vector for summation. While I expanded the result dimension ([None]
), Sasha expanded the input dimension ([:, None]
). Both approaches ensure correct broadcasting for the final 1D tensor.
Shashank: 29 cols, Sasha: 36 cols
Scores: Shashank 2, Sasha 0
💡 I noticed that Sasha used the shape
method in his solution, which I had incorrectly considered as a function that was not allowed. It wasn't a blocker in my own solutions as I just len with indexing to get the corresponding dimension shapes, but it's just a reminder that you can never be too careful about reading the instructions!
Puzzle 3: outer
Optimization: Already had the optimal solution using broadcasting with None
. Only removed extra spaces for brevity.
Comparing Notes: Both solutions use identical broadcasting logic: expanding a
to 2D with [:, None]
and b
with [None, :]
. This creates aligned dimensions for element-wise multiplication, perfectly matching the outer product definition.
Shashank: 27 cols, Sasha: 27 cols
Scores: Shashank 2.5, Sasha 0.5
Puzzle 4: diag
Optimization: Combined creation of boolean mask and filtering into a single line, skipping intermediate variables.
Comparing Notes: While I used broadcasting to create a 2D boolean mask, Sasha's solution directly indexes both dimensions with the same range tensor. His approach is more concise and elegant, essentially "zipping" the indices to get diagonal positions.
Shashank: 60 cols, Sasha: 48 cols
Scores: Shashank 2.5, Sasha 1.5
Puzzle 5: eye
Optimization: Combined boolean mask creation and conversion to integers into one line by adding 0. Removed extra spaces to further reduce length.
Comparing Notes: Both solutions start with the same boolean mask creation. While I add 0 to convert boolean to integers, Sasha originally used where
but then found an even simpler approach by multiplying the boolean mask by 1. His final solution is more elegant.
Shashank: 47 cols, Sasha: 42 cols
Scores: Shashank 2.5, Sasha 2.5
💡 The key optimization here is that I don't need to add an empty dimension on both sides of the comparsion, since adding the empty dimension on the LHS already causes the right side to be implicitly broadcasted. Quite clever and ingeniuous on Sasha's part!
Puzzle 6: triu
Optimization: Used the same approach as eye()
, but with <=
comparison to get upper triangular pattern. Combined boolean mask and integer conversion into one line.
Comparing Notes: Both solutions follow identical strategy from Puzzle 5, just replacing the equality operator with <=
. Like in eye()
, the key difference is my redundant [None,:]
on the RHS which Sasha avoids through implicit broadcasting.
Shashank: 47 cols, Sasha: 42 cols
Scores: Shashank 2.5, Sasha 3.5
❗ And just like that, Sasha takes the lead! Could this be a sign of things to come? 🤔
Puzzle 7: cumsum
Optimization: No real optimizations per se: I just removed the spaces from my original solution.
Comparing Notes: Both solutions leverage the same insight - cumulative sum at position i is the dot product of input with first i ones. While I reused triu()
, Sasha directly embedded the triangular matrix logic. Same concept, different implementation choices.
Shashank: 21 cols, Sasha: 73 cols
Scores: Shashank 3.5, Sasha 3.5
Puzzle 8: diff
Optimization: Combined assignment and return into one line with semicolon. In-place assignment avoids creating intermediate arrays while maintaining the required functionality.
Comparing Notes: While I used direct indexing and in-place modification, Sasha used where
to handle the edge case at index 0. Both solutions compute differences between consecutive elements, just with different approaches to boundary handling.
Shashank: 28 cols, Sasha: 64 cols
This one's a bit hard to judge, while I used much fewer characters, my solution involves two statements crammed into the same line. Sasha's solution, while longer, is just one Python statement. I am gonna split the points with Sasha on this one.
Scores: Shashank 4, Sasha 4
Puzzle 9: vstack
Optimization: Initially tried combining masks and outer products, but exceeded 80 chars. Switched to where
which simplified logic and reduced length significantly. Broadcasting eliminated need for explicit outer products.
Comparing Notes: Both solutions evolved similarly - from complex broadcasting to simpler where
usage. Sasha's insight about direct broadcasting of a
and b
without manipulation made his solution more elegant. A great example of "less is more" in tensor operations.
Shashank: 71 cols, Sasha: 43 cols
Scores: Shashank 4, Sasha 5
Puzzle 10: roll
Optimization: Simplified by removing unnecessary broadcasting with ones
. Combined three lines into one using semicolons. Direct array modification and indexing keeps the solution concise.
Comparing Notes: While I manually handled the wrap-around by setting last index to 0, Sasha used %
modulo operation for cyclic indexing. His solution is more mathematical and eliminates need for in-place modification.
Shashank: 38 cols, Sasha: 29 cols
Scores: Shashank 4, Sasha 6
Puzzle 11: flip
Optimization: Had the optimal solution from start - using reversed indexing sequence. Only removed spaces for brevity.
Comparing Notes: Both arrived at identical solutions, creating a reversed index sequence to flip the array. A case where tensor indexing has a clear, optimal approach.
Shashank: 33 cols, Sasha: 33 cols
Scores: Shashank 4.5, Sasha 6.5
Puzzle 12: compress
Optimization: Combined multiple steps into one line using semicolons: summing boolean mask, creating padded array, and filling with masked values.
Comparing Notes: While I used sum
to count True values and manual array filling, Sasha used a more sophisticated approach with cumsum
of the boolean mask to determine target positions. His solution creates a mapping matrix in one go, though more complex to parse.
Shashank: 48 cols, Sasha: 63 cols
Let's split the points on this one, my solution is more concise but uses more than one statement in a line, while Sasha's is a single statement but longer.
Scores: Shashank 5, Sasha 7
Puzzle 13: pad_to
Optimization: Combined multiple steps into one line using semicolons: creating zero array, finding minimum length, and copying values. Keeps the sequential logic but more concise.
Comparing Notes: While I used direct array manipulation with a temporary zero array, Sasha used matrix multiplication with a carefully constructed diagonal matrix. His approach is more elegant mathematically but longer in implementation.
Shashank: 45 cols, Sasha: 48 cols
I will give this one to Sasha, his solution is more elegant and if you remove the spaces, it will also take fewer characters.
Scores: Shashank 5, Sasha 8
Puzzle 14: sequence_mask
Optimization: Combined creation of comparison matrices and masking into one line. Used direct comparison of outer products without converting to integers, as multiplication with values
handles the conversion. Despite optimization attempts, couldn't bring solution under 80 chars.
Comparing Notes: While I used outer products to create comparison matrices, Sasha used where
with broadcasting. His solution directly compares reshaped length
with arange
, eliminating need for explicit outer products and making the intent clearer. This was one of the more enlightening solutions from the walkthrough, as I hadn't considered using where
with direct broadcasting at all.
Shashank: 98 cols, Sasha: 65 cols
Scores: Shashank 5, Sasha 9
💡 This was one of the few problems where I couldn't optimize my solution to be under 80 characters. The author's approach using where
with clever broadcasting not only resulted in a more concise solution but also demonstrated to me a completely different way of thinking about the problem.
Puzzle 15: bincount
Optimization: Combined the broadcasting, comparison, and summation into one line. The core idea of comparing each index with input values and summing matches remains the same.
Comparing Notes: Our approaches aligned perfectly here - both using comparisons with broadcasted arrays to count occurrences. The final solutions are nearly identical, just differing in format rather than logic.
Shashank: 60 cols, Sasha: 47 cols
Scores: Shashank 5, Sasha 10
💡 The key optimization is Sasha's use of dimension addition with [:, None]
instead of explicit outer
product, a pattern we've seen work well in previous puzzles too.
Puzzle 16: scatter_add
Optimization: Combined broadcasting, comparison and summation into one line, similar to bincount. Though technically functional, the solution exceeds the 80-character limit significantly.
Comparing Notes: While I used explicit outer products and element-wise operations, Sasha's elegant solution leverages eye
matrix indexing to create a mapping matrix. Indexing it with j
gives the mapping matrix from the columns in values
to the correct columns in the output.
Shashank: 106 cols, Sasha: 28 cols
Scores: Shashank 5, Sasha 11
💡 I really thought this was the the most elegant solution in the entire puzzle set. It gave me a new mental model to think about: instead of looping over indexes by using broadcasted matrices to get the mapping, I can get the mapping matrix directly by creating an identity matrix and then permuting it by indexing with the link vector.
❗ At this point I gave up on the character optimizations of my solutions, because the Speed Run Mode
only covers Puzzles 1-16. For the remaining 5 puzzles, I just compare my original solution approach to the author's.
Puzzle 17: flatten
Comparing Notes: Both mine and Sasha's solutions use the same core idea - using arange
to generate linear indices and mapping them to 2D coordinates using division and modulo. The mathematical approach aligned perfectly between my solution and Sasha's. The only real difference is I used regular division followed by typecasting to integer, while Sasha used //
integer division directly.
Puzzle 18: linspace
Comparing Notes: Both approaches use arange
as the foundation, scaling and shifting it to get desired range. Main difference is in handling edge cases - I used explicit conditionals while Sasha handled it with max
function.
💡 While I had already used this idea in my solutions, it was spelled out as a good mental model here: vectorization of loops over indices can be achieved directly with arange
and broadcasting.
Puzzle 19: heaviside
Comparing Notes: While I used two separate where
calls, Sasha's solution neatly combines the logic into a single where
statement with a boolean multiplication for the positive case.
Puzzle 20: repeat
Comparing Notes: Both solutions arrived at the same elegant approach using outer
product with ones vector. A case where the simplest solution was also the best.
Puzzle 21: bucketize
Comparing Notes: While I used a complex approach with multiple comparison matrices, Sasha's solution is much more elegant - directly comparing values with boundaries and summing the resulting boolean matrix to get bucket indices.
It was really insightful to go over the author's solutions and compare them to my own. Particularly for the puzzles 11-16, I was really glad to see broadcasting in use in ways that I had never explicitly considered - relying on indexing instead.
This was also one of those few side projects where I went through the entire process of implementing, optimizing, and comparing solutions, sharing my learning process through blog posts, and putting up my code on GitHub Tensor-Puzzles-Solutions. To celebrate this little win, here is my favorite dog gif from the ones that Sasha uses on successful completion of a puzzle: