This project solves Sumoku (Killer Sudoku) problems with various approaches, such as:
- traditional approach (traversing from the left to the right and from the top to the bottom),
- ordering approach (exploring the box that contains least amount of elements)
- MRV (exploring the cell that has the minimum remaining values)
Sumoku is a variation of Sudoku.
The objective is to fill the board with numbers from 1 to 9 and follow the following conditions:
- A number is unique in each row
- A number is unique in each column
- A number is unique in each box
- The sum of all numbers in a box should equal to the number provided
The lady sitting next to me on my flight back home was solving Sudoku problems. I chatted with her saying that I am also a fan of solving Sudoku problems. She tore a page from her Sudoku magazine and handed me. It was not standard Sudoku, but Sumoku-a variation of Sudoku that I have never seen before. I was flabbergasted as it is provided with no hints but only sums to begin with. I gave up solving the puzzle on my flight and started to chat with the lady. I mentioned that I am a software engineer to her. She believes that I am smart enough to solve this Sumoku. When I disembarked the plane, I thought to myself, "Hmm..., what if I write a piece of software that solves Sumoku? This is very similar to Sudoku and I did a LeetCode question like this before." And the rest is history.
The requirements are:
- CMake 3.18 or better; 4.0+ highly recommended
- A C++20 compatible compiler (gcc or llvm)
- The Boost libararies
- Git
- Doxygen (optional, highly recommended)
- fmt 11.0 or higher (will automatically install if not present)
- Catch2 3.8 or higher (will automatically install if not present)
- nanobench 4.3 or higher (will automatically install if not present)
- json 3.9.1 or higher (will automatically install if not present)
- abseil 20250512.1 or newer (will automatically install if not present)
To configure:
cmake -S . -B buildAdd --toolchain=./<your_toolchain_file>.toolchain if you want to use your own toolchain.
Add -GNinja if you have Ninja.
To build without example:
cmake --build buildTo test (--target can be written as -t in CMake 3.15+):
cmake --build build --target testTo run the binary with example layout:
./build/apps/appTo build and test:
cmake --build build -DCMAKE_BUILD_TYPE=Test && cmake --build build --target testRun a specific tag:
./build/tests/solvertestlib "[<tag>]"To build docs (requires Doxygen, output in build/docs/html):
cmake --build build --target docsTo build and run benchmark:
cmake -S . -B build -DCMAKE_BUILD_TYPE=Benchmark && ./build/bench/<name_of_benchmark>The folllowing table shows the latest iteration of the MRV method versus the traditional method (the very first iteration). There is a whopping 10x to 460x improvement (varies depending on the puzzles).
| ns/op | op/s | err% | total | Sumoku Solver Comparison #1 |
|---|---|---|---|---|
| 2,368.01 | 422,295.72 | 2.8% | 0.01 | traditional |
| 1,039.62 | 961,886.33 | 2.2% | 0.01 | MRV |
| ns/op | op/s | err% | total | Sumoku Solver Comparison #2 |
|---|---|---|---|---|
| 3,762.73 | 265,764.19 | 3.1% | 0.01 | traditional |
| 1,232.61 | 811,287.77 | 3.1% | 0.01 | MRV |
| ns/op | op/s | err% | total | Sumoku Solver Comparison #3 |
|---|---|---|---|---|
| 14,046,459.00 | 71.19 | 0.6% | 0.16 | traditional |
| 31,449.08 | 31,797.43 | 0.7% | 0.01 | MRV |
| ns/op | op/s | err% | total | Sumoku Solver Comparison #4 |
|---|---|---|---|---|
| 1,811,875.00 | 551.91 | 2.1% | 0.02 | traditional |
| 20,879.81 | 47,893.16 | 2.2% | 0.01 | MRV |
| ns/op | op/s | err% | total | Sumoku Solver Comparison #5 |
|---|---|---|---|---|
| 775,333.00 | 1,289.77 | 1.9% | 0.01 | traditional |
| 22,021.63 | 45,409.89 | 1.6% | 0.01 | MRV |
| ns/op | op/s | err% | total | Sumoku Solver Comparison #6 |
|---|---|---|---|---|
| 190,116.60 | 5,259.93 | 1.9% | 0.01 | traditional |
| 20,076.13 | 49,810.39 | 2.1% | 0.01 | MRV |
| ns/op | op/s | err% | total | Sumoku Solver Comparison #7 |
|---|---|---|---|---|
| 9,487,000.00 | 105.41 | 0.4% | 0.10 | traditional |
| 21,794.27 | 45,883.62 | 1.7% | 0.01 | MRV |
Note: The above results are generated by building it with -O3 -march=native -flto -ffast-math -DNDEBUG on a Mac Mini with M4 chip with 24 GB of RAM.
Backtracking is a technique that finds solutions for constraint satisfaction problems. A constraint satisfaction problem
Traversing rows and columns to verify that a number is unique is a O(N) operation. In our 9-by-9 grid, it is still manageable but we can reduce it from O(N) to O(1) by using a bitmask. Since we assume the board size (N) is 9 in the standard Sumoku game, we can use uint16_t to represent what numbers are present. The 1st bit in the mask represents if the number 1 is present or not, the 2nd bit in the mask represents if the number 2 is present or not, etc.
Let say we are placing number 7 in a given row, we just need to do the following steps:
- Bit-shift 1 by 7 times
- Find the bitmask that presents the given row
- OR the result from step (1) with the bitmask from step (2)
Let say we need to check if number 5 is already present in a given column, we just need to do the following steps:
- Bit-shift 1 by 5 times
- Find the bitmask that present the given column
- AND the result from step (1) with the bitmask from step (2)
Overall, we need to use two vector of uint16_t's (one for all the rows and another for all the columns) to represent the board.
The chart below shows the improvement from using bitmasks. We can see at least a 120% improvement on different approaches.
| ms/op | op/s | err% | total | Sumoku Solver Comparison #1 |
|---|---|---|---|---|
| 152.02 | 6.58 | 0.4% | 1.69 | traditional |
| 112.37 | 8.90 | 0.2% | 1.24 | traditional w/ bitmasks |
| 13.08 | 76.42 | 0.2% | 0.15 | ordering |
| 11.04 | 90.56 | 0.2% | 0.12 | ordering w/ bitmasks |
| ms/op | op/s | err% | total | Sumoku Solver Comparison #2 |
|---|---|---|---|---|
| 18.57 | 53.85 | 0.2% | 0.21 | traditional |
| 13.95 | 71.67 | 0.2% | 0.15 | traditional with bitmasks |
| 404.15 | 2.47 | 0.1% | 4.45 | ordering |
| 334.99 | 2.99 | 0.2% | 3.68 | ordering w/ bitmasks |
Minimum remaining value (MRV) heuristic is a way prioritize the selection of elements in a search algorithm. We mentioned earlier that backtracking is essentially a depth-first search (DFS) algorithm, and that it searches through all potential solution spaces. We can incorporate MRV heuristic into the search algorithm such that it searches potential solution spaces that have the least amount of possibilities (most constraints).
There are three constraints in Sumoku:
- a number needs to be unique in each row,
- a number needs to be unique in each column, and
- the sum of all elements within a box should equal to the sum that is provided
We iterate all blank elements in each iteration and find the element with the minimum remaining value by calling FindNextBestCell.
inline Selection FindNextBestCell()
{
Selection ret;
size_t curMinCnt = _N + 1;
// Loop through the entire board to find the next best cell
for (size_t r = 0; r < _N; ++r)
{
for (size_t c = 0; c < _N; ++c)
{
// Only check the cell that is empty
if (_board[r][c] == 0)
{
// Get the candidates and the number of candidates
uint16_t candidates = GetCandidates(r, c);
// If there is no candidate available that means we hit a dead end and this tree needs to be pruned
if (candidates == 0) [[unlikely]]
{
return Selection {.deadEnd = true};
}
#ifdef __GNUC__
int curNumOfCandidates = __builtin_popcount(candidates);
#else
int curNumOfCandidates = std::popcount(candidates);
#endif
// Update the return value when the current number of candidates is smaller than the previous one
if (curNumOfCandidates < curMinCnt)
{
curMinCnt = curNumOfCandidates;
ret.r = r;
ret.c = c;
ret.mask = candidates;
// If there is only one candidate then we return the current value early
if (curNumOfCandidates == 1)
{
return ret;
}
}
}
}
}
return ret;
}The function finds the element and return to Backtrack.
The following chart demonstrate the benefit of using MRV approach as it is faster than other solvers in average. The drawback is that this algorithm is required to loop through all remaining elements at every iteration and can be slow if the size of the board is small.
| ms/op | op/s | err% | total | Sumoku Solver Comparison #1 |
|---|---|---|---|---|
| 112.37 | 8.90 | 0.2% | 1.24 | traditional w/ bitmasks |
| 11.04 | 90.56 | 0.2% | 0.12 | ordering w/ bitmasks |
| 17.54 | 57.02 | 0.9% | 0.19 | MRV |
| ms/op | op/s | err% | total | Sumoku Solver Comparison #2 |
|---|---|---|---|---|
| 13.95 | 71.67 | 0.2% | 0.15 | traditional with bitmasks |
| 334.99 | 2.99 | 0.2% | 3.68 | ordering w/ bitmasks |
| 17.10 | 58.48 | 0.4% | 0.19 | MRV |
__builtin_popcount is a GCC special function that returns the number of 1's (set bits) in a given unsigned int. It leverages hardware instructions if possible, which is significantly faster than software implementation. In C++20, one can use std::popcount if the compiler is not GCC compatible. We use a preprocessor to check if the compiler is GCC and call different function accordingly.
#ifdef __GNUC__
int curNumOfCandidates = __builtin_popcount(candidates);
#else
int curNumOfCandidates = std::popcount(candidates);
#endif__builtin_ctz (count trailing zeros) is a GCC special function that returns the number of consecutive 0's from the right (least significant bit). It leverages hardware instructions if possible, which is significantly faster than software implementation. In C++20, one can use std::countr_zero if the compiler is not GCC compatible. If the input is 41 (0b101001) then the return value will be 0 and if the input is 48 (0b110000), then the return value will be 4.
We know that there can be at most 9 elements in a box and the maximum sum is 45 (
constexpr uint16_t GetPossibleNumbersMask(size_t target, size_t count)
{
static constexpr PossibleNumbersTable table{};
return table.get(target, count);
}Now we can use GetPossibleNumbersMask to improve our FindNextBestCell().
uint16_t GetCandidates(size_t r, size_t c)
{
size_t id = _boxID[r][c];
uint16_t forbidden = _rowMask[r] | _colMask[c] | _boxMask[id];
uint16_t ret = 0U;
for (int v = 1; v <= _N; ++v)
{
// Check if the current number is possible
if (!(forbidden & (1U << v)) && (_options[r][c] >> v))
{
ret |= (1U << v);
}
}
return ret;
}The following table shows the improvement by using pre-calculated candidate approach. We see a whopping 120 times improvement over the old method.
| ms/op | op/s | err% | total | Sumoku Solver Comparison #1 |
|---|---|---|---|---|
| 17.54 | 57.02 | 0.9% | 0.19 | MRV |
| 0.14 | 6,920.13 | 1.3% | 0.01 | MRV w/ precalculated candidates |
| ms/op | op/s | err% | total | Sumoku Solver Comparison #2 |
|---|---|---|---|---|
| 17.10 | 58.48 | 0.4% | 0.19 | MRV |
| 0.15 | 6,810.44 | 1.9% | 0.01 | MRV w/ precalculated candidates |
In a typical Sumoku game, the valid number goes from 1 to 9. Therefore the number of combinaitons is
We use i to represents the current combination of numbers. Since the number 0 is not a valid candidate, we just ignore it and let the last bit of i to represent the number 1, the second last bit of i to represent the number 2, etc.
However, since the mask is 0-based, i.e., the last digit represents the number 0, the second last bit represents the number 1, we need to convert i to mask before we can put it to table.
0b001: 1
0b010: 2
0b100: 3
We can get the number that i represents by adding the number of trailing 0's (__builtin_ctz: count trailing zeros) and 1.
And since each 1 bit in i represents a digit, we can get the current number of digits by counting the numbers of 1's in i by using __builtin_popcount.
for (uint16_t i = 0U; i < (1U << MAX_COUNT); ++i)
{
// Get the count of the numbers
int cnt = 0;
#ifdef __GNUC__
cnt = __builtin_popcount(i);
#else
cnt = std::popcount(i);
#endif
uint16_t mask = i;
uint16_t sum = 0;
uint16_t tmp = i;
while (tmp)
{
// Convert to number and add to the current sum
#ifdef __GNUC__
sum += __builtin_ctz(tmp) + 1;
#else
sum += std::countr_zero(tmp) + 1;
#endif
tmp &= (tmp - 1); // removes the rightmost 1 bit
}
// Store the current count and the sum into the table
table[cnt][sum] |= (mask << 1);
}This new version of finding all candidates that can sum up to a target is easier and more straight-forward compared to the dynamic programming that we will cover later.
Dynamic programming is a very powerful algorithm that breaks down complex problems into smaller sub-problems. There are multiple variations of dynamic programming, e.g., recursion, linear, or 2D. In our case, we need to create a 3D version.
This is probably the most challenging part in this repository. A 2D dynamic programming method was used but failed test cases and ChatGPT came to the rescue. (Yes, I am not ashamed that I used ChatGPT for help.)
We know that we need to find all the numbers that can sum up to a target in compile-time so that we can skip the calculations at run-time. The state space is { the number of digits, the target sum }. The value of each element represents the combinations of candidates. There are many combinations and can not be consolidated into one single mask (I made that mistake in the beginning).
The core logic is that how to find the combinations of candidates that can sum up to 22 with three digits that includes the digit 5.
We know that if this statement is true if and only if we can sum up to 17 (
For each element in the 2D dynamic programming grid, we need an additional dimension that stores all the possible combinations.
states[i][j][k] means that the kth combination that can sum up to j by using i digits.
The third dimension is essential since it represents how a specific combination of digits is constructed and two combinations of digits can not be mixed. Meaning that
struct PossibleNumbersTable
{
static constexpr int MAX_SUM = 45;
static constexpr int MAX_COUNT = 9;
std::array<std::array<uint16_t, MAX_SUM + 1>, MAX_COUNT + 1> table {};
consteval PossibleNumbersTable()
{
struct State
{
/// @brief All the masks (combinations)
// For instance, masks[1] with a value of 1b110 means {1, 2} is a valid combination of candidates
uint16_t masks[1 << MAX_COUNT] {}; // there are 2^MAX_COUNT possibilities (in this case 2^9)
/// @brief The number of combinations (also used as an index)
size_t sz = 0;
};
std::array<std::array<State, MAX_SUM + 1>, MAX_COUNT + 1> states {};
// The base case
states[0][0].masks[states[0][0].sz++] = 0;
// Iterate from number 1 to 9
for (int digit = 1; digit <= MAX_COUNT; ++digit)
{
// Iterate the number of digits and target in reverse to ensure each digit is used only once
// NOTE: Iterating forward from k = 1 to k = 9 would allow the current digit to be added
// to a sum that already includes it, leading duplicated digits.
// Reversing the loops ensures we only build results from the previous iteration
for (int c = MAX_COUNT; c >= 1; --c)
{
for (int s = MAX_SUM; s >= digit; --s)
{
// Get the previous state
// The previous state has one less number of digits, therefore it's (c - 1)
// and it has less sum, i.e., (s - num)
State& prev = states[c - 1][s - digit];
State& curr = states[c][s];
// Propagate all candidates from previous state to the current state and
// add the current digit to the mask
for (int i = 0; i < prev.sz; ++i)
{
curr.masks[curr.sz++] = prev.masks[i] | (1u << digit);
}
}
}
}
// Construct the final table that combine masks from all states
for (int c = 0; c <= MAX_COUNT; ++c)
{
for (int s = 0; s <= MAX_SUM; ++s)
{
uint16_t mask = 0;
const State& st = states[c][s];
// Loop through all masks
for (int i = 0; i < st.sz; ++i)
{
mask |= st.masks[i];
}
table[c][s] = mask;
}
}
}
/// @brief Finds all the candidates of k distinct digits that sum up to a given target
/// @param target The target sum
/// @param k The number of distinct digit(s)
/// @return All the candidates in mask format
constexpr uint16_t get(size_t target, size_t count) const
{
if (count > MAX_COUNT || target > MAX_SUM) return 0;
return table[count][target];
}
};target_compile_options is available in modern CMake. We create an interface library that only contains compiler flags and linker flags.
# Compiler flags
add_library(project_options INTERFACE)
# Flags for all build types
target_compile_options(project_options INTERFACE -Wall -Wextra -Wpedantic)
# Flags based on the specified build type
target_compile_options(project_options INTERFACE
$<$<CONFIG:Release>:-O3>
$<$<CONFIG:Debug>:-g -O0>
$<$<CONFIG:Benchmark>:-O3 -march=native -DNDEBUG>
$<$<CONFIG:Test>:-g -O1 -fsanitize=address -fsanitize=undefined>
)
# Linker flags
target_link_options(project_options INTERFACE
$<$<CONFIG:Test>:-fsanitize=address -fsanitize=undefined>
)We can then add project_options in src/CMakeLists.txt, tests/CMakeLists.txt, etc, like this so that the source code inside those subdirectories will be compiled with the flags defined in the main CMakeLists.txt:
target_link_libraries(board_library INTERFACE Boost::boost fmt::fmt project_options)-
$<$<CONFIG:Release>:-O3: adds -O3 flag if the build type is set to Release - -march=native: uses all available instructions on the machine
- -DNDEBUG: disable all standard asserts
Variadic templte was introduced back in C++11 and can be paired with concepts in C++20.
We want to create a function that takes multiple digits and output the mask that contains them. The number of digits varies therefore creatin a variadic function is the best option. Also, we would only take argument that is of type int. Hence we add requires in the template.
template <typename... Args>
requires (std::same_as<Args, int>&&...)
uint16_t GenerateCandidateMask(Args... digit)
{
return ((1U << digit) | ... | 0);
}Variadic functions could be gnarly at first glance. But we can easily break it down and see what it does.
We can expand return ((1U << digit) | ... | 0); to return ((1U << digit1) | (1U << digit2) | (1U << digit3)| 0);
And for the requires part, we need to add && there since it represents the AND operator after expansion:
(std::same_as<Args1, int> && std::same_as<Args2, int> && std::same_as<Args3, int>).
% is a very computationally expensive operator as it can take up to multiple CPU cycles. Whereas, & only requires a single CPU instruction.
In benchmark functions when we need to iterate an array with multiple elements in it, we need to make sure the index stays within bound.
One might use (i++) % sz.
But it can be optimized by using:
(i++) & (sz - 1), when the size of the array is a power of 2.
This is a very common low-level programming techique that optimizes the code.
In our case, if the size of the array is 1024 (which is
We only care about the last 7 bits and by subtracting one from 1024, we are effectively getting the last 7 bits.
This is the math behind it:
0b10000000001
& 0b01111111111
---------------
0b00000000001
nlohmann/json is the golden tool to serialize and deserialize data for C++ projects. It is a header-only and light-weight tool that is easy to use.
In order for nlohmann::json to know the data structure, we need to provide the arguments of a structure by calling NLOHMANN_DEFINE_TYPE_NON_INTRUSIVE. NLOHMANN_DEFINE_TYPE_NON_INTRUSIVE is a macro and the first argument is the name of the structure, followed by the member elements.
#include <nlohmann/json.hpp>
struct Point
{
size_t x;
size_t y;
// Use the default == operator
bool operator==(const Point&) const = default;
};
NLOHMANN_DEFINE_TYPE_NON_INTRUSIVE(Point, x, y) // for nlohmann::json
/// @brief The Sumoku test data structure
struct SumokuTestData
{
size_t N;
std::vector<std::vector<Point>> boxes;
std::vector<int> sums;
std::string label;
};
NLOHMANN_DEFINE_TYPE_NON_INTRUSIVE(SumokuTestData, N, boxes, sums, label) // for nlohmann::jsonData-driven testing is a technique that uses external data and import it into the test environment. This approach eliminates the need to incorportate test cases into the test source code. One can just create new tests in the test directory and rerun the test binary.
We can create a Sumoku test case by running this main function:
int main()
{
SumokuTestData myPuzzle
{
9,
{
{{0, 0}, {0, 1}}, {{0, 2}, {1, 2}}, {{0, 3}, {0, 4}}, {{0, 5}, {1, 5}}, {{0, 6}, {0, 7}}, {{0, 8}, {1, 8}},
{{1, 0}, {2, 0}}, {{1, 1}, {2, 1}}, {{1, 3}, {1, 4}}, {{1, 6}, {1, 7}},
{{2, 2}, {2, 3}}, {{2, 4}, {3, 4}}, {{2, 5}, {2, 6}}, {{2, 7}, {2, 8}},
{{3, 0}, {4, 0}}, {{3, 1}, {3, 2}}, {{3, 3}, {4, 3}}, {{3, 5}, {3, 6}}, {{3, 7}, {3, 8}},
{{4, 1}, {4, 2}}, {{4, 4}, {5, 4}}, {{4, 5}, {5, 5}}, {{4, 6}, {4, 7}}, {{4, 8}, {5, 8}},
{{5, 0}, {5, 1}}, {{5, 2}, {5, 3}}, {{5, 6}, {5, 7}},
{{6, 0}, {6, 1}}, {{6, 2}, {7, 2}}, {{6, 3}, {6, 4}}, {{6, 5}, {7, 5}}, {{6, 6}, {7, 6}}, {{6, 7}, {6, 8}},
{{7, 0}, {7, 1}}, {{7, 3}, {7, 4}}, {{7, 7}, {7, 8}},
{{8, 0}, {8, 1}, {8, 2}}, {{8, 3}, {8, 4}}, {{8, 5}, {8, 6}}, {{8, 7}, {8, 8}}
},
{
8, 10, 13, 7, 11, 14,
11, 8, 9, 10,
12, 15, 6, 13,
5, 11, 10, 9, 7,
14, 11, 6, 8, 12,
10, 9, 13,
7, 11, 8, 10, 14, 9,
12, 13, 5,
15, 10, 7, 11
},
"P7"
};
nlohmann::json j(myPuzzle);
std::ofstream file("./tests/data/puzzle_p8.json");
if (file.is_open())
{
file << j.dump(4);
file.close();
}
return 0;
}nlohmann::json sees the macros defined in board/boardlib.hpp for both SumokuTestData and Point. Therefore it can handle the data structure for us.
We then use this function to load the data (test cases) into the test. Again, since we have the macros for both SumokuTestData and Point, nlohmann::json can deserialize the data.
std::vector<SumokuTestData> LoadAllPuzzles(std::string_view dir)
{
std::vector<SumokuTestData> testCases;
// Iterate over all the json entries in the directory
for (const auto& entry : fs::directory_iterator(dir))
{
if (entry.path().extension() == ".json")
{
std::ifstream file{entry.path()};
nlohmann::json j;
file >> j;
SumokuTestData puzzle = j.get<SumokuTestData>();
testCases.push_back(puzzle);
}
}
return testCases;
}We also need to change the test section in our test case since there are now multiple test cases. Luckily, Catch2 supports this method. We can use DYNAMIC_SECTION and load the test cases by using GENERATE to generate multiple test cases for us. We just need to provide an array or vector of test cases and Catch2 and automatically does it for us.
TEST_CASE("Sumoku (SumokuMRV) Suite", "[SumokuMRV]")
{
// Load all the test cases
static std::string folder = GetTestDataPath();
static std::vector<SumokuTestData> all_puzzles = LoadAllPuzzles(folder);
// Check the vector to make sure it contains at least one test case
REQUIRE_FALSE(all_puzzles.empty());
const SumokuTestData& data = GENERATE(from_range(all_puzzles));
// The section
DYNAMIC_SECTION("Puzzle: " << data.label)
{
solver::SumokuMRV s {data.N, data.boxes, data.sums};
s.Solve();
auto ret = s.GetSolution();
REQUIRE (ret != std::nullopt);
std::vector<std::vector<int>> solution = *ret;
REQUIRE (solution.size() == data.N);
validate_boad_is_square(solution);
validate_sukodu_row_column_constraints(solution);
}
}Note that we want to see the name of the test case should a test case fails, therefore we need to provide the label or the name of a test to the title. This eliminate the need to write multiple sections and the code inside the section can be reused.
Assignment is used with an equal sign (=) before the new value.
We can assign a vector to anothe vector like this:
std::vector<int> a, b;
a = b;
Now the vector a will copy the values from vector b.
Constructor is used with a pair of parentheses () that can construct the object from the get-go.
We can construct a vector like this:
std::vector<int> vec({1, 2, 3, 4});
List-initialization is used with a pair of braces ({}) with the elements inside it.
We can construct a vector like this:
std::vector<int> vec{1, 2, 3, 4};
In this case, the vector contains all the elements inside the braces.
We need to beware of the differences between a constructor and a list-initilization since the result of nlohmann::json j(myPuzzle); is not the same as nlohmann::json j{myPuzzle};.
The first line of code is saying that I want to construct an instance of nlohmann::json that takes an instance of myPuzzle, whereas the second line of code is saying that I want to construct a list of nlohmann::json's that has one instance of myPuzzle. The output of the JSON files will be different as well.