wchargin's comments

wchargin · on July 25, 2024

Hmm… consider the following. Your FSM is acyclic iff you can assign each state an integer depth such that a state at depth d only ever transitions to states at depths strictly greater than d. So consider the following tables:

    CREATE TABLE states (
        state order_state PRIMARY KEY,
        depth int NOT NULL,
        UNIQUE(state, depth)
    );
    CREATE TABLE transitions (
        start_state order_state NOT NULL,
        event order_event NOT NULL,
        end_state order_state NOT NULL,
        start_depth int NOT NULL,
        end_depth int NOT NULL,
        CONSTRAINT transitions_start_depth_correct
            FOREIGN KEY(start_state, start_depth) REFERENCES states(state, depth)
            ON UPDATE CASCADE,
        CONSTRAINT transitions_end_depth_correct
            FOREIGN KEY(end_state, end_depth) REFERENCES states(state, depth)
            ON UPDATE CASCADE,
        CONSTRAINT transitions_depth_increases CHECK(end_depth > start_depth),
        PRIMARY KEY(start_state, event)
    );

Let's bang on it for a quick test. You can define a state machine; here's one that roughly matches the regex `^(AB|BA)$` (I know I'm being a bit sloppy):

    fsm=> CREATE DOMAIN order_state AS text;
    CREATE DOMAIN
    fsm=> CREATE DOMAIN order_event AS text;
    CREATE DOMAIN
    fsm=> -- "CREATE TABLE"s as above, then:
    fsm=> INSERT INTO states(state, depth) VALUES('start', 1), ('a', 2), ('b', 2), ('done', 3), ('error', 3);
    INSERT 0 5
    fsm=> INSERT INTO transitions(start_state, event, end_state, start_depth, end_depth) VALUES('start', 'A', 'a', 1, 2), ('start', 'B', 'b', 1, 2), ('a', 'B', 'done', 2, 3), ('b', 'A', 'done', 2, 3), ('a', 'A', 'error', 2, 3), ('b', 'B', 'error', 2, 3);
    INSERT 0 6

And, as you need to modify it, you can increase a node's depth to make room for intervening nodes:

    fsm=> UPDATE states SET depth = 4 WHERE state = 'error';
    UPDATE 1
    fsm=> TABLE transitions;
     start_state | event | end_state | start_depth | end_depth 
    -------------+-------+-----------+-------------+-----------
     start       | A     | a         |           1 |         2
     start       | B     | b         |           1 |         2
     a           | B     | done      |           2 |         3
     b           | A     | done      |           2 |         3
     a           | A     | error     |           2 |         4
     b           | B     | error     |           2 |         4
    (6 rows)

But you can't decrease a node's depth too far:

    fsm=> UPDATE states SET depth = 2 WHERE state = 'error';
    ERROR:  new row for relation "transitions" violates check constraint "transitions_depth_increases"
    DETAIL:  Failing row contains (a, A, error, 2, 2).
    CONTEXT:  SQL statement "UPDATE ONLY "public"."transitions" SET "end_state" = $1, "end_depth" = $2 WHERE $3::pg_catalog.text OPERATOR(pg_catalog.=) "end_state"::pg_catalog.text AND $4 OPERATOR(pg_catalog.=) "end_depth""

And you can't introduce transitions that don't increase depth:

    fsm=> INSERT INTO transitions(start_state, event, end_state, start_depth, end_depth) VALUES('done', 'AGAIN!', 'start', 3, 1);
    ERROR:  new row for relation "transitions" violates check constraint "transitions_depth_increases"
    DETAIL:  Failing row contains (done, AGAIN!, start, 3, 1).

Now, I don't know that I would immediately recommend this for high-throughput production use. You're storing "unnecessary" state not once but many times (each state's depth appears `1 + \deg(v)` times), plus additional indices and lookups. But I do think it meets the desired consistency goals!

zinclozenge · on July 27, 2024

Amazing! I also learned about domains with your comment.

wchargin · on Aug 31, 2023

Nice work! I had a similar idea and also wrote a little solver in Rust [1]. I chose the same benchmark puzzle as is in your code, and it solves in ~25 microseconds on my laptop (i7-1165G7).

I haven't optimized my approach at all or even run it under a profiler, but it uses SIMD-within-a-register for computing the "flood fill" of which squares a piece can move to, which is a nice efficiency boost. For the simplest example, if you have a bit-vector `v: u64` of positions that a pawn might be in, then `v << 8` is the vector that denotes where it might be after one step, and `((v & can_move::LEFT) << 7) | ((v & can_move::RIGHT) << 9)` denotes what it might be able to capture.

I think the `Stepper` implementations and the way they flow into the top-level `captures` function is a cool and powerful approach; I encourage taking a look if it sounds interesting to you.

[1]: https://github.com/wchargin/echo-chess

Direct link to source:

https://github.com/wchargin/echo-chess/blob/main/src/main.rs

ramly · on Sept 1, 2023

This is really neat. Thank you and @thethirdone for putting in the effort to tackle this! Love these creative takes to add efficiency boosts. You guys make HN awesome.

> Multiple white pieces makes the code significantly more complex, but it simply adds a little bit to the statespace.

That would be my guess too. Especially the sacrifice mechanic that needs to be timed just right after X moves/transformations of each white piece.

wchargin · on Jan 17, 2020

Hey, I wrote one of these, too! Mine generates and solves all possible puzzles (with my dictionary, there are 54733) in about 1.8 seconds on my six-year-old laptop, and can also typeset them to LaTeX+TikZ to make printable puzzle sheets. That’s 0.03 milliseconds per puzzle. It turns out that you can get this thing to run quite fast once you realize that you can pack words into bitsets and express the whole algorithm in terms of bitwise operations on machine words. Of particular interest is that the running time is output-sensitive: proportional to the number of solutions to the puzzle but not the total size of the dictionary (after one-time preprocessing).

I used the “hard copy” rule set as used in the printed New York Times, which is slightly different from the web rule set: words must be at least five letters long, not four, and are worth 1 point each, or 3 for a pangram. The hard copy puzzles also include three score thresholds, so I had a bit of fun trying to reverse-engineer how those thresholds are chosen. I didn’t get exactly the right function, but I got fairly close, and most importantly the thresholds feel fair when playing. (The online version also has score thresholds, but there are many more of them, and it was easy for me to transcribe thresholds from the archives of the printed copies.)

In my admittedly biased yet assuredly humble opinion, both the algorithm performance and the rating threshold estimation are interesting: https://github.com/wchargin/spelling-bee/tree/master#perform...

Oh, and a web-based solver, for convenience: https://wchargin.github.io/spelling-bee/

Lovely to see how the author of this article and I both had a lot of fun with this by taking it in different directions. :-)

matrixlab · on Jan 17, 2020

Very nice! I also took a bitwise approach and ended up with something that solved all possible puzzles in 5 seconds (in 60 lines of matlab). A technique I used that helped bring the time down was to calculate the points for a given puzzle irregardless of the center letter. This provides an upper bound for the points that could come from that puzzle. After you've done that for all the boards, you sort by upper bound and then start calculating the actual scores for the different center letter options. Doing this for the 538 puzzle, I only had to look at center letters for the top 2 puzzles 'aeginrt' and 'adeinrt' (after those you find a score that is greater than the upper bound for any of the other puzzles).

JoeAltmaier · on Jan 17, 2020

Its amazing the speedups if you have the right model for the problem!

jim-jim-jim · on Jan 17, 2020

Very cool. I also wrote a solver using the NYT magazine's rules back in 2018, and found the bitwise method of matching both 1 and 3 point values incredibly fast. I just precalculated the hashes and ran a parallel loop over the whole array, but I know there are specialized data structures that games like Scrabble use. I wonder if that would have sped things up even more.

Anyway, whenever I see a problem that involves letters now, I reach for bit operators first.