The original article discusses techniques for constraining the weights of a neural network to a submanifold of weight space during training. Your comment discusses interleaving the tokens of an LLM prompt with Unicode PUA code points. These are two almost completely unrelated things, so it is very confusing to me that you are confidently asserting that they are the same thing. Can you please elaborate on why you think there is any connection at all between your comment and the original article?
Our ECC construction induces an emergent modular manifold during KVQ computation.
Suppose we use 3 codeword lanes every codeword which is our default. Each lane of tokens is based on some prime, p, so collectively forms CRT-driven codeword (Chinese Remainder Theorem). This is discretely equivalent to labeling every k tokens with 1x globally unique indexing grammar.
That interleaving also corresponds to a triple of adjacent orthogonal embeddings since those tokens still retain a random gaussian embedding. The net effect is we similarly slice the latent space into spaced chain of modular manifolds within the latent space every k content tokens.
We also refer to that interleaving as Steifel frames for similar reasons as the post reads etc. We began work this spring or so to inject that net construction inside the model with early results in similar direction as post described. That's another way of saying this sort of approach lets us make that chained atlas (wc?) of modular manifolds as tight as possible within dimensional limits of the embedding, floating point precision, etc.
We somewhat tongue-in-cheek refer to this as the retokenization group at the prompt level re: renormalization group / tensor nets / etc. Relayering group is the same net intuition or perhaps reconnection group at architecture level.
I'm sorry, but even if I am maximally charitable and assume that everything you are saying is meaningful and makes sense, it still has essentially nothing to do with the original article. The original article is about imposing constraints on the weights of a neural network, during training, so that they lie on a particular manifold inside the overall weight space. The "modular" part is about being able to specify these constraints separately for individual layers or modules of a network and then compose them together into a meaningful constraint for the global network.
You are talking about latent space during inference, not weight space during training, and you are talking about interleaving tokens with random Gaussian tokens, not constraining values to lie on a manifold within a larger space. Whether or not the thing you are describing is meaningful or useful, it is basically unrelated to the original article, and you are not using the term "modular manifold" to refer to the same thing.
hmm / hear you. my point wasn't that we are applying modular manifolds in the same way it was that we are working on model reliability from two extremal ends using the same principle. there are various ways to induce modular manifolds in model at various levels of resolution / power. we started at outside / working in level and so it works with any black-box model out of the box and zero knowledge needed, dont even need to know token dictionary to show effect.
We're already working on pushing construction deeper into model both architecture and training. currently that's for fine-tuning and ultimately full architecture shrinkage / pruning and raw training vs. just fine-tuning etc.
& it was just great to see someone else using modular manifolds even if they are using them at the training stage vs. inference stage. they're exploiting modular form at training, we're doing it at inference. cool to see.
I would recommend reading beyond the title of a post before leaving replies like this, as your comment is thoroughly addressed in the text of the article:
> At this point you might be wondering, isn’t this a problem in many languages? Doesn’t Java also allow data races? And yes, Java does allow data races, but the Java developers spent a lot of effort to ensure that even programs with data races remain entirely well-defined. They even developed the first industrially deployed concurrency memory model for this purpose, many years before the C++11 memory model. The result of all of this work is that in a concurrent Java program, you might see unexpected outdated values for certain variables, such as a null pointer where you expected the reference to be properly initialized, but you will never be able to actually break the language and dereference an invalid dangling pointer and segfault at address 0x2a. In that sense, all Java programs are thread-safe.
And:
> Java programmers will sometimes use the terms “thread safe” and “memory safe” differently than C++ or Rust programmers would. From a Rust perspective, Java programs are memory- and thread-safe by construction. Java programmers take that so much for granted that they use the same term to refer to stronger properties, such as not having “unintended” data races or not having null pointer exceptions. However, such bugs cannot cause segfaults from invalid pointer uses, so these kinds of issues are qualitatively very different from the memory safety violation in my Go example. For the purpose of this blog post, I am using the low-level Rust and C++ meaning of these terms.
Java is in fact thread-safe in the sense of the term used in the article, unlike Go, so it is not a counterexample to the article's point at all.
> I would recommend reading beyond the title of a post before leaving replies like this, as your comment is thoroughly addressed in the text of the article:
The title is wrong. That's important.
> Java is in fact thread-safe in the sense of the term used in the article
The article's notion of thread safety is wrong. Java is not thread safe by construction, but it is memory safe.
Java also sometimes uses "memory safe" to refer to programs that don't have null pointer exceptions. So in that sense, Java isn't memory safe by construction either.
These terms are used slightly differently by different communities, which is why I discuss this point in the article. But you seem adamant that you have the sole authority for defining these terms so :shrug:
When those US government articles about how we should switch to memory safe languages come out, they refer to Java as a “memory safe language”.
They also count data race freedom as part of memory safety, which I think is wrong (and contradicts their inclusion of Java and even Go in the list of memory safe languages).
So no, I’m not an authority. I’m just following the general trend of how the term is used.
And ive never heard “memory safe” used in relation to not having null pointer exceptions. That’s a new one and sounds nonsensical, frankly
> They also count data race freedom as part of memory safety, which I think is wrong (and contradicts their inclusion of Java and even Go in the list of memory safe languages).
For Java, there's no contradiction if you define data race freedom as "data races cannot cause arbitrary memory corruption / UB".
> And ive never heard “memory safe” used in relation to not having null pointer exceptions. That’s a new one and sounds nonsensical, frankly
I was also surprised, but it's what I was told by people working on verification of Java programs. And you can see e.g. at https://link.springer.com/content/pdf/10.1007/978-3-030-1750... that people are proving memory safety of Java programs, which would not make sense at all if all Java programs are memory safe by construction.
If a language is "memory safe", by some definition we expect safety from memory faults (for example, not accessing memory incorrectly).
If a language is "memory safe" but not "thread safe", is the result "the language is free from 'memory faults', unless threads are involved"?
Or to put it another way; when used however the term of art is intended, "memory safety" is meant to provide some guarantees about not triggering certain erroneous conditions. "not thread safe" seems to mean that those same erroneous conditions can be triggered by threads, which seems to amount to '"memory safety" does not guarantee the absence of erroneous memory conditions'.
> If a language is "memory safe" but not "thread safe", is the result "the language is free from 'memory faults', unless threads are involved"?
Yes.
If a language is memory safe but not thread safe, then you can race, but the outcome of those races won't be memory corruption or the violation of the language's type system. It will lead to weird stuff, however - just a different kind of weirdness than breaking out of the language's sandbox
> If a language is memory safe but not thread safe, then you can race, but the outcome of those races won't be memory corruption or the violation of the language's type system.
By these definitions, doesn't that mean go is neither memory or thread safe? It looks like concurrent modification can result in memory corruption, e.g. the attempted access 0x42 example in the article
> By these definitions, doesn't that mean go is neither memory or thread safe?
Yes, with the caveat that you can't treat "memory safe" as a binary condition.
The strictest notion of memory safety is what I call GIMSO: "Garbage In, Memory Safety Out". I.e. there does not exist any sequence of bytes you could feed to the compiler that would result in a memory-unsafe outcome at runtime. Java aims for this. Fil-C does too. JavaScript also does.
But there are languages that I think it's fair to consider to be memory safe that offer escape hatches that violate GIMSO. Rust with `unsafe` is an example. C# with `unsafe` is another. Java if you include `sun.misc.Unsafe` (arguably it's not part of the language).
So I think if a language is memory safe, not thread safe, and the memory safety is gated on thread safety, then it's kinda fair to make statements like, "it's memory safe", if you have fine print somewhere that says "but the memory safety does not hold under the following kinds of races".
All of that said, I'd rather we just said that "memory safety" means what I call "GIMSO". But the ship has sailed. Lots of languages are called "memory safe" to mean something like, "you can get memory safety in this language if you obey certain idioms" - and in Rust that means "don't use unsafe" while in Go that means "don't race in certain ways".
In my opinion this is missing a very important different between the two approaches: using `unsafe`/`sun.misc.Unsafe` in Rust/C#/Java is a very deliberate choice which presence can easily be checked syntactically, meanwhile data races in Go are most often unintended and you can't easily check for their _guaranteed_ absence. Otherwise C/C++ are also "GIMSO" with the caveat "don't UB"!
GIMSO is defined as memory safety without caveats. The only way to get it (currently) in C/C++ is to compile with Fil-C.
You have a good point otherwise, but Go is considered memory safe anyway. And it probably makes sense that it is, since the chances of exploitation due to memory safety issues caused by races in Go are infinitesimal. It’s not at all fair to compare to the exploited-all-the-time issues of C/C++ (when you make the mistake of compiling with something other than Fil-C)
The only physically accurate answer for where to put the far plane is "behind everything you want to be visible". It fundamentally does not make any sense to change the shape of the far plane to "more accurately reflect human visual perception" because there is no far plane involved in human visual perception, period.
You're describing a problem with a particular method of fog rendering. The correct way to address that would be to change how fog is rendered. The perspective projection and the far plane are simply not the correct place to look for a solution to this.
I disagree. This problem exists even when the fog is completely absent and also distorts the objects at the sides of the screen regardless of the fog's presence or absence. I guess you could use fog, rendered in a particular way, to make it less noticeable but it's still there. So the root cause is the perspective projection.
Now, I've googled a bit on my own, trying all kinds of search phraes, and apparently it is a known problem that the perspective projection, when wide (about 75 degrees and up) FOV is used, will distort objects at the side of the screen. One of the solutions appears to be a post-processing pass called "Panini Projection" which undoes that damage at the sides of the screen. From what I understand, it uses cylinder (but not a sphere) as the projection surface instead of a plane.
You originally described a problem where fog had a different falloff in world space at the edges of the screen compared to the center of the screen. The root cause of that is not the perspective projection; it's how the fog is being rendered.
The issue you are describing now is called perspective distortion (https://en.wikipedia.org/wiki/Perspective_distortion), and it is something that also happens with physical cameras when using a wide-angle lens. There is no single correct answer for dealing with this: similarly to the situation with map projections, every projection is a compromise between different types of distortion.
Anyway, if you're writing a ray tracer it's possible to use whatever projection you want, but if you're using the rasterizer in the GPU you're stuck with rectilinear projection and any alternate projection has to be approximated some other way (such as via post-processing, like you mention).
If you assume at the start of your proof that π is rational, it's not clear that you can then still make use of concepts closely related to π. If those concepts depend in any way on π being irrational, then you can't use them to cleanly arrive at the contradiction.
This is fundamentally the same thing as undefined behavior, regardless of whether Odin insists on calling it by a different name. If you don't want behavior to be undefined, you have to define it, and every part of the compiler has to respect that definition. If a use-after-free is not undefined behavior in Odin, what behavior is it defined to have?
As a basic example, if the compiler guarantees that the write will result in a deterministic segmentation fault, then that address must never be reused by future allocations (including stack allocations!), and the compiler is not allowed to perform basic optimizations like dead store elimination and register promotion for accesses to that address, because those can prevent the segfault from occurring.
If the compiler guarantees that the write will result in either a segfault or a valid write to that memory location, depending on the current state of the allocator, what guarantees does the compiler make about those writes? If some other piece of code is also performing reads and writes at that location, is the write guaranteed to be visible to that code? This essentially rules out dead store elimination, register promotion, constant folding, etc. for both pieces of code, because those optimizations can prevent one piece of code from observing the other's writes. Worse, what if the two pieces of code are on different threads? And so on.
If the compiler doesn't guarantee a deterministic crash, and it doesn't guarantee whether or not the write is visible to other code using the same region of memory, and it doesn't provide any ordering or atomicity guarantees for the write if it does end up being visible to other code, and then it performs a bunch of optimizations that can affect all of those things in surprising ways: congratulations, your language has undefined behavior. You can insist on calling it something else, but you haven't changed the fundamental situation.
You language has behavior not defined within the language, sure. What it does not now have is permission for the compiler to presume that the code never executes with input that would cause the behavior not defined to occur.
The compiler is already doing that when it performs any of the optimizations I mentioned above. When the compiler takes a stack-allocated variable (whose address is never directly taken) and promotes it to a register, removes dead stores to it, or constant-folds it out of existence, it does so under the assumption that the program is not performing aliasing loads and stores to that location on the stack. In other words, it is leaving the behavior of a program that performs such loads and stores undefined, and in doing so it is directly enabling some of the most basic, pervasive optimizations that we expect a compiler to perform.
In a language with raw pointers, essentially all optimizations rely on this type of assumption. Forbidding the compiler from making the assumption that undefined behavior will not occur essentially amounts to forbidding the compiler from optimizing at all. If that is indeed what you want, then what you want is something closer to a macro assembler than a high-level language with an optimizing compiler like C. It's a valid thing to want, but you can't have your cake and eat it too.
When you put it like that, it's actually interesting. If they went ahead and said, "This is a language which by design can't have an optimizing compiler, it's strictly up to the programmer - or the code generator, if used as an intermediate language - to optimize" then it would at least be novel.
But as they don't, I see it more as an attempt to annoy the people who have studied these sort of things (I guess you are the people who "suck the joy out of programming" in their eyes)
No, the compiler is not "already" doing that. Odin uses the llvm as a backend (for now) and it turns off some of those UB-driven optimimzations (as mentioned in the OP).
Some things are defined by the language, some things are defined by the operating system, some by the hardware.
It would be silly for Odin to say "you can't access a freed pointer" because it would have to presume to know ahead of time how you utilize memory. It does not. In Odin, you are free to create an allocator where the `free` call is a no-op, or it just logs the information somewhere without actually reclaiming the 'freed' memory.
I can't speak for gingerBill but I think one of the reasons to create the language is to break free from the bullying of spec laywers who get in the way of systems programming and suck all the joy out of it.
> it does so under the assumption that the program is not performing aliasing loads and stores to that location on the stack
If you write code that tries to get a pointer to the first variable in the stack, and guess the stack size and read everything in it, Odin does not prevent that, it also (AFAIK) does not prevent the compiler from promoting local variables to registers.
Again, go back to the twitter thread. An explicit example is mentioned:
If you reference a variable, the langauge spec guarantees that it wil have an address that you can take, so there's that. But if you use that address to try to get other stack variables indirectly, then the language does not define what happens in a strict sense, but it's not 'undefined' behavior. It's a memory access to a specific address. The behavior depends on how the OS and the Hardware handle that.
The compiler does not get to look at that and say "well this looks like undefined behavior, let me get rid of this line!".
> If you write code that tries to get a pointer to the first variable in the stack, and guess the stack size and read everything in it, Odin does not prevent that, it also (AFAIK) does not prevent the compiler from promoting local variables to registers.
This is exactly what I described above. Odin does not define the behavior of a program which indirectly pokes at stack memory, and it is thus able to perform optimizations which exploit the fact that that behavior is left undefined.
> The compiler does not get to look at that and say "well this looks like undefined behavior, let me get rid of this line!".
This is a misleading caricature of the relationship between optimizations and undefined behavior. C compilers do not hunt for possible occurrences of undefined behavior so they can gleefully get rid of lines of code. They perform optimizing transformations which are guaranteed to preserve the behavior of valid programs. Some programs are considered invalid (those which execute invalid operations like out-of-bounds array accesses at runtime), and those same optimizing transformations are simply not required to preserve the behavior of such programs. Odin does not work fundamentally differently in this regard.
If you want to get rid of a particular source of undefined behavior entirely, you either have to catch and reject all programs which contain that behavior at compile time, or you have to actually define the behavior (possibly at some runtime cost) so that compiler optimizations can preserve it. The way Odin defines the results of integer overflow and bit shifts larger than the width of the operand is a good example of the latter.
C does have a particularly broad and programmer-hostile set of UB-producing operations, and I applaud Odin both for entirely removing particular sources of UB (integer overflow, bit shifts) and for making it easier to avoid it in general (bounds-checked slices, an optional type). These are absolutely good things. However, I consider it misleading and false to claim that Odin has no UB whatsoever; you can insist on calling it something else, but that doesn't change the practical implications.
> They perform optimizing transformations which are guaranteed to preserve the behavior of valid programs. Some programs are considered invalid (those which execute invalid operations like out-of-bounds array accesses at runtime), and those same optimizing transformations are simply not required to preserve the behavior of such programs.
I think this is the core of the problem and it's why people don't like these optimizations and turn them off.
Again I'm not the odin designer nor a core maintainer, so I can't speak on behalf of the language, but from what I understand, Odin's stance is that the compiler may not make assumptions about what kind of code is invalid and whose behavior therefore need not be preserved by the transformations it makes.
> The compiler does not get to look at that and say "well this looks like undefined behavior, let me get rid of this line!".
No production compiler does that (directly). This is silly. We want to help programmers. They sometimes keep it even if it is known to be UB just because removing it is unlikely to help optimizations.
But if you are optimizing assuming something does not happen, then you have undefined behavior. And you are always assuming something does not happen when optimizing.
> The compiler is already doing that when it performs any of the optimizations I mentioned above. When the compiler takes a stack-allocated variable (whose address is never directly taken) and promotes it to a register, removes dead stores to it, or constant-folds it out of existence, it does so under the assumption that the program is not performing aliasing loads and stores to that location on the stack. In other words, it is leaving the behavior of a program that performs such loads and stores undefined, and in doing so it is directly enabling some of the most basic, pervasive optimizations that we expect a compiler to perform.
No, that's C-think. Yes, when you take a stack-allocated variable and do those transformations, you must assume away the possibility that it's there are aliasing accesses to its location on the stack. Thus, those are not safe optimizations for the compiler to perform on a stack-allocated variable.
It's not something you have to do. The model of treating each variable as stack-allocated until proven (potentially fallaciously) otherwise is distinctly C brain damage.
> If that is indeed what you want, then what you want is something closer to a macro assembler than a high-level language with an optimizing compiler like C. It's a valid thing to want, but you can't have your cake and eat it too.
This is a false dichotomy advanced to discredit compilers outside the nothing-must-be-faster-than-C paradigm, and frankly a pretty absurd claim. There are plenty of "high-level" but transparent language constructs that can be implemented without substantially assuming non-aliasing. It's totally possible to lexically isolate raw pointer accesses and optimize around them. There is a history of computing before C! Heck, there are C compilers with "optimization" sets that don't behave as pathologically awfully as mainstream modern compilers do when you turn the "optimizations" off; you have to set a pretty odd bar for "optimizing compiler" to make that look closer to a macro assembler.
It's okay if your compiler can't generate numerical code faster than Fortran. That's not supposed to be the minimum bar for an "optimizing" compiler.
We are talking about Odin, a language aiming to be 'better C' the way Zig is. The literal only reason anyone uses C is to write code that runs as fast as possible, whether for resource-constrained environments or CPU-bound hot-paths. Odin has many features that one would consider warts if you weren't in an environment where you'd otherwise turn to C, such as manual memory freeing. If I were pre-committing to a language that runs five times slower than C, I have no reason to select Odin over C#, a language that runs only ~2.4 times slower than C.
> The model of treating each variable as stack-allocated until proven (potentially fallaciously) otherwise is distinctly C brain damage.
OK, let's consider block-local variables to have indeterminate storage location unless their address is taken. It doesn't substantively change the situation. Sometimes the compiler will store that variable in a register, sometimes it won't store it anywhere at all (if it gets constant-folded away), and sometimes it will store it on the stack. In the last case, it will generate and optimize code under the assumption that no aliasing loads or stores are being performed at that location on the stack, so we're back where we started.
Frankly it seems strange to me to be comparing Vale's generational reference system and Rust's borrow checker directly. They have completely different characteristics and are not direct substitutes for one another.
First, Rust's borrow checker incurs zero runtime overhead for any pointer operations (whether dereferencing, copying, or dropping a pointer) and requires no extra storage at runtime (no reference counts or generation numbers); it's entirely a set of compile-time checks. Generational references, on the other hand, require storing an extra piece of data alongside both every heap allocation and every non-owning reference, and they incur an extra operation at every dereference.
Second, since Rust's borrow checker exists entirely at compile time, it doesn't introduce any runtime failures. If a program violates the rules of the borrow checker, it won't compile; if a program compiles successfully, the borrow checker does not insert any conditional runtime panics or aborts. Generational references, in comparison, consist entirely of a set of runtime checks; you won't found out if you violated the rules of generational references until it happens at runtime during a particular execution and your program crashes.
Finally, Rust's borrow checker applies to references of all kinds, whether they point to a heap-allocated object, a stack-allocated object, an inline field inside a larger allocation, a single entry in an array, or even an object allocated on a different heap and passed over FFI. Its checks still apply even in scenarios where there is no heap. Generational references, on the other hand, are entirely specific to heap-allocated objects. They don't work for stack-allocated objects, they don't work for foreign objects allocated on a different heap, and they don't work in a scenario with no heap at all.
All of these are fundamental differences which mean that Vale's generational reference system is not at all a replacement for Rust's borrow checker. It's not zero-overhead, it doesn't catch errors at compile time, and it's fundamentally specific to heap-allocated objects. In these ways it's more comparable to Rust's Rc, which introduces runtime overhead and is specific to heap-allocated objects, or RefCell, which performs checks at runtime that can result in aborting the program.
I and some other folks in the Rust audio community have put together some low-level bindings for the CLAP API: https://github.com/glowcoil/clap-sys
They're relatively straightforward due to the fact that CLAP is a simple, pure-C ABI, and there are already some fully functional plugins making use of them (e.g. https://github.com/robbert-vdh/nih-plug).
You can't use RAII in Rust? What on earth could this possibly mean? RAII is an extremely pervasive pattern in Rust and is fundamental to many of the safe APIs in the standard library.
To clarify, I was talking about making RAII, not using RAII. And it surprised me too, when I learned that the borrow checker rejects it.
To see it in action: Have a Database object, and try to have multiple Transaction objects that might commit something to it, in their drop().
It's unfortunately not possible, because they can't all have a &mut Database as struct fields.
We can sacrifice speed (by using Cell's copying or Rc's counting) or safety (by using unsafe). Most RAII we see uses unsafe FFI under the hood, which is why it was so surprising to me.
Rust is actually right, you cannot have multiple mutable references to a Database object without things going down the drain. (This is related to the fact that, like other comments said, &mut is an exclusive reference).
However, achieving something like what you want is still more than possible in Rust. You can do this with the pattern of 'interior mutability', which in its simplest form is just a Mutex. This allows upgrading a shared reference to an exclusive reference, so that you can safely mutate an object while upholding the expectations that a mutable reference is exclusive, and a non-mutable reference does not change from under your feet.
Of course, for a database, you will probably want a more advanced implementation of interior mutability, so that you can commit multiple transactions at the same time. (Or not, it seems to work quite well for SQLite.)
RAII is a general pattern for tying resource management to the lifetime of objects such that resource allocation is tied to value construction and resource deallocation is tied to value destruction. The smart pointers for allocation in the Rust standard library (Box, Rc, and Arc) are examples of the RAII pattern, since memory allocation happens at creation time (Box::new()) and memory deallocation happens when the Box goes out of scope (in drop()). Another example of RAII in the standard library is File: opening a file means creating a value of type File, and dropping that value means closing the file. Yet another example are the smart-pointer guards used for accessing RefCell and Mutex: RefCell::borrow() returns a Ref, and Mutex::lock() returns a MutexGuard; the underlying value can only be accessed while the guard exists, and access is relinquished when the guard is dropped. Given all this, it's absurd to say that Rust doesn't support RAII — RAII is fundamental to the design of many of Rust's safe APIs.
The very specific API design that you've described is not possible in Rust, but it is strange to equate this with the entirety of RAII. In any case, there are many alternative APIs (some with no sacrifice in speed or safety!) that are perfectly possible in Rust.
i think the real mistake was to have "exclusive references" be called "mutable references" in the language. I've taken the habit of saying "mut" as "mutually exclusive" for references. Of course you can't have each Statement keep an exclusive reference to a db object. They're exclusive!
You need shared references for your DB, implying you need interior mutability. This is how Statements are implemented in real-world rust database drivers such as rusqlite (any operation on a db is done through a shared reference). The fact that a very real package is doing it proves that the pattern you're talking about is, in fact, possible.
You are missing something. For a piece of Rust software to run in any widely used computing environment, it is required to interface with a large body of non-Rust software via a non-typechecked ABI. Moreover, the Rust standard library itself contains many, many instances of the unsafe keyword. The benefits of Rust safety do not come from building a hermetically isolated tower of pure safe Rust code from the ground up, and those benefits do not become null and void the moment you include one C library used via FFI.
Rust safety is about being able to take an unsafe component, encapsulate its implementation details, and encode sound usage patterns for that component in a public API which can then be statically checked by the compiler. This allows the difficult problem of determining whether an entire codebase is sound, memory-safe, and free of undefined behavior to be factored into many smaller, more tractable problems of verifying that individual components are sound given their APIs. You can even do this with wrappers and bindings to C libraries, and there are many examples of this in the Rust ecosystem.