Another idea: have the cubes point an edge straight forward (instead of a face). Then if each cube has two adjacent dark sides and two adjacent light sides, one could setup two ‘simultaneous’ images: one viewed from the left at 45° and another viewed from the right. (Each pixel would have four possibilities.)
I guess the patents are long expired now and don't really apply to pixels, but that concept exists already for non-pixelated images and sadly these are replaced mostly by LEDs now in the wild:
For this to work, you'd want two adjacent faces painted, rather than opposite faces being painted, which seems to be how they're currently done (unless they only have one face painted?). Then the four possible rotations would allow for each possible pixel-pair. (The cubes could perhaps instead be squat rectangular prisms, to correct the aspect ratio, too.)
... But that's as far as you could take it, since 16-gons would show at least 7 faces while only having an encoding for 4.
I also thought of using hexagonal prisms, showing two faces at a time in paired colours but using three colours. These would also need much less clearance in order to rotate freely, compared to face-on cubes.