Julia is doing exactly that (even better than setting a flag, it is based on subtypes). Algorithms are usually written to work with `AbstractArray`. The usual RAM/CPU arrays (i.e. the type `Array`) is a subtype. The CUDA array which is stored on the GPU is also a subtype. Most general purpose code does not care which type you use.
Definitely look at the whole JuliaGPU ecosystem, and the CUDAnative.jl package. There are two big questions. First, how do you get code generation for targets such as GPUs from a dynamic high level language such as Julia, and that is what CUDAnative.jl achieves. The second question is what abstractions should Julia present to programmers so that increasingly larger codebases can automatically leverage GPUs. Packages such as CuArrays.jl are early answers, but there is much work to be done here.