Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Time to train an "LLM" on it and have it reproduce the source code 1-to-1, so I can use it without a license!


Open sourcing my local AI model that trains on this repo and codes an app based on it:

    cp -r gumroad not-gumroad



It's only a copyright issue if I know what the original source code looks like. If I don't know what it looks like, and my autocomplete writes it, how could I possibly know it's stolen?


I think the line b/w derivative work and new work might be different.

I mean if llms are trained on it ... and a lot of other things and then LLM can output the source code from a input ... then wouldn't it be open source / public domain


No


As if any court would accept this. Nice try


And how exactly are you going to prove derivative work?


OpenAI is basically betting the existence of the company on that.

Meta is betting the existence of their Llama models on it.


I don't think that's true. When chatgpt generates something that infringes (even on something not in the training data) it is still infringement and the output cannot be used by the user for anything they couldn't use the original for.

Luckily it doesn't do that often under normal use


But that's the point he tries to make. When you "teach" LLM with some knowledge, you teach it a set of patterns. It won't necessarily drop the code that infringes copyright. Say you load Gumroad code into Gemini Pro context and say something like: "Check this app. Analyze the implementation of feature XY... I need you to help me implement feature XY... but in Go". Then, you can recreate an entire platform that will look nothing like the original but will have the same features and open source it.


Except they have billions of dollars to make that bet.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: