Hacker Newsnew | past | comments | ask | show | jobs | submit | quinnhj's commentslogin

Aquarium (YC S20) | Remote (US) | Full-time | Engineering (Backend, Full Stack, Infra)

Data Management for Deep Learning.

Machine learning models are only as good as the datasets they're trained on, and that means that most improvement to model performance comes from improvement to the quality and diversity of their datasets. Our platform [1] makes it easy for ML teams to find anomalies + failure patterns in their datasets and fix these problems by editing / adding the right data. So the next time you retrain your model, it just gets better.

We have great customers like Pinterest and Woven Planet, and are backed by top investors like Sequoia Capital and YCombinator. We're currently growing our engineering team, tackling projects like streaming data pipelines, data clustering for visualization, and tight integrations into third party platforms.

Product Stack: Typescript, React, Python

Infra Stack: GCP (and various managed services), Apache Beam, Postgres, Docker, etc.

Our postings are here [2], or just reach out to me over email at quinn@companydomain.

----

[1] https://www.aquariumlearning.com/

[2] https://www.aquariumlearning.com/careers


Aquarium (YC S20) | Remote (US) | Full-time | Engineering (Backend, Full Stack, Infra)

Data Management for Deep Learning.

Machine learning models are only as good as the datasets they're trained on, and that means that most improvement to model performance comes from improvement to the quality and diversity of their datasets. Our platform [1] makes it easy for ML teams to find anomalies + failure patterns in their datasets and fix these problems by editing / adding the right data. So the next time you retrain your model, it just gets better.

We have great customers like Pinterest and Woven Planet, and are backed by top investors like Sequoia Capital and YCombinator. We're currently growing our engineering team, tackling projects like streaming data pipelines, data clustering for visualization, and tight integrations into third party platforms.

Product Stack: Typescript, React, Python

Infra Stack: GCP (and various managed services), Apache Beam, Postgres, Docker, etc.

Our postings are here [2], or just reach out to me over email at quinn@companydomain.

----

[1] https://www.aquariumlearning.com/

[2] https://jobs.lever.co/aquarium


Aquarium Learning (YC S20) | Remote (US) | Full-time | Engineering (Backend, Full Stack, Infra)

Data Management for Deep Learning.

Machine learning models are only as good as the datasets they're trained on, and that means that most improvement to model performance comes from improvement to the quality and diversity of their datasets. Our platform [1] makes it easy for ML teams to find anomalies + failure patterns in their datasets and fix these problems by editing / adding the right data. So the next time you retrain your model, it just gets better.

We have great customers like Pinterest and Woven Planet, and are backed by top investors like Sequoia Capital and YCombinator. We're currently growing our engineering team, tackling projects like streaming data pipelines, data clustering for visualization, and tight integrations into third party platforms.

Product Stack: Typescript, React, Python

Infra Stack: GCP (and various managed services), Apache Beam, Postgres, Docker, etc.

Our postings are here [2], or just reach out to me over email at quinn@companydomain.

----

[1] https://www.aquariumlearning.com/

[2] https://jobs.lever.co/aquarium


Aquarium Learning (YC S20) | Remote (US) | Full-time | Engineering (Backend, Full Stack, Infra)

Data Management for Deep Learning.

Machine learning models are only as good as the datasets they're trained on, and that means that most improvement to model performance comes from improvement to the quality and diversity of their datasets. Our platform [1] makes it easy for ML teams to find anomalies + failure patterns in their datasets and fix these problems by editing / adding the right data. So the next time you retrain your model, it just gets better.

We have great customers like Pinterest and Woven Planet, and are backed by top investors like Sequoia Capital and YCombinator. We're currently growing our engineering team, tackling projects like streaming data pipelines, data clustering for visualization, and tight integrations into third party platforms.

Product Stack: Typescript, React, Python

Infra Stack: GCP (and various managed services), Apache Beam, Postgres, Docker, etc.

Our postings are here [2], or just reach out to me over email at quinn@companydomain.

----

[1] https://www.aquariumlearning.com/

[2] https://jobs.lever.co/aquarium


Aquarium Learning (YC S20) | Remote (US) | Full-time | Engineering (Backend, Full Stack, Infra)

Data Management for Deep Learning.

Machine learning models are only as good as the datasets they're trained on, and that means that most improvement to model performance comes from improvement to the quality and diversity of their datasets. Our platform [1] makes it easy for ML teams to find anomalies + failure patterns in their datasets and fix these problems by editing / adding the right data. So the next time you retrain your model, it just gets better.

We have great customers like Pinterest and Woven Planet, and are backed by top investors like Sequoia Capital and YCombinator. We're currently growing our engineering team, tackling projects like streaming data pipelines, data clustering for visualization, and tight integrations into third party platforms.

Product Stack: Typescript, React, Python

Infra Stack: GCP (and various managed services), Apache Beam, Postgres, Docker, etc.

Our postings are here [2], or just reach out to me over email at quinn@companydomain.

----

[1] https://www.aquariumlearning.com/

[2] https://jobs.lever.co/aquarium


Tecton is their most recent high profile ML/AI company: https://a16z.com/2020/04/28/investing-in-tecton/


Other founder here! For a high level overview of this framing of the problem, I recommend reading this Waymo blog post [1].

One nice feature is that by using embeddings produced by a user's model, which has been trained in the context of their domain, we can do this sort of smart sampling in domains we've never seen before. Embeddings are also naturally anonymized, so we can do this without access to a user's potentially private raw data streams.

[1] https://blog.waymo.com/2020/02/content-search.html


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: