A CERN for Open Source Large-Scale AI

MrThoughtful · on April 9, 2023

Europe dug itself deep into a hole.

At first, we missed the internet. Because of way too much regulation.

Then we found ourselves using only US software and digital services.

What did we do? We created more regulation. Way more regulation. Burying all hope Europe could ever get a foot in the door of the internet.

Now we are about to miss AI.

OpenAI, Microsoft, Google, Meta, Tesla, NVIDIA .. all AI players are in the US again.

I don't think more governmental intervention is the right way.

In fact, I think it is counterproductive.

What should we do?

Well, it's a long process. We would have to get to a mindset that gives small entrepreneurs more freedom. Less bureaucracy and less regulation, step by step. I don't think it will happen. The reaction in Europe is always "more regulation". The thought of removing some regulation is probably terrifying to our leaders.

m_ke · on April 10, 2023

I think this has a lot more to do with the US having a larger english speaking population under one legal framework. All of the capital goes here first since it has the most upside and makes it easier to expand to other english speaking countries.

I'm Polish and would never build a tech business targeting the Polish market first over the US, and it has nothing to do with regulations getting in my way.

Building ChatGPT or Google for EU citizens is a lot harder than doing it for English speakers.

Also most of the leading figures in AI are from Europe or Asia but followed the money to the US.

EDIT 3: Americans underestimate the value of being surrounded by oceans instead of Russia and Germany, which they end up attributing to their ingenuity or libertarian leaning version of capitalism.

pk-protect-ai · on April 10, 2023

I believe you are driving the conversation off the course here.

> We would have to get to a mindset that gives small entrepreneurs more freedom. Less bureaucracy and less regulation, step by step.

Can you explain me, how the small entrepreneurs will get their hand on billions to train 175B models which we have now and the next ones wich will have order of magnitude or even 3 orders of magnitude more weights???

You can't do this without collaboration and lot, lot of money. This requires huge infrastructure for the training, the fine tuning and incremental training. BLOOM is one of the examples that was made using your taxes, but it lacks RLHF and RLMF. The later two require dedication and dedicated work on the models.

And this is what the CENRN's petition is about. I do not give less sht about regulations at this point, as I'm really worrying about big corps having a sole control and huge advantage on the technologies of the paradigm shift.

That is why such Open Source initiatives must be perused.

If you live in Europe vote for the CENR petition now.

ls-lah_33 · on April 10, 2023

Don't know about regulation, but the development of the internet was initially subsidised by the US government [1]. AI research has also been subsidised in the past. The famous MNIST handwriting dataset being a good example here [2].

[1] https://en.wikipedia.org/wiki/ARPANET

[2] https://www.nist.gov/srd/nist-special-database-19

fancyfredbot · on April 9, 2023

Deepmind is headquartered in the UK (but owned and funded by Google). Graphcore are also UK based.

YetAnotherNick · on April 9, 2023

> but owned and funded by Google

I remember Demis Hassabis(Deepmind founder) said in some podcast that they had lot of troubles getting decent funding in Europe. In general it seems Europe has much less appetite for funding private research groups than US, which I think it boils down directly or indirectly to regulations.

atleastoptimal · on April 9, 2023

This is true, it all comes down to money.

Europe is as capable if not more capable of being the center of engineering, tech and software. However there is no huge hotbed of VC money that allows huge risks.

chpatrick · on April 9, 2023

Europe has much less appetite for funding anything.

louloulou · on April 10, 2023

Except particle physics - which I guess is why they are trying to sell it this way.

tommoor · on April 9, 2023

So, from a regulatory perspective neither of these are in the EU either ;)

ben_w · on April 9, 2023

Only theoretically; the UK, as half the country said would happen, is currently deciding to use its post-Brexit sovereignty to repeatedly freely choose to (almost?) always do whatever the EU decides to do, only without any of the free trade advantages that come with guaranteeing they always will.

anonylizard · on April 9, 2023

This. The EU prides itself in being the 'regulatory superpower'.

Now the EU gets GDPR, while the US gets GPT-4. I cannot think of any influential EU AI companies, except for DeepL. While even the UK has two (Deepmind, Stability). I cannot see a more damning indictment of EU's tech policy than this.

But this seems to be the future that Europeans want, aging, resentful of changes, yet somehow mass importing immigrants to serve the pensioners. Italy now has 50% GDP per capita compared to the US, I think it'll end up being 33% by 2050.

ramblenode · on April 10, 2023

A pretty large majority (79%) of Americans are concerned about how their data is being used by companies. 75% would like some kind of regulation on what companies can do with their data [0]. The fact that the US government is so unwilling/unable to do anything despite such popular support is an indictment of the idea that Americans are better off just because the economy is doing well.

The problem with comparing economies is that something which is a net positive for GDP can still be a net negative for most of the actual people living under that economy. The resource curse of petro states is a classic example, but in the US one could look at synthetic opioids, casinos, strip mining, or even social media. More money is moving but are most people better off? The US (economy) is the richest it's ever been but despite this there seems to be a rising tide of pessimism, unafordability, and distrust of institutions.

[0] https://www.pewresearch.org/fact-tank/2019/11/15/key-takeawa...

cscurmudgeon · on April 10, 2023

Where is equivalent survey for EU citizens?

Patrickmi · on April 10, 2023

The US stock market it’s self is like 2x more than Europe combine that itself deter cooperation greediness and also innovation

abetusk · on April 9, 2023

I'm absolutely in favor of the idea of open source AI, providing libre/free/open source code, libre/free/open source data and providing all other relevant digital artifacts under a libre/free license but I'm nervous about believing in some other organization that claims to be open but then rescinding their offer.

I think this was precisely OpenAI's charter when it started [0] and now they've positioned themselves as paternalistic protectors, claiming to have peoples best interest at heart when limiting access the tools they've created. Even hugging face has weird licensing for many (most?) of their models [1].

My apologies, but I don't trust LAION. I guess I would trust it more if they talked about which licenses they specifically would support, which they wouldn't and what the repercussions of violating their charter would be.

[0] https://web.archive.org/web/20160220093339/https://openai.co...

[1] https://huggingface.co/models?sort=downloads

mark_l_watson · on April 10, 2023

I mostly use the OpenAI APIs out of some laziness on my part: super easy to integrate with my code, inexpensive, large 3rd party dev community.

All that said, I love what Hugging Face is doing with open models and software. I made a note to check out their licensing however, based in your comment.

nharada · on April 9, 2023

I support this, and even more so I believe this is necessary if we believe that academia should exist as a place for researchers to do work not motivated by profits. So many AI researchers have moved to work at big tech simply because the resources required for cutting edge ML are so vast.

Having a giant collaborative project would make it much easier for universities to retain professors, attract PhDs/Post-docs, etc. They could do work that aims to benefit society as a whole, instead of just happening to benefit society when it's a lucky side effect of benefiting stockholders.

rajandatta · on April 9, 2023

This petition is an excellent idea. We as humanity need to get closer to the edge if not ahead of AI Development to strive to ensure that a) the wealth and benefits can be generated for societies b) closed and unaccountable parties are not the sole custodians of such a powerful force.

The idea of CERN has a lot of merit because of CERN's success as a multi-national institution. The model will need to be evolved for AI but that's why we need to start now.

This is the start of a new phenomenon for our societies; not the end. It's better for its impact to be managed by society as a whole.

stathibus · on April 9, 2023

On the contrary, it's totally unnecessary and the same level of effort will come about naturally from market and industry/academic incentives. Cern was needed because nobody would spend the money on high energy physics otherwise.

seydor · on April 9, 2023

They can apply for a research grant, the EU does give a lot.

But if it goes the way of the Human Brain Project, it will alter course next year and become 100 small projects

Besides, our laws make it virtually impossible to train these models without getting sued. Not to talk about their output, a defamation lawsuit magnet.

EU is pretty much locked out of the future of AI

louloulou · on April 10, 2023

Also what languages are you focusing on for the training? How will this not result in massive political arguments about where resources are spent.

moelf · on April 9, 2023

except CERN actually does a lot of things backwards in terms of technology (partly due to our unique needs for hardware, but absolutely no excuse for software).

If what CERN is doing has even the remotest real-world application, CERN would have looked much less world-leading than it is today. European treaty is what made CERN viable -- because otherwise it's very hard to fund ultra long-term, almost no return projects.

AI is the opposite of that and you probably don't want to have a CERN: plenty of $ flowing into AI project without government intervention && you don't want the slowness of CERN.

i-use-nixos-btw · on April 10, 2023

I’m wondering how sustainable this is.

Tech firms are throwing billions at this at the moment, and will continue to do so for a long time. As soon as faster tech emerges, they’ll upgrade - they don’t need to justify it to the public, and justification to shareholders is pretty easy. “We’re trying to secure our position at the forefront of a competitive market”.

Would a publicly funded project be able to keep up with that? Or would there be a big upfront cost, only to find that their technology is in the Stone Age five years from now?

With CERN it makes a lot of sense. There isn’t a vast market out there for it to compete with - just a few organisations that they collaborate and “compete” with. But trying to join the big spenders seems risky.

What am I missing?

amrb · on April 9, 2023

Japan did have the VLSI project a group of there largest companies: https://youtu.be/bwhU9goCiaI?t=766

For Europe I could see a bigger group of companies, schools and researchers working towards open models with funding. This would allow EU companies to be competitive with the US while not duplicating effort, more eyes on the problem too. Maybe alignment work could be a goal, since the driver is not profit at any cost this time.

novaRom · on April 9, 2023

The problem is interdisciplinary. We have to unite first and make a clear plan which should answer following questions:

1. How to get a stable non-monopolistic semiconductor industry with multiple players without critical dependencies on a single supplier

2. How to increase energy supply and stabilize energy prices

3. How to transform existing businesses, governments, and society in general to align them with AI-induced disruptive changes

amrb · on April 9, 2023

So they would have to buy the hardware require to support this project where CERN requires research and creates scientific jobs in the EU.

I can see people may not like handing over a lot of money to Nvidia for 1000 cards, unless the hardware ML accelerators are shipping as an alternative.

SanderNL · on April 9, 2023

Wants your full address. “Your data will be secure”, until it is not. This is too much.

lwn · on April 9, 2023

At the next step. It's a no go for me as well.

bsaul · on April 9, 2023

please don't advocate AGAIN for EU politicians to compete against silicon valley. They love to do that and fail miserably all the time (but only after having spent troves of money, of course)

pelasaco · on April 9, 2023

at least the initiative isn't driven by politicians https://laion.ai/team/, but by subprime researchers trying to get their hands on the EU money. It's for sure better, but I'm afraid EU is broken.

ljlolel · on April 9, 2023

Subprime?

Weird way to describe this team, also tons of great researchers are in and from the EU.

One person listed there:

Robin Rombach Member. Stable Diffusion Trainer AI researcher with a focus on deep generative models. Author of VQGAN, Latent Diffusion, Stable Diffusion.

pelasaco · on April 9, 2023

Yeah, subprime. You cannot compare them with the people who founded CERN https://home.cern/about/who-we-are/our-people. But maybe our understanding of subprime are different? I didn't want to offend them, for me subprime means "being of less than top quality", which means good, but not great. Great but not excellent. So yeah, they are good researchers, some of them are good software developers, but IMO, not enough to lead the CERN for open source large-scale AI.

jononor · on April 13, 2023

Who would be good enough?

freakynit · on April 9, 2023

Can't we use kiclstarter or other fund raising sites to train a gpt-4 like powerful model?

oceanplexian · on April 9, 2023

I would avoid kickstarter for any AI related work, they made it clear after banning a few projects that they want to push their flavor of AI ethics on creators (e.g. https://updates.kickstarter.com/ai-current-thinking/). IMO we need an alternative infrastructure for organizing and funding these things, since it's clear Big Tech sees AI as a threat and wants to desperately shut it down.

pixl97 · on April 9, 2023

Corporations are raising billions (with a B) of dollars to create and develop AI models. With that money they are buying up every bit of compute they possibly can. I'm not sure what level of funding you're going to get in your actions, but it's not going to be anywhere close to the levels needed.

olalonde · on April 10, 2023

Which corporations? There were some estimates that GPT-3 cost under 5M$ to train[0].

[0] https://lambdalabs.com/blog/demystifying-gpt-3

MacsHeadroom · on April 10, 2023

Anthropic plans to spend $1B on compute to train Claud-Next-10X (10x the size of GPT-4 according to their own leaked documents) over the next 18 months. They also plan to raise $5B over the next 5 years. OpenAI has secured over $10B in funding.

GPT-3 is not GPT-3.5 or GPT-4. Each costs exponentially more than the last in compute and dollars.

Tepix · on April 9, 2023

This shouldn't be a one-time affair