Who should control your AI?
AI is rapidly becoming a part of our everyday lives, it’s inescapable whether we’re ready for it or not. As its influence grows, critical questions arise: Who owns and controls this technology, and what are their intentions? How will it develop over the next few years and what will the impact be on our lives and potentially our bank accounts?
While the debate around AI often swings between utopian and dystopian visions, it’s essential to look at the underlying technology and the organisations likely to dominate this space. AI consists of three core components: algorithms, data and computing power.
The algorithm is software coded to infer and capture ‘knowledge’ (deliberate punctuation marks) from data which is used to ‘train’ the model; once trained, the model is then released to processes queries from the user community. Data for model training and development can be either be vendor and organisation specific or generally available from the Internet. Computing horsepower basically means lots of servers with AI specific add-on processing cards allowing parallel processing and higher bandwidth to provide better AI processing performance than general purpose CPUs.
Let’s explore each one and examine who holds control over them and their potential intentions.
Algorithms: The Core of AI
Firstly, algorithms. These are the core of AI, and where most of the attention is focused at the moment. AI algorithms, and all other algorithms used for search or optimisation, are normally written by software engineers working for commercial organisations looking to profit in some way from it. There are a number of public-spirited individuals and organisations that open source their algorithms or licence them for minimal cost, these tend to be the exception rather than the rule.
Most algorithms are proprietary and are closely guarded intellectual property of the organisations that develop and maintain them. Interestingly, OpenAI, who are seen as one of the leaders in AI, were originally founded as a charitable organisation, with a mission of developing Artificial General Intelligence (AGI) for the common good, but are now a hybrid organisations where what is ostensibly a charity has a commercial, profit making subsidiary as its main trading entity, albeit they describe the subsidiary as a ‘capped profit’ organisation.
The other major players, who are either developing their own AI/AGI algorithms, such as Google, Meta, Anthropic, Stable Diffusion, IBM, Mistral and X/Twitter/Grok; or licence one of the developers, such as Microsoft and Apple’s relationships with OpenAI using their ChatGPT service, or AWS offering Anthropic’s Claude and Meta’s Llama 3 to their customers, are commercial, for profit entities. Given this, when using any of these models, one of the key questions you need to ask, particularly if the service is free to use, is who benefits from you using it, and what do they actually want from you?
Data: The Fuel for AI
The second element is data. All AI algorithms are trained on data sets, and its associated metadata, i.e. data about the data, from which they ‘learn’ the attributes to train the models to deliver the desired outcomes. This brings several potential issues. Firstly, is the data correct and accurate? Most of the major models are trained on publicly available data across the internet. As most people are aware, not all of the data is correct and accurate, although to be fair the vast majority of data is. The other issue this brings is copyright. Most data is owned by a producer or publishing organisation, who have a legal right to control access to it. Lots of them would like you to pay for access and won’t allow you to copy and share that data without their permission or payment. The data you pay for also tends to be of higher quality, checked by a human and is usually more accurate than generally available data from the internet. This is a growing issue that a number of lawsuits such as that between the New York Times against OpenAI are currently testing. Who (or what) can access it and how much is that data worth to accurately train AI models? The algorithm developers are keen to get access to the best quality data, witness the number of licensing deals that are currently being struck between content providers and AI developers.
The second issue is that the data to train the model can be selective; if you were to train your model to identify and comment on food, and the only training data provided was pictures and details of cream buns, then the model will only consider cream buns to be food and would not be able to classify a steak or stalk of broccoli as food.
The third issue is the provenance of the data. Is it free from bias and generally accepted to be truthful? This sort of depends on your viewpoint and who you ask. Many people believe that the moon landings in the late 1960’s were faked, likewise many people are not convinced that evolution is a real process, they believe everything in the world was created by the man upstairs a few thousand years ago.
In certain pockets of the internet there are copious postings on these topics. The AI algorithm can only infer from the data it analyses. If your AI results state either of the two previous beliefs in their answer then most users would immediately question the data, however any true believer of either theory getting these results would see them as correct. Additionally, AI is also producing copious data itself, it is estimated that up to 30% of the current internet content is AI generated, some of which is being used to train newer AI algorithms, so these inaccuracies get repeated and magnified with each iteration.
Lastly, on this topic, data volumes. The more data the model can be trained on, the likelihood is that it will be more accurate and provide better quality results. Therefore, if you are the owner of vast amounts of data, hello Apple, Alphabet/Google, Meta, AWS, Microsoft and others, the chances are models trained on your data will be more generally useful than one trained on Fordway’s data. In the old adage, ‘everything comes to those what has’.
Computing Power: The Backbone of AI
AI needs vast amount of expensive computing hardware and facilities to operate, specifically specialised AI processing units (AI PUs), otherwise known as NPUs, DPUs, SPUs and TPUs depending on the vendor. This is why Nvidia has recently become one of the most valuable companies in the world, as at the moment they have the most capable AI PU available, plus the manufacturing and distribution capacity to service the demand. Other vendors are developing rival AI PUs, particularly for more specialised applications, but have a way to go before they catch up. You also need datacentres with thousands of servers to host the AI PUs, vast amounts of data storage capacity and high speed networking linking all the components. Who currently has these? Why, it’s Apple, Alphabet/Google, Meta, AWS, Microsoft and their fellow hyperscalers. They are happy to rent you capacity, subject to market demand, but if you own it you have an implicit advantage by deciding who gets to use it.
So, in my view, whilst AI is potentially a massive force for good, and will provide many benefits for humanity, the way it is being developed, marketed, and used simply reinforces the hegemony of our current digital overlords. Coming back to my original point, if they own and control it, whilst their offerings may provide many benefits for each of us, ultimately the products they introduce are primarily to generate profits and further their interests. Therefore, look before you leap, and understand what you are signing up to. Whilst the EU and others are looking at regulation and policy to minimise the worst excesses of the vendors, the pace of regulation is rather slower than the pace of product development, so they will always be playing catch up after you have signed up for the service and have given your data and bank details to the provider.
In an ideal world, AI should be force for common good, such as roads and power grids, but in the short to medium term I can’t see any western governments having the desire, ability and resources to set up competing services to those offered by the hyperscalers, who, whilst controlled by market forces, are not altruistic organisations. It would be nice, but I don’t think it will happen for several years to come.