On TikTok, AI, and AI Ethics

In recent weeks, two major tech topics have come to the forefront, both in some ways tied to US-China relations. The first is new regulation, aimed at TikTok, due to the behavior of the TikTok “algorithm” and the perceived ties between TikTok’s parent company, ByteDance, and the Chinese government. The second is continued advancements in large language models (LLM) that underlie the technology used in OpenAIs ChatGPT, which prompted AI watchdogs to call for the pause in development and dissemination of OpenAIs next generation LLM, GPT-4.

I want to examine both the US response to TikTok and the US debate about AI, and where I think things are headed on both fronts. In the case of TikTok, the actual avenue Congress will take to regulate TikTok is specifically targeted to apps like TikTok which are built in China. This targeted approach betrays that meaningful AI regulation is unlikely, no matter how many people believe that unregulated AI poses a serious threat to humanity. Full disclosure, I have a TikTok account and use the app occasionally, and I have a paid account with OpenAI. I use ChatGPT daily and the paid tier has access to a beta version of the new GPT-4 capabilities.

TikTok brings out a unique blend of technophobia, sinophobia, and generational division that makes it a very ripe target for Congress. Some of these fears are not unfounded. The chief concerns about TikTok are:

  1. Connections to the Chinese Communist Party (CCP): ByteDance is a Chinese company and has a CCP committee and is thus subject to some oversight and pressure from the CCP. TikTok is, reportedly, wholly separate from its Chinese counterpart product, Douyin, but there is some question about how much code base the two companies share, and thus what kinds of back doors the Chinese government has into the app. It’s also reasonable to conclude that ByteDance could act on CCP requests for TikTok product initiatives and decisions.
  2. Surveillance: TikTok app permissions, if not well-managed by the user (and perhaps even if they are well-managed), gives TikTok access to a substantial amount of data from users’ devices. This data can be used to create a robust profile of a user, and some permissions give the app access to things like device cameras even when the app is not in use.
  3. “The Algorithm”/Misinformation: The primary way users receive content from TikTok is through the For You Page (fyp) which brings the user content based on the app's machine-learning-driven understanding of the user’s content preferences. Anecdotally, the app was incredibly good at delivering preferred content, better than any other social app I had used, though recently it’s been less good at that. I suspect that TikTok is going through a similar product degradation that plagues most social media, which is well laid out here (If you want to know why all of your tech products start great and invariably get worse, that’s a great read). Lawmakers are concerned that the algorithm drives misinformation, and propaganda, into users' feeds, as well as dangerous content and dangerous trends. There is no shortage of articles discussing the latest dangerous trend your teen might be exposed to.

So how does Congress propose we deal with TikTok? I’ve already written about the misguided attempt by Congress to deal with misinformation, and it applies here. It’s heavy-handed regulation with many unintended consequences that does not solve the problem, and that’s likely why Congress has not used this TikTok moment to advance Section 230 reform. Congress also has their hands tied if they try to attack TikTok based on surveillance. Most American-made apps have capabilities that raise the same surveillance and data collection concerns that TikTok does, and our government has proven unwilling to confront those problems. Much of that likely has to do with tech companies’ willingness to give government access to their data in some circumstances. This leaves TikTok’s ties with China as the only angle of attack. And that’s the angle Congress has proposed with the Restrict Act. I will not discuss the merits of the Restrict Act here, though I do think the final bill that comes of it will likely have serious negative impacts, but the fact that the act regulates based on the nebulous grounds that an app poses a national security risk rather than regulating any of the underlying technology is informative for AI.

A brief discussion of what AI is and is not. If you are well-versed in AI, just skip this and maybe the next paragraph. These days, in the tech world and consulting world, anything that uses machine learning often gets pitched as AI because, frankly, the people doing the selling know that AI sounds more impressive. The same thing happened years ago with “data mining” vs “machine learning”, and often things that are pitched as “machine learning” today are using good old statistics from the early 20th century. It’s almost like these people are being intentionally confusing. To simplify things, when I discuss AI here, I am speaking about products and ongoing research moving towards Artificial General Intelligence, or AGI, which I’ll use interchangeably with AI going forward. AGI is not oriented to a single task, it attempts to mimic human reasoning and behavior. For our purposes, ChatGPT is AI. The machine learning engine used by Google to feed you ads is not AI. They both use machine learning under the hood (in ChatGPT's case, deep learning) but only ChatGPT is attempting to achieve AGI.

AI is currently having a moment because we’re starting to emerge from the opaque world of theory and research into a world where people have access to real AI tools that they can work with. GPT-3 is an LLM that was released in 2020 and after a fairly long beta run, it is now the basis for a few different tools out there, including ChatGPT and Github Copilot. I use ChatGPT in my work every day. An example of its capabilities: I’ve been working on a forecasting model as part of my day job. I wanted to build an econometric model to predict a variable of interest for my company. I asked ChatGPT to tell me what kinds of variables it would include in such a model, and it was able to highlight for me a couple of considerations I had overlooked in my initial plan. Then I asked it to write me code to pull publicly available data for the model, it wrote me all of the code and gave me instructions on how to run it. Finally I asked it to write starter code to build the model itself, it fed me great starter code which I spent five minutes adapting to get my final model. This capability allowed me to build the model directly in an hour instead of taking up a minimum of two days of my analysts’ time to parse the request and execute the modeling based on my instructions, leaving them free to accomplish our normal day-to-day work. AI, for me, is an immediate force multiplier. My partner is an accountant, and she had me ask it for guidance on how certain transactions should be handled in a company’s books, and ChatGPT gave her a satisfactory answer. Suffice it to say, the tech is very good, already useful, and is improving with GPT-4. I am not one to hype up technology, so when I tell you that this tech is going to impact your work and your life, you may want to take it seriously. And this AI only knows how to read text and write text, imagine when we have AI that knows how to use robotic legs, arms, and hands. Or maybe don’t imagine that. Now back to AI regulation.

AI poses similar dangers to those Congress sees with TikTok, but they exist beyond social media applications. There’s a real potential future where AI becomes the primary designer and implementer for much of our infrastructure, perhaps most importantly defense infrastructure. Congress, and the US in general, is in a prisoner’s dilemma when it comes to AI regulation. A best case scenario would probably involve some thoughtful regulation of AI technology and AI applications to avoid the negative impacts of AI. This would require both China and the US to take action to regulate AI. But any regulation would put some brakes on AI innovation and development, and whoever acts first would immediately hurt their momentum in the AI arms race. So the likely result here is potential regulation around the edges, perhaps around discrimination and other social ills, but nothing that would hinder the development of the core AI technology and capabilities. Such regulations could be impactful to many companies looking to implement AI in day-to-day operations.

I am a huge proponent of using AI technology to your benefit right now, and think that companies that move quickly on AI will see serious benefits. However, given the direction of potential AI regulation, companies need to put a premium on ensuring their AI implementations are ethical, meaning they are non-discriminatory and they solve problems as intended. Fortunately, you do not need an AI expert to ensure you’re implementing AI ethically, you simply need to know how to ask the right questions of AI. But it requires thought and effort.

A few years ago I was able to attend a talk by the CEO of a very large defense contracting company. During the talk, he spent most of his time discussing how his company was leaning into machine learning and AI technology, and how they were recruiting people that know how to use and implement the tech (he was speaking to an MBA class, and it was a not-so-sly potshot at the future MBAs). He also discussed how they comply with current regulation for machine learning and AI, mostly from the EU. I asked him, given their current investment in AI, and the fact that regulation tends to lag technological advancement by many years, how do they think about ensuring their current implementations of machine learning and AI are ethical, and will be ethical in the long run? His answer was to reiterate the laws they comply with, and the eye-rolling, direct-from-Google philosophy that they “do no harm”. This was, at best, a cop out. He might as well have said “I have no idea what we’re doing with this technology that I’m pitching, I’ll just hope that we’re using it for good reasons”, which is alarming coming from a CEO basing his strategy on AI and machine learning. Organizations need to be specific about how they think about AI, and in order to keep the focus specific it is best to think about implementations. We can gain peace of mind about the ethics of our AI implementation in two steps:

  1. Identify potential pitfalls. These are mostly known because the same pitfalls were covered in human processes. A good hypothetical example is bank lending. Lenders used to brazenly discriminate along racial and gender lines. That is now illegal, but the outcomes of that discrimination live on in historical lending data that would be used to develop an AI lending decision bot. So if we used AI to make lending decisions, we would need to build in guardrails to account for historical discrimination.
  2. Develop tests for these pitfalls. In our banking case, we would create loan applications that are similar to applications from people of varying demographics, observe how the AI makes a loan determination based on those mock applications, and evaluate the bias. Notably, it is not sufficient to simply avoid feeding the AI demographic data. AI can use seemingly unrelated data to infer demographics, even if we do not want it to (this is known as “unmasking”). So we must always evaluate outcomes, not just inputs.

I’ve already worked on these kinds of implementations. One simplified example I have is from a project where I was doing due diligence for a client seeking to buy a company that used machine learning to parse resumes and match those resumes to jobs. I wanted to make sure the algorithm worked well, so I fed it resumes with names and addresses and whatever other identifying information was in the resume, but the rest of the resume was lorem ipsum gibberish. I found that the model very confidently matched those resumes to jobs, which raised questions about how well it was matching other resumes. I also ran a separate test where I fed identical resumes to the model, but used a number of different stereotypical male and female names at the top, and observed that the female names were systemically routed to lower salary band jobs. Importantly, I didn’t use any of my technical knowledge about machine learning algorithms, statistics, or programming to test this model. I simply sat and thought about the universe of bad outcomes for such a model, and how bad actors might try to misuse the model. All of this is applicable to any AI implementation, and it does not require expensive external consultants (except me, definitely pay me) to ensure your implementation goes well, it just requires that you plan to ask the right questions of the AI before implementation, and periodically after implementation. This is one way to ensure that good intentions turn into good outcomes.

I’m just one guy with one opinion, how do you think we should work to benefit from AI without causing harm?

Subscribe to Signal-Noise Ratio

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe