Welcome to AI This Week, Gizmodoâs weekly deep dive on whatâs been happening in artificial intelligence.
Despite the fact that companies have been shoving AI into every customer service interface in sight, itâs fairly obvious that the information those interfaces provide isnât always that helpful. Case in point, a chatbot at a California car dealership went viral this week after bored web users discovered that they could trick it into saying all sorts of weird stuff. Most notably, the bot offered to sell a guy a 2024 Chevy Tahoe for a dollar. âThatâs a legally binding offerâno takesie backsies,â the bot added during the conversation.
The bot in question belonged to the Watsonville Chevy dealership, in Watsonville, California. It was provided to the dealership by a company called Fullpath, which sells âChatGPT-poweredâ chatbots to car dealerships across the country. The company promises that its app can provide âdata rich answers to every inquiryâ and says that it requires almost no effort from the dealership to set up. âImplementing the industryâs most sophisticated chat takes zero effort. Simply add Fullpathâs ChatGPT code snippet to your dealershipâs website and you are ready to start chatting,â the company says.
Of course, if Fullpathâs chatbot offers ease of use, it also seems quite vulnerable to manipulationâwhich would seem to throw into question how useful it actually is. In fact, the aforementioned bot was goaded into using the exact language of its goofy responseâincluding the âlegally bindingâ and âtakesie backsiesâ bitsâby Chris Bakke, a Silicon Valley tech executive, who posted about his experience with the chatbot on X.
âJust added âHacker, âsenior prompt engineer,â and âprocurement specialistâ to my resume. Follow me for more career advice,â Bakke said sarcastically, after sharing screenshots of his conversation with the chatbot.
I just bought a 2024 Chevy Tahoe for $1. pic.twitter.com/aq4wDitvQW
— Chris Bakke (@ChrisJBakke) December 17, 2023
This chatbot is what Blake, Alec Baldwinâs character from Glengarry Glenn Ross, would call a âcloser.â That is, it knows just what to say to get a potential customer in the mood to buy. At the same time, saying anything to close a deal isnât necessarily a surefire strategy for success and, with that kind of discount, I donât think Blake would be super happy with the chatbotâs profit margins.
Bakke wasnât the only one who spent time screwing with the chatbot this week. Other X users claimed to be having conversations with the dealership bot on topics ranging from trans rights to King Gizzard to the animated Owen Wilson movie Cars. Others said they had goaded it into spitting out a Python script to solve a complex math equation. A Reddit user claimed to have âgaslitâ the bot into thinking it worked for Tesla.
FullPath has argued, in an interview with Insider, that a majority of its chatbots do not experience these kinds of problems and that the web users who had hacked the chatbot had tried hard to goad it in ridiculous directions. When reached for comment by Gizmodo, a Fullpath representative provided us with a statement arguing much the same. It reads, in part:
Fullpathâs ChatGPT was built to assist serious shoppers with automotive inquiries, which it does successfully every day for tens of thousands of shoppers. AI chatbots, like any other chatbot, can be pranked and made to look silly if you have some extra time on your hands. This is not normal shopper behavior and Fullpath has features to prevent pranksters from exploiting the chat, including a fresh update pushed yesterday that identifies and auto bans these types of users…There is also a disclaimer on every chat that mentions that AI can sometimes be inaccurate and all information should be verified directly with the dealership.
Curious as to whether other car dealership chatbots had similar foibles, I noted that some web users were talking about Quirk Chevrolet of Braintree, Massachusetts. So I went to the Quirk website, where, after a brief period of prodding, the chatbot proceeded to have conversations with me about a variety of weird topics, including Harry Potter, invisibility, espionage, and the movie Three Days of the Condor. Like normal ChatGPT, the bot seemed willing to chat about lots of stuff, not just the topics it had been programmed to address. Before I was blocked by the service, I managed to get the chatbot to spit out a poem about Chevrolet that sounded like bad ad copy. Not long afterward, I received a message saying that my ârecent messagesâ had ânot alignedâ with the siteâs âcommunity standards.â The bot added: âYour access to the chat feature has been temporarily paused for further investigation.â
The race to plug LLMs into everything was always destined to be rocky. This technology is still deeply imperfect, which means that forcing its integration into every nook and cranny of the internet is a recipe for copious amounts of troubleshooting. Thatâs apparently a deal most businesses are willing to take. Theyâd rather rush a buggy product to market and miff some customers than miss the âinnovationâ train and be left in the dust. Same as it ever was.
Question of the day: How many security bots are roaming your neighborhood?
The answer is: Probably more than youâd think. In recent weeks, one robotics company in particular, Knightscope, has been selling its autonomous âsecurity guardsâ like hotcakes. Knightscope sells something called the K5 security botâa 5-foot tall, egg-shaped autonomous machine, that comes tricked out with sensors and cameras, and can travel at speeds of up to 3 mph. In Portland, Oregon, where the business district has been suffering a retail crime surge, some companies have hired the Knightscope bots to protect their stores; in Memphis, a hotel recently stuck one in its parking lot; and, in Cincinnati, the local police department seems to be mulling a Knightscope contract. These cities are lagging behind larger metropolises, like Los Angeles, where local authorities have been using the robots for years. In September, the NYPD announced it had procured a Knightscope security bot to patrol Manhattanâs subway stations. Itâs a bit unclear whether itâs caught any turnstile hoppers yet.
More headlines this week
LLMs may be pretty bad at doing paperwork. New research from startup Patronus suggests that even the most advanced LLMs, like GPT-4 Turbo, are not particularly useful if you need to look through dense government filings, like Securities and Exchange Commission documents. Patronus researchers recently tested LLMs by asking them basic questions about specific SEC filings they had been fed. More often than not, the LLM would ârefuse to answer, or would âhallucinateâ figures and facts that werenât in the SEC filings,â CNBC reports. The report sorta throws cold water on the premise that AI is a good replacement for corporate clerical workers.
A billionaire-backed think tank helped draft Bidenâs AI regulations. Politico reports that the RAND Corporation, the notorious defense community think-tank thatâs been referred to as the âPentagonâs brain,â has been overtaken by the âeffective altruismâ movement. Key figures at the think tank, including the CEO, are âwell known effective altruists,â the outlet writes. Worse still, RAND seems to have played a key role in writing President Bidenâs recent executive order on AI earlier this year. Politico says that RAND recently received over $15 million in discretionary grants from Open Philanthropy, a group co-founded by billionaire Facebook co-founder Dustin Moskovitz and his wife Cari Tuna that is heavily associated with effective altruist causes. The policy provisions included in Bidenâs EO by RAND âcloselyâ resemble the âpolicy priorities pursued by Open Philanthropy,â Politico writes.
Amazonâs use of AI to summarize product reviews is pissing off sellers. Earlier this year, Amazon launched a Rotten-Tomatoes-style platform that uses AI to summarize product reviews. Now, Bloomberg reports that the tool is causing trouble for merchants. Complaints are circulating that the AI summaries are frequently wrong or will randomly highlight negative product attributes. In one case, the AI tool described a massage table as a âdesk.â In another, it accused a tennis ball brand of being smelly even though only seven of 4,300 reviews mentioned an odor. In short: Amazonâs AI tool seems to be getting pretty mixed reviews.