29 Mar 2023 | Jack Benjamin

How digital publishers are fighting for more control over their IP

Feature

Publishers have had potential revenues from their IP stolen from tech vendors, representatives allege, and the problem may only become bigger with the rise of AI, writes Jack Benjamin.

In an open letter published on Monday, Richard Reeves, the managing director of the Association of Online Publishers (AOP) accused a number of unnamed tech vendors of intellectual property (IP) theft.

The letter, which Reeves penned on behalf of digital publishers, described how online publishers have been taken advantage of by “unscrupulous” intermediaries that are extracting and selling increasingly valuable first-party data to advertisers and agencies.

Speaking to The Media Leader, Reeves elaborated on the AOP’s letter, detailing that the broad issue facing publishers is that of the need to reassert control over their intellectual property.

‘A huge sum of money’

In the letter, Reeves writes that “as fading third-party cookie use has fuelled demand for alternative targeting methods such as contextual advertising, the tactics of some verification technologies are extending beyond their original — and legitimate — purpose,” adding that some vendors are packing extra unseen tags or running bots known as “crawlers” to collect, package, and sell publisher data.

The AOP’s call to action urges advertisers and agencies to boycott any such vendor that cannot verify it received consent from publishers to sell the data it is offering.

The timing and severity of the letter’s release and diction, respectively, were well-considered. The reason for the letter, says Reeves, is that the publishing industry had failed to make its concerns explicit up to this point, despite the issue going on for years.

“We really felt that it was necessary to socialize this issue more widely,” says Reeves, who hopes to take a “conciliatory and pragmatic” approach to negotiating with vendors on respecting publisher rights. “You could call me a naïve optimist, but I hope common sense will prevail. We are not out there looking for confrontation. It’s just good common sense; good practice. But it’s not something that we will necessarily achieve the ends that we desire on our own.”

AOP accuses tech vendors of stealing publishers’ IP in open letter

With the deprecation of the third-party cookie, publishers will lose a considerable amount of potential revenue if their data is taken and sold off by intermediaries without expressed consent. Danny Spears, chief operating officer at publisher-backed online ads sales house Ozone, describes the scale of the exploitation as large, especially given that many publishers are struggling amid inflationary pressures and declining adspend.

“It’s a huge sum of money,” Spears says. “Whether you consider it a black market or a grey market, the folks who are active in that space don’t report it in a way which is easy to pull apart.”

For publishers that are able and do seek to sell their first-party data to advertisers, the existence of intermediaries undertaking the same practice dilutes the potential revenue a publisher can earn by inserting a competitor that can drop the offering price.

“[Premium publishers] rightly believe that they are luxury brands, and that should mean they can command a higher price,” Spears argues. “And they should be able to control the environments in which they’re transacted. But actually at the moment, their products, their data has been abstracted by third parties without permission, and is now being sold on a grey market, which dilutes economics.

“Publishers are a key source of value in the advertising supply chain,” Spears continues. “To this point in programmatic, the way that the market has been structured—and I would say, rigged—is such that value in data specifically is leaked out away from the publisher, which means that the value is collected by others.”

‘Provenance and legitimacy is just not on their consideration’

Neither Reeves nor Spears are willing to explicitly name any specific vendors publishers have taken issue with. When prompted, Reeves hints that there “is one player in the market where we have spent a disproportionate amount of time [with] because of [their market] dominance.” Spears, meanwhile, tells The Media Leader that there were a number of publicly listed vendors as well as dozens of small start-ups that they are aware are contributing to the problem.

“They’re very clear that their individual [unique selling point] is about the frequency and the rigour with which they mine publisher data,” Spears describes of his conversations with such vendors. “That’s their point of difference, and they’ll very proudly talk to you about how they do it better than anyone else. No sense of irony. Provenance and legitimacy of their data is just not on their consideration whatsoever.”

When reached for comment by The Media Leader, market leading tech vendors such as DoubleVerify and Integral Ad Science declined to comment.

Spears goes so far as to imply that, for large vendors that are publicly traded companies, they may not be being transparent with investors that their business model is reliant on “stolen” data. “Where they’re monetizing, their share price will be in some way propped up by the fact that they’re using assets they don’t have the right to use,” he says, adding that advertisers that may have otherwise been ignorant to the problem would be “quite shocked if they knew they were investing in that product.”

Spears is grateful that there are now advertisers that are beginning to take notice, though it’s unclear whether the AOP’s letter will sufficiently move them to act on publishers’ behalf.

While Reeves expresses hope that publishers’ concerns would be ameliorated in a self-regulatory fashion, the AOP has nevertheless spoken to legal counsels within publishing organisations, and will consider more radical steps in the future if needed.

Spears says that if attempts at self-regulation don’t work and the problem is not resolved, “publishers will need to be looking at what their other routes potentially look like; more direct lines to deal with the economic damage that is being caused to them.” He suggests that, though he does not speak for any individual publisher himself, legal recourse should be considered by individual publishers should progress not be made in the near term.

Exercising control over AI

The publishing bodies’ complaints over theft are echoed in ongoing anxieties over how advanced AI chatbots may encroach upon publishers’ IP.

Last week, OpenAI, the company behind popular chatbot ChatGPT, added support for plug-ins to the chatbot. Before then, ChatGPT could only train itself on information posted online up until 2021; the change now allows ChatGPT to browse the open web, including live data.

The update could have radical implications for digital publishers. In the demo, a user asked ChatGPT to search how 2023 Oscar winners’ box office sales compared to recently released movies. The chatbot completed searches, “clicked on” and scanned articles from Variety and Box Office Mojo, and gave an accurate response (with citations) within 30 seconds.

Will ChatGPT and Bard upend Search advertising as we know it?

By automating search, reading, and synthesis, users will no longer necessarily need to read the articles that produce the information the chatbot learns from. As such, outlets could lose out on readership and therefore precious advertising revenue across the digital space.

Publishers represented by the AOP have taken time to consider what posture they should take regarding the rapid development of chatbots. Many publishers are themselves experimenting with using them to cover the news. But Reeves expressed misgivings over the potential for their IP to be further taken advantage of in ways similar to tech vendors.

“Does this amplify the need to consider some form of mechanism that really is a precursor to crawlers of any nature, AI included, being deployed?” he asks rhetorically. Reeves goes on to argue that publishers should be able to exercise control over blocking chatbots from accessing their site, else they provide chatbots with the ability to someday make that very content obsolete.

While not going so far as to recommend direct legal action yet, Reeves notes that the recent lawsuit brought by Getty Images against Stability AI, the company behind the Stable Diffusion AI image generator, provides a precedent for legal challenges against AI companies training their bots on content without publisher consent. In the US, publishers appear to be preparing for legal showdowns with Microsoft and Google, who have stakes in different chatbots, over the issue.

‘Enormous amount of wastage’

Reeves also points out that the sheer number of crawlers that are being deployed has a real impact on the digital advertising industry’s carbon footprint. Whether they are used for AI or for contextual audience data, he calls out the “enormous amount of wastage” caused by crawlers that are “unverified, unlicensed, or doing something that none of us particularly agrees is of any great value running amok on the web.”

It is not the first time publishers have brought up the complexity of the digital supply chain as a drag on carbon reduction efforts. At a November webinar hosted by The Media Leader and The Guardian, Julie Richards, director of sustainability & operational transformation at The Guardian, argued that the top priority for publishers looking to reduce their carbon footprint should be to simplify the supply chain.

“If we could cut out some of the intermediaries and shorten the relationship between the advertiser and the publisher, I think that would actually cut out a lot of excess and wasted emissions,” she said.

‘Journalism requires the nuance, empathy, and scrutiny of humans’

Echoing Reeves’ anxieties, Newsworks CEO Jo Allan raises additional concerns to The Media Leader for publishers and consumers alike. She warns: “AI scrapes content with no reference to its sources or recourse to pay for the content it uses and, in a world blighted by fake news, these AI tools could see a further explosion of the issue with questionable content spread quickly with no discernible way to identify fact from fiction.”

The ability for AI to spread fake news was well captured last week, as a fake, AI-generated photo of Pope Francis in a white puffer jacket made the rounds on social media platforms. The concern among many publishers is that very soon, much more dangerous disinformation could be spread equally as wide and equally as fast, further deteriorating the already low trust consumers have in the media.

ChatGPT does threaten journalism, but not in the way you might think

Allan admits, however, that AI is certainly the “next exciting era of the internet”.

“There are clearly opportunities for AI tools to support journalists,” she adds, “but the craft of journalism has, and always will require the nuance, empathy, and scrutiny of humans.”

The highly technical nature of how AI works, as well as the alleged exploitation of publishers in the digital supply chain, only increases the challenge for publishers to drive understanding and engagement in addressing their concerns.

Says Spears: “With any issue, particularly one that is technologically complex, the risk is it hides in the shadows.”