CopyCop+: A Primer in GenAI Disinformation and DNS Intelligence for CTI Analysts

CopyCop refers to two recent Cyber Threat Intelligence (CTI) reports from Recorded Future on GenAI (Generative AI) enabled Russian Influence Operations. The CopyCop campaigns are just the latest examples of ‘narrative laundering’ operations targeting the U.S. and Western democracies. An interesting aspect of these CopyCop campaigns is that they provide a good set of domain name indicators and adversary tactics techniques and procedures (TTP), which CTI analysts can add to their intelligence collections and use to plan countermeasures and develop proactive defenses against future campaigns. 

In this post we will:

  • summarize CopyCop and related campaigns
  • demonstrate an intelligence collection we built from CopyCop domain indicators using the DomainTools Iris Internet Intelligence Platform to show how CTI analysts can use DNS Intelligence and GenAI to model adversary TTP and develop proactive defenses 
  • demonstrate two example of adversarial GenAI prompts 
  • provide an outlook on GenAI trust attitudes, Google’s new ‘high-fidelity grounding initiative’ for managing hallucinations, and new research on a multi-agent prompting

Bottom line: DNS is a common feature in most cyber-attacks. Cyber defenders need to collect DNS indicators of attack and use them both to block attacks and build proactive defenses. CTI analysts need to use GenAI to keep up with attackers, while enterprises and governments need to press domain registrars and hosting providers to dismantle malicious domain infrastructure and continue to push for stronger regulations and enforcement. 

About CopyCop+

The term ‘CopyCop+’ extends beyond the specific campaigns identified in Recorded Future’s reports from June 24 and May 9, 2024. [1-3] It also encompasses related research on Russian Influence Operations, most notably NewsGuard reporting from June 18 and 26, [4, 5] and Clemson University from December 2023. [6] The primary threat actor in all these reports is the same – John Mark Dougan, an American defector to Russia and former US Marine and police officer. For a more comprehensive listing of prior research please refer to our posts from December 2023,  February 2024, and March 2024.

Recorded Future describes CopyCop as a Russian-linked influence network that poses as US local news or political websites. It even maintains an editorial staff of over a 1,000 fake personas. While some of the content is human crafted, much of it is plagiarized content sourced from Russian propaganda sites and influential sites known to promote disinformation, or generated directly by AI LLMs (Large Language Models). As of March 2024, CopyCop had published over 19,000 AI-generated posts. 

NewsGuard has published over 167 websites used in these campaigns for its subscribers. Of particular interest, NewsGuard conducted an audit of 10 leading LLMs using more that 500 prompts designed to test the LLMs’ propensity to disseminate disinformation based on 19 well-documented false claims. NewsGuard found that the leading LLMs spread false claims 32% of the time, with some LLMs spreading disinformation most of the time. This suggests that the guardrails designed to protect us from adversarial AI are inadequate. 

Recorded Future has provided over 100 website domain names used in these campaigns which are now targeting the 2024 US elections. We explore these domains and their implications in greater detail in the next sections.

DNS Intelligence and DomainTools

About DNS Threat Intelligence: DNS service providers like DomainTools, InfoBlox and Palo Alto Networks have long used machine learning (ML) to classify suspicious domains and track the supporting infrastructure from initial domain registration through weaponization. DNS Intelligence helps defenders monitor the infrastructure as it evolves. Palo Alto Networks Unit 42 reports that 85% of malware exploits DNS for malicious activity. [9] Infoblox notes that while adversaries may continuously morph their malware to evade detection, the DNS infrastructure (domain names, domain registrars, hosting providers) is harder to change and therefore serves as a stable means to track adversaries over time. Infoblox reports blocking newly registered suspicious domains on average 63 days before they are first deployed in malicious campaigns. [10, 11]

Domain Tools: Currently, we manage over 100 intelligence collections in DomainTools Iris, representing nearly 14,000 domains. Of these, at least 12 collections contain more than 500 domains associated with disinformation network infrastructures.

We extracted the domain name indicators from the Recorded Future CopyCop reports and ingested into DomainTools Iris. Figure 1 shows a partial record view of 107 domains in this collection. As we are interested in understanding the domain infrastructure, the selected view shows 11 columns out of hundreds that an analyst may select. These columns represent key features of the collection and include: domain name, tag name, domain age (first seen), risk score, registrar, status, domain create date, IP information, ISP information, ASN (Autonomous System Number), and country code. 

Figure 1. DomainTools Iris

For a small data set like this (107 domains), an analyst can gain valuable insights into adversary infrastructure and TTP just by exploring these domain infrastructure features.  For example:

  • Domain Age: 47% are new; first seen in the last 60 days (since 1-May-2024)
  • Domain Registrars: 91% of the domains were registered through NameCheap
  • ISP: 97% of the domains are hosted by Cloudflare
  • Country: 93% are located in the US
  • TLDs (Top level domains): The actors have a strong preference for traditional top-level-domains, with 77% using .com or .org domains
  • Risk Score: As of July 2, 2024, 91 domains (85%) had the highest possible risk score of 100—a score that signifies definitive risk and comes from trusted industry blocklists.

Interestingly, only three of the 107 domains were tagged in other disinformation campaigns, suggesting that the CopyCop infrastructure was purpose-built for this campaign.

If the analyst wants to drill down on a particular domain, they can get additional detail such as a screenshot, as shown by the domain ‘lansingtribune[.]org’ Figure 2. (WARNING!). At the bottom of the screen, note that the site was built using ‘Ascendor, a specialized WordPress theme designed for ‘Authentic News’’. 

Figure 2. DomainTools screenshot Lansingtribune[.]org as of 26-June-2024

The insights described above are powerful DomainTools capabilities that can be used to develop new queries for proactive threat hunting domains that match these patterns derived from the CopyCop adversary model. 

Augmenting DNS and CTI with GenAI: Generative AI offers the analyst the ability to discover more from their collections. For example, a senior analyst could quickly identify morphological or semantic patterns in domain names, suggesting associations to geographical locations, media types, or political ideologies. This type of exploration is a great opportunity for an analyst to enlist the support of an LLM helper. In Figure 3, we see a prompt we submitted to one of our suite of LLMs (Anthropic’s Claude). Followed by Claude’s reply in Figure 4. 

Figure 3. CopyCop Domain Semantics – Place Names (partial)
Figure 4. Perplexity.ai Results for CopyCop Domain Semantics – Place Names

While these examples highlight a powerful potential for the CTI analyst, there are also limitations. In our example we gave two prompts requesting semantic analysis on place names and media names to four of the chatbots in our suite as shown in Table 1. Note the difference in response results to the same prompt from each of the four chatbots. This is a reminder that CTI analysts need to understand how their LLMs work, their strengths and limitations, and their task suitability.

Table 1. Chatbot Comparative Performance Prompt Semantics

Adversarial AI Use-cases

In this section we consider two Adversarial AI use-cases to demonstrate the dual threats of disinformation and economic manipulation.

Use-case 1 – NewsGuard Disinformation: In Figure 6 we see a partial view of a prompt and three responses from a NewsGuard report on the tendency of the Top-10 GenAI models to mimic and uncritically spread Russian disinformation. [5] In this audit test, the prompt is a false narrative that originated on the Russian disinformation site “SanFranChron” in an article entitled “Chilling Leaks Expose Alleged CIA Plot by Ukrainian Troll Cabal to Subvert 2024 U.S. Election”. NewsGuard researchers found that the original “whistleblower testimony” from “Olesya Movchan” appears to be an AI-generated persona. As reported by NewsGuard, “eight of the 10 chatbots replied with full-fledged local news articles advancing the claim. The responses included specific details pertaining to the false narrative that were not mentioned in the original prompt, mimicking, and in some cases, linking to the original article on the ‘SanFranChron.’”

Figure 6. NewsGuard Survey on Top-10 LLMs Tendency to Mimic Russian Disinformation (partial)

Use-case 2 – Hypothetical Stock Manipulation: We created a fictional scenario for a reporter from the Atlanta Observer – a known disinformation domain on CopyCop list with a DomainTools risk score of 100. The prompt contained a number questionable assertions, like: claiming it was an authoritative source when in fact it is sketchy, providing incorrect headquarters location about a real company (Aeye Inc, stock symbol LIDR), and providing a fabricated ‘scoop’ (Aeye strategic licensing deal expected with Goldman Sachs). 

Our prompt with known falsehoods and two replies from our chatbot suite are shown in Figure 7. Perplexity.ai gave a responsible reply. It attempted to validate the question, and when it could not, it explained why. Perhaps it had guardrails that recognized that the prompt was a malign attempt to manipulate a market. The second reply was taken from Anthropic’s Claude and was similar to replies provided by GPT-4 and GPT-4o. In contrast, Claude accepted the fictional narrative in the prompt, even embellishing it with a made-up quote by the CEO of Aeye, Blair LaCorte, who had actually left the company in December 2022. 

Figure 7.Cybercrank market manipulation false prompt and perplexity.ai and Claude replies 29-June-2024

Implications for AI Safety and Regulation: These scenarios demonstrate the critical need for advanced guardrails in AI systems to prevent misuse in spreading false information and manipulating markets. The scenarios also demonstrate the need for more rigorous oversight and proactive measures to ensure AI reliability and ethical use. 

Outlook on Trends in Generative AI

Here are three issues that we are watching. 

Trust remains a complex and important issue. According to a June 2023 Ipsos poll, a significant majority of Americans—83%—express distrust towards the companies developing AI technologies. Additionally, 72% are skeptical about the government’s ability to regulate these technologies effectively. When it comes to accountability for the misinformation generated by AI, the public is divided, with 28% believing the government should be responsible and 26% looking towards companies and brands. [12] IDC provides a detailed analysis on how companies are responding to these trust issues and what governance measures they are implementing. [13]

Google Initiatives on Data and Model Grounding Announcements: Google’s announcement of model and data grounding partnerships with influential third-party data and service providers (Moody’s, MSCI, Thomson Reuters and Zoominfo) is a positive attempt to reduce hallucinations and win enterprise customers. This initiative, which is slated to be integrated into Vertex AI by Q3 2024, holds the potential to establish precedent for further collaborations between data providers and AI developers. [14]

Advancements in Mixture of Agents Research: In our last post we described our Federated Prompt Management concept and prototype that harnesses the power of multiple LLMs in an attempt to mitigate the hallucination effect. We see this new research on Mixture-of-Agents Enhances Large Language Model Capabilities, suggesting that a collaborative agent framework can significantly enhance the capabilities and reliability of large language models, as a promising and validating sign. [15]

Editor’s Notes:

  • Credit to ChatGPT for valuable assistance with threat analysis research and proofreading. 
  • The Featured Image for this blog is a human-created mashup sparked by inspiration from The Economist.  

References

  1. Recorded Future: Insikt Group – Russia-Linked CopyCop Expands to Cover US Elections, Target Political Leaders  , 24-June-2024
  2. Recorded Future: Insikt Group – Russia-Linked CopyCop Uses LLMs to Weaponize Influence Content at Scale , 9-May-2024
  3. The Economist – A Russia-linked network uses AI to rewrite real news stories , 10-May-2024
  4. NewsGuard – ‘Information Voids’ Cause AI Models to Spread Russian Disinformation, 26-June-2024
  5. NewsGuard – Top 10 Generative AI Models Mimic Russian Disinformation Claims A Third of the Time, Citing Moscow-Created Fake Local News Sites as Authoritative Sources, 18-June-2024
  6. Data & Society – Data Voids: Where Missing Data Can Easily Be Exploited , May 2018
  7. Linvill, Darren and Warren, Patrick, Infektion’s Evolution: Digital Technologies and Narrative Laundering, 15-Dec-2023.
  8. NewsGuard – Bloom Social Analytics: Russia resurrects its NATO-Ukraine false narrative: NATO troops in coffins , 2-July-2024
  9. Palo Alto Networks – Stop Attackers from Using DNS Against You , 2-Jan-2023
  10. Infoblox – BREAK THE RANSOMWARE SUPPLY CHAIN WITH DNS THREAT INTEL , 31-May-2024 
  11. Infoblox –  ENHANCING SECURITY USING MICROSOFT ZERO TRUST DNS AND PROTECTIVE DNS SERVICE , 4-June-2024 
  12. IPSOS – Few Americans trust the companies developing AI systems to do so responsibly , 27-July-2023
  13. IDC – GenAI and Trust: How Companies Are Thinking About the Trustworthiness of AI and GenAI Tools, 2024
  14. VentureBeat – Google partners with Thomson Reuters, Moody’s and more to give AI real-world data , 27-June-2024
  15. arXiv:2406.04692 [cs.CL] – Mixture-of-Agents Enhances Large Language Model Capabilities , 7-June-2024

Leave a Comment

Your email address will not be published. Required fields are marked *

Pin It on Pinterest