What Is a Knowledge Graph?

A knowledge graph is a structured representation of entities (people, organisations, locations, events) and the relationships between them, stored in a format that enables efficient traversal and query — discovering connections that would be invisible in unstructured text. In journalism, knowledge graphs enable investigators to ask questions like "which individuals appear in both Dataset A and Dataset B?" or "which companies share directors with Company X?" — questions that human analysis of raw documents cannot answer at scale.

Google's Knowledge Graph, which powers the information panels that appear in Google search results, is the most widely known example — but newsroom knowledge graphs are purpose-built for investigative and intelligence use cases, integrating news archive data, public records, corporate filings, and investigative datasets into a single queryable network.

Knowledge Graphs in Major Investigations

The ICIJ's (International Consortium of Investigative Journalists) Linkurious-powered graph visualisation was central to the Panama Papers and Paradise Papers investigations — enabling analysts to map relationships between hundreds of thousands of offshore entities, directors, and beneficial owners across 11.5 million documents. Without graph-based analysis, these investigations would have been computationally impossible.

Neo4j, a commercial graph database, has become the standard tool for investigative journalism knowledge graphs. Its Cypher query language enables journalists to express relationship queries naturally ("find all paths of length ≤3 between Person A and Corporation B") and its visualisation tools make network analysis accessible to reporters who are not data scientists.

AI and Knowledge Graph Construction

The most time-consuming aspect of knowledge graph journalism has historically been entity extraction and relationship mapping from unstructured text — a process that required substantial manual annotation. Modern NLP tools and LLMs have dramatically reduced this barrier. Named entity recognition (NER) models can automatically extract people, organisations, locations, and events from large document corpora. LLMs can extract relationship assertions from text ("X served as director of Y from 2018 to 2022") that can be structured into graph edges. Omniscient AI's newsroom intelligence layer uses entity extraction and relationship mapping to maintain a continuously updated knowledge graph of the news sources, journalists, organisations, and topics it monitors.