Data on the Russian invasion of Ukraine available in near-real time
March 9, 2022
ANN ARBOR – In order to track and share data on events unfolding in Ukraine, Yuri Zhukov, Associate Professor of Political Science and Research Associate Professor at the Center for Political Studies, launched VIINA: Violent Incident Information from News Articles on the 2022 Russian Invasion of Ukraine. VIINA is a near-real time multi-source event data system for the invasion.
“I wanted to make these data available immediately because media sites in both countries are already being shut down, due to either censorship (in Russia) or military operations (in Ukraine),” said Zhukov. “It is thus essential that researchers have access to information about the war, as reported across media organizations and other actors in the information space.” While different media cover different types of events, VIINA’s multi-source approach will capture a more accurate picture of events as they unfold.
This platform allows researchers to access data based on news reports from Ukrainian and Russian media, which have been geocoded and classified into standard conflict event categories through machine learning.
VIINA is freely available for use by students, journalists, policymakers, and researchers. Using an automated web scraping routine that runs every 6 hours, VIINA extracts the text of news reports published by each source and their associated metadata, including publication time and date, web urls. GIS-ready data can be downloaded from VIINA, with temporal precision down to the minute.
VIINA draws on news reports from a variety of Ukrainian and Russian news providers. Data sources currently include news wires, TV stations, newspapers, and online publications in both countries. Zhukov plans to expand these sources as the conflict unfolds, to include OSINT social media feeds and other key sources. The set of sources may also change as the war unfolds — due to interruptions to journalistic activity from military operations, cyber attacks, and state censorship, as well as the availability of new data from other information providers.
ISR Communications, [email protected]