In this page you can find all the documentation about the project DeVoteD. DeVoteD explores data about voter turnouts and democracy indeces in order to understand how much the exercise of the voting right impact on the democracy indices. You can find all the scripts of our data in the GitHub page of the project.
In order to accomplish our research case, we collected data from different sources and re-used it to create our own dataset. We aimed at re-using datasets free of cognitive biases, prejudices and discriminations, fair and reliable, legally valid, relevant, consistent and accurate. [ Coming soon ]
The datasets used to investigate the relationship between the voting and democracy indeces include the following data from the respective sources:
In order to manage the mash-up of different datasets, with different licenses we followed the Guidelines
for Open Data provided by the EU. In accordance with these guidelines, we pursued the objective to make
our research data findable, accessible, interoperable and re-usable (FAIR).
Findable: the first step in (re)using data is to find them. Metadata and data should be
easy to find for both humans and computers. Machine-readable metadata are essential for automatic
discovery of datasets and services, so this is an essential component of the FAIRification process.
Accessible: once the user finds the required data, she/he needs to know how can they be accessed, possibly including authentication and authorisation.
Interoperable: the data usually need to be integrated with other data. In addition, the data need to interoperate with applications or workflows for analysis, storage, and processing.
Reusable: the ultimate goal of FAIR is to optimise the reuse of data. To achieve this, metadata and data should be well-described so that they can be replicated and/or combined in different settings.
The principles mentioned above include three types of entities: data, metadata and infrastructure. Given the analysis, we can state that our research data are almost 100% compliant with the FAIR principles, with the few exceptions due to the lack of license specification.
The data we collected for the purposes of our research derive from different sources and therefore are subject to different types of license, when specified. When the licence was available and specified by the publisher, we found the Creative Common License CC0 BY 1.0. This license allows the user to freely use, share, modify, and distribute the material for any purpose without permission or attribution. It places the work in the public domain and does not allow the user to apply legal restrictions to others.
The original datasets used:
V-Dem Core: we couldn't find information about the license. The maintenance is
regular on a annual basis. Since the 2014 at least one new version has been released.
Party Facts: the licence used is CC0 by-1.0. The last version was published in the
2023. But, as it can be read in the documentation section news, the project update regularly the datasets, indeed the last update was done
in May, 5, 2025.
IDEA Voter Turnout Database: we couldn't find information about the license. The
maintenance is regular on a annual basis. Since the 2014 at least one new version has been
released.
Manifesto Project: we couldn't find information about the licence, nor about the
version of the data. Lack of attribution of license means to abandon data forever, which means that
other companies can recopyright it. Although the project has been explicited terms of use stating that redistribution of the data is forbidden unless
explicitly authorized in writing by the project.
If sharing is approved, the user must include all accompanying files, including the Terms of Use
document. The user must also: clearly identify the data’s provenance, properly cite the dataset in any
publications and notify the Manifesto Project of any published research using their data and provide
them with a copy. In conclusion the Manifesto Project’s Terms of Use are much more restrictive than an
open license. For researchers, it’s usable and valuable—but not "open" in the legal sense used by open
science and open data communities. The maintenance is regular as declaired the organization itself and
it is proved by the annual release of new, updated and corrected versions of the dataset.
While the data of IDEA Voter Turnout Database are provided exclusively in XLSX, all the other data are provided in a variety of format including open ones. V-Dem and Manifesto Project provide data in CSV, STATA, R and SPSS giving to the user the possibility to choose different tools to analyse the data. Manifesto Project provides data in XLSX, too. Party Facts provides data in the open format TAB, i.e. data as tab-separated values. The legal situation of Manifesto Project dataset affected to some extent our research: data were available behind login.
Given that all the datasets, except for V-Dem, used are exclusively publicly shared aggregate
datasets, where the information collected were never about individuals, but institutions, the GDPR
does not apply.
The V-Dem project ensures the anonymity of its expert coders
to address both current and potential future security concerns and for legal compliance such as the
GDPR applying data encryption, access control and separation of the data from those of the individual
coder (see the Methodology section of the FAQ).
The other projects didn't take such measures because the data
collected were not collected by individual coders, but from official results publicly available.
How sustainable and bias-free are our data providers?
V-Dem: continually reviews its methodology—and occasionally adjust it—with the goal of improving the quality. V-Dem has a rigorous expert recruitment at a global level. Experts are usually academics or professionals with specialist and evidenced knowledge in one or more domains. Approximately two-thirds are nationals or residents of the country they provide information on. The quality and impartiality of the data is highly dependent on the Country Experts. Consequently, V-Dem pays a great deal of attention to their recruitment and use the following selection criteria: validated expertise, local, in-depth knowledge, seriousness of purpose, impartiality, diversity in professional background among the Experts. V-Dem does not reveal the identity of the Experts. V-Dem uses the Bayesian Item-Response Theory (IRT) to convert the the ordinal responses experts provide into continuous estimates of the concepts being measured. This allow to estimate the traits of the concepts. This also allow for the possibility that experts have different thresholds for their ratings. This method allows for their reliability to idiosyncratically vary, accounting for the concern that not all experts are equally expert on all concepts and cases.
Party Facts: aggregates no individual-level data and does not have a coder network. It also lacks documented ethics or bias-control protocols. Data reliability depends on contributor reports and mapping validation processes.
IDEA Voter Turnout Database: uses official electoral data. There is no mention of data collector protection or coder ethics. There is a potential bias issues: risks stem from reliance on potentially falsified official data, but no mitigation strategy described.
Manifesto Project: has transparency in coding methodology, though without explicit mention of coder binding ethics or diversity protocols.
The DeVoteD catalogue and dataset were created within a course at the University of Bologna and is not actively maintained, while the datasets used for this catalogue are maintained by the relative institutions. However, our scripts remain available and can be rerun at any time on new files. If somebody notices that one of our input files is available in a newer version, we would be glad to be informed about it in order to update our file with the automated script. Our scripts are licensed under CC 4.0. We invite the community to update our files and contribute the updated files to our GitHub project. We will review the files and then add them if correct.
In the page Visualisation the user can find some graphical representations that are thought to help them better understand the data we collected. There is a map visualisation which allows the user to select one among the economic and political aspects we analysed, and shows it both in the map and in the correspondent bar chart. By selecting a country in the map, the user can see the total value on the map, and the value year by year from 2008 to 2018 in the bar chart. In the Graphs section we made available a bubble chart representing the total number of displaced people per country between 2008 and 2018, and a graph allowing you to visualise two different aspects among the economic and political ones at the same time, in the form of bar or line chart.
We used the DCAT Application Profile for data portals in Europe, version 3.0.1, to encode the metadata about all our data, including the original datasets and our own dataset Flucht. Click here to see the code on our GitHub project.
The DeVoteD project successfully investigated the relationship between voter turnout and democracy indices through the integration of four major political datasets spanning 2008-2018. While the project achieved its technical objectives of creating a FAIR-compliant dataset, the empirical findings challenge conventional assumptions about the relationship between electoral participation and democratic quality.
The project demonstrates exemplary data integration practices by successfully combining:
V-Dem Core Dataset: High-level democracy indices with sophisticated uncertainty measures
Party Facts: Comprehensive party system mappings across countries
Manifesto Project: Political party positioning data
IDEA Voter Turnout Database: Global electoral participation statistics
The integration process followed FAIR principles rigorously, achieving near-complete compliance with findability, accessibility, interoperability, and reusability standards. The project's transparent documentation and version control represent best practices in reproducible research.
Despite methodological rigor, several quality issues emerged:
Licensing Inconsistencies:
The most significant limitation involves incompatible licensing frameworks. While Party Facts uses CC0 1.0, the Manifesto Project employs restrictive terms prohibiting redistribution without explicit authorization. V-Dem and IDEA databases lack clear licensing information, creating legal uncertainty for data reuse.
Coverage Biases:
All source datasets exhibit Western-centric biases in both geographical coverage and analytical frameworks. This systematically underrepresents non-liberal democratic experiences and may not capture diverse forms of democratic governance prevalent in non-Western contexts.
Temporal Limitations:
The 2008-2018 timeframe, while substantial, misses recent democratic developments including the rise of populist movements and democratic backsliding phenomena that became prominent after 2018.
The correlation analysis reveals a counterintuitive finding that challenges fundamental assumptions about democratic participation:
Key Correlation Values
Liberal Democracy ↔ Voter Turnout: r = 0.25
Electoral Democracy ↔ Voter Turnout: r = 0.22
Participatory Democracy ↔ Voter Turnout: r = 0.19
Deliberative Democracy ↔ Voter Turnout: r = 0.22
Egalitarian Democracy ↔ Voter Turnout: r = 0.26
Voter turnout shows only weak-to-moderate positive correlations with all democracy indices (r = 0.19-0.26), with the strongest relationship occurring with egalitarian democracy (r = 0.26). These correlations, while statistically significant, explain less than 7% of variance in democratic quality.
Implications:
This suggests that higher voter turnout does not necessarily indicate stronger democracy. Democratic quality appears to depend more heavily on institutional factors, governance structures, and systemic features than on the mere quantity of electoral participation.
All democracy indices show consistent moderate negative correlations with the Herfindahl-Hirschman Index (r ≈ -0.54 to -0.55), confirming that democratic quality increases with political competition and decreases with party system concentration.
The weak turnout-democracy correlation suggests that participation quality matters more than participation quantity. High turnout in systems with limited competition, restricted media freedom, or weak institutional constraints may not enhance democratic governance. This finding has several implications:
Policy Focus: Electoral reforms should prioritize institutional quality over turnout maximization
Measurement: Democratic assessments should weight institutional factors more heavily than participation rates
Intervention Design: Democracy promotion efforts should address systemic constraints before focusing on voter mobilization
The strong correlations between democratic dimensions suggest that piecemeal democratic reforms may be insufficient. Countries seeking democratic improvement may need comprehensive approaches addressing multiple institutional domains simultaneously rather than focusing on single aspects like electoral procedures or civil liberties in isolation.
Aggregation Effects: Country-level analysis may obscure within-country variation in democratic experience across regions, social groups, or time periods.
Causal Inference: Correlation analysis cannot establish causal relationships between turnout and democracy. Future research should employ longitudinal designs or natural experiments to identify causal mechanisms.
Measurement Validity: Expert-based democracy ratings, while sophisticated, may not capture citizen experiences of democratic governance or alternative democratic traditions.
Further Data Gathering: We resulted working with unsufficient per party votes shares data which may have caused miscalculations of concentration indeces. Future gatherings should fulfill the missing data.
Disaggregated Analysis: Examine subnational variation in turnout-democracy relationships
Temporal Dynamics: Investigate how turnout-democracy relationships evolve during democratic transitions
Qualitative Integration: Combine quantitative measures with qualitative assessments of democratic experience
Contemporary Update: Extend analysis to post-2018 period to capture recent democratic developments
The project's GitHub repository, automated scripts, and DCAT-AP compliant metadata provide a sustainable foundation for future research. The modular approach allows researchers to update individual datasets while maintaining overall coherence.
By identifying licensing obstacles and documenting integration challenges, the project contributes to broader conversations about open science infrastructure in political research. The detailed legal and ethical analysis provides a template for similar projects.
Based on integration experience, the project recommends that major political datasets:
1. Adopt standardized open licenses (preferably CC0 or CC BY)
2. Provide machine-readable metadata following DCAT standards
3. Implement regular update cycles with clear versioning
4. Document methodology and bias mitigation strategies transparently
The DeVoteD project successfully demonstrates that rigorous data integration can yield important insights into fundamental questions about democracy and participation.
The counterintuitive finding that voter turnout correlates only weakly with democratic quality challenges conventional wisdom and suggests new directions for both research and policy.
While technical execution was exemplary, the project highlights persistent challenges in political data infrastructure, particularly regarding licensing compatibility and bias mitigation. These findings emphasize the need for coordinated efforts to improve data sharing standards in political science research.
The results suggest that democracy is not simply about getting more people to vote, but about creating institutional conditions where citizen participation can meaningfully influence governance outcomes. This insight has profound implications for how we measure, understand, and promote democratic governance in an era of global democratic challenges.