PrefLib | Datasets

"Voter Autrement" Online Experiments in French Presidential Elections

00073

Experiment Election

This dataset contains data from online voting experiments conducted in 2017 and 2022 during the French presidential elections. In this experiments, participants were asked to test alternative voting methods to elect the French president, more precisely approval voting, evaluative voting, Borda, IRV and the majority judgement in 2022. The experiments took place online. The datasets of the different years (that contains more information) can be found here, together with more datasets of voting experiments conducted during large-scale political elections.

Consists of 18 data files.

Details

Download [zip, 412.38 KB]

Voting Experiments on IRV and Borda in French Presidental Elections

00072

Election Politics

This dataset contains the data collected during two voting experiments. The first one was carried out in the city of Faches-Thumesnil, in northern France, during the first round of the presidential election in April 2007. Participants were invited to vote using the Instant Runoff Voting method (also called Hare's method). The second one was carried out in the city of Fort-de-France in Martinique during the first round of the 2017 presidential election. Participants were invited to vote using IRV and Borda-4 (ranking only 4 candidates). These dataset and more datasets collected during voting experiments can be found here.

Consists of 3 data files.

Details

Download [zip, 11.94 KB]

"Voter Autrement" In Situ Experiments in French Presidential Elections

00071

Experiment Election

This dataset contains data from voting experiments conducted in 2007, 2012 and 2017 during the French presidential elections in the different cities. In these experiments, participants were asked to test alternative voting methods to elect the French president, more precisely approval voting and evaluative voting (with scales depending on the year and the city). The experiments took place in situ in polling stations during the first round of the presidential election (using paper ballots). The datasets of the different years (that contains more information) can be found here, together with more datasets of voting experiments conducted during large-scale political elections.

Consists of 33 data files.

Details

Download [zip, 169.41 KB]

Opinions of the Supreme Court of the US

00075

Politics

The dataset is based on the opinions authored and joined by the justices betwen 1946 and 2021, derived from the Supreme Court Database. The Court consists of 9 justices who vote on each case about which of the two parties to the case wins. The Court then publishes a majority opinion explaining the Court’s reasoning. Justices can also submit concurring opinions and dissenting opinions, and join any of the opinions submitted by others. Concurring opinions explain additional or alternative reasons, written by justices who voted with the majority. Dissenting opinions explain why a justice did not vote with the majority. Note that there might be several dissenting and concurring opinions. Each opinion, concurrence, or dissent becomes a ballot “approving” the justices that joined in it. The data was converted to the appropriate format by Théo Delemazure.

Consists of 152 data files.

Details

Download [zip, 247.42 KB]

Pol.is

00069

Election

Polis is a survey software that allows users to share their opinions and ideas over statements and comments regarding a given theme.

The data presented here is a reformat of the Polis data available on GitHub under under a Creative Commons Attribution 4.0 International license. The data has been reformated to fit the PrefLib categorical preferences data.

Each file describes a polis poll. The alternatives correspond to the statements/comments from the polls. For each statement, the voters can either express positive opinion (corresponding to the "Approved" category), negative opinion (corresponding to the "disapproved" category), or the absence of opinion (in the "Neutral/Skipped" category). Voters do not have to express an opinion, or absence thereof, for all statements/comments.

The data has been provided by Simon Rey, as part of the development of the Proportionality Press within the Fair online group decision making (FairOGD) project of Jan Maly.

Consists of 20 data files.

Details

Download [zip, 1.78 MB]

Vote Pluriel - Online Experiment during the 2012 French Experiment

00074

Experiment Election

This dataset contains data from the online experiment "Vote Pluriel" conducted during the 2012 French presidential election. In this experiment, participants were invited to vote using different voting methods: approval voting and instant-runoff voting (IRV). For IRV, participants had to rank at least 3 candidats. The experiment was conducted online, and participants were recruited through various channels, including social media and email lists. This dataset and more similar datasets collected during voting experiments can be found here.

Consists of 2 data files.

Details

Download [zip, 26.36 KB]

Habermas Machine Deliberation

00070

Election Deliberation

This dataset collects the vote of participants in mini-jury experiments. For each mini-jury, five participants had to give their opinion on a specific topic and an LLM was used to generate four statements agregating the participants' opinions. Participants could then rank the different propositions, and Schulze's method was used to select the winning statement. The original data was collected by Google DeepMind, and was converted to the appropriate format by Théo Delemazure.

Consists of 2710 data files.

Details

Download [zip, 3.14 MB]

Poland Local Elections

00068

Election Politics

This dataset collects voting data from recent Polish local elections. In 2014, in all cities with up to 100 000 inhabitants a first-past-the-post system was used. For this, all cities with up to 20 000/50 000/100 000 inhabitants where divided into 15/21/23 constituencies. The dataset consists of elections from 1317 cities (excluding ones with low vote length). In a file, each constituency is considered to be a voter, ranking the alternatives as in the election results of that constituency.

This dataset was donated by Niclas Boehmer.

Consists of 1315 data files.

Details

Download [zip, 639.08 KB]

Comparative Study of Electoral Systems

00067

Election Politics

This dataset presents data collected as part of the Comparative Study of Electoral Systems. This study consists of post-election studies from (federal) elections from different countries. In some of these post-election studies, participants were asked to rank all important political parties or leaders in their country that they know on a scale from 0 to 10 according to how much they agree with the views of the party. For each of the 174 post-election studies where this question was asked, a data file was created with the parties as candidates. Each voter in the data file then corresponds to a participant in the survey and ranks the parties according to the participant's answer. Check the website of the CSES for more details.

This dataset was donated by Niclas Boehmer.

Consists of 305 data files.

Details

Download [zip, 960.75 KB]

United Kingdom General Elections

00066

Election Politics

This dataset collects voting data from recent UK general elections. For each general elections, the UK territory is divided into constituencies. In a file, each constituency is considered to be a voter, ranking the alternatives as in the election results of that constituency.

This dataset was donated by Niclas Boehmer.

Consists of 13 data files.

Details

Download [zip, 11.51 KB]

Marble League (FKA Marble Olympics)

00065

Sport

The Marble League (formerly known as the MarbleLympics) is an annual tournament where marbles from different teams compete against each other in a number of different sports events (see here for more details).

For each instance of the league, several events are organised that all lead to an intermediate ranking of the competitors. In the files, each event corresponds to a voter ranking the alternatives as they were ranked in the event.

This dataset was donated by Niclas Boehmer.

Consists of 4 data files.

Details

Download [zip, 4.21 KB]

Eurovision Song Contest

00064

Election

This dataset collects the vote from the European Song Contest. Every candidate is a country (resp. their representative singer) and every vote is also a country. In the original format they only organised a final, from 2004 onwards semi-finals were added.

This dataset was donated by Niclas Boehmer.

Consists of 73 data files.

Details

Download [zip, 64.47 KB]

CTU AG1 Tutorial Time Selection

00063

Election

This dataset contains the results of surveying students of the Czech Technical University in Prague about their preferred tutorial time. Each student selected, from the set of predefined alternatives, those that fits into their schedule.

The data on this page has been donated by Dušan Knop and Šimon Schierreich.

Consists of 1 data file.

Details

Download [zip, 2.21 KB]

Alternative Order Experiment

00062

Experiment

This dataset contains the results of a simple experiment regarding voting over landscape images with varying displaying order. There are 19 agents, each voting in two rounds. Eight images (alternatives) are denoted by A through H. In the first round, the images were displayed in the sequence A, B, C, D, E, F, G, H, while in the second round, the sequence was D, C, B, A, H, G, F, E.

To allow identify voters from the first to the second round, in addition to our standard file formats, we provide a CSV file that provides the preferences submitted by a voter in both rounds.

These data were donated by Honorata Sosnowska from the SGH Warsaw School of Economics. The work concerned with the data was supported by the SGH Warsaw School of Economics grant KAE/S21 and the National Center for Science grant UMO-2018/31/B/HS4/01005 Opus 16.

Consists of 3 data files.

Details

Download [zip, 2.53 KB]

Kusama Network

00061

Election

Certain blockchain protocols conduct approval-based committee elections on a day-to-day basis. Specifically, these elections occur in blockchains using the Nominated Proof-of-Stake (NPoS) protocol. In this system, a subset of stakeholders, called validators, are elected to run the consensus protocol, which is crucial for the integrity of the blockchain. The problem of selecting the validators can be modeled as a committee election.

This dataset presents the voting data of the Kusama network, a blockchain system that implements the Nominated Proof-of-Stake (NPoS) protocol. The dataset contains 96 elections from the Polkadot blockchain. These elections contain roughly 2000 candidates and 10 000 voters each.

Note that in practice voters are assigned weights (that are of highly different scales). We cannot present this data in the PrefLib data. To every ".cat" file that includes the approval ballots corresponds thus a ".dat" file that describes the weights.

This dataset has been converted into the PrefLib format based on the sources provided by Niclas Böhmer (available here).

Consists of 1520 data files.

Details

Download [zip, 509.90 MB]

Polkadot Network

00060

Election

Certain blockchain protocols conduct approval-based committee elections on a day-to-day basis. Specifically, these elections occur in blockchains using the Nominated Proof-of-Stake (NPoS) protocol. In this system, a subset of stakeholders, called validators, are elected to run the consensus protocol, which is crucial for the integrity of the blockchain. The problem of selecting the validators can be modeled as a committee election.

This dataset presents the voting data of the Polkadot network, a blockchain system that implements the Nominated Proof-of-Stake (NPoS) protocol. The dataset contains 96 elections from the Polkadot blockchain. These elections contain between 18 202 and 48 025 voters and between 920 and 1080 candidates.

Note that in practice voters are assigned weights (that are of highly different scales). We cannot present this data in the PrefLib data. To every ".cat" file that includes the approval ballots corresponds thus a ".dat" file that describes the weights.

This dataset has been converted into the PrefLib format based on the sources provided by Niclas Böhmer (available here).

Consists of 496 data files.

Details

Download [zip, 404.32 MB]

Camp Songs

00059

Election

The dataset consists of two pre-camp surveys, conducted for youth summer camps in 2022 and 2023 in Poland. Several weeks before each camp, campers were asked to fill out a survey, which included (among others) two questions related to CCM-genre music pieces. Responses to each question form an approval election. So, for each year, we obtained two elections.

In the first question, survey participants--that is, voters---were presented approximately 80 song titles---candidates. The participants were asked to select at least 15 of their favorite ones to sing during camp activities. The lower bound on the number of selections was not enforced. In the second question, which involved far fewer songs (around ten), the participants were asked to select songs they would like to learn. This time, there was no indication of the number of choices, hence, some participants selected none.

The questions remained the same in all years, however, the presented songs were different. Specifically, in 2022, the first and second questions involved 78 and 8 songs, respectively. A year later, the respective numbers were 82 and 10. The survey had 39 participants in 2022 and 56 in 2023.

The data on this page has been donated by Andrzej Kaczmarczyk.

Consists of 4 data files.

Details

Download [zip, 10.12 KB]

NSW Legislative Assembly Election Data

00058

Election Politics STV

The New South Wales (NSW) Legislative Assembly is the lower of two houses of the Parliament of New South Wales, an Australian state. The Assembly comprises 93 seats, each representing one of 93 Districts.

In these elections, voters submitted Optional Preferential Votes; these ballots required at least one candidate to be specified. The outcome of each election was determined by the Instant-Runoff Voting (IRV) social choice function.

The data sets posted below correspond to each of the NSW districts in the 2015, 2019 and 2023 NSW Legislative Assembly elections. The elements numbered 1-93 correspond to the 2015 election, those numbered 94-186 correspond to the 2019 election, and those numbered 187-279 correspond to the 2023 election.

These datafiles comprise all formal votes cast in each contest, with informal votes omitted.

Consists of 279 data files.

Details

Download [zip, 1.93 MB]

Parliamentary Elections

00057

Election Politics

This dataset gathers parliamentary elections. The Austrian elections were provided by Martin Lackner.

Consists of 9 data files.

Details

Download [zip, 6.87 KB]

Seasons Power Ranking

00056

Election

This dataset contains elections generated from weekly power rankings. Specifically, the underlying power ranking data (kaggle.com/masseyratings/rankings) contains weekly power rankings of college basketball teams (between 2001 and 2021), college baseball teams (between 2010 and 2021), and college American football teams (between 1997 and 2021) from different media outlets and ranking systems.

For each of the three sports (basketball, baseball, American football), for each season and each ranking system, we created an election where each vote corresponds to the power ranking of the teams in one week of the season according to the ranking system.

Each patch contains the raw election and a post-processed version where some candidates and voters are deleted to make the election complete. The name of each patch starts with the sports, followed by the relevant year and ranking system.

The combined power rankings and weekly power rankings datasets were generated from the same data.

This dataset is part of a larger study by Boehmer and Schaar (see this page for a more detailed description of the dataset, the post-processing, and pointers to similar datasets). If you have any questions, please contact: niclas.boehmer@tu-berlin.de.

Consists of 4015 data files.

Details

Download [zip, 26.98 MB]

Combined Power Ranking

00055

Election

This dataset contains elections generated from weekly power rankings. Specifically, the underlying power ranking data (kaggle.com/masseyratings/rankings) contains weekly power rankings of college basketball teams (between 2001 and 2021), college baseball teams (between 2010 and 2021), and college American football teams (between 1997 and 2021) from different media outlets and ranking systems.

For each of the three sports (basketball, baseball, American football), for each season, we created an election where each vote corresponds to the power ranking of the teams in one of the weeks of the season according to one of the ranking systems.

Each patch contains the raw election and a post-processed version where some candidates and voters are deleted to make the election complete. The name of each patch starts with the relevant sports followed by the relevant year.

The season power rankings and weekly power rankings datasets were generated from the same data.

This dataset is part of a larger study by Boehmer and Schaar (see this page for a more detailed description of the dataset, the post-processing, and pointers to similar datasets). If you have any questions, please contact: niclas.boehmer@tu-berlin.de.

Consists of 53 data files.

Details

Download [zip, 20.77 MB]

Weeks Power Ranking

00054

Election

This dataset contains elections generated from weekly power rankings. Specifically, the underlying power ranking data (kaggle.com/masseyratings/rankings) contains weekly power rankings of college basketball teams (between 2001 and 2021), college baseball teams (between 2010 and 2021), and college American football teams (between 1997 and 2021) from different media outlets and ranking systems.

For each of the three sports (basketball, baseball, American football), for each week in one of the seasons, we created an election where each vote corresponds to the power ranking of the teams in this week according to one of the ranking systems.

Each patch contains the raw election and a post-processed version where some candidates and voters are deleted to make the election complete. The name of each patch starts with the sports, followed by the day on which the power rankings were published (there is one day from each covered weeek).

The combined power rankings and season power rankings datasets were generated from the same data.

This dataset is part of a larger study by Boehmer and Schaar (see this page for a more detailed description of the dataset, the post-processing, and pointers to similar datasets). If you have any questions, please contact: niclas.boehmer@tu-berlin.de.

Consists of 956 data files.

Details

Download [zip, 28.18 MB]

Formula 1 Races

00053

Election

This dataset contains elections generated from the Formula 1 World Championship. The underlying Formula 1 data (kaggle.com/rohanrao/formula-1-world-championship-1950-2020) contains the finishing times of all drivers in all laps of races taking place between 1950 and 2020. For each race (taking place between 1950 and 2020), we created an election where each vote corresponds to a lap in the race and ranks the drivers by the time they spend in this lap.

Each patch contains the raw election and a post-processed version where some candidates and voters are deleted to make the election complete. The name of each patch starts with the year in which the race took place followed by the name of the race.

The Formula 1 seasons dataset contains elections generated from the same data.

This dataset is part of a larger study by Boehmer and Schaar (see this page for a more detailed description of the dataset, the post-processing, and pointers to similar datasets). If you have any questions, please contact: niclas.boehmer@tu-berlin.de.

Consists of 454 data files.

Details

Download [zip, 1.16 MB]

Formula 1 Seasons

00052

Election

This dataset contains elections generated from the Formula 1 World Championship. The underlying Formula 1 data (kaggle.com/rohanrao/formula-1-world-championship-1950-2020) contains the finishing times of all drivers in all laps of races taking place between 1950 and 2020. For each year, we created an election where each vote corresponds to a race in this year and ranks the drivers by their total finishing time in this race.

Each patch contains the raw election and a post-processed version where some candidates and voters are deleted to make the election complete.

The Formula 1 races dataset contains elections generated from the same data.

This dataset is part of a larger study by Boehmer and Schaar (see this page for a more detailed description of the dataset, the post-processing, and pointers to similar datasets). If you have any questions, please contact: niclas.boehmer@tu-berlin.de.

Consists of 71 data files.

Details

Download [zip, 126.86 KB]

Countries Ranking

00051

Election

This dataset contains elections generated from indicator-based rankings of countries. For each year between 2005 and 2016, the underlying country ranking data (based on the popular world happiness report; kaggle.com/alcidesoxa/world-happiness-report-2005-2018) contains different quantitative indicators for the happiness of citizens from over 100 countries. For each year, we created an election where the countries are the candidates and each vote ranks them according to one indicator.

Each patch contains the raw election and a post-processed version where some candidates and voters are deleted to make the election complete.

Other indicator-based rankings have been used to create the university rankings and city rankings datasets.

This dataset is part of a larger study by Boehmer and Schaar (see this page for a more detailed description of the dataset, the post-processing, and pointers to similar datasets). If you have any questions, please contact: niclas.boehmer@tu-berlin.de.

Consists of 12 data files.

Details

Download [zip, 83.53 KB]

Movehub City Ranking

00050

Election

This dataset contains an election generated from indicator-based rankings of cities. The underlying city ranking data (kaggle.com/blitzr/movehub-city-rankings) contains twelve quantitative indicators for the life quality in 216 different cities determined by movehub.com. We created a single election where each city is a candidate and each vote corresponds to the ranking of the cities with respect to one of the indicators.

Each patch contains the raw election and a post-processed version where some candidates and voters are deleted to make the election complete.

Other indicator-based rankings have been used to create the country rankings and university rankings datasets.

This dataset is part of a larger study by Boehmer and Schaar (see this page for a more detailed description of the dataset, the post-processing, and pointers to similar datasets). If you have any questions, please contact: niclas.boehmer@tu-berlin.de.

Consists of 1 data file.

Details

Download [zip, 7.21 KB]

Multilaps Competitions

00049

Election

This dataset contains elections generated from multi-lap sports competitions.

The underlying mylaps data contains the completion time of athletes in each lap of a multi-lap competition (specifically, speed skating and cycling competitions) crawled from results.sporthive.com. For each race, we created an election in which the athletes are the candidates and each vote corresponds to one lap and ranks the athletes by their completion time.

Each patch contains the raw election and a post-processed version where some candidates and voters are deleted to make the election complete.

This dataset is part of a larger study by Boehmer and Schaar (see this page for a more detailed description of the dataset, the post-processing, and pointers to similar datasets). If you have any questions, please contact: niclas.boehmer@tu-berlin.de.

Consists of 635 data files.

Details

Download [zip, 1.25 MB]

Spotify Countries Chart

00048

Election

This dataset contains elections generated from charts on Spotify. For each day between the 1st of January 2017 and the 9th of January 2018, the Spotify data (kaggle.com/edumucelli/spotifys-worldwide-daily-song-ranking) contains a daily ranking of the 200 most listened songs in 53 different countries. In our elections, candidates model songs. For each month and each country, we created an election where each vote corresponds to the ranking of the songs on one day of the month in the country.

Each patch contains the raw election and a post-processed version where some candidates and voters are deleted to make the election complete. The name of a patch starts with an abbreviation of the relevant country followed by the relevant year and the month. Note that the names of candidates are the IDs of the respective songs on spotify (e.g., candidate 10nqz67NQWWa7XPq7ycihi corresponds to "Welcome to New York" from Taylor Swift open.spotify.com/track/10nqz67NQWWa7XPq7ycihi).

The spotify daily charts dataset contains elections generated from the same data.

This dataset is part of a larger study by Boehmer and Schaar (see this page for a more detailed description of the dataset, the post-processing, and pointers to similar datasets). If you have any questions, please contact: niclas.boehmer@tu-berlin.de.

Consists of 645 data files.

Details

Download [zip, 12.78 MB]

Spotify Daily Chart

00047

Election

This dataset contains elections generated from charts on Spotify. For each day between the 1st of January 2017 and the 9th of January 2018, the Spotify data (kaggle.com/edumucelli/spotifys-worldwide-daily-song-ranking) contains a daily ranking of the 200 most listened songs in 53 different countries. In our elections, candidates model songs. For each day between the 1st of January 2017 and 9th January 2018, we created an election where each vote corresponds to the ranking of the songs on this day in one of the 53 countries.

Each patch contains the raw election and a post-processed version where some candidates and voters are deleted to make the election complete. Note that the names of candidates are the IDs of the respective songs on spotify (e.g., candidate 10nqz67NQWWa7XPq7ycihi corresponds to "Welcome to New York" from Taylor Swift ).

The spotify country charts dataset contains elections generated from the same data.

This dataset is part of a larger study by Boehmer and Schaar (see this page for a more detailed description of the dataset, the post-processing, and pointers to similar datasets). If you have any questions, please contact: niclas.boehmer@tu-berlin.de.

Consists of 362 data files.

Details

Download [zip, 24.60 MB]

Global University Ranking

00046

Election

This dataset contains elections generated from indicator-based rankings of universities. For each year between 2012 and 2015, the university ranking data (kaggle.com/mylesoneill/world-university-rankings) contains rankings of universities according to different criteria provided by three systems. For each year, we created an election where the universities are the candidates and each vote ranks them according to one criterion used by one of the three systems.

Each patch contains the raw election and a post-processed version where some candidates and voters are deleted to make the election complete.

Other indicator-based rankings have been used to create the country rankings and city rankings datasets.

This dataset is part of a larger study by Boehmer and Schaar (see this page for a more detailed description of the dataset, the post-processing, and pointers to similar datasets). If you have any questions, please contact: niclas.boehmer@tu-berlin.de.

Consists of 4 data files.

Details

Download [zip, 113.24 KB]

Tennis Ranking

00045

Election

This dataset contains elections generated from tennis world rankings. The underlying tennis data (kaggle.com/mimoopoo/atp-tennis-rankings-1990-to-2019) contains weekly rankings of the top 100 male tennis players published by the ATP between January 1990 and September 2019. For each year, we created an election where each player is a candidate and each vote corresponds to the ranking of the players in one week.

Each patch contains the raw election and a post-processed version where some candidates and voters are deleted to make the election complete.

Other sports world rankings have been used to create the table tennis world rankings and boxing world rankings datasets.

This dataset is part of a larger study by Boehmer and Schaar (see this page for a more detailed description of the dataset, the post-processing, and pointers to similar datasets). If you have any questions, please contact: niclas.boehmer@tu-berlin.de.

Consists of 29 data files.

Details

Download [zip, 275.82 KB]

Table Tennis Ranking

00044

Election

This dataset contains elections generated from tennis world rankings. The underlying table tennis data (kaggle.com/romanzdk/ittf-table-tennis-player-rankings-and-information) contains the monthly ITTF ranking of the top 500-1500 male and female table tennis players between 2001 and 2020. For each year, we created an election where each player is a candidate and each vote corresponds to the ranking of the players in one month.

Each patch contains the raw election and a post-processed version where some candidates and voters are deleted to make the election complete. The name of each patch starts with the relevant gender followed by the year.

Other sports world rankings have been used to create the boxing world rankings and tennis world rankings datasets.

This dataset is part of a larger study by Boehmer and Schaar (see this page for a more detailed description of the dataset, the post-processing, and pointers to similar datasets). If you have any questions, please contact: niclas.boehmer@tu-berlin.de.

Consists of 38 data files.

Details

Download [zip, 1.54 MB]

Cycling Races

00043

Election

This dataset contains elections generated from road bicycle racing competitions. It consists of two parts.

Tour de France. For each edition of the Tour de France between 1903 and 2021, the underlying Tour de France data (procyclingstats.com) contains the completion times of all riders for each stage. For each edition, we created an election in which the riders are the candidates and each vote corresponds to a stage and ranks the riders by their completion time.

Giro d'Italia. For each edition of the Giro d'Italia between 1910 and 2020, the underlying data Giro d'Italia data (procyclingstats.com) contains the completion times of all riders for each stage of the edition. For each edition, we created an election in which the riders are the candidates and each vote corresponds to a stage and ranks the riders by their completion time.

Each patch contains the raw election and a post-processed version where some candidates and voters are deleted to make the election complete. The name of each patch starts either with gdi (Giro d'Italia) or tdf (Tour de France), followed by the respective year.

This dataset is part of a larger study by Boehmer and Schaar (see this page for a more detailed description of the dataset, the post-processing, and pointers to similar datasets). If you have any questions, please contact: niclas.boehmer@tu-berlin.de.

Consists of 196 data files.

Details

Download [zip, 1.34 MB]

Boxing

00042

Election

This dataset contains elections generated from boxing world rankings. The underlying boxing data (kaggle.com/martj42/ufc-rankings) contains the Ultimate Fighting Championship rankings of the top 16 fighters in twelve different weight classes in different weeks between February 2013 and August 2021. For each year and weight class, we created an election where each fighter is a candidate and each vote corresponds to the ranking of the fighters in one week.

Each patch contains the raw election and a post-processed version where some candidates and voters are deleted to make the election complete. The name of each patch starts with the relevant weight class followed by the year.

Other sports world rankings have been used to create the table tennis world rankings and tennis world rankings datasets.

This dataset is part of a larger study by Boehmer and Schaar (see this page for a more detailed description of the dataset, the post-processing, and pointers to similar datasets). If you have any questions, please contact: niclas.boehmer@tu-berlin.de.

Consists of 99 data files.

Details

Download [zip, 133.51 KB]

Boardgames Geek Ranking

00041

Election

This dataset contains an election generated from board game charts.

The underlying board games data (kaggle.com/mseinstein/bgg_top2000) contains a weekly ranking of the 2000 most popular board games on boardgamegeek.com between October 2018 and December 2021. We created a single election where each game is a candidate and each vote corresponds to the ranking of the games in one week.

The patch contains the raw election and a post-processed version where some candidates and voters are deleted to make the election complete.

This dataset is part of a larger study by Boehmer and Schaar (see this page for a more detailed description of the dataset, the post-processing, and pointers to similar datasets). If you have any questions, please contact: niclas.boehmer@tu-berlin.de.

Consists of 1 data file.

Details

Download [zip, 1.14 MB]

Breakfast Items

00035

Election

This dataset was collected by Green and Rao (1972), and is a standard example in the literature on multidimensional unfolding (which is about embedding preferences in Euclidean space). They obtained strict preference rankings over a collection of 15 sweet breakfast items (such as toast, muffins, donuts) from "a group of 42 respondents, 21 Wharton MBA students and their wives. The questionnaire was self-administered separately by husband and wife. All subjects independently filled out the same questionnaire and received compensation for their efforts."

Green and Rao asked for these preferences in 6 different situations: "Overall preferences", "When I'm having a breakfast consisting of juice, bacon and eggs, and beverage", "When I'm having a breakfast consisting of juice, cold cereal, and beverage", "When I'm having a breakfast consisting of juice, pancakes, sausage, and beverage", "Breakfast, with beverage only", "At snack time, with beverage only". The rankings for each of these situations are provided in separate data patches. A .csv file presents the ranking of a same respondent across the different data files; rankings in odd positions (1st, 3rd, ...) come from the MBA students, the ranking in the following line from that student's wife.

This dataset was digitized by Dominik Peters from the listings provided in the appendix of the 1972 book.

Consists of 7 data files.

Details

Download [zip, 10.14 KB]

Computer Science Conference Bidding Data

00039

Matching

This dataset contains the bidding data from 3 Computer Science Conferences. This contains the bids of all reviewers (aside a small number of opt-outs) over a subset of papers at the conference.

The bidding language for these conferences is yes/maybe/conflict. In order to make these more useful for PreLib users, we have converted them to incomplete partial orders of the form {yes} > {maybe} > {no response}. The papers for which a reviewer had a conflict have been removed from their preference list. All reviewers had different preference orderings, hence each file contains as many entries as reviewers.

Consists of 3 data files.

Details

Download [zip, 10.71 KB]

Project Bidding Data

00038

Matching

This dataset contains bids of students over a set of projects for student/project allocations at the School of Computing Science, University of Glasgow. Each project is supervised by an individual each with a maximum capacity of supervision. There are 8 years worth of data in this set and with between 31 and 51 students and 56 and 155 projects. This data was kindly donated by David Manlove who collected this data.

In addition to the strict and incomplete preference profiles of the students we have extended the profiles with all unranked items tied at the end. We have also posted .dat files containing the supervisor identifiers and capacities. The format for the .dat files is Supervisor ID, Capacity, Projects; where Projects is a space separated list of the projects supervised by the Supervisor. Each project has a capacity of 1 while each supervisor has a variable capacity. In academic sessions 2007-08 and 2008-09 there were no supervisor capacities in force, thus the projects and supervisors are in 1-1 correspondence.

Consists of 8 data files.

Details

Download [zip, 29.87 KB]

AAMAS Bidding Data

00037

Matching

This dataset contains the bids of reviewers over papers from the 2015, 2016 and 2021 Autonomous Agents and Multiagent Systems Conference.

For the years 2015 and 2016, inclusion in these data sets were explicitly opt-in; 2015 contains 9,817 bids of 201 reviewers over 613 papers; this represents about 40% of the actual 22,360 bids of 281 reviewers over 670 papers. The 2016 data contains 161 out of 393 reviewers with bids over 442 out of 550 papers. For the year 2021, 526 submissions, 71 SPC members, and 596 regular PC members passed the checks (not opting-out, etc...).

The bidding language for these conferences is yes/maybe/no/conflict. In order to make these more useful for PreLib users, we have converted them to categorical data of the form {yes} > {maybe} > {no response} > {no} > {conflict}. Note that not all years have the same categories. We are deeply grateful to the IFAAMAS board and Rafael Bordini, Edith Elkind, John Thangarajah, and David Shield for approving, coordinating, and providing this dataset. The 2021 data has been generously provided by Ulle Endriss.

Consists of 3 data files.

Details

Download [zip, 201.89 KB]

Cities Survey

00034

Election

This dataset contains noisy input from two surveys, one about cost of living and one about population, of 392 individuals over 36 alternatives for cost of living and 48 alternatives for population. Each individual provided a ranking of six given cities in terms of cost of living and a ranking of six countries in terms of population.

The data were collected among participants of the 3rd PatrasIQ research and technology exhibition, in Patras, Greece in April 2016. We received input from 392 volunteers; each of them was given a random bundle of six cities (from a pool of 36) and a random bundle of six countries (from a pool of 48), and was asked to give a strict ranking of the given cities and countries in terms of his/her estimation about their cost of living indices and population (in decreasing order), respectively.

In the cost of living treatment each city appears in at least 57 and at most 70 bundles/votes. The alternative ids define a ground truth, i.e., a strict ranking of all 36 cities according to cost of living index data retrieved from numbeo.com in April 2016. In the population treatment Each country appears in at least 47 and at most 52 bundles/votes. The alternative ids define a ground truth, i.e., a strict ranking of all 48 countries according to population data retrieved from wikipedia.org in April 2016.

The data on this page has been donated by Iannis Caragiannis.

Consists of 2 data files.

Details

Download [zip, 7.66 KB]

San Sebastian Poster Competition

00033

Election

Approval Ballots from the San Sebastian Poster Competition held during The Summer School on Computational Social Choice organized by COST Action IC1205 at the Miramar Palace in San Sebastian in July 2016. This set has two elections of approval ballots with 17 alternatives and about 60 voters each. The data on this page was donated by Ulle Endriss.

Two elections were held, using approval voting. In the first election the alternatives were posters A1-A17; in the second election the alternatives were posters B1-B17. There were 67 eligible voters (56 summer school participants, including the 34 poster presenters, as well as 7 lecturers and 4 organizers). Of these, 65 voters participated in the first election and 60 voters participated in the second election (1 voter did not vote in either election). The elections were conducted using the Whale3 system of Sylvain Bouveret. Most of the posters are available at the summer school website.

The original data file (00033-00000001.dat) includes one column per poster. Each of the two sets of posters is ordered by the number of approvals received. Each row corresponds to a voter. The voters are ordered by the number of approvals they have given across both elections, except that the 7 voters who only participated in one of the two elections are listed last. The other files are converted into standard PrefLib format where all approved alternatives are considered a tied equivalence class.

Consists of 3 data files.

Details

Download [zip, 6.14 KB]

Education Surveys in Informatics (Cujae)

00032

Election

This dataset contains the results of surveying students and professors in the Faculty of Informatics, Instituto Superior Politecnico Jose Antonio Echeverria (Cujae, Havana, Cuba) about their preferences on courses and the most important aspects affecting their performance as students and professionals. Answers include ties and missing elements. These surveys, conducted in 2015, include criteria about different numbers of aspects (6 to 32 candidates) and 13 courses.

This dataset was donated by Alejandro Rosete Suarez and Milton Garcia Borroto and may be augmented with new surveys in the future.

Consists of 10 data files.

Details

Download [zip, 8.23 KB]

Vermont District Races

00031

Election Politics

This dataset contains votes for 15 different races for various public offices held in Vermont in 2014. This data was collected and donated by Jeremy A. Hansen. There are 3 to 6 candidates and 532 to 1960 voters in these data files. Not all races were competitive so not every race is reported for every district.

Consists of 15 data files.

Details

Download [zip, 8.12 KB]

UK Labor Party Leadership Vote

00030

Election Politics

The 2010 UK Labor Party Leadership Vote is posted at www.rangevoting.org. This set contains the votes cast by all 266 MPs over the 5 leadership candidates. The votes are incomplete strict orders which we have posted along with extensions placing all unranked candidates tied at the end and pairwise graphs.

Consists of 1 data file.

Details

Download [zip, 2.33 KB]

Proto French Election Ratings

00029

Combinatorial Politics

This analog dataset to the 2002 French Presidential Election Dataset was collected by Jean-Francois Laslier, Karine Van der Straeten and Michel Balinski. It consists of 398 approval ballots and subjective ratings on a 20 point scale collected over potential candidates for the 2002 French Presidential election cast by students at Institut d’Etudes Politiques de Paris.

This dataset preserves both the approval ballots and the subjective ratings of the candidates by each of the voters. The Approvals are coded as either a 1.0 for approved or a 0.0 for not approved. The subjective ratings are on 20 point scale where a score of -1.0 is when no input was provided (as compared to a rating of 0.0, the lowest possible).

Consists of 1 data file.

Details

Download [zip, 6.43 KB]

APA Election Data

00028

Election

This dataset contains the results of the elections of the American Psychological Association between 1998 - 2009. The voters are allowed to rank any number of the 5 candidates without ties. Each of these elections have 5 candidates and between 13,318 and 20,239 voters.

These data were donated by Michal Regenwetter and Anna Popova from the University of Illinois at Urbana-Champaign. The work that analyzed this data was supported by National Science Foundation grants SES # 08-20009, ICES # 1216016 (PI: M. Regenwetter), the University Library at the University of Illinois at Urbana-Champaign (PI: A. Popova), and the Basic Research Program at HSE (S. Popov). We thank the American Psychological Association for permitting access to its election ballot data.

Consists of 12 data files.

Details

Download [zip, 35.96 KB]

Proto French Election

00027

Election Politics

This analog dataset to the 2002 French Presidential Election Dataset was collected by Jean-Francois Laslier, Karine Van der Straeten and Michel Balinski. It consists of 398 approval ballots collected over potential candidates for the 2002 French Presidential election cast by students at Institut d'Etudes Politiques de Paris.

This dataset is interesting as its companion dataset Proto French Election Ratings has both the subjective evaluations of the candidates, along with the approvals. This dataset only preserves the approval ballots cast by the students. As the candidate set is the potential presidential candidates (and thus, not the exact set used in ED-00026), this is presented as a separate dataset.

Consists of 1 data file.

Details

Download [zip, 2.44 KB]

2002 French Presidental Election

00026

Election Politics

The 2002 French Presidental Election Dataset was collected by Jean-Francois Laslier and Karine Van der Straeten. It consists of 2,597 approval ballots collected in parallel to the actual election in 6 different districts in France.

The approval votes were collected at a set of polling stations in France during the first round of voting in the 2002 French National Election. Voters in these districts were informed prior to the election that they would have the ability to cast an approval ballot along with their normal ballot for the election. Overall, over 75% of those who turned up to vote participated in the experiment. Each of the files represent one district voting on the same election. There are between 367 and 476 voters (2,597 in all) and 16 candidates. Additional details the method used to collect the data and results of analysis can be found in the required citation for the use of this dataset.

Consists of 6 data files.

Details

Download [zip, 27.55 KB]

Mechanical Turk Puzzle

00025

Election

The Mechanical Turk Dots datasets come from Andrew Mao and were collected using Mechanical Turk. These data sets each contain elections with 793-797 voters over 4 candidates.

Each of the candidates correspond to an instance of the sliding puzzle game presented to a user on Mechanical Turk, who is asked to rank the items from those in a position closest to solution (first) to those requiring the most moves to complete (last). Thus, for all of these data sets there is a ground truth ranking which corresponds to the candidate names in sorted order. In the Puzzle task, each task contains elements requiring d, d+3, d+6, and d+9 moves to complete, where d = {5, 7, 9, 11}. This allows for more noise to be introduced to various iterations of the task. For each i, 40 sets of puzzles were placed on Mechanical Turk and were ranked by 20 users. As per the data owners request these 160 individual trails have been aggregated into a single file for each i. The individual trial runs are available upon request.

Consists of 4 data files.

Details

Download [zip, 3.12 KB]

Mechanical Turk Dots

00024

Election

The Mechanical Turk Dots datasets come from Andrew Mao and were collected using Mechanical Turk. These data sets each contain elections with 794-800 voters over 4 candidates.

Each of the candidates correspond to random dots presented to a user on Mechanical Turk, who is asked to rank the items from those containing the least dots (first) to those containing the most dots (last). Thus, for all of these data sets there is a ground truth ranking which corresponds to the candidate names in sorted order. In the Dots task, each task contains elements with 200, 200+i, 200+2i, and 200+3i dots, where i = {3, 5, 7, 9}. This allows for more noise to be introduced to various iterations of the task. For each i, 40 sets of puzzles were placed on Mechanical Turk and were ranked by 20 users. As per the data owners request these 160 individual trails have been aggregated into a single file for each i. The individual trial runs are available upon request.

Consists of 4 data files.

Details

Download [zip, 3.06 KB]

Takoma Park Election Data

00023

Election Politics STV

The Takoma Park Data contains the results from the 2007 Takoma Park, WA special election for city council. The set contains one elections with between 4 canddiates and about 400 voters.

Note that these elections were conducted under a ranked voting system which allowed blank entries. In processing this data for PrefLib we have ignored blanks and only report the order over the candidates.

The data on this page was donated by Jeffrey O'Neill who runs the site OpenSTV.org.

Consists of 1 data file.

Details

Download [zip, 2.11 KB]

San Leandro Election Data

00022

Election Politics STV

The San Leandro data contains the results from several elections, including mayor and city council elections, held in San Leandro, CA between 2010 and 2012. The set contains 3 distinct elections with between 4 and 7 canddiates and about 25,000 voters each.

Note that these elections were conducted under a ranked voting system which allowed blank entries. In processing this data for PrefLib we have ignored blanks and only report the order over the candidates.

The data on this page was donated by Jeffrey O'Neill who runs the site OpenSTV.org.

Consists of 3 data files.

Details

Download [zip, 9.96 KB]

San Francisco Election Data

00021

Election Politics STV

The San Francsico data contains the results from several elections, including board of supervisors, district attorny, and mayoral elections, held in San Francisco, CA between 2008 and 2012. The set contains 14 distinct elections with between 4 and 25 canddiates and 18,000 and 195,000 voters.

Note that these elections were conducted under a ranked voting system which allowed blank entries. In processing this data for PrefLib we have ignored blanks and only report the order over the candidates.

The data on this page was donated by Jeffrey O'Neill who runs the site OpenSTV.org.

Consists of 14 data files.

Details

Download [zip, 184.38 KB]

Pierce Election Data

00020

Election Politics STV

The 2008 Pierce Data contains the results from several elections, including county executive, held in Pierce, WA in 2008. The set contains 4 distinct elections with between 4 and 7 canddiates and 40,000 and 300,000 voters.

Note that these elections were conducted under a ranked voting system which allowed blank entries. In processing this data for PrefLib we have ignored blanks and only report the order over the candidates.

The data on this page was donated by Jeffrey O'Neill who runs the site OpenSTV.org.

Consists of 4 data files.

Details

Download [zip, 14.64 KB]

Oakland Election Data

00019

Election Politics STV

The 2010 Oakland Data contains the results from the city council and mayoral elections held in Oakland, CA in 2010. The set contains 7 distinct elections with between 4 and 11 canddiates and 900 and 145,000 voters.

Note that these elections were conducted under a ranked voting system which allowed blank entries. In processing this data for PrefLib we have ignored blanks and only report the order over the candidates.

The data on this page was donated by Jeffrey O'Neill who runs the site OpenSTV.org.

Consists of 7 data files.

Details

Download [zip, 44.10 KB]

Minneapolis Election Data

00018

Election Politics STV

The 2009 Minneapolis Data contains the results from the election for the Parks and Rec Commissioner and Tax Assessor in Minneapolis, MN. The set contains about 30,000 votes over 7-400 candidates. The full data sets contain ballots along with write in candidates (Mikey Mouse and Yoda are well represented). The No Write In files contain the same votes removing any write-ins and modifying the votes accordingly.

Note that these elections were conducted under a ranked voting system which allowed blank entries. In processing this data for PrefLib we have ignored blanks and only report the order over the candidates.

The data on this page was donated by Jeffrey O'Neill who runs the site OpenSTV.org.

Consists of 4 data files.

Details

Download [zip, 58.32 KB]

Berkley Election Data

00017

Election Politics STV

The 2010 Berkley Data contains the results from a city council election (District 7) in Berkley, CA. The set contains about 4,000 votes over 4 candidates.

Note that these elections were conducted under a ranked voting system which allowed blank entries. In processing this data for PrefLib we have ignored blanks and only report the order over the candidates.

The data on this page was donated by Jeffrey O'Neill who runs the site OpenSTV.org.

Consists of 1 data file.

Details

Download [zip, 2.91 KB]

Aspen Election Data

00016

Election Politics STV

The 2009 Aspen Data contains the results from the mayoral and city council elections held in Aspen, CO in 2009. The data contains two different elections with about 2,500 votes each over 5 and 11 candidates.

Note that these elections were conducted under a ranked voting system which allowed blank entries. In processing this data for PrefLib we have ignored blanks and only report the order over the candidates.

The data on this page was donated by Jeffrey O'Neill who runs the site OpenSTV.org.

Consists of 2 data files.

Details

Download [zip, 30.04 KB]

Clean Web Search

00015

Election

This dataset contains the results of comparing websearches across Bing, Google, Yahoo, and Ask. This data is provided by Robert Bredereck at TU Berlin. Robert provides tools to compute Kemeny rankings on this data at his website at TU Berlin.

These data files differ from the other set of web data in that these files are forced to be complete. This means that the results are restricted to only those candidates (sites) that appear in all three datasets. The data files marked big contain around 200 (max 242) candidates each while the data files marked small contain between 10 and 50 candidates. The search querys are shown in the names of the individual data files below. For the WebImpact files the number of search results for a particular term were used to creage a complete ranking over the search terms. These files measure the webimpact of various world cities and countries. We have extended this data into tournament graphs and weighted majoirty graphs.

Consists of 79 data files.

Details

Download [zip, 142.54 KB]

Sushi Data

00014

Election

This dataset contains the results of a series of surveys conducted by Toshihiro Kamishima asking 5000 individuals for their preferences about various kinds of sushi. There are three different datasets that were elicited in different ways:

Element Series 00000001 contains 10 complete strict rank orders of 10 different kinds of sushi.
Element Series 00000002 contains individual's strict rank ordering of 100 different kinds of sushi (candidates).
Element Series 00000003 contains individual's scoring of sushi items on a scale of 0-4, with repeats allowed.

This dataset contains 14 files in total including soc, soi, toi, and toc files.

Note that the dataset was incorrectly converted, it has been fixed as of Jan 2016, please re-download.

Due to licence issues we require that you go through Toshihiro Kamishima website to obtain the datafiles and observe the following licence terms:

We involve Toshihiro Kamishima, his colleagues, and their employers. You involve the user of this data and his/her colleagues, and their employers. We are NOT liable for any damages or losses, arising out of or related to your use or inability to use this data set. You can use this data set for any research purpose. You must not redistribute without our permission. We would like you to acknowledge the use of these program codes or data sets in publications by citing one of our related publications, if you could.

Consists of 3 data files.

Details

Download [zip, 421.64 KB]

T Shirt

00012

Election

This dataset contains complete rank orderings of T-Shirt designs voted on by members of the Optimization Research Group at NICTA. There are 11 designs (candidates) and 30 votes about these deisgns. Voters were required to submit complete strict orders.

This data has been kindly donated by Carleton Coffrin.

Consists of 1 data file.

Details

Download [zip, 1.50 KB]

Web Search

00011

Election

This dataset contains the results of comparing websearches across Bing, Google, Yahoo, and Ask. This data is provided by Robert Bredereck at TU Berlin. Robert provides tools to compute Kemeny rankings on this data at his website at TU Berlin.

The data files marked big contain around 2000 candidates each while the data files marked small contain between 100 and 200 results. The search querys are shown in the names of the individual data files below. For the WebImpact files the number of search results for a particular term were used to creage a complete ranking over the search terms. These files measure the webimpact of various world cities and countries. The results are not complete and not every candidate (website) is ranked by all the voters (search engines). We have extended this data into tournament graphs, weighted majoirty graphs, and created a toc dataset where all candidates are tied, at the end of rankings.

Consists of 77 data files.

Details

Download [zip, 6.19 MB]

Skiing Competitions

00010

Election Sport

This dataset contains the Cross Country Skiing and Ski Jumping results from the 2006-2009 World Championships. This data is provided by Robert Bredereck at TU Berlin. Robert provides tools to compute Kemeny rankings on this data at his website at TU Berlin.

The results from each competition in the season provides a rank ordering over the candidates (competitiors). We have created a toc datafile where all candidates are tied, at the end of rankings.

Note that this dataset used to contain the Formula 1 data. A larger set of the F1 data is now available in the 00052 and 00053 datasets.

Consists of 2 data files.

Details

Download [zip, 19.43 KB]

Trip Advisor Data

00040

Combinatorial

This dataset contains 675,069 reviews of 1,851 hotels across the world scraped from Trip Advisor. The data was scraped and donated by Hongning Wang.

One file contains the numerical aspect ratings provided by the users, along with other information about the hotel. The other files contains the text of the users review (split into 3 files). These reviews have been slightly modified, all excess spaces and tabs have been removed and all commas have been changed to semi-colons.

Both files are encoded in the dat format but are actually CSV files. The first line of each file explains the fields within the file. Some of the usernames are encoded in Unicode so please be careful when parsing the files!

Consists of 4 data files.

Details

Download [zip, 77.34 MB]

Kidney Data

00036

Matching

This dataset contains 310 instances of synthetic kidney donor pools. The data was generated using a state of the art donor pool generation method (described in Saidman et al., Increasing the opportunity of live kidney donation by matching for two-and three-way exchanges. Transplantation 81(5), 2006) and was donated by John Dickerson. John has recently posted his generation as well as his exchange solving code online; it is available here.

The dataset consists of 10 randomly generated instances of kidney exchanges with 16, 32, 64, 128, 256, 512, 1024, 2048 patients and, as a percentage of the pool, altruists at 0%, 5%, 10%, and 15% for a total of 310 data files. The main components use the wmd data format. Each edge has a source and multiple destinations to represent the patients that can receive a kidney from the source. All edges have weight 1 unless they connect from a patient to an altruist (who does not need a kidney), which have weight 0.

There is a dat file associated with each kidney exchange datafile. This file contains some extra fields that may be of interest to researchers. Specifically, the file contains the following files: Pair index number of the pair in the corresponding wmd file.; Patient the blood type of the person needing the kidney; Donor the blood type of the person donating the kidney; Wife-P? 1 if the person needing the kidney is the wife of the donor; %Pra denotes the panel reactive antibody level of the patient, discretized into three levels; Out-Deg the number of nodes in the wmd file that can receive a kidney from this donor; Altruist1 if the corresponding pair is an altruist.

Consists of 310 data files.

Details

Download [zip, 198.24 MB]

Social Recommendation

00013

Combinatorial

This dataset contains the Facebook Social Graph and full ratings of 16 restaurants and 23 pubs by 93 users.

You can find anonymous versions of the social network and the items ratings. It includes three files:

links.csv - The is an edge list that contains the Facebook social friendship ties of all the participants. These links are undirected.
pubs.csv - The file contains the list of participants and their ratings for 23 Pubs.
rest.csv - The file contains the list of participants and their ratings for 16 Restaurants.

Each line in the rating files (pubs.csv and rest.csv) represents a participant with the structure: userid,X1,...,Xn. The userid in these files corresponds with the ids in the links.csv file.

The data on this page has been donated by Lihi Dery.

Consists of 3 data files.

Details

Download [zip, 216.78 KB]

AGH Course Selection

00009

Election

This dataset contains the results of surveying students at AGH University of Science and Technology about their course preferences. Each student provided a rank ordering over all the courses with no missing elements. There are 9 courses to choose from in 2003 and 7 in 2004.

The data on this page has been donated by Piotr Faliszewski.

Consists of 2 data files.

Details

Download [zip, 2.71 KB]

Glasgow City Council

00008

Election Politics STV

This data set contains the results of the 2007 Glasgow City Council elections, seperated by Ward. There are 21 wards, each with different candidates and voters. These files report the results of all the Ward level elections which were origionally held under STV. In this data set there is a maximum of 13 candidates and a minimum of 8 candidates. The maximum number of voters is 12,744 and the minimum is 5,199.

The data presented here was donated by Jeffrey O'Neill who runs the site OpenSTV.org.

Consists of 21 data files.

Details

Download [zip, 458.76 KB]

Electoral Reform Society (ERS) Data

00007

Election

This dataset contains the results of 86 separate elections of various elections held by non-profit organizations, trade unions, and professional organizations. They were originally donated by Nicolaus Tideman who secured NSF funding to have the ballots tabulated. The ballots are from elections held under various voting rules requiring incomplete strict orders. The tabulated results were initially collected by the Electoral Reform Society in the UK in order to support the adoption of STV and other range voting methods.

The files contain vote records with a maximum of 29 candidates and as few as 3; the number of voters ranges from 9 to 3419. The toc files have all unranked candidates tied, at the end of the order. Additionally, some of these are complete sets of ballots from the given elections and some are random samples from the set of all ballots.

Consists of 87 data files.

Details

Download [zip, 301.97 KB]

Skate Data

00006

Election Sport

This dataset contains figure skating rankings from various competitions during the 1998 season including the World Juniors, World Championships, and the Olympics. These data sets generally have 10-25 candidates (skaters) and 8-10 judges (voters).

The candidates (skaters) are ordered such that the first candidate skated first, and on down the list. We have maintained this order as presented in the original versions of this dataset.

Consists of 48 data files.

Details

Download [zip, 42.80 KB]

Burlington Election Data

00005

Election Politics STV

The 2009 Burlington, Vermont Mayoral Election Data is posted online at www.rangevoting.org. It contains a number of interesting features when evaluated with the IRV method. Namely, the majority candidate in the first round does not emerge as the winner of the election.

The 2006 Burlington, Vermont Mayoral data presented here was donated by Jeffrey O'Neill who runs the site OpenSTV.org.

Consists of 4 data files.

Details

Download [zip, 9.63 KB]

Netflix Prize Data

00004

Election

The Netflix Prize was a competition devised by Netflix to improve the accuracy of its recommendation system. To facilitate this Netflix released real ratings about movies from the users of the system. Any set of movies can be transformed into an election via a process outlined by Mattei, Forshee, and Goldsmith (reference below).

The data sets posted below correspond 100 random 3 and 4 candidate elections drawn from Data Set 1 in the paper , "An Empirical Study of Voting Rules and Manipulation with Large Datasets." The elements numbered 1 - 100 are all 3 candidate elections and the elements 101 - 201 are all 4 candidate elections.

Consists of 200 data files.

Details

Download [zip, 95.62 KB]

Mariner Path Selection

00003

Election

The Mariner Trajectory Selection Data Set is the votes cast by the various science teams responsible for selecting the trajectory for the 1977 interplanetary satellite. There were a total of 10 science teams voting over 32 possible paths. All these votes are complete but indifference was allowed between some of the objects.

Consists of 1 data file.

Details

Download [zip, 1.85 KB]

Debian Project Data

00002

Election

The Debian Project Leader Elections are held yearly with most of the ballots available online.

We have captured several years of data below including the vote for the Debian logo. Some years there have been only a few candidate and we have omitted these years. The included data sets have between 4 and 9 candidates depending on instance and about 400 individual votes per instance.

Consists of 8 data files.

Details

Download [zip, 21.85 KB]

Irish Election Data

00001

Election Politics STV

The Dublin North, West, and Meath data sets contain a complete record of votes for two separate elections held in Dublin, Ireland in 2002. The votes were posted online but have since been removed.

The data sets are not complete, they contain many partial votes over the candidate set. The North data set contains 43,942 votes over 12 candidates, the West data set contains 29,988 over 9 candidates, and the Meath set contains 64,081 votes over 14 candidates.

The Meath data presented here was donated by Jeffrey O'Neill who runs the site OpenSTV.org.

Consists of 3 data files.

Details

Download [zip, 608.90 KB]