The gerda package provides functions to access and work
with GERDA datasets. The German Election Database (GERDA) provides data
on German elections spanning federal elections (since 1953 at the county
level, 1980 at the municipal level), state (Landtag) elections,
local (Kommunal) elections, mayoral (Bürgermeister)
elections, European Parliament elections, and county (Kreistag)
elections. All election datasets include turnout and vote shares for all
major parties. GERDA also supplies geographically harmonized datasets
that account for changes in municipal boundaries and mail-in voting
districts.
In addition to election results, the package provides county-level socioeconomic covariates from INKAR, municipality-level data from the German Census 2022, and a party crosswalk that maps GERDA party names to standardized ParlGov attributes.
GERDA was compiled by Vincent Heddesheimer, Florian Sichart, Andreas Wiedemann and Hanno Hilbig. For additional information, see also the GERDA website (www.german-elections.com) and the accompanying publication: doi.org/10.1038/s41597-025-04811-5
This vignette will introduce you to the main functions of the package and demonstrate how to use them.
To see a list of all available GERDA electoral result datasets, you
can use the gerda_data_list() function:
gerda_data_list()
#> municipal_unharm Local elections at the municipal level (1990-2020, unharmonized).
#> municipal_harm Local elections at the municipal level (1990-2020, harmonized).
#> municipal_harm_25 Local elections at the municipal level, harmonized to 2025 boundaries.
#> state_unharm State elections at the municipal level (2006-2019, unharmonized).
#> state_harm State elections at the municipal level (2006-2019, harmonized).
#> state_harm_21 State elections at the municipal level, harmonized to 2021 boundaries.
#> state_harm_23 State elections at the municipal level, harmonized to 2023 boundaries.
#> state_harm_25 State elections at the municipal level, harmonized to 2025 boundaries.
#> federal_muni_raw Federal elections at the municipal level (1980-2025, raw data).
#> federal_muni_unharm Federal elections at the municipal level (1980-2025, unharmonized).
#> federal_muni_harm_21 Federal elections at the municipal level (1990-2025, harmonized to 2021 boundaries).
#> federal_muni_harm_25 Federal elections at the municipal level (1990-2025, harmonized to 2025 boundaries).
#> federal_cty_unharm Federal elections at the county level (1953-2021, unharmonized).
#> federal_cty_harm Federal elections at the county level (1990-2021, harmonized).
#> county_elec_unharm County (Kreistag) elections at the municipal level, unharmonized.
#> county_elec_harm_21 County (Kreistag) elections, harmonized to 2021 boundaries.
#> county_elec_harm_21_cty County (Kreistag) elections aggregated to county level, harmonized to 2021 boundaries.
#> county_elec_harm_21_muni County (Kreistag) elections at the municipal level, harmonized to 2021 boundaries.
#> european_muni_unharm European Parliament elections at the municipal level, unharmonized.
#> european_muni_harm European Parliament elections at the municipal level, harmonized.
#> mayoral_unharm Mayoral election results at the municipal level, unharmonized.
#> mayoral_harm Mayoral election results at the municipal level, harmonized.
#> mayoral_candidates Mayoral candidates (person-level).
#> mayor_panel Mayor panel (person-level, one row per mayor-term).
#> mayor_panel_harm Mayor panel (person-level, harmonized to current boundaries).
#> mayor_panel_annual Mayor panel at annual frequency (one row per municipality-year).
#> mayor_panel_annual_harm Mayor panel at annual frequency, harmonized to current boundaries.
#> ags_crosswalks Crosswalks for municipalities (1990-2025).
#> cty_crosswalks Crosswalks for counties (1990-2025).
#> ags_1990_to_2023_crosswalk Municipality crosswalk: 1990 boundaries to 2023 boundaries.
#> ags_1990_to_2025_crosswalk Municipality crosswalk: 1990 boundaries to 2025 boundaries.
#> crosswalk_ags_2021_to_2023 Municipality crosswalk: AGS 2021 to AGS 2023 (targeted).
#> crosswalk_ags_2021_2022_to_2023 Municipality crosswalk: AGS 2021 and 2022 to AGS 2023 (targeted).
#> crosswalk_ags_2023_to_2025 Municipality crosswalk: AGS 2023 to AGS 2025 (targeted; RDS only).
#> crosswalk_ags_2023_24_to_2025 Municipality crosswalk: AGS 2023 and 2024 to AGS 2025 (targeted; RDS only).
#> crosswalk_ags_2024_to_2025 Municipality crosswalk: AGS 2024 to AGS 2025 (targeted; RDS only).
#> ags_area_pop_emp Crosswalk covariates (area, population, employment) for municipalities (1990-2025).
#> ags_area_pop_emp_2023 Crosswalk covariates (area, population, employment) for municipalities, harmonized to 2023 boundaries.
#> cty_area_pop_emp Crosswalk covariates (area, population, employment) for counties (1990-2025).This function displays a formatted table with the names and
descriptions of all available datasets. You can use the
file_name column from this output to specify which dataset
you want to load using the load_gerda_web() function.
The main function for loading GERDA data is
load_gerda_web(). This function allows you to load a
specific dataset from a web source. Here’s an example of how to use
it:
# Load the municipal harmonized dataset
municipal_harm_data <- load_gerda_web("municipal_harm", verbose = TRUE, file_format = "rds")The load_gerda_web() function takes the following
parameters:
file_name: A character string with the name of the
dataset to load, e.g. "federal_cty_harm" (as shown in the
gerda_data_list() output). The function supports fuzzy
matching, so close misspellings will produce a helpful suggestion.verbose: If set to TRUE, it prints
messages about the loading process (default is FALSE)file_format: Specifies the format of the file to load,
either "rds" or "csv" (default is
"rds"). Both formats return the same tibble, so this choice
only affects download size and speed.Here’s an example of a typical workflow using the gerda
package:
gerda_data_list()
#> municipal_unharm Local elections at the municipal level (1990-2020, unharmonized).
#> municipal_harm Local elections at the municipal level (1990-2020, harmonized).
#> municipal_harm_25 Local elections at the municipal level, harmonized to 2025 boundaries.
#> state_unharm State elections at the municipal level (2006-2019, unharmonized).
#> state_harm State elections at the municipal level (2006-2019, harmonized).
#> state_harm_21 State elections at the municipal level, harmonized to 2021 boundaries.
#> state_harm_23 State elections at the municipal level, harmonized to 2023 boundaries.
#> state_harm_25 State elections at the municipal level, harmonized to 2025 boundaries.
#> federal_muni_raw Federal elections at the municipal level (1980-2025, raw data).
#> federal_muni_unharm Federal elections at the municipal level (1980-2025, unharmonized).
#> federal_muni_harm_21 Federal elections at the municipal level (1990-2025, harmonized to 2021 boundaries).
#> federal_muni_harm_25 Federal elections at the municipal level (1990-2025, harmonized to 2025 boundaries).
#> federal_cty_unharm Federal elections at the county level (1953-2021, unharmonized).
#> federal_cty_harm Federal elections at the county level (1990-2021, harmonized).
#> county_elec_unharm County (Kreistag) elections at the municipal level, unharmonized.
#> county_elec_harm_21 County (Kreistag) elections, harmonized to 2021 boundaries.
#> county_elec_harm_21_cty County (Kreistag) elections aggregated to county level, harmonized to 2021 boundaries.
#> county_elec_harm_21_muni County (Kreistag) elections at the municipal level, harmonized to 2021 boundaries.
#> european_muni_unharm European Parliament elections at the municipal level, unharmonized.
#> european_muni_harm European Parliament elections at the municipal level, harmonized.
#> mayoral_unharm Mayoral election results at the municipal level, unharmonized.
#> mayoral_harm Mayoral election results at the municipal level, harmonized.
#> mayoral_candidates Mayoral candidates (person-level).
#> mayor_panel Mayor panel (person-level, one row per mayor-term).
#> mayor_panel_harm Mayor panel (person-level, harmonized to current boundaries).
#> mayor_panel_annual Mayor panel at annual frequency (one row per municipality-year).
#> mayor_panel_annual_harm Mayor panel at annual frequency, harmonized to current boundaries.
#> ags_crosswalks Crosswalks for municipalities (1990-2025).
#> cty_crosswalks Crosswalks for counties (1990-2025).
#> ags_1990_to_2023_crosswalk Municipality crosswalk: 1990 boundaries to 2023 boundaries.
#> ags_1990_to_2025_crosswalk Municipality crosswalk: 1990 boundaries to 2025 boundaries.
#> crosswalk_ags_2021_to_2023 Municipality crosswalk: AGS 2021 to AGS 2023 (targeted).
#> crosswalk_ags_2021_2022_to_2023 Municipality crosswalk: AGS 2021 and 2022 to AGS 2023 (targeted).
#> crosswalk_ags_2023_to_2025 Municipality crosswalk: AGS 2023 to AGS 2025 (targeted; RDS only).
#> crosswalk_ags_2023_24_to_2025 Municipality crosswalk: AGS 2023 and 2024 to AGS 2025 (targeted; RDS only).
#> crosswalk_ags_2024_to_2025 Municipality crosswalk: AGS 2024 to AGS 2025 (targeted; RDS only).
#> ags_area_pop_emp Crosswalk covariates (area, population, employment) for municipalities (1990-2025).
#> ags_area_pop_emp_2023 Crosswalk covariates (area, population, employment) for municipalities, harmonized to 2023 boundaries.
#> cty_area_pop_emp Crosswalk covariates (area, population, employment) for counties (1990-2025).If you are using add_gerda_covariates() or
add_gerda_census(), you can skip this section: the helpers
detect the level of your data and use the correct join keys
automatically. If you are writing a manual left_join() or
merging against other sources, the table below shows which identifier
and time columns each family carries.
| Dataset family | Geographic id | Time column |
|---|---|---|
municipal_*, state_*,
federal_muni_*, european_muni_*,
mayoral_* |
ags (8-digit municipality) |
election_year (+ election_date where
available) |
federal_cty_harm |
county_code (5-digit county) |
election_year |
federal_cty_unharm |
county_code + ags alias (see
Deprecations) |
election_year + year alias |
county_elec_* (with _cty suffix) |
county_code (5-digit county) |
election_year |
county_elec_* (without _cty suffix) |
ags (8-digit municipality) |
election_year |
mayor_panel / mayor_panel_harm |
ags + person_id |
election_date |
mayor_panel_annual /
mayor_panel_annual_harm |
ags + person_id |
year |
gerda_covariates() (INKAR, county-level) |
county_code (5-digit county) |
year (not election_year) |
gerda_census() (Zensus 2022, municipality-level) |
ags (8-digit municipality) |
time-invariant (2022) |
ags_crosswalks,
ags_1990_to_2023_crosswalk,
ags_1990_to_2025_crosswalk,
crosswalk_ags_* |
Pair of AGS codes at source and target vintages | Vintage is encoded in column names |
cty_crosswalks |
Pair of 5-digit county codes at source and target | Vintage is in column names |
Two things to watch for when joining manually:
gerda_covariates() uses year, not
election_year. If you merge it directly into federal
election data, you need a rename:
by = c("county_code" = "county_code", "election_year" = "year").county column is a county
name or partial code, not the 5-digit AGS-based county code. Use
substr(ags, 1, 5) to extract the county key when you want
to join against county-level data.The gerda package includes county-level socioeconomic
and demographic covariates from INKAR (Indikatoren und Karten zur Raum-
und Stadtentwicklung). These covariates can be easily merged with GERDA
election data to enrich your analyses. INKAR data is available from 1995
to 2022, so covariates can be matched to federal elections from 1998
onwards (earlier elections fall outside the INKAR coverage window).
The easiest way to add covariates to your election data is using the
add_gerda_covariates() function:
library(dplyr)
# Load election data and add covariates
merged <- load_gerda_web("federal_cty_harm") %>%
add_gerda_covariates()
# Your data now includes 30 county-level covariates!Under the hood, add_gerda_covariates() merges on county
code and election year. It automatically:
county_code or
ags, and election_year) are presentThe covariates dataset includes 30 variables across 10 categories
(for the full list of variable names, units, and descriptions, see
gerda_covariates_codebook()):
To see detailed information about each covariate, including units and missing data patterns:
For more control, you can access the raw covariates data:
Coverage varies by variable: core indicators (demographics, economy,
labor market) are available for all 7 federal election years
(1998-2021). Newer INKAR indicators (e.g., childcare, some healthcare
variables) are available for 2-3 recent elections only. Consult the
codebook’s missing_pct column to check per-variable
availability before analysis.
The gerda package includes municipality-level data from
the German Census 2022 (Zensus 2022). This cross-sectional snapshot
covers approximately 10,800 municipalities and can be merged with any
GERDA election dataset.
The main advantage of this covariate data is that it is observed at the municipal level (unlike the county-level INKAR data). This allows for more fine-grained analyses of local election outcomes. However, the census is a single time point (2022), so it does not vary across election years. This means that the resulting merged dataset will have time-invariant covariates, i.e. each municipality receives the same census values for all election years. Users should not conduct analyses that rely on over-time variation in these covariates.
The census data includes 14 indicators across four categories:
Since the census is a 2022 snapshot, the same values are attached to all election years (see also the note above).
Most census variables have >95% municipality coverage.
avg_household_size_census22 has approximately 12.5% missing
values because Destatis suppresses data for small municipalities under
its disclosure rules.
The party_crosswalk() function provides a mapping
between GERDA party names and standardized party information from the
ParlGov database. This is particularly useful for linking GERDA data
with other political science datasets or for obtaining standardized
party characteristics.
The function takes two main parameters:
party_gerda: A character vector of GERDA party
namesdestination: The name of the column from the ParlGov
view_party table to map toYou can map GERDA party names to various standardized party characteristics, including:
left_right: Left-right position scoresparty_name_english: English party namesparty_name_short: Short party namescountry_name: Country names# Map GERDA party names to left-right positions
parties <- c("cdu", "spd", "linke_pds", "fdp")
left_right_scores <- party_crosswalk(parties, "left_right")
print(left_right_scores)
# Map to English party names
english_names <- party_crosswalk(parties, "party_name_english")
print(english_names)This function is especially useful when you want to:
The gerda package provides easy access to a wide range
of German election and related data. By using the
gerda_data_list() function to explore available datasets
and load_gerda_web() to load them, you can quickly
incorporate this data into your research or analysis projects.
For more information or to provide feedback, please contact hhilbig@ucdavis.edu or visit the GitHub repository at https://github.com/hhilbig/gerda.