widr provides direct API access to the World Inequality Database (WID) from R. It offers validated variable codes, structured downloads as standard data frames, and helpers for currency conversion, inequality measurement, and plotting. Independent implementation, unaffiliated with the World Inequality Lab (WIL) or the Paris School of Economics. Data are sourced from WID and maintained by WIL.
WID variables follow a four-part grammar:
<type:1> <concept:5-6> [<age:3>] [<pop:1>]
| Component | Width | Example | Meaning |
|---|---|---|---|
type |
1 letter | s |
share |
concept |
5-6 letters | ptinc |
pre-tax national income |
age |
3 digits | 992 |
adults 20+ |
pop |
1 letter | j |
equal-split between spouses |
sptinc992j denotes the share of
pre-tax national income for equal-split adults
aged 20+.
The full catalogue is available at World Inequality Database; widr bundles it as six searchable reference tables.
wid_search("national income") # keyword search across concepts
wid_decode("sptinc992j") # parse into components
wid_encode("s", "ptinc", age = "992", pop = "j") # build from components
wid_is_valid(series_type = "s", concept = "ptinc") # non-throwing validationThe six reference tables (wid_series_types,
wid_concepts, wid_ages,
wid_pop_types, wid_percentiles,
wid_countries) are lazy-loaded and compiled from the codes
dictionary by an independent script.
download_wid() returns a wid_df, a classed
data.frame fully compatible with dplyr, ggplot2, and base
R. At minimum supply indicators or areas; all
other parameters default to "all" (age to
"992", pop to "j").
library(widr)
# Top 1% pre-tax income share, United States, 2000-2022
top1 <- download_wid(
indicators = "sptinc992j",
areas = "US",
perc = "p99p100",
years = 2000:2022
)
top1
#> <wid_df> 23 rows | 1 countries | 1 variables
#> country variable percentile year value age pop
#> 1 US sptinc992j p99p100 2000 0.168 992 j
#> ...Data is retrieved from the WID webservice at
https://rfap9nitz6.execute-api.eu-west-1.amazonaws.com/prod.
Many series are linearly interpolated between survey years. Pass
include_extrapolations = FALSE to retain only directly
observed observations:
metadata = TRUE attaches source and methodological
documentation as an attribute — the shape of the data frame is
unchanged:
| Parameter | Default | Description |
|---|---|---|
indicators |
"all" |
Variable codes |
areas |
"all" |
ISO-2 country / region codes |
years |
"all" |
Integer vector or "all" |
perc |
"all" |
Percentile codes, e.g. "p99p100" |
ages |
"992" |
Three-digit age code |
pop |
"j" |
Population unit |
metadata |
FALSE |
Attach source info as attr(., "wid_meta") |
include_extrapolations |
TRUE |
Include interpolated points |
cache |
TRUE |
Cache responses to disc |
verbose |
FALSE |
Print progress messages |
wid_df is a plain data.frame subclass;
dplyr verbs and ggplot2 work without any unwrapping:
library(dplyr)
library(ggplot2)
top1 |>
wid_tidy(country_names = FALSE) |>
filter(year >= 1990) |>
ggplot(aes(year, value)) +
geom_line(colour = "#58a6ff", linewidth = 0.9) +
scale_y_continuous(labels = scales::percent_format()) +
labs(title = "Top 1% pre-tax income share - United States",
x = NULL, y = NULL) +
theme_minimal()wid_tidy() coerces year to integer and
value to double, and optionally appends
indicator, series_type,
type_label, and country_name columns.
wid_query() builds a query; wid_filter()
updates it; wid_fetch() executes it. Useful when iterating
over parameter combinations or embedding in analysis pipelines:
All responses are cached to disc by default, keyed to the exact query parameters and persisting across sessions:
Monetary series (types a, m,
t) are in local currency at the prior year’s prices.
wid_convert() fetches the appropriate WID exchange-rate
series and divides in one step. Dimensionless series (types
s, g, etc.) pass through unchanged with a
message.
# Bottom 50% average income, four countries - convert to 2022 USD PPP
download_wid("aptinc992j", areas = c("US", "FR", "CN", "IN"), perc = "p0p50") |>
wid_convert(target = "ppp", base_year = "2022")Supported targets: "lcu" (no conversion),
"usd", "eur", "gbp",
"ppp", "yppp".
These operate on data already in memory; no additional API calls are needed.
Requires a share (s) series with contiguous
pXpY codes covering the full distribution:
All plot functions return ggplot objects and accept
additional layers:
# Time series - one line per country; facet = TRUE for separate panels
wid_plot_timeseries(shares,
country_labels = c(US = "United States", FR = "France",
DE = "Germany", CN = "China"))
# Cross-country bar chart for a single year
wid_plot_compare(shares, year = 2020)
# Lorenz curve
wid_plot_lorenz(dist, country = "US")library(widr); library(dplyr); library(ggplot2)
download_wid(
indicators = "aptinc992j",
areas = c("US", "FR", "CN", "IN"),
perc = "p0p50",
years = 1990:2022
) |>
wid_convert(target = "ppp", base_year = "2022") |>
wid_tidy(country_names = TRUE) |>
ggplot(aes(year, value, colour = country_name)) +
geom_line(linewidth = 0.8) +
scale_y_continuous(labels = scales::dollar_format()) +
labs(title = "Bottom 50% average pre-tax income",
subtitle = "2022 USD PPP · equal-split adults 20+",
x = NULL, y = NULL, colour = NULL)| Function | Purpose |
|---|---|
download_wid() |
Download data; returns a wid_df |
wid_decode() / wid_encode() |
Parse or build variable codes |
wid_validate() / wid_is_valid() |
Validate code components |
wid_search() |
Keyword search across reference tables |
wid_tidy() |
Decode columns, coerce types |
wid_convert() |
Currency conversion |
wid_metadata() |
Retrieve source information |
wid_gini() |
Gini coefficient |
wid_top_share() |
Top fractile income / wealth share |
wid_percentile_ratio() |
Percentile ratio (e.g. P90/P10) |
wid_plot_timeseries() |
Time-series line chart |
wid_plot_compare() |
Cross-country bar / point chart |
wid_plot_lorenz() |
Lorenz curve |
wid_query() / wid_filter() /
wid_fetch() |
Reusable query objects |
wid_set_key() |
Set API key |
wid_cache_list() / wid_cache_clear() |
Cache management |
Full code dictionary: vignette("code-dictionary") · wid.world/codes-dictionary