Tillbaka till case studyLive demo
Datapipeline — Gapminder
Interaktiv demo: från rådata till insikter i 4 steg. Datakälla: Gapminder.org
Step 1: Raw Data Ingestion
Rådatan laddas in från Gapminder. Notera kvalitetsproblem: duplicerade rader, null-värden, tomma namn och ogiltiga värden (markerade i rött).
SQL — Data Ingestion
-- Step 1: Raw data ingestion
-- Source: gapminder.org (via github.com/jennybc/gapminder)
-- Years: 2002 and 2007 for all countries
SELECT country, continent, year,
life_exp, population, gdp_per_cap
FROM gapminder_raw;
-- 51 rows loaded (46 real + 5 dirty rows)51 rows loaded · 5 rows with issues
| # | country | continent | year | lifeExp | pop | gdpPercap |
|---|---|---|---|---|---|---|
| 1 | Sweden | Europe | 2002 | 80.04 | 8954175 | 29341.63 |
| 2 | Sweden | Europe | 2007 | 80.884 | 9031088 | 33859.75 |
| 3 | Norway | Europe | 2002 | 79.05 | 4535591 | 44683.98 |
| 4 | Norway | Europe | 2007 | 80.196 | 4627926 | 49357.19 |
| 5 | Denmark | Europe | 2002 | 77.18 | 5374693 | 32166.5 |
| 6 | Denmark | Europe | 2007 | 78.332 | 5468120 | 35278.42 |
| 7 | Germany | Europe | 2002 | 78.67 | 82350671 | 30035.8 |
| 8 | Germany | Europe | 2007 | 79.406 | 82400996 | 32170.37 |
| 9 | France | Europe | 2002 | 79.59 | 59925035 | 28926.03 |
| 10 | France | Europe | 2007 | 80.657 | 61083916 | 30470.02 |
| 11 | Finland | Europe | 2002 | 78.37 | 5193039 | 28204.59 |
| 12 | Finland | Europe | 2007 | 79.313 | 5238460 | 33207.08 |
| 13 | Japan | Asia | 2002 | 82 | 127065841 | 28604.59 |
| 14 | Japan | Asia | 2007 | 82.603 | 127467972 | 31656.07 |
| 15 | China | Asia | 2002 | 72.028 | 1280400000 | 3119.28 |
| 16 | China | Asia | 2007 | 72.961 | 1318683096 | 4959.11 |
| 17 | India | Asia | 2002 | 62.879 | 1034172547 | 1746.77 |
| 18 | India | Asia | 2007 | 64.698 | 1110396331 | 2452.21 |
| 19 | South Korea | Asia | 2002 | 77.045 | 47969150 | 19233.99 |
| 20 | South Korea | Asia | 2007 | 78.623 | 49044790 | 23348.14 |
| 21 | Bangladesh | Asia | 2002 | 62.013 | 135656790 | 1136.39 |
| 22 | Bangladesh | Asia | 2007 | 64.062 | 150448339 | 1391.25 |
| 23 | United States | Americas | 2002 | 77.31 | 287675526 | 39097.1 |
| 24 | United States | Americas | 2007 | 78.242 | 301139947 | 42951.65 |
| 25 | Brazil | Americas | 2002 | 71.006 | 179914212 | 8131.21 |
| 26 | Brazil | Americas | 2007 | 72.39 | 190010647 | 9065.8 |
| 27 | Canada | Americas | 2002 | 79.77 | 31902268 | 33328.97 |
| 28 | Canada | Americas | 2007 | 80.653 | 33390141 | 36319.24 |
| 29 | Mexico | Americas | 2002 | 74.902 | 102479927 | 10742.44 |
| 30 | Mexico | Americas | 2007 | 76.195 | 108700891 | 11977.57 |
| 31 | Argentina | Americas | 2002 | 74.34 | 38331121 | 8797.64 |
| 32 | Argentina | Americas | 2007 | 75.32 | 40301927 | 12779.38 |
| 33 | Nigeria | Africa | 2002 | 46.608 | 119901274 | 1615.29 |
| 34 | Nigeria | Africa | 2007 | 46.859 | 135031164 | 2013.98 |
| 35 | South Africa | Africa | 2002 | 53.365 | 44433622 | 7710.95 |
| 36 | South Africa | Africa | 2007 | 49.339 | 43997828 | 9269.66 |
Showing 36 of 51 rows
Steg 1 / 4
Data från Gapminder.org.