Task 43 Open Datasets

Portal

Contents (values, duration)

Data’s public availability

Used in a published study?

Stakeholder 

Additional information

A2e The Data Access Portal for the U.S. Department of Energy, Wind Energy Technologies Office’s Atmosphere to Electrons (A2e) initiative contains datasets supporting its projects. Most project data are publicly available. Registration is required for data download. Some datasets have restricted access to project members only. A2e maintains a publications list and guidelines for citing the data are on the website. Government Data types include: archive data; source code; images; raw data; structured data; other data types; meteorological data; power generation data
Copernicus Open Data Copernicus houses the remote sensing data from several satellites and a global network of thousands of land, air, and marine-based sensors that create the most detailed pictures of Earth. Copernicus is the largest space data provider in the world, currently delivering 16 terabytes per day. The vast majority of data on Copernicus is free and open. A self registration is needed but the self-registration process is automatic and immediate. Government A maximum of 2 concurrent downloads per user is allowed in order to ensure a download capacity for all users
DEA Energy System Data Danish Energy Agency (DEA) provides data contributing to generate an overview of the Danish energy supply system, including wind power. The Master Data Register of Wind Turbines is a national database which contains all Danish power producing wind turbines > 6 kW. The Register has information on location, technical specifications and output for each wind turbine.  Time resolution: monthly.  Data goes back to 2002. https://ens.dk/en/our-services/statistics-data-key-figures-and-energy-maps/overview-energy-sector Data can directly downloaded Government DEA also provides monthly energy production and consumption statistics.
DOE’s OEDI The Open Energy Data Initiative (OEDI) is a centralized repository of energy research datasets aggregated from the U.S. Department of Energy’s Programs, Offices, and National Laboratories.  The PI funded by DOE programs are required to submit their datasets related to the project. As of the last access in March 2022, there are more than 5,600 submissions.  There is a wide variety of datasets. Most of them are open access.  For example, the US Offshore Wind Resource Data for 2000-2019 is one of the submissions. Data can be directly downloaded. Government Some datasets are of a large size.
DSWE datasets The datasets used in the book together with extension datasets, totaling 1 GB data.  Most of them ten-minute SCADA data.  Multiple channels available.  Multiple turbines and multiple years.  More details on a separate page All data immediately available Detailed information please see the separate page. Academia Reuse permission: free to reuse once credit the data sources/book. Applicable to all data subsets
DTU Wind Databank A large number of datasets and file collections from DTU Wind Energy. The content comprises of two large groups: collections/online resources (usually .pdf files) and datasets (.nc and .txt files).  The datasets have SCADA data, turbulence measurements, turbine load measurements, participation data, resource data, among others. All data immediately available. Once clicking open a dataset, the page shows the links to related publications. Academia Reuse permission: once clicking open a dataset, the page shows the license for that specific datasets.  The licenses of nearly all the datasets are CC-BY.
ECMWF Datasets European Centre for Medium-Range Weather Forecasts (ECMWF) operates the Copernicus Climate Change Service (C3S) on behalf of the European Union. ECMWF provides various open datasets.  Among them, there are Copernicus Climate Data Store, and Copernicus Atmosphere Data Store, which house the European and global climate data dated as far back as 1950. Most of datasets are open access.  A free registration may be needed Some related publications may be found at here. Government 1. Not sure how this data platform is related to the Copernicus Open Data. 2. NORA3 is the downscaling of ERA5, hosted on the Copernicus Climate Data Store.
EDP Open Data Sixteen datasets in the Open Wind Farm dataset.  Detailed information see a separate page. Yes, but a sign-in is needed. Once register and sign in, no need of further request. Please see the references in the separate page Industry Reuse permission:  The first 15 datasets follow the open reuse license (CC-BY-SA), but the reuse of the last dataset is restricted.
EIA Data The US Energy Information Administration (EIA) provides data for electric power plants, capacity, generation, fuel consumption, sales, prices and customers. Time resolution: monthly.  Historic data goes back to 1970. Data can be directly downloaded. Some analysis can be found at a companion page. Government This site includes data from all sorts of power generation. Wind power is just one of them.
ELEXON BRMS Data This data site provides operational data for UK electricity market including Electricity Data Summary, REMIT, Transparency Data, Transmission data, Generation data, Demand data (day-ahead forecast, renewables power forecast), Balancing data. The data can be access using an API for near real time access. Free download of data connected to usage license. As this data is free to use, multiple studies have used the data, such as this example. Mixed, mainly industry This site includes a glossary and a software to help filter the data.
Elia Belgium Open Data Sixty datasets are arranged in six themes: balancing, transmission, power generation, congestion management, load, studies.  Specifically, the wind power generation data can be found through Link1 and Link 2.  For the wind power data, the datasets include actual measured power and three sets of forecasts – most recent, day-ahead, and week-ahead. Data available from January 2017 up to present. Time granularity: 15 minutes. All data immediately available This web page houses a number of studies and report. Industry Resume permission: Elia Open Data License is CC BY 4.0.
EMODnet EMODnet is a long-term, marine-data initiative funded by the European Maritime and Fisheries Fund. The map page lets the user discover and access data per theme, platform, recording age, depth, provider, area. For each connected dataset/platform, a dedicated platform page is available. These pages provide the user with metadata, plots, download features, platform products – e.g. monthly averages or wind plots – more info and links, as well as statistics on the use of the data from that platform. Data quality information is available in connection with datasets. Data can be directly downloaded. Research Users are advised to visit the summary information page of EMODnet.
ENERGYDATA.INFO This site houses 879 datasets relevant to the energy sector, including but not limited to wind energy.  This site is maintained by the World Bank Group. The datasets cover many different countries in the world.  They are wind speed measurements, wind power potentials, electricity transmission network, solar irradiance, terrain/geo information data, hydropower, among other things. Most of data are open and free to download.  Each dataset’s license is clearly stated. Four of 879 datasets are labeled as “Not Open Data”.  Open datasets can be directly downloaded. Mixed stakeholders but mostly government Nearly 80% of the datasets are in one of the following data formats: xlsx/csv, GeoJSON, and SHP.
ENGIE Open Data The 10-minute environmental measurements and wind turbine power measurements from four 2MW class wind turbines on the La Haute Borne wind farm in north-eastern France. There are two data files, one from 2013 to 2016 and the other from 2017 to 2020. The turbine’s GPS information is also provided. The data file can be directly downloaded. Industry The meaning of the variable names in the data file is explained on the data description page.  The turbine information is on this page.
HErZ Reanalysis Data Hans-Ertel-Centre (HErZ) for Weather Research’s regional reanalyses provide quality‐controlled and homogenized data sets as a basis for the detection and assessment of regional climate change in past and future, the development of applications in various fields as well as the verification and calibration of impact models. Three reanalysis datasets are available on the web site, which have different spatial resolutions, and different space and time coverage. Two of the datasets (COSMO-REA2 and COSMO-REA6) can be directly downloaded.  To download he COSMO-REA12 dataset, one needs to register an account. These links provide references: Ref Page 1, Ref Page 2, and Ref Page 3. Academia File formats: GRIB 1 (COSMO-REA2 and COSMO-REA6), and GRIB 2 (COSMO-REA12).
IEEE DataPort Various datasets from medicine to energy systems After signing up to IEEE, any dataset is currently available (may change in the future, after finishing the test phase and defining cost models) Yes, but only trackable with high research effort Mainly research, but basically anybody with an IEEE account

1. Offers “online” data analysis based on cloud services (amazon based)

2. Anybody with an account can contribute datasets

3. Data challenges, offered

4. Any uploaded dataset and data analysis gets a DOI automatically

5. Connection of dataset, one uploaded to Orcid possible

Open-Power-System-Data

Various data packages about data in the energy sector including 1) time series of grid loading, wind and solar power production and prices; 2) weather data; 3) power capacity installed and more

All data is available for download immediately without registration:

1)  the data as csv-files

2)  metadata as json-file

The list of papers citing the data (which has a DOI) is provided on the website Research, but the datasets are licensed as open data, so also industry can use it, even for profit

1) Also “external” can provide datasets to share with open data license on this platform

2) DOI makes it easy to track the use of the data

3) They list further data platforms and projects: https://open-power-system-data.org/data-projects

KNMI Data Platform (KDP) This is the data platform of the Royal Netherlands Meteorological Institute (KNMI), the Dutch national weather service.  It currently houses 252 datasets, arranged in seven groups: precipitation, climate, wind, temperature, sunshine and radiation, weather forecast, and seismology and acoustics.  Time resolutions vary, from a few seconds to ten minutes. Each dataset’s reuse permission is clearly specified. The vast majority of the datasets are open. Please cross reference Dutch Offshore Wind Atlas website. Government Most of the datasets are in the NetCDF format, followed by the ASCII and HDF5 formats.  The three formats account for nearly 94% of the datasets on the platform.
Marine Data Exchange (MDE) The Crown Estate collected offshore environmental survey datasets from the offshore renewable and marine aggregates industries.  The datasets that are no longer commercial sensitive, labeled as “reached a Financial Investment Decision (FID)”, are made public on the MDE website. The wind resource data won’t be released at least 2 years after the date of data collection and also reached FID. The public data portal lists data in five survey schemes: marine aggregate, research, tidal, wave, and wind. The datasets on public data portal are immediately available and free to download. For some datasets, there are accompanying reports. Industry Grant a non-exclusive non-transferable license.  Also, permission is granted to download and reproduce in hard copy outputs and static digital formats (e.g. PDF, TIFF or jpeg files), not to be used in GIS applications (e.g. webGIS) which can be accessed from outside the licensee business or over an intra / internet.
MassCEC Metocean Data Initiative Two main categories of datasets: the Lidar Data and the Metocean Data. The Lidar data were collected by a vertical profiling WindCube LIDAR located in Massachusetts state waters one mile south of Martha’s Vineyard and near federal offshore wind energy areas. The Metocean data were measured by sensors installed directly onto an Air-sea Interaction Tower, which include: cup anemometers measuring wind speed and direction, temperature, relative humidity, water temperature, air pressure, acoustic doppler current profiler. The data availability is from May 5 2017 through August 31 2020 for Lidar and March 3, 2017 through Sept 1, 2020 for Metocean. One needs to login to an FTP site to view and download the data.  Both the FTP (web access) link and the username/password for logging in are provided on the MassCEC Metocean Data Initiative website. There are reports on the same FTP site, once log in. Academia/Industry Once log in, the raw data are in the folder named “Station_Data”.  The datasets are in text or Excel formats.
Met Éireann Open Data Met Éireann, Ireland’s National Meteorological Service, is the leading provider of weather information and related services in Ireland. Its mission is to monitor, analyze and predict Ireland’s weather and climate and to provide a range of high quality meteorological and related information to the public and to specific customers.  The site houses 2,000 open datasets, including historical measurements, current observations and forecasts. Datasets can be directly downloaded. Government The most popular datasets include: Weather Warnings; Met Éireann Weather Forecast API; Met Éireann Live Text Forecast Data; MÉRA (Met Éireann ReAnalysis) Climate Reanalysis; Latest Observations; and Rainfall Radar.
NASA Earth Data NASA’s Goddard Earth Sciences Data and Information Systems Center (GES-DISC) is one of 12 NASA Science Mission Directorate Data Centers that provide Earth science data, information, and services. This site archives data sets applicable to several NASA Earth Science Focus Areas including: Atmospheric Composition, Water & Energy Cycles, and Climate Variability. As of the last access, the site archives over 149 million files of a total size of over 3,000 TB. Files distributed over 3 billion of data volume over 34,000 TB. A free account registration is required for accessing and downloading data. Government One may discuss with scientists and data specialists via the Earthdata User Forum
National Data Buoy Center The US National Data Buoy Center’s maintains a current database of meteorological and hydrological data, historical data, and written information generated by the National Weather Service or received from other official sources. Typical data include wind speed, air temperature, sea temperature, sea level pressure, peak wind, wind gust, significant wave height, average wave period, and dominant wave period. Data history is dated back to 1990s.  Availability in time and data type varies for different buoy stations. Most datasets are directly downloadable. Government To access the data, go to the interactive map.  Then, click on a Buoy Center.  From them, either click on “View Detail” or “View History”.  Some Buoy Centers do not have “View History”.
National Institute of Wind Energy (India) Data collected by a pulsed Offshore LiDAR that was placed near a 120 m meteorological mast located at Wind Turbine Test Station, Kayathar Tirunelveli district, Tamilnadu, India.  Two years of data are available, from Dec 2017 through Nov 2019, 10-minute average data, for a total of 33GB. Yes, the data can be directly downloaded. There are reports on the same website Academia and Government Two large downloads, each for a year worth of the offshore LiDAR data.
NCEI Climate Forecast Data National Centers for Environmental Information (NCEI) provides access to near real-time historical model data, while real-time data are available on NCEP servers through the Climate Forecast System (CFS) 7-day rotating archive.  Temporal resolution: hourly, spatial resolution: half degree (approximately 56 km).  The Operational Forecasts Time Frame is April 1, 2011—Present and the historical Reforecasts Time Frames: Dec. 12, 1981—March 2011. Data sets can be directly downloaded. Saha et al. 2010Saha et al. 2014. Government File Format: GRIB2 and/or BUFR.
NORA3 The 3 km Norwegian reanalysis (NORA3) data is a freely available high-resolution data set, to be used for offshore wind resource assessment and wind power estimates. NORA3 is a high-resolution atmospheric dynamic downscaling of the ERA5 data from ECMWF. The downscaling of ERA5 is performed by the NWP model HARMONIE-AROME (H-A). Data are dated back to 1984. Time resolution is every 6 hours. Data can be directly downloaded Solbrekke et al. 2021 Academia The data format is .nc
NREL Wind Integration Data National Renewable Energy Lab (NREL)’s wind integration data sets provide the energy industry with a consistent set of wind profiles for the United States. It has two sets of of data.  The Eastern and Western Wind Integration Data Sets have the ten-minute time-series wind data for 2004, 2005, and 2006, and the Wind Integration National Data Set Toolkit provides an update and expansion of the Eastern Wind Integration Data Set and Western Wind Integration Data Set that includes meteorological conditions and turbine power for more than 126,000 sites in the continental United States for the years 2007–2013. Data can be directly downloaded. Follow the instructions on the webpage. References can be found at Study 1, Study 2, and this page. Government Wind Integration Data and Tools is part of the datasets and tools of the Grid Modernization effort of NREL.
NWTC Information Portal This portal has three datasets: the airfoil datasets from the Ohio State University wind tunnel tests, the Met datasets from the 135-meter turbine inflow research towers at NREL’s Flatirons Campus, and the Metocean distribution parameters for three representative sites (a West Coast site, an East Coast site and a Gulf of Mexico site).  The Met data can be traced back to September 23, 1996.  The data collection is being conducted on the ongoing basis and updated with one day delay. The data resolution between Sept 23, 1996 and Aug 23, 2001 is 10 minutes (and also hourly), and the data resolution after Aug 24, 2001 to present is 1 minute (and also hourly). Yes, the data can be directly downloaded. For the airfoil data, a report accompanies each dataset.  For the Metocean distribution parameter data, it is used in Stewart et al. (2015). Government The airfoil data and Metocean parameter data are readily available from the NWTC Portal website. For the Met data, one needs to go to the Measurement and Instrumentation Data Center website and click on the “Daily plots and raw data files” link. The direct link on the NWTC Portal website directs one to a graphic display.
NYSERDA Floating LiDAR Buoy Data Data from two floating LiDAR buoys deployed in the New York Bight: Hudson South and Hudson North.  Data resolution: ten minutes and hourly. Datasets can be directly downloaded.  Email address is asked for but providing it is optional. On the same website lists DNV reports and monthly data reports. Industry On March 6, 2022 when the website is accessed, the raw datasets listed are only for March 1, 2022 (both Hudson North and Hudson South) and Sept 21, 2021 (only for Hudson North).
PANGAEA Data Sharing Platform Pandaea is a public data sharing platform.  In spirit, it is similar to Zenodo, but Pandaea concentrates on data related to earth and environmental science. At the time of access (March 6, 2022), it houses a total of 409,494 datasets.  The best way to search is to click “Search” on the website upper-right corner.  The available datasets are arranged by topics and locations.  A few examples.  A Doppler Wind-LiDAR dataset at the offshore wind farm “Global Tech I” in the German North Sea and a vertical profiles of wind speed and wind direction data from 40 to 500 m at Braunschweig, North German Plain, Lower Saxony, Germany. Almost all datasets are directly downloadable. Some of the papers that used the datasets are listed on the dataset webpage. Mixed, mostly academia. Permission is specified for each dataset. Most datasets are granted with public access license CC-BY.
RVO Offshore Wind Farm Data The Netherlands Enterprise Agency (RVO) collected the site data for the following wind farm zones in the North Sea: Borssele, Hollandse Kust (zuid), Hollandse Kust (noord), Hollandse Kust (west), Ten noorden van de Waddeneilanden (TNW) and IJmuiden Ver. For each wind farm zone, the data, the data is organized in four aspects: General information: Introduction to the wind farm zone, Project and Site Description, Maps, GIS Viewer, Revision Log and Q&A Log; Obstructions: UXO and Archaeology; Soil: Geological desk study, geophysical and geotechnical surveys and morphodynamics; Wind & Water: Wind Resource Assessment, Metocean desk study and Metocean measurement campaign. Data can be directly downloaded. Companion reports and presentations can be found at Page 1 and Page 2 Government An overview of the wind farm zones is available here.
Spain Hourly Energy Generation & Forecasting Data This dataset contains 4 years of hourly data for electrical consumption, generation, pricing, and weather in Spain and the respective forecasts by the Transmission Service Operator (TSO).  The time span is from Jan 1 2015 through Dec 31 2018. One needs to sign in to Kaggle and then download the datasets.  A public use license is granted. There are codes, discussions and comments on the same website. Individual There are two data files, both in csv format.
Tall Tower Datasets The Tall Tower Dataset is an open initiative aiming to boost the utilization of hub-height (around 100 meters above ground) wind observations. The data resolution varies, from 1-minute to hourly. The datasets come from 181 different tower sites all over the world, some of which may overlap with the other open data sources listed in this table. The datasets include five principal measurements: wind speed, wind direction, temperature, relative humidity, pressure. Most of the datasets can be directly downloaded. The reuse permission is CC BY-NC 4.0 (no commercial use) Publications that used or are benefited from the datasets are listed on the website. Academia

1. A note on Technical Information about the datasets is available, providing much needed background and descriptive information.

2. The data format used on this website is NetCDF (.nc).

Turkish Wind Turbine Data A single turbine in Turkey (precise location unknown), 10-minute data, five data channels: Date/Time, Active Power, Wind Speed, Anticipated Power using OEM’s power curve, and Wind Direction.  The data covers the year of 2018. There is a total of 50,530 data records, about 96% of the whole tear ten-minute data. One needs to sign in to Kaggle to download the data.  Signing in to Kaggle is free. There is a code board and a discussion board associated with the data in Kaggle. Unknown This is a relatively small dataset, only 4 MB in csv format.
UCSD Coastal Data Coastal Data Information Program (CDIP) is an extensive network for monitoring waves and beaches along the coastlines of the United States.  Time resolution is 30 minutes. Data are dated back to 1975. Datasets are organized by station, which is defined as a location where CDIP maintains sensors and collects wave and climatological data. These stations are named according to their geographic locales. Data can be directly downloaded from an HTTP server. Data can also be accessed programmatically by Python. There are CDIP models on the website. Academia The data format is netCDF.
Windcube 400S Dynamic Scanning Data Set Radial wind velocity measurements from a scanning-head Doppler lidar (Windcube 400S) collected over a year-long measurement campaign at the Mount Mercer wind farm (Western Victoria, Australia). All timestamps in the data set are expressed in UTC. The raw data is in a zip file of 57GB. One single zipped file that is directly downloadable. Used in an Energies paper by Pichault et al. (2021). Academia Downloading takes a long time.
Wind Turbine Power Performance Data The 10-minute environmental measurements and wind turbine power measurements for the whole year of 2017.  The wind turbine is 2.5 MW Clipper Liberty turbine.  There is an associated 130 m met mast which spans the entire vertical diameter of the turbine rotor. The datafile has 52,416 data rows (plus a header row) and 726 columns for the environmental and power variables (plus two timestamp related columns). The data were obtained at the Eolos Wind Research Station operated by the University of Minnesota. The data file can be directly downloaded. Brian Davison’s Thesis (2019). Academia The on-site file, structure.csv, provides the explanation of the 726 environmental and power variables in the raw data file.
Zenodo Data Sharing Platform

Similar to IEEE Data Portal, Zenodo houses a large number of wind or wind energy related datasets.  These datasets are not centrally organized, but will have to be searched for individually.  For instance, Australian Energy Market Operator, Data Science for Wind Energy Community (some overlap with other datasets mentioned above), HomogWS-se, Kelmarsh wind farm data, MAR-ERA5MERIDA, NEWA, Pays d’Othe (France) Wind Farm, Penmanshiel wind farm data, Sisal-Yucatán vertical wind, UEPS And UEBB datasets.

New additions:

Björkö Wind Turbine Version 1 (45kW) high frequency Structural Health Monitoring (SHM) data.

All data immediately available. Many datasets are associated with a publication, which used that specific dataset in the study.  For instance, the UEPS and UEBB datasets are used in the study of Barber and Nordborg (2020). Mixed stakeholders, as data contributors are from many different sectors (government, industry, academia, individuals). At Zenodo, one can do a dataset search using different keywords or keyword combinations.  For instance, if one uses “wind” to search, one gets this list.  If one uses “wind lidar”, one gets this list.

Join us

Join us here

Contact us

sarah.barber@ost.ch

Contact us

shawn.sheng@nrel.gov

Follow us

Follow us on LinkedIn