A selection of pre-SKA public data sets are available on an Amazon Web Services Public Data Archive.
About the Data:
Observations with the Murchison Widefield Array, a Square Kilometer Array (SKA) precursor in Western Australia. This particular dataset is from the Epoch of Reionization project to detect signatures of the first stars and galaxies forming and the effect of these early stars and galaxies on the evolution of the universe, a key science driver of the SKA. Nearly 2PB of such observations have been recorded to date, this is is a small subset of that which has been exported from the MWA data archive in Perth.
Location:
The data are in s3://mwapublic, which is an s3 bucket on AWS. The data are all in the uvfits format, which can be read using many radio astronomy analysis packages including pyuvdata (https://pyuvdata.readthedocs.io). Also included are metadata files with detailed telescope metadata in a FITS binary table format, called metafits files. Much of the metadata are also in the uvfits files, but the metafits files may contain more details of interest to the advanced user. Metafits files can be read using many FITS readers, including astropy.io.fits.
Detailed Data description:
The following is a high-level description of the files and folders in s3://mwapublic/uvfits:
- Total size 114.9 TB
- s3://mwapublic/uvfits/4.1
- Observations from the 2013-2014 EoR observing season (high-band, EoR0 field) that were used in Beardsley et al. 2016 (10.3847/1538-4357/833/1/102)
- 6804 observations lasting 2 minutes each
- Data taken between August 2013 – April 2014
- High-band data: frequencies 167-198 MHz
- EoR0 Field: Centered on RA 0, Dec -27 degrees
- Preprocessed with the Cotter package to remove RFI and instrument systematics
- s3://mwapublic/uvfits/5.1
- Data taken for a survey of the diffuse emission above the horizon during EoR0 observations
- 928 observations lasting 2 minutes each
- Observations were taken in November 2015
- Cover RA = -50-150 degrees and Dec=-55-5 degrees
- Includes 35 unique pointings
- High-band data: frequencies 167-198 MHz
- Preprocessed with the Cotter package to remove RFI and instrument systematics
- s3://mwapublic/uvfits/5.1/SSINS
- 917 observations (matching files in the enclosing folder)
- Further processed with the SSINS package (https://github.com/mwilensky768/SSINS) to remove very faint RFI
- s3://mwapublic/uvfits/5.2
- Observations from the 2013-2014 EoR observing season, including low-band and high-band and EoR1 and EoR2 fields
- 6764 observations lasting 2 minutes each
- Data taken between August 2013 – April 2014
- High-band data: 167-198 MHz
- Low-band data: 139 -170 MHz
- EoR1 Field:
- EoR2 Field:
- Preprocessed with the Cotter package to remove RFI and instrument systematics
- s3://mwapublic/metafits
- Contains the “metafits” metadata files
- Total size 380 MB
- Subfolders 4.1 (43.6 MB), 5.1 (38.3 MB), and 5.2 (298.1 MB) contain the metafits corresponding to the data in the s3://mwapublic/uvfits subfolders
Accessing the data on AWS:
First create an AWS access key:
- In the AWS console (upper right), <user> → My Security Credentials → Users
- Click on <user> → Security credentials
- You can only see the Secret Access Key when you make a new access key
- There is a limit of 2 access keys, so delete one and create a new one if you already have 2
- Copy the Access Key and Secret Access Key to a safe place; you can’t see the Secret Access Key again after it’s been created
Next install the AWS CLI
apt-get install awscli or conda install awscli
aws configure
- AWS Access Key = key
- AWS Secret Access Key = secret key
- Default region name = us-east-1
- Default output format left blank
To download data, use aws s3 cp s3://path/to/data path/to/destination
To ls the bucket, use aws s3 ls s3://path/to/data