Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Info

It's a pleasure to present you Welcome to our third newsletter. We try to keep the release schedule close to one, not exceeding two months, balanced between being informational and not too chatty.Apart from the regular project progress and IT news, there are quite some chapters on policies that will affect how observations will be done and what is required to access data in the future. There's also a section on licensing issues of data no longer embargoed.Taras Yakobchuk introduces the new tool he is developing for visualizing and analyzing calibrated GRIS/GREGOR data. The tool is not only intended to help experts analyzing data offered by SDC, but it should also allow access to laypersons who are not experts in dealing with this type of data.We would like to encourage you to openly comment on any parts. Feedback is always welcome and helps us to deliver a better product.

Table of Contents
maxLevel3
minLevel2

Editorial

📰 Editorial

🔒 What goes here?  

Peter Caligari

Project Status

SDC Project Status 03-2021

Petri Kehusmaa

Inc drawio
zoom1
simple0
pageId219152389
custContentId218824807
lbox1
diagramDisplayNameUntitled Diagram.drawio
hiResPreview0
baseUrlhttps://leibniz-kis.atlassian.net/wiki
diagramNameUntitled Diagram.drawio
imgPageId212008969
pCenter0
aspect12e1b939-464a-85fe-373e-61e167be1490 1
width924
includedDiagram1
aspectHashe2dc1af971366644803e9ec7f18c9983475ec2d4
linksauto
tbstyletop
height421

Solution Analysis and Design Phase steps.

SDC Project team has been working hard to find the best possible hardware and software components to build a robust platform for the solar community. The project has now entered a phase where we are creating a detailed solution design meaning that we have already identified many technical pieces which are going to be included in the final version of SDC. The team is now trying to find the best possible ways to integrate these different pieces together. This means a lot of investigations on technical details and testing different scenarios.

📋 Summary 06 Apr

Regretfully, we are rather approaching a two-month release schedule than the initially envisioned one-month plan — much work going on in parallel. 

Hopefully, you will still find this newsletter helpful and informative. The tools section is updated, and IT news is up to date. 

As always, we would like to encourage you to comment openly on any subject or raise new topics to which you think we do not pay enough attention in the context of SDC. Feedback is always welcome and helps us to deliver a better product.


Table of Contents
maxLevel3
minLevel2

Editorial


📰 Editorial


📜Aperio

Traditionally all data processing in solar physics is typically done on files. While this option will prevail in SDC, it is not the best way to deal with large data sets, where computations need to be done where the data resides and not vice versa. To interface data in SDC programmatically, APIs are needed for the most common programming languages like Python and IDL. 

We are pleased that we could win Aperio Software to develop a Python API for SDC. Aperio Software is heavily involved in the development of SunPy and Astropy, a community effort to develop Python packages for Solar Physics and Astronomy. Drew Leonard, one of the founders of Aperio, developed the prototype for the VTF pipeline. The contract divides into a design and implementation phase. During the former, Drew will clarify what is expected from the future API and what requirements it must meet through workshops and one-on-one meetings. Expect the first version for mid-2022. 

Project Status


SDC Project Status 03-2021 (06.07.2021)


Petri Kehusmaa

Solution Development and Integration Phases.Image Added

Solution Development and Integration

The project has now shifted into a phase where we are building the actual SDC platform and creating/acquiring all necessary components. These components are in-house developed software for instrument pipelines and analysis, compute, network and storage hardware, middleware (RUCIO, Kubernetes, Docker, etc.), and governance/management/documentation software like Jira Service Management and Confluence.

There is still some work to be done to find all suitable solution components and thus shaping the final scope of SDC. We aim to build SDC as a service platform for the solar community with a continuous focus on users and platform development.

📋 Summary

Current project health

Current project status

Project constraints

 

Status
colourGreenYellow
titleGREEN

“Create Detailed Solution Design” phase in progress.

YELLOW

Finalizing some tasks for solution design and creating solution components. 

Governance model not finalized and implementation not started yet.

Resources and their availability.

Technology POCs taking more time than predicted.

 📊 Project status

Tip

Accomplishments

  • High-level solution design

  • Started collecting data policies

  • Clarified embargo policies

  • Listed essential use cases for SDCSome software components created (GRIS Viewer)

  • The hardware acquisition process started

  • RUCIO test environment established

 

Next steps

  • Continue selecting solution components and planning component integrationscreating solution components

 

Warning

Risks & project issues

  • Lack of resources

  • Resource availability

  • Multiple process implementations at the same time

  • No agreed governance model

Governance


👩‍⚖️ Policies, Frameworks & Governance

📜

Products & Tools

🛠 SDC Products & Tools


Standardized GRIS Pipeline
  • ITIL v4 process model going to be partially adopted for service management purposes

  • Data policies definition started

  • SDC governance model and scope to be decided

Products & Tools


🛠 SDC Products & Tools


Standardized GRIS Pipeline

Carl Schaffer (Unlicensed)

The GRIS reduction pipeline was merged to a common version in collaboration with M. Collados (IAC, GRIS PI). The version running at OT and Freiburg now both produce data which that is compatible with downstream SDC tools. The latest version of the pipeline can always be found on the KIS GitLab server. The current OT version will be synced to the ulises branch and merged into the main production branch periodically.

SDC

data archive

GRIS VFISV-Inversion pipeline

Vigeesh Gangadharan

A pipeline code for performing Milne-Eddington inversions of GRIS spectropolarimetric data is now available at,

Storage

https://sdcgitlab.leibniz-kis.de/

Get access to data from GRIS/GREGOR and LARS/VTT instruments and the ChroTel full-disc telescope at OT.

Updates as of July 2021

  • The detail pages for observations have been reworked see an example here:

    • Added dynamic carousel of preview data products

    • Added flexible selection for downloading associated data

  • VFISV inversion results have been added for most of the GRIS observations. The website now includes information on line of sight velocity and magnetic field strength

  • Development process has streamlined:

    • automated test deployments for quicker iterations and fixes

    • Changes to the UI will occur in regular sprints. We’re currently collecting ideas here

  • Added historic ChroTel data for 2013, thanks to Andrea Diercke from AIP for contacting us and providing us with this supplemental archive.

Conferences & Workshops

📊 Conferences & Workshops

Nazaret Bello Gonzalez

Forthcoming Conferences/Workshops of Interest 2021

Every second Thursdays, 12:30-13:30 CET

PUNCH Lunch Seminar (see SDC calendar invitation for zoom links)

  • 11 Feb 2021: PUNCH4NFDI and ESCAPE - towards data lakes

  • 25 Feb 2021: PUNCH Curriculum Workshop

April week 12-16 (3 days, TBD)

ESCAPE WP4 Technology Forum 

June 01-02 (16:00 - 17:30)

15th International dCache Workshop

June 10-11

3th International Workshop on Science Gateways |  IWSG 2021

Topics:

  • Architectures, frameworks and technologies for science gateways

  • Science gateways sustaining productive collaborative communities

  • Support for scalability and data-driven methods in science gatewayS

  • Improving the reproducibility of science in science gateways

  • Science gateway usability, portals, workflows and tools

  • Software engineering approaches for scientific work

  • Aspects of science gateways, such as security and stability

June 28, 2021:

Data-intensive radio astronomy: bringing astrophysics to the exabyte era

Topics: 

  • Data-intensive radio astronomy, current facilities and challenges

  • Data science and the exascale era: technical solutions within astronomy

  • Data science and the exascale era: applications and challenges outside astronomy

SDC participation in Conferences & Workshops

Nov. 26, 2020:

2nd SOLAR net Forum Meeting for Telescopes and Databases

Talk:  Big Data Storage -- The KIS SDC case, NBG, PC & PK, 2nd SOLARNET Forum (Nov 26)
Nazaret Bello GonzalezPetri Kehusmaa Peter Caligari

SDC Collaborations

🤲 SDC Collaborations

 Nazaret Bello Gonzalez

SOLARNET https://solarnet-project.eu

KIS coordinates the SOLARNET H2020 Project that brings together European solar research institutions and companies to provide access to the large European solar observatories, supercomputing power and data. KIS SDC is actively participating in WP5 and WP2 in coordinating and developing data curation and archiving tools in collaborations with European colleagues.
Contact on KIS SDC activities in SOLARNET: Nazaret Bello Gonzalez nbello@leibniz-kis.de

 ESCAPE https://projectescape.eu/

KIS is a member of the European Science Cluster of Astronomy & Particle Physics ESFRI Research Infrastructures (ESCAPE H2020, 2019 - 2022) Project aiming to bring together people and services to build the European Open Science Cloud. KIS SDC participates in WP4 and WP5 to bring ground-based solar data into the broader Astronomical VO and the development tools to handle large solar data sets. 

Contact on KIS SDC activities in ESCAPE: Nazaret Bello Gonzalez nbello@leibniz-kis.de

 

EST https://www.est-east.eu/

KIS is one of the European institutes strongly supporting the European Solar Telescope project. KIS SDC represents the EST data centre development activities in a number of international projects like ESCAPE and the Group of European Data Experts (GEDE-RDA).

Contact on KIS SDC as EST data centre representative: Nazaret Bello Gonzalez nbello@leibniz-kis.de

 

PUNCH4NFDI https://www.punch4nfdi.de

KIS is a participant (not a member) of the PUNCH4NFDI Consortium. PUNCH4NFDI is the NFDI (National Research Data Infrastructure) consortium of particle, astro-, astroparticle, hadron and nuclear physics, representing about 9.000 scientists with a Ph.D. in Germany, from universities, the Max Planck Society, the Leibniz Association, and the Helmholtz Association. PUNCH4NFDI is the setup of a federated and "FAIR" science data platform, offering the infrastructures and interfaces necessary for the access to and use of data and computing resources of the involved communities and beyond. PUNCH4NFDI is currently competing with other consortia to be funded by the DFG (final response expected in spring 2021). KIS SDC aims to become a full member of PUNCH and federate our efforts on ground-based solar data dissemination to the broad particle and astroparticle communities.

Contact on KIS SDC as PUNCH4NFDI participant: Nazaret Bello Gonzalez nbello@leibniz-kis.de & Peter Caligari mailto:cale@leibniz-kis.de 

 IT news

🖥 IT news

Peter Caligari

Ongoing & Future developments

Webpage

Status
colourYellow
titleKIS
The design of the new website is essentially complete. Final technical adjustments are currently being made to Typo3. The website is already running at the final (VM-ware) server at KIS and is available at the web address:

https://newwww.leibniz-kis.de

After the content has been moved, the server will be renamed to http://www.leibniz-kis.de , and the old site will be shut down.

One of the reasons for the relaunch was to present our content more adapted to the particular browsers used by people with disabilities. This requires specific fields to be filled in in the back-end so that the page content can be appropriately classified. We will have an editor training on 13. and 14.07.21 about these points and the back-end's general handling.

Network

Status of the dedicated 10 Gbit line between KIS & OT

Status
colourYellow
titleKIS
Status
colourPurple
titleOT
The missing network equipment for the end at KIS will be installed in the second week of July. We will then try to establish the link remotely from Freiburg with the help of personnel at the telescopes.

Test of (application) firewalls at KIS

Status
colourYellow
titleKIS
Status
colourPurple
titleOT
Firewall testing at KIS is terminated. We chose between three

sdc/grisinv

The pipeline uses the Very Fast Inversion of the Stokes Vector (VFISV, Borrero et al. 2011) code v5.0 (node for spectrograph data) as the main backend to carry out a Milne-Eddington Stokes inversion for individual spectral lines.
The current implementation of the pipeline is a Python MPI wrapper around the VFISV code to easily work with the GRIS data. The inversion for the desired spectral line is performed using VFISV and the buffer with the inversion results is communicated to the Python module. The Python module propagates the keywords from level 1 (L1) and packages the inversion results and outputs a FITS file (when used as a command-line interface) or returns an NDarray (when called within a python script).

For more information on installing and using the pipeline, check the above GitLab repository.

Please report any issues with the code using the link below,

https://gitlab.leibniz-kis.de/sdc/grisinv/-/issues/new?issue

SDC data archive

https://sdc.leibniz-kis.de/

Get access to data from GRIS/GREGOR and LARS/VTT instruments and the ChroTel full-disc telescope at OT.

Updates as of July 2021

  • The detail pages for observations have been reworked see an example here:

    • Added dynamic carousel of preview data products

    • Added flexible selection for downloading associated data

  • VFISV inversion results have been added for most of the GRIS observations. The website now includes information on line of sight velocity and magnetic field strength

  • The development process has streamlined:

    • automated test deployments for quicker iterations and fixes

    • Changes to the UI will occur in regular sprints. We’re currently collecting ideas here

  • Added historic ChroTel data for 2013, thanks to Andrea Diercke from AIP for contacting us and providing us with this supplemental archive.

GRISView

Taras Yakobchuk

GRISView is a new visualization and analysis tool to work with GRIS/GREGOR calibrated datasets as distributed by the SDC website. It is written in Python with GUI made using Qt cross-platform framework.

Image Added

Currently implemented features include:

  • Quick panning and zooming of map images and spectra using mouse

  • Multiple POI (point-of-interest) and ROI (rectangle-of-interest) for easy inspection of spectral changes across the map

  • Distance measurement between multiple map points given in different units

  • Intensity profile plots along a given line segment, linking several profiles for radial profiles checking

  • Interactive color bars used to view histogram, adjust image contrast, select and modify the viewing color scheme

  • Generating contours for map images, easy levels adjustment, and color setting

  • Browsing spectra with cursor moving using keyboard and mouse shortcuts, quick navigation using marker list

  • Relative scale for quick wavelengths difference evaluation at the cursor position

  • Viewing observation FITS files headers

  • Support for both individual observations and time-series

Next, it is planned to add the following:

  • Exporting current spectra and map plots as images and data files

  • Derived quantities visualization e.g. Q/I, V/I, DOLP (degree of linear polarization) etc.

  • Various normalizations of spectra e.g. to a selected signal level, local continuum, quiet Sun

  • Spectral line fitting and line parameters determination

  • Saving and restoring working sessions

Info

Feedback welcome

We strongly encourage all colleagues to try out this new tool and provide feedback. Instructions for installing and using the program can be found on the tool's GitLab page:

https://gitlab.leibniz-kis.de/sdc/gris/grisview

Please report any issues and bugs on the program GitLab page or using the direct link:

https://gitlab.leibniz-kis.de/sdc/gris/grisview/-/issues/new?issue

Conferences & Workshops


📊 Conferences & Workshops


Forthcoming Conferences/Workshops of Interest 2021

Every second Thursday, 12:30-13:30 CET (currently on summer break)

PUNCH Lunch Seminar (see SDC calendar invitation for zoom links)

KIS internal Typo3 Editors' training

July 13 & 14, 2021, 10:00 - 12:00 CEST registration needed!

SDC Collaborations


🤲 SDC Collaborations


 Nazaret Bello Gonzalez

SOLARNET https://solarnet-project.eu

KIS coordinates the SOLARNET H2020 Project that brings together European solar research institutions and companies to provide access to the large European solar observatories, supercomputing power and data. KIS SDC is actively participating in WP5 and WP2 in coordinating and developing data curation and archiving tools in collaborations with European colleagues.
Contact on KIS SDC activities in SOLARNET: Nazaret Bello Gonzalez nbello@leibniz-kis.de

 ESCAPE https://projectescape.eu/

KIS is a member of the European Science Cluster of Astronomy & Particle Physics ESFRI Research Infrastructures (ESCAPE H2020, 2019 - 2022) Project aiming to bring together people and services to build the European Open Science Cloud. KIS SDC participates in WP4 and WP5 to bring ground-based solar data into the broader Astronomical VO and the development tools to handle large solar data sets. 

Contact on KIS SDC activities in ESCAPE: Nazaret Bello Gonzalez nbello@leibniz-kis.de

 

EST https://www.est-east.eu/

KIS is one of the European institutes strongly supporting the European Solar Telescope project. KIS SDC represents the EST data centre development activities in a number of international projects like ESCAPE and the Group of European Data Experts (GEDE-RDA).

Contact on KIS SDC as EST data centre representative: Nazaret Bello Gonzalez nbello@leibniz-kis.de

 

PUNCH4NFDI https://www.punch4nfdi.de

KIS is a participant (not a member) of the PUNCH4NFDI Consortium. PUNCH4NFDI is the NFDI (National Research Data Infrastructure) consortium of particle, astro-, astroparticle, hadron, and nuclear physics, representing about 9.000 scientists with a Ph.D. in Germany, from universities, the Max Planck Society, the Leibniz Association, and the Helmholtz Association. PUNCH4NFDI is the setup of a federated and "FAIR" science data platform, offering the infrastructures and interfaces necessary for the access to and use of data and computing resources of the involved communities and beyond. PUNCH4NFDI has been granted funds and will start officially its activities on October 1, 2021. KIS SDC aims to become a full member of PUNCH and federate our efforts on ground-based solar data dissemination to the broad particle and astroparticle communities.

Contact on KIS SDC as PUNCH4NFDI participant: Nazaret Bello Gonzalez nbello@leibniz-kis.de & Peter Caligari mailto:cale@leibniz-kis.de 

 IT news


🖥 IT news


Peter Caligari

Ongoing & Future developments

Webpage

Status
colourYellow
titleKIS
The design of the new website is essentially complete. We are currently making some final technical adjustments to the webserver and Typo3. The website is already running at the deployment (VM-ware) server at KIS and is already publicly available at the web address:

https://newwww.leibniz-kis.de

After the content has been moved, the server will be renamed http://www.leibniz-kis.de, and the old site will be shut down.

One of the reasons for the relaunch was to increase support of the particular browsers used by people with disabilities. This requires specific fields in the back-end to be filled in so that the page content can be appropriately classified. We will have a training course on handling the typo3 back-end in general, focusing on the above points on 

July 13 & 14, 2021, 10:00 CEST (Editors' training)

We currently plan to avoid any user login in the front end. This would allow us to not have to use cookies at all, rendering the need to use these annoying GDPR popups obsolete. However, this means we might not have any restricted areas on the website at all (including an Intranet)! This is a radical approach, and we might not be able to stringently follow through with this (see below). In that case, the Intranet on the website will be limited to purely informational pages; any documents now downloadable on the old website should be migrated to the cloud (wolke7). Anyhow, Typo3 allows hosting multiple websites under a single installation sharing the basic design and resources. Therefore, any websites requiring user registration and login (like the Intranet or a possible OT-webpage) might be built as separate websites, keeping the publicly accessible website login-free. 

Network

Status of the dedicated 10 Gbit line between KIS & OT

Status
colourYellow
titleKIS
Status
colourRed
titleSDC
We are currently in the process of ordering the first storage node for SDC. This node consists of a DELL R740XD2 (2 CPUs, 24x16 TB disks, 10 Gbit Ethernet). The price (including VAT) per usable TB storage is of the order of 160€/TB.

We will use this machine as a test-bed for the technology envisioned for SDC and already have all raw-data from observations in 2021 as well as the large files from simulations accumulating on mars stored there.

SDC will consist of at least 4 similar nodes. This is the first one. As soon as the remaining hosts are setup we will move any data still on this first host to the new SDC cluster and join it to the latter, also.

Status
colourRed
titleSDC
In parallel, we are looking into outsourcing seldomly accessed files to the public cloud. Within the framework of the SDC, it is planned to use the latter mainly to flexibly cover short-term peaks in demand. 

The costs per TB of storage space in the cloud are strongly dependent on capacity and, above all, the access pattern. They vary between approx. 60-200 €/TB/a. Access-independent models, in which only a fixed fee is charged per stored GB, but no fees for downloading or uploading, are at the upper end of this scale. At the lower end are public providers such as Amazon, Google and Microsoft, which charge a relatively high fee for each type of data access in addition to the (relatively cheap) price of simple storage.

Additionally, licence fees of a similar magnitude for the software that moves files between the cloud and the local storage at the KIS are required. 

We are currently obtaining concrete offers to outsource 100 TB for 1 year to a public cloud. The pricing models are so complicated that we can determine the resulting costs only through a limited real-world test. 

We will intentionally design the integration so that it will become apparent to all users which files are in the cloud and which are not. Although this is cumbersome (and artificially induced), we deem this awareness essential (at least initially, where we have no experience of the potential costs involved). The exact model is still to be worked out, and we will inform you about it again in due course. 

Status
colourPurple
titleOT
The two new nodes for jane arrived at OT. The installation will be done as soon as either Peter Caligari can travel there or we get a technician of DELL up to the telescopes. Due to Covid-19, the time scale for this installation remains unclear. We will keep you informed.
Purple
titleOT
The missing network equipment for the end at KIS will be installed in the second week of July. We will then try to establish the link remotely from Freiburg with the help of personnel at the telescopes.

Test of (application) firewalls at KIS

Status
colourYellow
titleKIS
Status
colourPurple
titleOT
Firewall testing at KIS (see https://leibniz-kis.atlassian.net/l/c/rF8kmXjv ) has terminated. Two manufacturers are still being considered, and a final choice will be made as soon as possible.

We (IT) still very much advocate going for high-availability setups for KIS and OT (in Freiburg) because KIS will host a significant part of SDC and OT because there's no trained personnel on-site, and replacements to the Canary islands take time).

Storage

Status
colourYellow
titleKIS
Status
colourRed
titleSDC
We are currently setting up one DELL R740XD2 as a (fake) dCache cluster running two (redundant) dCache pools on VM-ware. This host serves as a testbed to simulate hardware and network failures in the dCache cluster to come while providing a failure-tolerant (hopefully) net capacity of about 100 TB to KIS, alleviating the currently pressing storage shortage.

Starting in July, six more comparable hosts will be purchased through a public tender. These will have a similar setup and form storage Tier1 (near-line) of SDC at KIS. We expect the hosts to arrive in late September.

We use ZFS on virtualized Debian servers as a basis for the individual dCache-nodes. ZFS uses copy-on-write and checksums any blocks on disk and provides auto-healing. Zpools will most probably use RAIDZ or RAIDZ2, and any file will reside on at least 2 different servers. At the time of this writing, the only other file system offering similar features is BTRFS, but support for BTRFS was recently pulled from some major distributions (e.g. CentOS, the distro that has mainly been used at the KIS so far).

Status
colourRed
titleSDC
The 100 TB space on Microsoft Azure for cold data still needs some configuration. As of today, the third-party software responsible for moving files between our Isilon storage cluster (mars) and the cloud has problems doing so from Linux clients in a satisfactory way. The manufacturer of the software is working on the issue.

Current Resources

Compute nodes

hostname

# of CPUs & total cores

ram [GB]

patty

Status
colourYellow
titleKIS
, legs & louie
Status
colourYellow
titleKIS
(installed but not publicly available yet. Nearly there…)

2 x AMD EPYC 7742, 128 cores

1024

itchy & selma

Status
colourYellow
titleKIS

4 x Xeon(R) CPU E5-4657L v2 @ 2.40GHz, 48 cores

512

scratchy

Status
colourYellow
titleKIS

quake &halo
Status
titleKIS/seismo

hathi

Status
colourPurple
titleOT

4 x Intel(R) Xeon(R) CPU E5-4650L @ 2.60GHz, 32 cores

512

Central storage space

Total available disk space for /home (

Status
colourYellow
titleKIS
Status
colourPurple
titleOT
), /dat (
Status
colourYellow
titleKIS
Status
colourPurple
titleOT
), /archive (
Status
colourYellow
titleKIS
), /instruments (
Status
colourPurple
titleOT
)

name

total [TB, brutto]

free [TB, brutto]

mars

Status
colourYellow
titleKIS

758

39

quake

Status
titleKIS/seismo

61

0

halo

Status
titleKIS/seismo

145

44,5

jane

Status
colourPurple
titleOT

130 (-> 198)

23


 References

📎 References

Products & Tools

Forthcoming Conferences/Workshops

Collaborations