Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

SDC got fundet

Table of Contents
maxLevel3

SDC got funded

Svetlana Berdyugina

The establishment of the Science Data Center (SDC) as the third strategic pillar of KIS is an important strategic development of the institute because the SDC connects the two other pillars, the fundamental research and observatory operation. Establishing the SDC was identified as one of the strategic goals of KIS yet in 2013, and it was one of the recommendations of the Leibniz Evaluation in 2015. Our funding application to the national Joint Scientific Conference (GWK) for the strategic expansion of KIS dedicated to the SDC was finally approved on November 13, 2020. From 2021, 1.4 M€ per year will be allocated to the SDC. This will lead to a noticeable staff count increase, an innovative computing infrastructure, as well as broadening scientific collaborations in the area of scientific computing, big data, and artificial intelligence. In addition to enhancing scientific activities at KIS, the SDC will become a valuable big-data infrastructure in Germany and beyond.

📸 Introduction to the SDC Team

The tasks at the start of SDC are manifold. They range from establishing standards and policies, curating and injecting data from the OT and other sources, setting up the necessary IT infrastructure, to creating a new project plan in record time.

As of January 2021, this challenge is being met by twelve members with overlapping responsibilities and expertise (at least two more engineers are expected to join in the course of the year2021).Strategic decisions are taken by the board, consisting of Svetlana Berdyugina. as PI, Petri Kehusmaa for management & project planning, Nazaret Bello Gonzales leading the scientific efforts in SDC, and Peter Caligari as overall project head & responsible for IT.

Image Added

Name (% FTE in SDC)

Responsibility

Svetlana Berdyugina (10%)

Status
colourGreen
titleBoard
Status
colourYellow
titleScience

Project PI, internal & external collaborations, high-level data

Janek Beck (100%)

Status
colourRed
titledevops

IT specialist trainee, application and pipeline development

Nazaret Bello Gonzalez González (50%)

Status
colourGreen
titleBoard
Status
colourYellow
titleScience
Status
colourPurple
titleData

Project Lead Scientist, data calibration, curation & dissemination, high-level data, internal and external scientific collaborations, workshops and training events

Peter Caligari (80%)

Status
colourGreen
titleBoard
Status
colourBlue
titleIT
Status
titleTechnology

Project head, technology development and realisationrealization, compute & data infrastructure, network, budget & personnell personnel responsibility, head of IT

Vigeesh Gangadharan (60%)

Status
colourYellow
titleScience
Status
colourPurple
titleData
Status
colourRed
titledevops

high-level data tools development

Andriy Gorobets (50%)

Status
colourYellow
titleScience
Status
colourPurple
titleData
Status
colourRed
titledevops

Data curation & analysis, high-level data development

Marco Günter (20%)

Status
colourBlue
titleIT
Status
titleTechnology
Status
colourRed
titledevops

System IT, compute, infrastructure and data servers, network,

Petri Kehusmaa (100%)

Status
colourGreen
titleBoard
Status
titleTechnology

Project manager & senior systems architect, project plan development, governance, risk assessment

Markus Knobloch (40%)

Status
colourBlue
titleIT
Status
colourPurple
titleData

Status
colourRed
titledevops

OT interfaces and data injection (hardware side), meta-data standards, non-scientific IT, instructor for trainees

Sophie Müller (100%)

Status
colourRed
titledevops

IT specialist trainee, application and pipeline development

Carl Schaffer (100%)

Status
colourPurple
titleData
Status
titleTechnology
Status
colourRed
titledevops

System architect, lead software developer, OT data injection (software side), ML, AI, Web-applications

Taras Yakobchuk (50%)

Status
colourYellow
titleScience
Status
colourPurple
titleData
Status
titleTechnology
Status
colourRed
titledevops

System architect, high-level data & data management

Status
colourRed
titledevops
We plan to outsource the development of APIs (Application Programmable Interfaces) for access to science ready data (L1 onwards) to Drew Leonard from Aperio, the software development firm that designed and implemented the prototype of the VTF pipeline in Python. API development will begin with Python to be later followed by IDL and maybe other languages.

We are currently shaping a contract and expect Aperio to begin as soon as possible. The development will be split into a design and implementation phase. The first will indeed require input from scientists intending to use SDC to get a product that benefits future uses as much as possible. We expect the first version of these APIs to be ready for the initial launch of SDC at the end of 2021.

Peter Caligari


SDC Project Status 02-2021 Petri Kehusmaa


Petri Kehusmaa

SDC Project kick-off was held on in October 2020. Since then, we have come a long way. We have had a workshop with scientists to gather requirements for this new service platform. We have also tested a lot of many different technology options to be selected as building block blocks for the future infrastructure.

Now we have entered the Solution Analysis and Design phase. In this phase we will work with different conceptual design components to be selected into our final blueprint and to be further implemented later this year.

The project team has also deployed Jira Atlassian for Service Management and Documentation. In the future, Jira Service Management will in the future serve SDC as a “Help Desk” ticketing system and knowledge base for platform users. It will provide automated workflows, support portal, knowledge base, feedback system and many more new services for the community.

Solution Analysis and Design Phase steps.

Drawio
zoom1
simple0
inComment0
pageId134775845
custContentId135038077
lbox1
diagramDisplayNameUntitled Diagram.drawio
contentVer3
revision3
baseUrlhttps://leibniz-kis.atlassian.net/wiki
diagramNameUntitled Diagram.drawio
pCenter0
width1241
links
tbstyle
height438

📋 Summary

Current project health

Current project status

Project constraints

Status
colourGreen
titleGreen

Solution Analysis and Design in progress.

Project The project is slightly behind the schedule.

Resources.

Technology POCs taking more time than predicted.

📊 Project status

Tip

Accomplishments

  • Workshop

  • Technology POCs

  • Requirements validation

  • Jira Atlassian deployment for the project team

Info

Next steps

  • Solution Analysis and Design

  • Define data policies

  • Define embargo rules

Warning

Risks & project issues

  • Lack of resources

Definition & Status of the Project Requirements

In addition to political, technical and security requirements, the workshops in late 2021 helped us clarify expectations that scientists from different disciplines have of the SDC. We continuously update a list of all identified demands. The list includes each requirement's current priority, a possibly assigned responsible, and its current status.

/wiki/spaces/SP/pages/124813373


🛠 SDC Products & Tools


Nazaret Bello Gonzalez

SDC data archive

http://sdc.leibniz-kis.de:8080/

Get access to data from GRIS/GREGOR and LARS/VTT instruments and the ChroTel full-disc telescope at OT.

Speckle reconstruction 

https://gitlab.leibniz-kis.de/sdc/speckle-cookbook

This tutorial helps the user run KISIP (Wöger & von der Lühe, 2008) on her favourite BBI and/or HiFI imaging data.
Contact: Vigeesh Gangadharan (vigeesh@leibniz-kis.de)

Coming soon: 

A Jupyter Notebook to assist the user on VFISV inversions for GRIS data by Vigeesh Gangadharan , including features like, e.g., wavelength calibration. Stay tuned!


📊 Conferences & Workshops


Nazaret Bello Gonzalez

Forthcoming Conferences/Workshops of Interest 2021

Feb 11

PUNCH4NFDI Open Data Workshop (Registration required!)

Feb 8-12

ESCAPE School:  First Science with interoperable data

The Virtual Observatory (VO) is opening new ways of exploiting the huge amount of data provided by the ever-growing number of ground-based and space facilities, as well as by computer simulations. The goal of the school is twofold:

  • Expose participants to the variety of  VO tools and services available today so that they can use them efficiently for their own research.

  • Gather requirements and feedback from participants

Every second Thursdays, 12:30-13:30 CET

PUNCH Lunch Seminar (see SDC calendar invitation for zoom links)

  • 11 Feb 2021: PUNCH4NFDI and ESCAPE - towards data lakes

  • 25 Feb 2021: PUNCH Curriculum Workshop

April week 12-16 (3 days, TBD)

ESCAPE WP4 Technology Forum 

June 10-11

3th International Workshop on Science Gateways |  IWSG 2021

Topics:

  • Architectures, frameworks and technologies for science gateways

  • Science gateways sustaining productive collaborative communities

  • Support for scalability and data-driven methods in science gatewayS

  • Improving the reproducibility of science in science gateways

  • Science gateway usability, portals, workflows and tools

  • Software engineering approaches for scientific work

  • Aspects of science gateways, such as security and stability

June 28, 2021:

Data-intensive radio astronomy: bringing astrophysics to the exabyte era

Topics: 

  • Data-intensive radio astronomy, current facilities and challenges

  • Data science and the exascale era: technical solutions within astronomy

  • Data science and the exascale era: applications and challenges outside astronomy

SDC participation in Conferences & Workshops

Nov. 26, 2020:

2nd SOLAR net Forum Meeting for Telescopes and Databases

Talk:  Big Data Storage -- The KIS SDC case, NBG, PC & PK, 2nd SOLARNET Forum (Nov 26)
Nazaret Bello GonzalezPetri Kehusmaa Peter Caligari


🤲 SDC Collaborations


 Nazaret Bello Gonzalez

SOLARNET https://solarnet-project.eu

KIS coordinates the SOLARNET H2020 Project that brings together European solar research institutions and companies to provide access to the large European solar observatories, supercomputing power and data. KIS SDC is actively participating in WP5 and WP2 in coordinating and developing data curation and archiving tools in collaborations with European colleagues.
Contact on KIS SDC activities in SOLARNET: Nazaret Bello Gonzalez nbello@leibniz-kis.de

 ESCAPE https://projectescape.eu/

KIS is a member of the European Science Cluster of Astronomy & Particle Physics ESFRI Research Infrastructures (ESCAPE H2020, 2019 - 2022) Project aiming to bring together people and services to build the European Open Science Cloud. KIS SDC participates in WP4 and WP5 to bring ground-based solar data into the broader Astronomical VO and the development tools to handle large solar data sets. 

Contact on KIS SDC activities in ESCAPE: Nazaret Bello Gonzalez nbello@leibniz-kis.de

 

EST https://www.est-east.eu/

KIS is one of the European institutes strongly supporting the European Solar Telescope project. KIS SDC represents the EST data centre development activities in a number of international projects like ESCAPE and the Group of European Data Experts (GEDE-RDA).

Contact on KIS SDC as EST data centre representative: Nazaret Bello Gonzalez nbello@leibniz-kis.de

 

PUNCH4NFDI https://www.punch4nfdi.de

KIS is a participant (not a member) of the PUNCH4NFDI Consortium. PUNCH4NFDI is the NFDI (National Research Data Infrastructure) consortium of particle, astro-, astroparticle, hadron and nuclear physics, representing about 9.000 scientists with a Ph.D. in Germany, from universities, the Max Planck Society, the Leibniz Association, and the Helmholtz Association. PUNCH4NFDI is the setup of a federated and "FAIR" science data platform, offering the infrastructures and interfaces necessary for the access to and use of data and computing resources of the involved communities and beyond. PUNCH4NFDI is currently competing with other consortia to be funded by the DFG (final response expected in spring 2021). KIS SDC aims to become a full member of PUNCH and federate our efforts on ground-based solar data dissemination to the broad particle and astroparticle communities.

Contact on KIS SDC as PUNCH4NFDI participant: Nazaret Bello Gonzalez nbello@leibniz-kis.de & Peter Caligari mailto:cale@leibniz-kis.de 

 


🖥 IT news

Network

Dedicatel


Peter Caligari

Ongoing & Future developments

Webpage

Status
colourYellow
titleKIS
Our website is based on a 5-year-old version of Typo3, which poses an increased security risk. Furthermore, the current design only fulfils the most basic means for access by disabled people and a presentation on mobile devices with small displays - both requirements that the Ministry expected us to consider and implement.

Therefore, at the beginning of 2020, KIS-IT intended a relaunch based on a commercially available template for Typo3. However, this had to be stopped due to an increasing workload on IT from SDC and the fact that we were only 2 people in IT last year.

The KIS, therefore, decided to outsource the relaunch. The aforementioned template will still be the basis, but an external web agency will do the design and initial implementation. We have first design drafts and are currently streamlining our site-tree. Feel free to drop by anytime if you are interested in the process.
Costs for the relaunch are of the order of 13 k€.

Network

Dedicated 10 Gbit line between KIS & OT

Status
colourYellow
titleKIS
Status
colourPurple
titleOT
We are currently setting up a dedicated 10 Gbit line between the OT and the KIS (technically, we get a wavelength in a multiplexed dark-fibre, a so-called Lambda).

On Spanish territory, the line is completely free of charge. Nevertheless, the (throughput independent) annual costs for the remaining distance amount to approx. 45 k€ (incl. VAT). Additionally, there are one-time set-up costs of about 30 k€. The costs are

We expect a significantly lower latency on the new line than on the existing internet connection (which will not be affected in any way).
Therefore, we will mainly use it for remote observations and data transport within the framework of the SDC.

The SDC is striving for cooperation with the SCC/KIT in Karlsruhe. By coincidence, the 10 Gbit line enters German territory at the DFN node in Campus North of the KIT. We hope to connect KIT and KIS using the same line without significantly higher costs (despite the high expected load).

New (application) firewalls at KIS & OT

Status
colourYellow
titleKIS
Status
colourPurple
titleOT
Parallelly, we will replace the existing firewalls at both locations (KIS and OT) with modern application firewalls. The current ones are simple packet filters, classifying traffic by port only (all traffic on ports 80 and 443 is web traffic, any web-traffic that is not on 80 on 443 would not be recognized as such). So-called application firewalls do not rely on ports but rather examine all traffic for specific patterns to assign it to a particular application or usage.
Those machines need to be put under maintenance to cope with new threats efficiently. Per site, we expect initial costs of the order of 20-30 k€ for a redundant cluster of two machines, and around 2-3 k€ maintenance costs, each.

Storage

Status
colourYellow
titleKIS
The storage at the KIS is becoming increasingly scarce. In contrast to the storage nodes used at OT, such can no longer be bought for the KIS system.

We are currently scanning all files at KIS to determine the amount of data not accessed for a considerable amount of time. We are planning to invest in a slower tier where such data will automatically be moved. While still visible, access would take longer than for frequently accessed data on the primary tier (however, accessing data on the slower tier will automatically and transparently move it back to the near-line tier). Even this to be introduced slower tier will still not primarily have the character of an archive. Expected costs for such a system are 50-70 k€ (for approx 0,5 PB, incl VAT).

Status
colourRed
titleSDC
Mainly static files of common interest that are too huge to store on offline media like external disks or tape will go to SDC once operational.

Status
colourPurple
titleOT
We will upgrade the central storage at OT (jane) with two additional nodes á 32 TB to cope with the tight storage situation during observations in 2020 (partly due to mainly running remote observations due to Covid19). Jane will then consist of a total of 6 nodes.

Current Resources

Compute nodes

hostname

# of CPUs & total cores

ram [GB]

patty

Status
colourYellow
titleKIS

marge & homer
Status
colourYellow
titleKIS
(coming soon…)

2 x AMD EPYC 7742, 128 cores

1024

itchy & selma

Status
colourYellow
titleKIS

4 x Xeon(R) CPU E5-4657L v2 @ 2.40GHz, 48 cores

512

scratchy

Status
colourYellow
titleKIS

quake &halo
Status
titleKIS/seismo

hathi

Status
colourPurple
titleOT

4 x Intel(R) Xeon(R) CPU E5-4650L @ 2.60GHz, 32 cores

512

Central storage space

Total available disk space for /home (

Status
colourYellow
titleKIS
Status
colourPurple
titleOT
), /dat (
Status
colourYellow
titleKIS
Status
colourPurple
titleOT
), /archive (
Status
colourYellow
titleKIS
), /instruments (
Status
colourPurple
titleOT

Tip

Highlights

👋 About me

)

name

total [TB, brutto]

free [TB, brutto]

mars

Status
colourYellow
titleKIS

758

39

quake

Status
titleKIS/seismo

61

0

halo

Status
titleKIS/seismo

145

44,5

jane

Status
colourPurple
titleOT

130 (-> 198)

23


📎 References

Strategy & OKRs

Quick links

Products & Tools

Forthcoming Conferences/Workshops

Collaborations