Performance Evaluation of Distributed Time Series Databases for Industrial Control Systems

  • job offer:
  • job categories:

    Master Thesis / Internship

  • job posting number:

    IPE 12-20

  • starting date:

    Nach Vereinbarung

  • contact person:

    Dr. Suren Chilingaryan, Dr. Andreas Kopmann

Job description:      

Slow control systems of large scientific experiments include multiple thousands of sensors monitoring the operation of used instrumentation and properties of an ongoing experiment. This information is crucial for the understanding of measured data and it should be preserved for a long time after the active experimentation is finished. Recording and organization of slow control data present a few challenges due to increasing sampling rates and the sheer amount of stored data during the experiment's lifetime. A robust distributed time-series database is required to ensure uninterrupted recording of data and to provide a fast interface to the stored historical data.

We aim to evaluate and compare several time-series databases as candidates to archive the data produced by the control system of the KATRIN (KArlsruhe TRItium Neutrino) project. The student is expected to review time-series databases (e.g. InfluxDB, VictoriaMetrics, and Timescale), then provide a detailed evaluation report of several possible solutions that are suited to store high volumes of time series data. The selected engine is to be integrated with the existing data exploration and archival systems operating at IPE.

The ideal database will:

  • Reliably store the high-bandwidth streams of the data;

  • Scale well in the cluster environment;

  • Include intelligent caching mechanisms to speed-up the queries;

  • Extract standard statistical information and provide a programming interface to compute custom

    properties;

  • Support Geo-distributed operation modes;

  • Integrate with data analysis tools like Apache Spark, etc.

 

Personal qualification:

Required Skills:

Very good understanding of the relational and NoSQL database technologies. Good programming skills and preferably prior experience in Java and Python. Familiar with statistical methodologies and environments like Jupyter Notebooks, Matlab, R language, etc.

 

 

Contact person:       

Suren Chilingaryan suren.chilingaryan∂kit.edu IPE

Phone: +49 721 / 608 26579

Andreas Kopmann andreas.kopmann∂kit.edu IPE

Phone: +49 721 / 608 24910