Rasdaman

From Bauman National Library
This page was last modified on 18 June 2016, at 14:24.
Revision as of 14:24, 18 June 2016 by egor zorin (Talk | contribs) (Start)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
</td></tr>
Rasdaman
fraimed
Developer(s) Rasdaman GmbH
Stable release
rasdaman 9.1 / July 10, 2015 (2015-07-10)
Repository {{#property:P1324}}
Development status Active
Written in Java,C++
Operating system Windows,Linux,OS X
Website www.rasdaman.org

Definition

Rasdaman ("raster data manager") is an Array DBMS[1], that is: a Database Management System which adds capabilities for storage and retrieval of massive multi-dimensional arrays, such as sensor, image, and statistics data. A frequently used synonym to arrays is raster data, such as in 2-D raster graphics; this actually has motivated the name rasdaman. However, rasdaman has no limitation in the number of dimensions - it can serve, for example, 1-D measurement data, 2-D satellite imagery, 3-D x/y/t image time series and x/y/z exploration data, 4-D ocean and climate data, and even beyond spatio-temporal dimensions.

Description

Rasdaman ("raster data manager") allows storing and querying massive multi-dimensional ​arrays, such as sensor, image, simulation, and statistics data appearing in domains like earth, space, and life science. This worldwide leading array analytics engine distinguishes itself by its flexibility, performance, and scalability. Rasdaman can process arrays residing in file system directories as well as in databases. In fact, rasdaman has pioneered ​Array Databases being the first fully implemented, operationally used system with an array query language and optimized processing engine with unprecedented scalability.

Rasdaman allows to create a system for the analysis of multidimensional data sets. To work with multidimensional datasets is proposed SQL-like query language RASQL. While Rasdaman provides a means for distributed processing of queries and creation of cluster solutions, for example, has recently demonstrated the involvement of more than a thousand nodes in the cluster for joint processing of a single query to the database.

Rasdaman.png

Versions

There are two Rasdaman database versions:

  • The version of Rasdaman community, developed from the source code by third-party developers
  • The official version of Rasdaman, supported by the company

It should be noted that both rasdaman version are completely full. The main differences is that the official version of rasdaman add speed performance, plus some specific extension functions and maintenance tools.

Open Source

Start

  1. Open the console
  2. Run the demo script download data
    $ install_demo.sh
  3. Using sql tool for query, we get the result:
    $ rasql -q "select png( NIR ) from NIR" --out file

Query language

The rasdaman query language, rasql, offers raster processing formulated through expressions over raster operations in the style of ​SQL. Consider the following query: "The difference of red and green channel from all images from collection LandsatImages where somewhere in the red channel intensity exceeds 127". In rasql, it is expressed as

    select ls.red - ls.green
    from LandsatImages as ls
    where max_cells( ls.red ) > 127

Rasql is a full query language, supporting select, insert, update, and delete. Additionally, the concept of a partial update is introduced which allows to selectively update parts of an array. In view of the potentially large size of arrays this is a practically very relevant feature, e.g., for updating satellite image maps with new incoming imagery.

Query formulation is done in a declarative style (queries express what the result should look like, not how to compute it). This allows for extensive optimization on server side. Further, rasql is safe in evaluation: every valid query is guaranteed to to terminate in finite time.

References

Notes

  1. Array database management systems (DBMSs) provide database services specifically for arrays (also called raster data), that is: homogeneous collections of data items (often called pixels, voxels, etc.), sitting on a regular grid of one, two, or more dimensions. Often arrays are used to represent sensor, simulation, image, or statistics data.