Metakit

From Bauman National Library
Revision as of 07:48, 21 April 2016 by igor nikolaev (Talk | contribs)

Metakit is an efficient embedded database library with a small footprint. It fills the gap between flat-file, relational, object-oriented, and tree-structured databases, supporting relational joins, serialization, nested structures, and instant schema evolution. There is a C++ API, a Python binding called Mk4py, and a Tcl binding called Mk4tcl. You can manipulate and exchange data between any of these.

Data files are portable. The library has been used on Unix, Windows, Macintosh, VMS, and others, spanning a range of 16- to 64-bit architectures, from PDA's to S390's.

Metakit is in use in various commercial projects and products on millions of desktops.

General

MetaKit is a freely available, cross platform, open source (MIT-style licensed) database library that you can include in your own programs. It has been developed by Jean-Claude Wippler, the man behind Equi4 Software. MetaKit comes with an API for C++, one for Python, and an API for Tcl called "Mk4tcl".

Like every other database or storage library, MetaKit helps you manage the data you want to store on disk. However, it does things somewhat differently from many other database libraries, which give it a number of unique advantages, some of which are ideally paired to scripting languages like Tcl.

  • Both the code and datafiles are portable. All byte-ordering managed by the library.
  • Store multiple nested data structures, to create document-centric applications.
  • You'll have to see this to believe it: restructure files on-the-fly, while open.
  • Complementing commit/rollback of changes, data can also be serialized.
  • The use of Stable Storage ensures that files cannot be corrupted by crashes.
  • Files are opened without reading data. Memory-mapped files if O/S supports it.
  • The API mimics container classes. Quickly get sizes and iterate over rows.
  • Sorting, relational join / group by, set operations, permutations, hashing.
  • The largest int defines storage format. String/binary data is stored as var-sized.
  • Can be linked shared or statically, for hassle-free deployment of components.
  • The library is extremely small, unused functions are stripped off in static links.
  • Only a small interface is exposed. One header file lists all the classes you need.
  • Use from Python and Tcl

MetaKit Concepts

There are only a few concepts that you'll need to know to use MetaKit. Most of these should be familiar to you if you've used any other database or storage mechanism, though perhaps under a different name.

Datafile. MetaKit stores all its data in one or more self-contained data files on disk. These live like any other file in the file system; you open them with MetaKit when you want to access your data, and you close them afterwards. When you open a file, you specify a 'tag' that is associated with the open file (e.g. "db" in the earlier example). You can have multiple data files open at once, where each would have a different tag.

View. Views let you partition your data files into one or more separate areas, each of which may hold different types of data. The description of what data each view can hold is referred to as its layout, or structure. Views are specified as tag.viewname (e.g. "db.addressbook" earlier). Views are equivalent to 'tables' in many other databases.

Row. A row holds a collection of data related to the same object, such as the name, address, etc. of a person in an address book. This is commonly referred to as a 'record' in many other databases. A view is actually made up of an array of rows, which are referred to by a zero-based index (i.e. the first row is '0', the second is '1', etc.). You can refer to an individual row within a view by tag.viewname!index (e.g. "db.addressbook!3").

Properties. A property is an individual data item. Each row will hold one or more properties. Each row within a single view will contain the same properties, though of course they will likely have different values. So for example, every row within the view may have a name and address property, but the values will be different for each row. Rows are commonly known as 'fields' in many other databases. Note that properties are not referred to directly by a notation like for views and rows, but accessed through MetaKit commands like mk::get and mk::set.

Documentation

C++

The documentation for the C++ API is generated by Doxygen. There are also some older docs and an intro from the pre-2.3 days. There is C++ sample code in the demo/ and examples/ areas of the source code distribution.

Riccardo Cohen has written a small tutorial with some nice code examples, see his Metakit C++ tutorial on the web.

Further information can be gleaned from the 140+ small C++ snippets which form the regression test suite in the tests/ directory.

Python

The Mk4py page describes the use of MK from Python. There is Python sample code in the examples/ area of the source code distribution.

An annotated, newer version of the above has been created by Brian Kelley, with several examples, explaining new features such as hashes and ordered views.

Tcl

The Mk4tcl page describes the procedural interface from Tcl to MK. Mark Roseman has written a tutorial for Mk4tcl, which is an excellent way to get started in Tcl. There is Tcl sample code in the examples/ area of the source code distribution.

There is a newer Oomk binding (Object Oriented MK) which exposes more of MK's C++ core to Tcl, i.e. like the Python binding. It is based on Mk4too (a relatively unknown part of Metakit for Tcl which is best left alone now that Oomk exists).

The "Starkits" chapter in the 4th edition of Practical Programming in Tcl and Tk by Brent Welch includes documentation about the Mk4tcl binding.

References