Apache CouchDB

From Bauman National Library
This page was last modified on 25 June 2016, at 22:01.
Apache CouchDB
Couch.png
Developer(s) Apache Software Foundation
Repository {{#property:P1324}}
Development status Active
Written in Erlang
Platform Cross-platform
Type Document-oriented database
License Apache license 2.0 [1]
Website couchdb.apache.org

Apache CouchDB, commonly referred to as CouchDB, is open source database software that focuses on ease of use and having an architecture that "completely embraces the Web". It has a document-oriented NoSQL database architecture and is implemented in the concurrency-oriented language Erlang; it uses JSON to store data, JavaScript as its query language using MapReduce, and HTTP for an API.

General

CouchDB is often categorized as a “NoSQL” database, a term that became increasingly popular in late 2009, and early 2010. While this term is a rather generic characterization of a database, or data store, it does clearly define a break from traditional SQL-based databases. A CouchDB database lacks a schema, or rigid pre-defined data structures such as tables. Data stored in CouchDB is a JSON document(s). The structure of the data, or document(s), can change dynamically to accommodate evolving needs.

Main features

ACID Semantics

CouchDB provides ACID semantics. It does this by implementing a form of Multi-Version Concurrency Control, meaning that CouchDB can handle a high volume of concurrent readers and writers without conflict.

Built for Offline

CouchDB can replicate to devices (like smartphones) that can go offline and handle data sync for you when the device is back online.

Distributed Architecture with Replication

CouchDB was designed with bi-direction replication (or synchronization) and off-line operation in mind. That means multiple replicas can have their own copies of the same data, modify it, and then sync those changes at a later time.

Document Storage

CouchDB stores data as "documents", as one or more field/value pairs expressed as JSON. Field values can be simple things like strings, numbers, or dates; but ordered lists and associative arrays can also be used. Every document in a CouchDB database has a unique id and there is no required document schema.

Eventual Consistency

CouchDB guarantees eventual consistency to be able to provide both availability and partition tolerance.

Map/Reduce Views and Indexes

The stored data is structured using views. In CouchDB, each view is constructed by a JavaScript function that acts as the Map half of a map/reduce operation. The function takes a document and transforms it into a single value that it returns. CouchDB can index views and keep those indexes updated as documents are added, removed, or updated.

HTTP API

All items have a unique URI that gets exposed via HTTP. It uses the HTTP methods POST, GET, PUT and DELETE for the four basic CRUD (Create, Read, Update, Delete) operations on all resources. CouchDB also offers a built-in administration interface accessible via Web called Futon.

Key Characteristics

Documents

A CouchDB document is a JSON object that consists of named fields. Field values may be strings, numbers, dates, or even ordered lists and associative maps. An example of a document would be a blog post:

{
    "Subject": "I like Plankton",
    "Author": "Rusty",
    "PostedDate": "5/23/2006",
    "Tags": ["plankton", "baseball", "decisions"],
    "Body": "I decided today that I don't like baseball. I like plankton."
}

In the above example document, Subject is a field that contains a single string value "I like plankton". Tags is a field containing the list of values "plankton", "baseball", and "decisions". A CouchDB database is a flat collection of these documents. Each document is identified by a unique ID.

Views

To address this problem of adding structure back to semi-structured data, CouchDB integrates a view model using JavaScript for description. Views are the method of aggregating and reporting on the documents in a database, and are built on-demand to aggregate, join and report on database documents. Views are built dynamically and don’t affect the underlying document; you can have as many different view representations of the same data as you like. Incremental updates to documents do not require full re-indexing of views.

Schema-Free

Unlike SQL databases, which are designed to store and report on highly structured, interrelated data, CouchDB is designed to store and report on large amounts of semi-structured, document oriented data. CouchDB greatly simplifies the development of document oriented applications, such as collaborative web applications.

In an SQL database, the schema and storage of the existing data must be updated as needs evolve. With CouchDB, no schema is required, so new document types with new meaning can be safely added alongside the old. However, for applications requiring robust validation of new documents custom validation functions are possible. The view engine is designed to easily handle new document types and disparate but similar documents.

Distributed

CouchDB is a peer based distributed database system. Any number of CouchDB hosts (servers and offline-clients) can have independent "replica copies" of the same database, where applications have full database interactivity (query, add, edit, delete). When back online or on a schedule, database changes can be replicated bi-directionally.

CouchDB has built-in conflict detection and management and the replication process is incremental and fast, copying only documents changed since the previous replication. Most applications require no special planning to take advantage of distributed updates and replication.

Unlike cumbersome attempts to bolt distributed features on top of the same legacy models and databases, replication in CouchDB is the result of careful ground-up design, engineering and integration. This replication framework provides a comprehensive set of features:

  • Master → Slave replication
  • Master ↔ Master replication
  • Filtered Replication
  • Incremental and bi-directional replication
  • Conflict management

These replication features can be used in combination to create powerful solutions to many problems in the IT industry. In addition to the fantastic replication features, CouchDB's reliability and scalability is further enhanced by being implemented in the Erlang programming language. Erlang has built-in support for concurrency, distribution, fault tolerance, and has been used for years to build reliable systems in the telecommunications industry. By design, the Erlang language and runtime are able to take advantage of newer hardware with multiple CPU cores.

Installation

Installation from binaries

This is the simplest way to go.

  1. Get the latest Windows binaries from CouchDB web site. Old releases are available at archive.
  2. Follow the installation wizard steps
  3. Open up Futon (if you hadn’t selected autostart CouchDB after installation, you have to start it first manually)
  4. It’s time to Relax!

Installation from sources

Setting Up Cygwin

Before starting any Cygwin terminals, run:

set CYGWIN=nontsec

To set up your environment, run:

[VS_BIN]/vcvars32.bat

Replace [VS_BIN] with the path to your Visual Studio bin directory.

You must check that:

  • The which link command points to the Microsoft linker.
  • The which cl command points to the Microsoft compiler.
  • The which mc command points to the Microsoft message compiler.
  • The which mt command points to the Microsoft manifest tool.
  • The which nmake command points to the Microsoft make tool.

If you do not do this, the build may fail due to Cygwin ones found in /usr/bin being used instead.

Building Erlang

You must include Win32 OpenSSL, built statically from source. Use exactly the same version as required by the Erlang/OTP build process.

However, you can skip the GUI tools by running:

echo "skipping gs" > lib/gs/SKIP
echo "skipping ic" > lib/ic/SKIP
echo "skipping jinterface" > lib/jinterface/SKIP

Follow the rest of the Erlang instructions as described.

After running:

./otp_build release -a

You should run:

./release/win32/Install.exe -s

This will set up the release/win32/bin directory correctly. The CouchDB installation scripts currently write their data directly into this location.

To set up your environment for building CouchDB, run:

eval `./otp_build env_win32`

To set up the ERL_TOP environment variable, run:

export ERL_TOP=[ERL_TOP]

Replace [ERL_TOP] with the Erlang source directory name.

Remember to use /cygdrive/c/ instead of c:/ as the directory prefix.

To set up your path, run:

export PATH=$ERL_TOP/release/win32/erts-5.8.5/bin:$PATH

If everything was successful, you should be ready to build CouchDB.

Relax.

Building CouchDB

Note that win32-curl is only required if you wish to run the developer tests.

The documentation step may be skipped using --disable-docs if you wish.

Once you have satisfied the dependencies you should run:

./configure \
    --with-js-include=/cygdrive/c/path_to_spidermonkey_include \
    --with-js-lib=/cygdrive/c/path_to_spidermonkey_lib \
    --with-win32-icu-binaries=/cygdrive/c/path_to_icu_binaries_root \
    --with-erlang=$ERL_TOP/release/win32/usr/include \
    --with-win32-curl=/cygdrive/c/path/to/curl/root/directory \
    --with-openssl-bin-dir=/cygdrive/c/openssl/bin \
    --with-msvc-redist-dir=/cygdrive/c/dir/with/vcredist_platform_executable \
    --disable-init \
    --disable-launchd \
    --prefix=$ERL_TOP/release/win32

This command could take a while to complete.

If everything was successful you should see the following message:

You have configured Apache CouchDB, time to relax.

Relax.

To install CouchDB you should run:

make install

If everything was successful you should see the following message:

You have installed Apache CouchDB, time to relax.

Relax.

To build the .exe installer package, you should run:

make dist

Alternatively, you may run CouchDB directly from the build tree, but to avoid any contamination do not run make dist after this.

First run

You can start the CouchDB server by running:

$ERL_TOP/release/win32/bin/couchdb.bat

When CouchDB starts it should eventually display the following message:

Apache CouchDB has started, time to relax.

Relax.

To check that everything has worked, point your web browser to:

http://127.0.0.1:5984/_utils/index.html

From here you should run the verification tests in Firefox.

Links

References

Cite error: Invalid <references> tag; parameter "group" is allowed only.

Use <references />, or <references group="..." />
  1. "Apache license".