Understanding CouchDB, a Database That Embraces the Web

In my last blog, I talked about NoSQL databases and their types. This blog focusses on CouchDB, a document-based store for documents made up of tagged elements.

CouchDB gets accessed in JSON format over HTTP. This makes it simple to use for Web applications. Developers should look to CouchDB when seeking a reliable database as every change is stored as a document revision. This ensures that redundancy and conflict resolution are well addressed. By using multi-version concurrency control (MVCC), you can also ensure that ‘writes’ do not block ‘reads’.

Additionally, CouchDB boasts a strong replication model that allows for Master-Master replication and filtered replication streams. This robust replication model opens up interesting possibilities for scaling a database.

Advantages of using CouchDB

  • Open Source
  • Provides ACID (Atomicity, Consistency, Isolation, Durability) semantics
  • Can handle a high volume of concurrent readers and writers without conflict, i.e. MVCC (Multi-Version Concurrency Control)
  • Supports HTTP and REST protocols
  • Bi-directional replication, continuous or ad-hoc with conflict detection
  • Previous versions of documents are made available
  • Views can be built using embedded map/reduce functions
  • Authentication and attachment handling
  • Real-time updates via ‘_changes’ like RDBMS triggers

Best used: For accumulating, occasionally changing data, on which pre-defined queries are to be run, i.e. places where versioning is important.

Examples:

  • CMS systems, master-master replication is an especially interesting feature, allowing easy multi-site deployments.
  • ERP applications where media files of products like images, documents can be managed and served.

Naming a Couch document

Couch documents stored in a CouchDB have a document ID (_id). _id are case-sensitive string identifiers that uniquely identify a document. Two couch documents cannot have the same identifier in the same database as they are considered as the same document. It is recommended to use non-special characters for document IDs. When using special characters you have to be aware of proper URL en-/decoding.

Special Fields

Field NameDescription
_idThe unique identifier of the document
_revThe current MVCC-token/revision of this document
_attachmentsIf the document has attachments, _attachments holds a (meta-)data structure
_deletedIndicates that this document has been deleted and previous revisions will be removed on next compaction run
_revisionsRevision history of the document
_revs_infoA list of revisions of the document, and their availability
_conflictsInformation about conflicts
_deleted_conflictsInformation about conflicts
_local_seqSequence number of the revision in the database (as found in the _changes feed)

How a typical Couch document will look like:

{
 "_id": "sdfsdfsdfff2f2af0c50e6c0",
 "_rev": "20-bbc1d13bd06186e52a761e0687df7f07",
 "Make": "Dodge",
 "createdDateStamp": "2015-01-09 : 18:02:26",
 "Interior Color": "Black",
 "Exterior Color Code": "PXT",
 "Engine Name": "HEMI 5.7L V8 370hp 395ft. lbs.",
 "user_id": "pkollias",
 "Interior Color Code": "X9",
 "Style": "R/T 4dr Sedan",
 "Equipment": {
 "Towing": [{
 "Evidence of Hitch Installed?": "None"
 }],
 "Radio": [{
 "Radio Present?": "Factory"
 }],
 "Headrests": [{
 "Headrests Present?": "No"
 }, {
 "Quantity Missing": "0"
 }],
 "Entertainment": [{
 "Evidence of Entertainment System Installed?": "None"
 }]
 },

Understanding-CouchDB-01

CouchDB has inbuilt support for HTTP methods

GET

To retrieve a document, simply perform a GET operation at the document’s URL.

PUT

To create a new document you can either use a POST operation or a PUT operation. To create/update a named document using the PUT operation, the URL must point to the document’s location.

POST

The POST operation can be used to create a new document with a server generated DocID. To do so, the URL must point to the database’s location. To create a named document, use the PUT method instead.

DELETE

To delete a document, perform a DELETE operation at the document’s location, passing the rev parameter with the document’s current revision. If successful, it will return the revision id for the deletion stub.

Attachments

Documents can have attachments just like emails. There are two ways to use attachments:

  • the first one is a REST API that addresses individual attachments by URLs;
  • the second is in line with your document.

Please note that attachments may have embedded / characters that are sent unescaped to CouchDB. You can use this to provide a subtree of attachments under a document.

Creating a Couch Document

String hostUrl = http://USERNAME:PASSWORD@SERVER_NAME:5984
Or https://USERNAME:PASSWORD@SERVER_NAME:6984

HttpPost httpPostRequest = new HttpPost(hostUrl + "/" + databaseName);
 StringEntity body = new StringEntity(jsonDoc.toString(), "utf8");
 httpPostRequest.setEntity(body);
 httpPostRequest.setHeader("Accept", "application/json");
 httpPostRequest.setHeader("Content-type", "application/json");
 HttpResponse httpResponse = (HttpResponse) new DefaultHttpClient().execute(httpPostRequest);
 HttpEntity entity = httpResponse.getEntity();
 if (entity != null) {
 // Read the content stream
 InputStream instream = entity.getContent();
 // Convert content stream to a String
 String resultString = FileUtil.convertStreamToString(instream);
 instream.close();
 // Transform the String into a JSONObject
 JSONObject jsonResult = new JSONObject(resultString);
 return jsonResult:
 }

Adding  Documents as Attachments

String fileName = java.net.URLEncoder.encode(fileName, "UTF-8");
 HttpPut httpPutRequest = new HttpPut(hostUrl + "/" + dbName +
 (docID != null ? "/" + docID : "") + "/" + fileName + (revID != null ? "?rev=" + revID : ""));

ByteArrayEntity body = new ByteArrayEntity(attachData); //attachData – byte array
httpPutRequest.setEntity(body);
httpPutRequest.setHeader(“Content-type”, attachFileType);
HttpResponse httpResponse = (HttpResponse) new DefaultHttpClient().execute(httpPutRequest);

Get Documents

String criteria = “_design / views / _view / get_docs_by_id_name ? key = 10000”;
 HttpGet httpGetRequest = new HttpGet(hostUrl + "/" + dbName + "/" + criteria);
 HttpResponse httpResponse = (HttpResponse) new DefaultHttpClient()
 .execute(httpGetRequest);
 HttpEntity entity = httpResponse.getEntity();
 if (entity != null) {
 // Read the content stream
 InputStream instream = entity.getContent();
 // Convert content stream to a String
 resultString = FileUtil.convertStreamToString(instream);
 instream.close();
 // Transform the String into a JSONObject
 JSONObject jsonResult = new JSONObject(resultString); // get the attachments from the JSON
 }

Delete Attachments by filename

String couchUrl = hostUrl + "/" + databaseName + "/" + documentId + "/" + fileName + "?rev=" + revisionId;
 HttpDelete httpDeleteRequest = new HttpDelete(couchUrl);
 httpDeleteRequest.setHeader("Accept", "application/json");
 httpDeleteRequest.setHeader("Content-type", "application/json");
 HttpResponse httpResponse = (HttpResponse) new DefaultHttpClient().execute(httpDeleteRequest);

Delete All Attachments

String couchUrl = hostUrl + "/" + databaseName + "/" + documentId + "/" + rev = "+revisionId;
 HttpDelete httpDeleteRequest = new HttpDelete(couchUrl);
 httpDeleteRequest.setHeader("Accept", "application/json");
 httpDeleteRequest.setHeader("Content-type", "application/json");
 HttpResponse httpResponse = (HttpResponse) new DefaultHttpClient().execute(httpDeleteRequest);

GET /{db}/_changes

This request retrieves a sorted list of changes made to documents in the database, in time order of application, can be obtained from the database’s _changes resource.

This request can be used to listen for update and modifications to the database for post processing or synchronization. A continuously connected changes feed is a reasonable approach for generating a real-time log for most applications.

  • http://HOSTNAME:5984/media-db/_changes
  • http://HOSTNAME:5984/media-db/_changes?since=10 (Returns from the change immediately after the given sequence number.)
  • http://HOSTNAME:5984/media-db/_changes?feed=continuous&heartbeat=5000 (Returns the changes every 5 seconds if there is any change in media-db database.)
  • http://HOSTNAME:5984/media-db/_changes?filter=myfilters/filterMyDocs (Custom filter is used here, I want to monitor only a specific set of documents and not all couch documents within the database.)
    function(doc, req) {
     if(doc["fileCategory"] == 'Financial Data') {
     return true;
     } else {
     return false;
     }
     }

    This will filter couch documents having attribute as ‘fileCategory’ which is equal to ‘Financial Data’

In the next blog we will see how to render a page using Couch Show/List functions and Couch Views…

Author

  • Sangamaeswaran T

    Sangamaeswaran works as Technical Leader with Trigent Software. He has more than ten years of experience in Java/J2ee web development and strong expertise in SAAS applications, CXF, RESTful Web Services, OFBiz, Spring, DOJO and Couch DB.