In my last blog, I talked about NoSQL databases and their types. This blog focusses on CouchDB, a document-based store for documents made up of tagged elements.
CouchDB gets accessed in JSON format over HTTP. This makes it simple to use for Web applications. Developers should look to CouchDB when seeking a reliable database as every change is stored as a document revision. This ensures that redundancy and conflict resolution are well addressed. By using multi-version concurrency control (MVCC), you can also ensure that ‘writes’ do not block ‘reads’.
Additionally, CouchDB boasts a strong replication model that allows for Master-Master replication and filtered replication streams. This robust replication model opens up interesting possibilities for scaling a database.
Advantages of using CouchDB
- Open Source
- Provides ACID (Atomicity, Consistency, Isolation, Durability) semantics
- Can handle a high volume of concurrent readers and writers without conflict, i.e. MVCC (Multi-Version Concurrency Control)
- Supports HTTP and REST protocols
- Bi-directional replication, continuous or ad-hoc with conflict detection
- Previous versions of documents are made available
- Views can be built using embedded map/reduce functions
- Authentication and attachment handling
- Real-time updates via ‘_changes’ like RDBMS triggers
Best used: For accumulating, occasionally changing data, on which pre-defined queries are to be run, i.e. places where versioning is important.
Examples:
- CMS systems, master-master replication is an especially interesting feature, allowing easy multi-site deployments.
- ERP applications where media files of products like images, documents can be managed and served.
Naming a Couch document
Couch documents stored in a CouchDB have a document ID (_id). _id are case-sensitive string identifiers that uniquely identify a document. Two couch documents cannot have the same identifier in the same database as they are considered as the same document. It is recommended to use non-special characters for document IDs. When using special characters you have to be aware of proper URL en-/decoding.
Special Fields
Field Name | Description |
---|---|
_id | The unique identifier of the document |
_rev | The current MVCC-token/revision of this document |
_attachments | If the document has attachments, _attachments holds a (meta-)data structure |
_deleted | Indicates that this document has been deleted and previous revisions will be removed on next compaction run |
_revisions | Revision history of the document |
_revs_info | A list of revisions of the document, and their availability |
_conflicts | Information about conflicts |
_deleted_conflicts | Information about conflicts |
_local_seq | Sequence number of the revision in the database (as found in the _changes feed) |
How a typical Couch document will look like:
{ "_id": "sdfsdfsdfff2f2af0c50e6c0", "_rev": "20-bbc1d13bd06186e52a761e0687df7f07", "Make": "Dodge", "createdDateStamp": "2015-01-09 : 18:02:26", "Interior Color": "Black", "Exterior Color Code": "PXT", "Engine Name": "HEMI 5.7L V8 370hp 395ft. lbs.", "user_id": "pkollias", "Interior Color Code": "X9", "Style": "R/T 4dr Sedan", "Equipment": { "Towing": [{ "Evidence of Hitch Installed?": "None" }], "Radio": [{ "Radio Present?": "Factory" }], "Headrests": [{ "Headrests Present?": "No" }, { "Quantity Missing": "0" }], "Entertainment": [{ "Evidence of Entertainment System Installed?": "None" }] },
CouchDB has inbuilt support for HTTP methods
GET
To retrieve a document, simply perform a GET operation at the document’s URL.
PUT
To create a new document you can either use a POST operation or a PUT operation. To create/update a named document using the PUT operation, the URL must point to the document’s location.
POST
The POST operation can be used to create a new document with a server generated DocID. To do so, the URL must point to the database’s location. To create a named document, use the PUT method instead.
DELETE
To delete a document, perform a DELETE operation at the document’s location, passing the rev parameter with the document’s current revision. If successful, it will return the revision id for the deletion stub.
Attachments
Documents can have attachments just like emails. There are two ways to use attachments:
- the first one is a REST API that addresses individual attachments by URLs;
- the second is in line with your document.
Please note that attachments may have embedded / characters that are sent unescaped to CouchDB. You can use this to provide a subtree of attachments under a document.
Creating a Couch Document
String hostUrl = http://USERNAME:PASSWORD@SERVER_NAME:5984
Or https://USERNAME:PASSWORD@SERVER_NAME:6984
HttpPost httpPostRequest = new HttpPost(hostUrl + "/" + databaseName); StringEntity body = new StringEntity(jsonDoc.toString(), "utf8"); httpPostRequest.setEntity(body); httpPostRequest.setHeader("Accept", "application/json"); httpPostRequest.setHeader("Content-type", "application/json"); HttpResponse httpResponse = (HttpResponse) new DefaultHttpClient().execute(httpPostRequest); HttpEntity entity = httpResponse.getEntity(); if (entity != null) { // Read the content stream InputStream instream = entity.getContent(); // Convert content stream to a String String resultString = FileUtil.convertStreamToString(instream); instream.close(); // Transform the String into a JSONObject JSONObject jsonResult = new JSONObject(resultString); return jsonResult: }
Adding Documents as Attachments
String fileName = java.net.URLEncoder.encode(fileName, "UTF-8"); HttpPut httpPutRequest = new HttpPut(hostUrl + "/" + dbName + (docID != null ? "/" + docID : "") + "/" + fileName + (revID != null ? "?rev=" + revID : ""));
ByteArrayEntity body = new ByteArrayEntity(attachData); //attachData – byte array
httpPutRequest.setEntity(body);
httpPutRequest.setHeader(“Content-type”, attachFileType);
HttpResponse httpResponse = (HttpResponse) new DefaultHttpClient().execute(httpPutRequest);
Get Documents
String criteria = “_design / views / _view / get_docs_by_id_name ? key = 10000”; HttpGet httpGetRequest = new HttpGet(hostUrl + "/" + dbName + "/" + criteria); HttpResponse httpResponse = (HttpResponse) new DefaultHttpClient() .execute(httpGetRequest); HttpEntity entity = httpResponse.getEntity(); if (entity != null) { // Read the content stream InputStream instream = entity.getContent(); // Convert content stream to a String resultString = FileUtil.convertStreamToString(instream); instream.close(); // Transform the String into a JSONObject JSONObject jsonResult = new JSONObject(resultString); // get the attachments from the JSON }
Delete Attachments by filename
String couchUrl = hostUrl + "/" + databaseName + "/" + documentId + "/" + fileName + "?rev=" + revisionId; HttpDelete httpDeleteRequest = new HttpDelete(couchUrl); httpDeleteRequest.setHeader("Accept", "application/json"); httpDeleteRequest.setHeader("Content-type", "application/json"); HttpResponse httpResponse = (HttpResponse) new DefaultHttpClient().execute(httpDeleteRequest);
Delete All Attachments
String couchUrl = hostUrl + "/" + databaseName + "/" + documentId + "/" + rev = "+revisionId; HttpDelete httpDeleteRequest = new HttpDelete(couchUrl); httpDeleteRequest.setHeader("Accept", "application/json"); httpDeleteRequest.setHeader("Content-type", "application/json"); HttpResponse httpResponse = (HttpResponse) new DefaultHttpClient().execute(httpDeleteRequest);
GET /{db}/_changes
This request retrieves a sorted list of changes made to documents in the database, in time order of application, can be obtained from the database’s _changes resource.
This request can be used to listen for update and modifications to the database for post processing or synchronization. A continuously connected changes feed is a reasonable approach for generating a real-time log for most applications.
- http://HOSTNAME:5984/media-db/_changes
- http://HOSTNAME:5984/media-db/_changes?since=10 (Returns from the change immediately after the given sequence number.)
- http://HOSTNAME:5984/media-db/_changes?feed=continuous&heartbeat=5000 (Returns the changes every 5 seconds if there is any change in media-db database.)
- http://HOSTNAME:5984/media-db/_changes?filter=myfilters/filterMyDocs (Custom filter is used here, I want to monitor only a specific set of documents and not all couch documents within the database.)
function(doc, req) { if(doc["fileCategory"] == 'Financial Data') { return true; } else { return false; } }
This will filter couch documents having attribute as ‘fileCategory’ which is equal to ‘Financial Data’
In the next blog we will see how to render a page using Couch Show/List functions and Couch Views…