• Tidak ada hasil yang ditemukan

93

© Vasan Subramanian 2017

V. Subramanian, Pro MERN Stack, DOI 10.1007/978-1-4842-2653-7_6

CHAPTER 6

Chapter 6 ■ Using MongoDB

Collections

A collection is like a table in a relational database. It is a set of documents, and you access each document via the collection. Just like in a relational database, you can have a primary key and indexes on the collection. But there are a few differences.

A primary key is mandated in MongoDB, and it has the reserved field name _id.

Even if you don’t supply an _id field when creating a document, MongoDB creates this field and auto-generates a unique key for every document. More often than not, the auto- generation is used as is, since it is convenient and guaranteed to produce unique keys even when multiple clients are writing to the database simultaneously. MongoDB uses a special data type called the ObjectId for the primary key.

The _id field is automatically indexed. Apart from this, indexes can be created on other fields, and this includes fields within embedded documents and array fields.

Indexes are used to efficiently access a subset of documents in a collection.

Unlike a relational database, MongoDB does not require you to define a schema for a collection. The only requirement is that all documents in a collection must have a unique _id, but the actual documents may have completely different fields. In practice, though, all documents in a collection do have the same fields. Although a flexible schema may seem very convenient for schema changes during the initial stages of an application, this can cause problems if you don’t have some kind of schema checking in the application code.

Query Language

Unlike the universal English-like SQL in a relational database, the MongoDB query language is made up of methods to achieve various operations. The main methods for read and write operations are the CRUD methods. Other methods include aggregation, text search, and geospatial queries.

All methods operate on a collection and take parameters as JavaScript objects that specify the details of the operation. Each method has its own specification. For example, to insert a document, the only parameter you need is the document itself. For querying, the parameters are a match specification and a list of fields to return.

Unlike relational databases, there is no method that can operate on multiple collections at once. All methods operate on only one collection at a time. If there is a need to combine the result of multiple collections, each collection has to be queried separately and manipulated by the client. In a relational database, you can use joins to combine tables using fields that are common to the tables, so that the result includes the contents of both tables. You can’t do this in MongoDB and many other NoSQL databases. This lets NoSQL databases scale by using shards, or multiple servers to distribute documents part of the same collection.

Also, unlike relational databases, MongoDB encourages denormalization, that is, storing related parts of a document as embedded subdocuments rather than as separate collections (tables) in a relational database. Take an example of people (name, gender, etc.) and their contact information (primary address, secondary address etc.). In a relational database, you would have separate tables for People and Contacts, then join the two tables when you need all of the information together. In MongoDB, on the other hand, you store the list of contacts within the same People document, thus avoiding a join.

Chapter 6 ■ Using MongoDB

95

Installation

MongoDB can be installed easily on OS X, Windows, and most distributions based on Linux. The installation instructions are different for each operating system and have a few variations depending on the OS flavor as well. Please install MongoDB by following the instructions at the MongoDB website (https://docs.mongodb.com/

manual/installation/ or search for “mongodb installation” in your search engine).

Choose version 3.2 or higher, preferably the latest, as some of the examples use features introduced only in version 3.2. Most installation options let you install the server, the shell, and tools all in one. Check that this is the case; if not, you may have to install them separately.

After installation, ensure that you have started MongoDB server (the name of the daemon or service is mongod), if it is not already started by the installation process. Test the installation by running the mongo shell like this:

$ mongo

On a Windows system, you may need to add .exe to the command. The command may require a path depending on your installation method. If the shell starts successfully, it will also connect to the local MongoDB server instance. You should see the version of MongoDB printed on the console, the database it is connecting to (the default is test), and a command prompt, like this:

MongoDB shell version: 3.2.4 connecting to: test

>

If, instead, you see an error message, revisit the installation and server starting procedure.

The mongo Shell

The mongo shell is an interactive JavaScript shell, very much like the Node.js shell. In this interactive shell, there are a few non-JavaScript conveniences apart from the full power of JavaScript. In this section, you’ll look at the basic operations that are possible via the shell, those that are most commonly used. For a full reference of all you can do with the shell, take a look at the MongoDB documentation.

By default, the shell connects to a database called test. At any time, to know the current database, use the special command db like this:

> db

It should print the current database, which is by default, test. To connect to another database, say a playground database, do this:

> use playground

Chapter 6 ■ Using MongoDB

Note that a database does not have to exist to connect to it. The first document creation will initiate the database creation if it doesn’t exist. The same applies to

collections: the first creation of a document in a collection creates the collection. You can see the proof of this by listing the databases and collections in the current database:

> show databases

> show collections

You will see that playground is not listed in the databases, and the collections list is empty. Let’s create an employees collection by inserting a document. To insert a document, you use the insert() method on the collection, which is referred to by a property with the name of the database, of the special variable db:

> db.employees.insert({name: {first: 'John', last: 'Doe'}, age: 44});

Now, if you list the databases and collections, you will find both playground and employees listed. Let’s also make sure that the first employee record has been created. To list the contents of a collection, you need to use the find() method of the collection:

> db.employees.find();

{ "_id" : ObjectId("57b1caea3475bb1784747ccb"), "name" : { "first" : "John", "last" : "Doe" }, "age" : 44 }

You can see that _id was automatically generated and assigned. If you wanted a prettier, indented listing of employees, you should use the pretty() method on the results of find() like this:

> db.employees.find().pretty() {

"_id" : ObjectId("57b1caea3475bb1784747ccb"), "name" : {

"first" : "John", "last" : "Doe"

},

"age" : 44 }

Now, insert a few more documents with different names and ages. Add a middle name for someone, like this:

> db.employees.insert({name: {first: 'John', middle: 'H', last: 'Doe'},  age: 22});

Chapter 6 ■ Using MongoDB

97 This is what a flexible schema lets you do: you can enhance the schema whenever you discover a new data point that you need to capture, without having to explicitly modify the schema. In this case, it is implicit that any employee document where the middle field under name is missing indicates an employee without a middle name. If, on the other hand, you added a field that didn’t have an implicit meaning when absent, you’d either have to handle the absence in the code, or run a migration to modify all documents and add the field with a default value.

Note that MongoDB automatically generated the primary key for each of the documents, which is displayed as ObjectId("57b1caea3475bb1784747ccb") in the find() output. Just to reiterate, the ObjectId is a special data type, which is why it is displayed like that. You can convert an ObjectId to and from strings, which you’ll see a little later.

The insert() method can take in an array when inserting multiple documents together. The variations insertOne() and insertMany() were introduced in version 3.2 to make it more explicit whether the parameter is a single document or an array.

To retrieve only some documents that you’re interested in, you need to supply a filter to find(). The filter specification is an object where the property name is the field that you want to filter on, and the value is its value that you want to match. Say you want to find all employees aged 44; this is what you would do:

> db.employees.find({age: 44});

The output would be similar to the output of find() without filters as described previously. The filter is actually a shortcut for age: {$eq: 44}, where $eq is the operator.

Other operators for comparison are available, such as $gt for greater than. If you need to compare and match fields within embedded documents, you can refer to field using the dot notation (which will require you to specify quotes around the field name). If there are multiple field specifications, all of them have to match.

A second parameter can be passed to restrict the fields that are returned. This is called the projection. The format of this specification is an object with one or more field names as the key and the value as 0 or 1, to indicate exclusion or inclusion. Unfortunately, you cannot combine 0s and 1s: you can only start with nothing and include all the fields using 1s, or start with everything and exclude fields using 0s. The _id field is an exception;

it is always included unless you specify a 0. The following will find employees whose first name is John, aged 44 or more, and print only their first names and ages:

> db.employees.find({'name.first': 'John', age: {$gte: 44}},  {'name.first': 1, age: 1})

The method findOne() is a variation that returns a single document rather than an cursor that can be iterated over.

Chapter 6 ■ Using MongoDB

In order to update a document, you first need to specify the filter that matches the document to update, and then specify the modifications. The filter specification is the same as for a read operation. Typically, the filter is the ID of the document so that you are sure you update one and only one document. You can replace the entire document by supplying the document as the second parameter. If you want to only change a few fields, you do it by using the $set operator, like this:

> db.employees.update({_id: ObjectId("57b1caea3475bb1784747ccb")},  {$set: {age: 44}})

You identify the document to modify using its primary key, _id. In order to generate the special data type ObjectId, you need to use the mongo shell’s built-in function ObjectId() and pass it a string representation of the ID. The update results in a message like this (the output may vary depending on the version of MongoDB you have installed):

WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })

The update() method can take more options, one of which is the upsert option.

When this is set to true, MongoDB will find the document based on the filter, but if it doesn’t find one, it will create one based on the document supplied. Of course, this isn’t useful if you are using the object ID to identify the object. It’s useful if the key to search for is a unique key (like the employee number). Also, the second parameter in this case must be the entire document; it can’t be a patch specification using $set.

The variations updateMany() and updateOne() were introduced in version 3.2 to make it explicit as to the intention of the update. Use of these variations is recommended over the plain update() since even if the filter matches multiple or single documents, the update will affect one or many documents depending on which method was called.

To delete a document, use the remove method with a filter, just as in the find method:

> db.employees.remove({"_id" : ObjectId("57b1caea3475bb1784747ccb")})

WriteResult({ "nRemoved" : 1 })

If you think a field is often used to filter the list, you should create an index on the field to make the search more efficient. Failing this, MongoDB searches the entire database for a match. To create an index on the age field, do this:

> db.employees.createIndex({age: 1})

Chapter 6 ■ Using MongoDB

99

Shell Scripting

A mongo shell script is a regular JavaScript program, with all the collection methods available as built-ins. One difference from the interactive shell is that you don’t have the convenience commands such as use and the default global variable db. You must initialize them within the shell script programmatically, like this:

var db = new Mongo().getDB("playground");

Add a few more statements in a script file, the same that you typed in the mongo shell for inserting and reading documents. To execute the script, supply it as a parameter to the mongo shell like this (if you have saved the file as test.mongo.js):

$ mongo test.mongo.js