Monday, November 6, 2017

Reasons to use MongoDB (NoSQL) as your ECM's Database

In this article we will explore the advantages of using NoSQL databases such as MongoDB to store data and documents

MongoDB is currently one of the most popular NoSQL databases. Unlike relational data bases, the data is not stored in tables, but rather flat files are used in JSON (JavaScript Object Notation) format, which is a widely used standard among a large number of current applications. This allows the integration between MongoDB and these applications much easier.

The term NoSQL refers to "Not only SQL". This means that NoSQL does not use a relational model, and this is useful when the structures of the data you use can vary. It is possible to make changes to the schemas without having to stop the database. NoSQL data bases can be adapted to real projects more easily than an entity-relationship model.
It also has a decentralized structure, which allows it to use distributed schemes. This feature makes it easily scalable. The scalability is horizontal: you can use more machines with less computing capacity, instead of having to resort to a single, more powerful machine. The choice of a NoSQL database is more reasonable if you do not have a large budget for equipment.

Another advantage is that in a NoSQL database, queries for large amounts of data are optimized. To give you an idea, Facebook, Twitter, Reddit or Foursquare use NoSQL databases.

Regarding the limitations of NoSQL, the NoSQL databases do not offer such strict control over the atomicity of transactions. This is a significant advantage in relational databases. The atomicity, is what allows to perform a complete operation involving several tables, without there being changes in the environment, before the transaction ends completely. That is, either the entire transaction is carried out or it is not carried out. For example, the atomicity ensures that in a bank transfer, the transaction is not half done, but if the money is entered into an account, it must leave the other. While this quality impacts the performance of the database, atomicity is what maintains the integrity of the data and, eventually, allows an ordered rollback, if necessary. The NoSQL databases, on the other hand, support an eventual consistency of the data.

The relational model already has more than 40 years of use, so the evolution of the products and the tools of relational databases have a great maturation. The NoSQL databases are not yet fully standardized, so each of them has its own characteristics in terms of queries and does not necessarily maintain compatibility with SQL statements. However, the fact that today has to deal with such large amounts of data, open an extensive panorama for these repositories of data. NoSQL is a good option for those companies that detect performance and scalability problems or costs due to large volumes of data.

Use in Document Management Systems

In a document management system, with large volumes of documents and high numbers of query, write or update transactions, the use of a NoSQL scheme is much more efficient.

The relational data bases require that a schema shall be defined, which is no more than a structure described in some formal language interpreted by the database engine and that describes the skeleton of the existing tables and their interrelation. This introduces a limitation, since the metadata of the stored documents will be restricted to the data type defined in this schema. This limitation does not exist in a NoSQL database.

In addition, within each table you must define restrictions regarding the rows and columns, as well as the data type that can be stored in each column. With NoSQL, based on the type of data, the definition of this restriction is automated, reducing the time spent for development.
The high performance and high scalability of MongoDB makes it ideal for a system of this type. In addition to providing a JSON document structure, it supports a dynamic schema called BSON. In a relational database, files can be stored as BLOB (binary large objects) data types, which have a maximum size of approximately 4.25 Gbytes, which indicates a limit to the maximum size of the stored document. Although MongoDB supports a maximum size of 16Mbytes per document, this limitation can be overcome if GridFS is used, which divides the file into pieces in order to store it, allowing the total size to be virtually unlimited.

It also has a feature called "sharding", which is what allows load balancing between servers, by assigning different data to each one of them, and in this way the query task and data insertion are distributed. A mapping or transformation between the objects of the application and the objects in the database is not necessary. It uses internal memory to store the work set, which allows faster access to data. MongoDB supports dynamic queries using a powerful document-based query language.

These general characteristics make MongoDB the most suitable database mechanism to be used in a document management system.


No comments:

Post a Comment