From RDBMS to NoSQL – 1 week with MongoDB

I have talked about how I felt restricted rather than served by the relational model in my current project. The Scrum master suggested trying MongoDB since it seems suitable for the task at hand, and so I did. I feel it was a very interesting experience.

MongoDB is a document-based database that is one of the most popular NoSQL DBMS available. The basic premise is that you store your data in “documents” which are essentially a more sophisticated JSON representation of your data entities. The documents are stored in “collections” and they, usually, schema-less. That is to say, you don’t have to shape your data in strict ways to store and retrieve them. This delays and reduces the effort of designing a database as you would have to do in a relational model. For example, let us say you have a lot of similar objects that you want to store and that you retrieve in similar use cases. In an RDBMS, you will likely have to structure them with a base table that has a type field and a lot of secondary tables that implement these different types and refers to the base table. What about Mongo? well, you can simply dump all of these “different” documents in the same collection and you will get the fields you need for every type! When you add a new type or modify an existing one you might not need to do anything database-wise. Awesome! right? But this isn’t even the best part!

The best part, for me at least, is the arbitrary complexity of the documents! A document can have sub-documents that have their own sub-documents ad-infinitum! MongoDB recommends embedding data that belong together as much as possible. In my case, I have templates that have various entries which vary a lot in attributes and can have entries of their own. This was annoying to implement in an RDBMS since I had to create various tables and I had to implement a certain order of operations when trying to save a new template so that I can accommodate all the foreign keys and the like. In Mongo, I just dump the whole template in a collection and that’s it. It also makes a lot of sense, since these entries really belong to the template and if I want to delete the template I will delete them too. Now that they are literally a part of the template “document”, deleting the document deletes them too without needing to run a lot of queries on a lot of tables while paying attention to foreign keys restrictions. Obviously, there are times when embedding is inappropriate and you need to refer to other documents via referencing.

This leads us to the parts where Mongo seems less than ideal. The schema-free nature and focus on creating rich flexible documents came at the expense of inter-document data integrity. It seems there is no actual alternative for foreign-key restrictions in MongoDB and the application will have to maintain data integrity on its own. The same issue pops to some extent for the data integrity of collections too. The balance between restricting the fields of documents that can be accepted in the collection and providing freedom for different future changes seems to be a compromise the designer has to make.

All in all, Mongo is a rather exciting experience. I think it suits my project a lot since it has a lot of complex nested entities that have sole ownership of their members and the project itself seems in flux with many changes expected to be added in the future. It seems indeed possible to structure an RDBMS well enough to adapt to the future changes, but Mongo makes this task a breeze. On the other hand, if you have fixed well-defined structures without a lot of them belonging solely to certain entities, or if you have high needs for integrity on the database level, then a classic RDBMS is probably your best choice.

Leave a Reply Cancel reply