mongodb - Document schema - Performance vs Modification anomaly - tradeoff -
to design document schema below application.
one approach is, below mongodb document designed primarily based on matching of data access pattern of application (above).
> db.posts.find().pretty() { "_id": objectid("5099f5eabcf1bf2d90ea41ad"), // post 1 "author": "xyz", "body" : "this test body", "comments": [ { "body": "this comment", "email": "alan@tech.com", "author": "alan donald" }, { "body": "this comment\r\n", "email": "alan@tech.com", "author": "alan donald" } ], "date" : isodate("2012-11-07t05:47:22,9412"), "permalink": "this_is_a_test_post", "tags":[ "cycling", "mongodb", "swimming" ], "title": "this test post" }
above schema allow application data access pattern, to,
1) collect recent blog entries blog home page
2) collect information display single post
3) collect comments single author
but not,
providing table of contents tag
another approach is, document schema, relational approach inclination like,
> db.posts.find().pretty() { "_id": "post1", // use objectid bson type "title": "this test post", "body": "this test body", "date": isodate("2012-11-07t05:47:22,9412") } > db.comments.find().pretty() { "_id": 3, // use objectid bson type "post_id": "post1", "author": "alan donald", "author_email": "alan@tech.com", "nth": 0 "body": "this comment" }, { "_id": 4, // use objectid bson type "post_id": "post1", "author": "alan donald", "author_email": "alan@tech.com", "nth": 1, "body": "this comment\r\n" }, > db.tags.find().pretty() { "_id": 5, // use objectid bson type "tag": "cycling" "post_id": "post1" }, { "_id": 6, // use objectid bson type "tag": "mongodb" "post_id": "post1" }, { "_id": 7, // use objectid bson type "tag": "swimming" "post_id": "post1" }
comparison:
1) mongodb not inherently support join operation among collections. so, approach1 looks better. because approach2 require multiple queries , join on results of multiple query.
2) approach1 looks better embedding(pre-join) of comments
in document, keeps data consistent, though mongodb lacks foreign key constraint.
3) mongodb not support transactions support atomic operation @ single document level. so, approach1 looks better
post , comments have one-to-many relation. many large or few.
question:
using approach1, each document(post) db.posts
collection consists of multiple comments
redundant data. enhances performace prone modification anomaly. there better approach schema design?
in approach 1 comment, using array has limitation in mongodb. https://docs.mongodb.com/manual/reference/limits
and mongodb 3.2
onwards in aggregation pipeline can use $lookup
join.
Comments
Post a Comment