Document databases and BigTable

Sumit Rawal answered on May 20, 2023 Popularity 3/10 Helpfulness 1/10

Contents


More Related Answers

  • big data
  • create table using existing table bigquery
  • get big data from database

  • Document databases and BigTable

    0

    The BigTable paper describes how Google developed their own massively scalable database for internal use, as basis for several of their services. The data model is quite different from relational databases: columns don’t need to be pre-defined, and rows can be added with any set of columns. Empty columns are not stored at all.

    BigTable inspired many developers to write their own implementations of this data model; amongst the most popular are HBase, Hypertable and Cassandra. The lack of a pre-defined schema can make these databases attractive in applications where the attributes of objects are not known in advance, or change frequently.

    Document databases have a related data model (although the way they handle concurrency and distributed servers can be quite different): a BigTable row with its arbitrary number of columns/attributes corresponds to a document in a document database, which is typically a tree of objects containing attribute values and lists, often with a mapping to JSON or XML. Open source document databases include Project Voldemort, CouchDB, MongoDB, ThruDB and Jackrabbit.

    How is this different from just dumping JSON strings into MySQL? Document databases can actually work with the structure of the documents, for example extracting, indexing, aggregating and filtering based on attribute values within the documents. Alternatively you could of course build the attribute indexing yourself, but I wouldn’t recommend that unless it makes working with your legacy code easier.

    The big limitation of BigTables and document databases is that most implementations cannot perform joins or transactions spanning several rows or documents. This restriction is deliberate, because it allows the database to do automatic partitioning, which can be important for scaling — see the section on distributed key-value stores below. If the structure of your data is lots of independent documents, this is not a problem — but if your data fits nicely into a relational model and you need joins, please don’t try to force it into a document model.

    Popularity 3/10 Helpfulness 1/10 Language whatever
    Source: Grepper
    Link to this answer
    Share Copy Link
    Contributed on May 20 2023
    Sumit Rawal
    0 Answers  Avg Quality 2/10


    X

    Continue with Google

    By continuing, I agree that I have read and agree to Greppers's Terms of Service and Privacy Policy.
    X
    Grepper Account Login Required

    Oops, You will need to install Grepper and log-in to perform this action.